1
|
Zeng C, Zhuo C, Gao J, Liu H, Zhao Y. Advances and Challenges in Scoring Functions for RNA-Protein Complex Structure Prediction. Biomolecules 2024; 14:1245. [PMID: 39456178 PMCID: PMC11506084 DOI: 10.3390/biom14101245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 09/24/2024] [Accepted: 09/30/2024] [Indexed: 10/28/2024] Open
Abstract
RNA-protein complexes play a crucial role in cellular functions, providing insights into cellular mechanisms and potential therapeutic targets. However, experimental determination of these complex structures is often time-consuming and resource-intensive, and it rarely yields high-resolution data. Many computational approaches have been developed to predict RNA-protein complex structures in recent years. Despite these advances, achieving accurate and high-resolution predictions remains a formidable challenge, primarily due to the limitations inherent in current RNA-protein scoring functions. These scoring functions are critical tools for evaluating and interpreting RNA-protein interactions. This review comprehensively explores the latest advancements in scoring functions for RNA-protein docking, delving into the fundamental principles underlying various approaches, including coarse-grained knowledge-based, all-atom knowledge-based, and machine-learning-based methods. We critically evaluate the strengths and limitations of existing scoring functions, providing a detailed performance assessment. Considering the significant progress demonstrated by machine learning techniques, we discuss emerging trends and propose future research directions to enhance the accuracy and efficiency of scoring functions in RNA-protein complex prediction. We aim to inspire the development of more sophisticated and reliable computational tools in this rapidly evolving field.
Collapse
Affiliation(s)
| | | | | | | | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China; (C.Z.); (C.Z.); (J.G.); (H.L.)
| |
Collapse
|
2
|
Sabei A, Hognon C, Martin J, Frezza E. Dynamics of Protein-RNA Interfaces Using All-Atom Molecular Dynamics Simulations. J Phys Chem B 2024; 128:4865-4886. [PMID: 38740056 DOI: 10.1021/acs.jpcb.3c07698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Facing the current challenges posed by human health diseases requires the understanding of cell machinery at a molecular level. The interplay between proteins and RNA is key for any physiological phenomenon, as well protein-RNA interactions. To understand these interactions, many experimental techniques have been developed, spanning a very wide range of spatial and temporal resolutions. In particular, the knowledge of tridimensional structures of protein-RNA complexes provides structural, mechanical, and dynamical pieces of information essential to understand their functions. To get insights into the dynamics of protein-RNA complexes, we carried out all-atom molecular dynamics simulations in explicit solvent on nine different protein-RNA complexes with different functions and interface size by taking into account the bound and unbound forms. First, we characterized structural changes upon binding and, for the RNA part, the change in the puckering. Second, we extensively analyzed the interfaces, their dynamics and structural properties, and the structural waters involved in the binding, as well as the contacts mediated by them. Based on our analysis, the interfaces rearranged during the simulation time showing alternative and stable residue-residue contacts with respect to the experimental structure.
Collapse
Affiliation(s)
- Afra Sabei
- Université Paris Cité, CiTCoM, CNRS, Paris F-75006, France
| | - Cécilia Hognon
- Université Paris Cité, CiTCoM, CNRS, Paris F-75006, France
| | - Juliette Martin
- Univ Lyon, Université Claude Bernard Lyon 1, CNRS, UMR 5086 MMSB, Lyon 69367, France
- Laboratory of Biology and Modeling of the Cell, Université de Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, Inserm U1293, Lyon 69367, France
| | - Elisa Frezza
- Université Paris Cité, CiTCoM, CNRS, Paris F-75006, France
| |
Collapse
|
3
|
Harini K, Sekijima M, Gromiha MM. PRA-Pred: Structure-based prediction of protein-RNA binding affinity. Int J Biol Macromol 2024; 259:129490. [PMID: 38224813 DOI: 10.1016/j.ijbiomac.2024.129490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 01/10/2024] [Accepted: 01/12/2024] [Indexed: 01/17/2024]
Abstract
Understanding crucial factors that affect the binding affinity of protein-RNA complexes is vital for comprehending their recognition mechanisms. This study involved compiling experimentally measured binding affinity (ΔG) values of 217 protein-RNA complexes and extracting numerous structure-based features, considering RNA, protein, and interactions between protein and RNA. Our findings indicate the significance of RNA base-step parameters, interaction energies, number of atomic contacts in the complex, hydrogen bonds, and contact potentials in understanding the binding affinity. Further, we observed that these factors are influenced by the type of RNA strand and the function of the protein in a protein-RNA complex. Multiple regression equations were developed for different classes of complexes to perform the prediction of the binding affinity between the protein and RNA. We evaluated the models using the jack-knife test and achieved an overall correlation 0.77 between the experimental and predicted binding affinities with a mean absolute error of 1.02 kcal/mol. Furthermore, we introduced a web server, PRA-Pred, intended for the prediction of protein-RNA binding affinity, and it is freely accessible through https://web.iitm.ac.in/bioinfo2/prapred/. We propose that our approach could function as a potential resource for investigating protein-RNA recognitions and developing therapeutic strategies.
Collapse
Affiliation(s)
- K Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - M Sekijima
- Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India; International Research Frontiers Initiative, School of Computing, Tokyo Institute of Technology, Yokohama, 226-8501, Japan; Department of Computer Science, National University of Singapore, Singapore.
| |
Collapse
|
4
|
Agarwal A, Kant S, Bahadur RP. Efficient mapping of RNA-binding residues in RNA-binding proteins using local sequence features of binding site residues in protein-RNA complexes. Proteins 2023; 91:1361-1379. [PMID: 37254800 DOI: 10.1002/prot.26528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 04/13/2023] [Accepted: 05/02/2023] [Indexed: 06/01/2023]
Abstract
Protein-RNA interactions play vital roles in plethora of biological processes such as regulation of gene expression, protein synthesis, mRNA processing and biogenesis. Identification of RNA-binding residues (RBRs) in proteins is essential to understand RNA-mediated protein functioning, to perform site-directed mutagenesis and to develop novel targeted drug therapies. Moreover, the extensive gap between sequence and structural data restricts the identification of binding sites in unsolved structures. However, efficient use of computational methods demanding only sequence to identify binding residues can bridge this huge sequence-structure gap. In this study, we have extensively studied protein-RNA interface in known RNA-binding proteins (RBPs). We find that the interface is highly enriched in basic and polar residues with Gly being the most common interface neighbor. We investigated several amino acid features and developed a method to predict putative RBRs from amino acid sequence. We have implemented balanced random forest (BRF) classifier with local residue features of protein sequences for prediction. With 5-fold cross-validations, the sequence pattern derived dipeptide composition based BRF model (DCP-BRF) resulted in an accuracy of 87.9%, specificity of 88.8%, sensitivity of 82.2%, Mathew's correlation coefficient of 0.60 and AUC of 0.93, performing better than few existing methods. We further validated our prediction model on known human RBPs through RBR prediction and could map ~54% of them. Further, knowledge of binding site preferences obtained from computational predictions combined with experimental validations of potential RNA binding sites can enhance our understanding of protein-RNA interactions. This may serve to accelerate investigations on functional roles of many novel RBPs.
Collapse
Affiliation(s)
- Ankita Agarwal
- School of Bio Science, Indian Institute of Technology Kharagpur, Kharagpur, India
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, India
| | - Shri Kant
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, India
| |
Collapse
|
5
|
Sabei A, Caldas Baia TG, Saffar R, Martin J, Frezza E. Internal Normal Mode Analysis Applied to RNA Flexibility and Conformational Changes. J Chem Inf Model 2023; 63:2554-2572. [PMID: 36972178 DOI: 10.1021/acs.jcim.2c01509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
We investigated the capability of internal normal modes to reproduce RNA flexibility and predict observed RNA conformational changes and, notably, those induced by the formation of RNA-protein and RNA-ligand complexes. Here, we extended our iNMA approach developed for proteins to study RNA molecules using a simplified representation of the RNA structure and its potential energy. Three data sets were also created to investigate different aspects. Despite all the approximations, our study shows that iNMA is a suitable method to take into account RNA flexibility and describe its conformational changes opening the route to its applicability in any integrative approach where these properties are crucial.
Collapse
|
6
|
Li H, Huang E, Zhang Y, Huang S, Xiao Y. HDOCK update for modeling protein-RNA/DNA complex structures. Protein Sci 2022; 31:e4441. [PMID: 36305764 PMCID: PMC9615301 DOI: 10.1002/pro.4441] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 09/05/2022] [Accepted: 09/06/2022] [Indexed: 11/05/2022]
Abstract
Protein-nucleic acid interactions are involved in various cellular processes. Therefore, determining the structures of protein-nucleic acid complexes can provide insights into the mechanisms of the interactions and thus guide the rational drug design to modulate these interactions. Due to the high cost and technical difficulties of solving complex structures experimentally, computational modeling such as molecular docking has been playing an important role in the study of molecular interactions. In order to make it easier for researchers to obtain biomolecular complex structures through molecular docking, we developed the HDOCK server for protein-protein and protein-RNA/DNA docking (accessed at http://hdock.phys.hust.edu.cn/). Since its first release in 2017, HDOCK has been widely used in the scientific community. As nucleic acids may include single-stranded (ss) RNA/DNA and double-stranded (ds) RNA/DNA, we now present an updated version of HDOCK, which offers new options for structural modeling of ssRNA, ssDNA, dsRNA, and dsDNA. We hope this update will better help the scientific community solve important biological problems, thereby advancing the field. In this article, we describe the general protocol of HDOCK with emphasis on the new functions on RNA/DNA modeling. Several application examples are also given to illustrate the usage of the new functions.
Collapse
Affiliation(s)
- Hao Li
- School of Physics, Huazhong University of Science and TechnologyWuhanHubeiChina
| | | | - Yi Zhang
- School of Physics, Huazhong University of Science and TechnologyWuhanHubeiChina
| | - Sheng‐You Huang
- School of Physics, Huazhong University of Science and TechnologyWuhanHubeiChina
| | - Yi Xiao
- School of Physics, Huazhong University of Science and TechnologyWuhanHubeiChina
| |
Collapse
|
7
|
Rodríguez-Lumbreras LA, Jiménez-García B, Giménez-Santamarina S, Fernández-Recio J. pyDockDNA: A new web server for energy-based protein-DNA docking and scoring. Front Mol Biosci 2022; 9:988996. [PMID: 36275623 PMCID: PMC9582769 DOI: 10.3389/fmolb.2022.988996] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 09/20/2022] [Indexed: 11/16/2022] Open
Abstract
Proteins and nucleic acids are essential biological macromolecules for cell life. Indeed, interactions between proteins and DNA regulate many biological processes such as protein synthesis, signal transduction, DNA storage, or DNA replication and repair. Despite their importance, less than 4% of total structures deposited in the Protein Data Bank (PDB) correspond to protein-DNA complexes, and very few computational methods are available to model their structure. We present here the pyDockDNA web server, which can successfully model a protein-DNA complex with a reasonable predictive success rate (as benchmarked on a standard dataset of protein-DNA complex structures, where DNA is in B-DNA conformation). The server implements the pyDockDNA program, as a module of pyDock suite, thus including third-party programs, modules, and previously developed tools, as well as new modules and parameters to handle the DNA properly. The user is asked to enter Protein Data Bank files for protein and DNA input structures (or suitable models) and select the chains to be docked. The server calculations are mainly divided into three steps: sampling by FTDOCK, scoring with new energy-based parameters and the possibility of applying external restraints. The user can select different options for these steps. The final output screen shows a 3D representation of the top 10 models and a table sorting the model according to the scoring function selected previously. All these output files can be downloaded, including the top 100 models predicted by pyDockDNA. The server can be freely accessed for academic use (https://model3dbio.csic.es/pydockdna).
Collapse
Affiliation(s)
| | - Brian Jiménez-García
- Barcelona Supercomputing Center, Barcelona, Spain
- Zymvol Biomodeling SL, Barcelona, Spain
| | | | - Juan Fernández-Recio
- Barcelona Supercomputing Center, Barcelona, Spain
- Instituto de Ciencias de la Vid y del Vino (ICVV), Logroño, Spain
- *Correspondence: Juan Fernández-Recio,
| |
Collapse
|
8
|
Yang R, Liu H, Yang L, Zhou T, Li X, Zhao Y. RPpocket: An RNA–Protein Intuitive Database with RNA Pocket Topology Resources. Int J Mol Sci 2022; 23:ijms23136903. [PMID: 35805909 PMCID: PMC9266927 DOI: 10.3390/ijms23136903] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 06/13/2022] [Accepted: 06/20/2022] [Indexed: 02/04/2023] Open
Abstract
RNA–protein complexes regulate a variety of biological functions. Thus, it is essential to explore and visualize RNA–protein structural interaction features, especially pocket interactions. In this work, we develop an easy-to-use bioinformatics resource: RPpocket. This database provides RNA–protein complex interactions based on sequence, secondary structure, and pocket topology analysis. We extracted 793 pockets from 74 non-redundant RNA–protein structures. Then, we calculated the binding- and non-binding pocket topological properties and analyzed the binding mechanism of the RNA–protein complex. The results showed that the binding pockets were more extended than the non-binding pockets. We also found that long-range forces were the main interaction for RNA–protein recognition, while short-range forces strengthened and optimized the binding. RPpocket could facilitate RNA–protein engineering for biological or medical applications.
Collapse
|
9
|
Evtugyn G, Porfireva A, Tsekenis G, Oravczova V, Hianik T. Electrochemical Aptasensors for Antibiotics Detection: Recent Achievements and Applications for Monitoring Food Safety. SENSORS (BASEL, SWITZERLAND) 2022; 22:3684. [PMID: 35632093 PMCID: PMC9143886 DOI: 10.3390/s22103684] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 05/05/2022] [Accepted: 05/07/2022] [Indexed: 06/15/2023]
Abstract
Antibiotics are often used in human and veterinary medicine for the treatment of bacterial diseases. However, extensive use of antibiotics in agriculture can result in the contamination of common food staples such as milk. Consumption of contaminated products can cause serious illness and a rise in antibiotic resistance. Conventional methods of antibiotics detection such are microbiological assays chromatographic and mass spectroscopy methods are sensitive; however, they require qualified personnel, expensive instruments, and sample pretreatment. Biosensor technology can overcome these drawbacks. This review is focused on the recent achievements in the electrochemical biosensors based on nucleic acid aptamers for antibiotic detection. A brief explanation of conventional methods of antibiotic detection is also provided. The methods of the aptamer selection are explained, together with the approach used for the improvement of aptamer affinity by post-SELEX modification and computer modeling. The substantial focus of this review is on the explanation of the principles of the electrochemical detection of antibiotics by aptasensors and on recent achievements in the development of electrochemical aptasensors. The current trends and problems in practical applications of aptasensors are also discussed.
Collapse
Affiliation(s)
- Gennady Evtugyn
- A.M. Butlerov’ Chemistry Institute, Kazan Federal University, 18 Kremlevskaya Street, 420008 Kazan, Russia; (G.E.); (A.P.)
- Analytical Chemistry Department, Chemical Technology Institute, Ural Federal University, 19 Mira Street, 620002 Ekaterinburg, Russia
| | - Anna Porfireva
- A.M. Butlerov’ Chemistry Institute, Kazan Federal University, 18 Kremlevskaya Street, 420008 Kazan, Russia; (G.E.); (A.P.)
| | - George Tsekenis
- Biomedical Research Foundation, Academy of Athens, 4 Soranou Ephessiou Street, 115 27 Athens, Greece;
| | - Veronika Oravczova
- Department of Nuclear Physics and Biophysics, Comenius University, Mlynska Dolina F1, 842 48 Bratislava, Slovakia;
| | - Tibor Hianik
- Department of Nuclear Physics and Biophysics, Comenius University, Mlynska Dolina F1, 842 48 Bratislava, Slovakia;
| |
Collapse
|
10
|
Mias‐Lucquin D, Chauvot de Beauchene I. Conformational variability in proteins bound to single-stranded DNA: A new benchmark for new docking perspectives. Proteins 2022; 90:625-631. [PMID: 34617336 PMCID: PMC9292434 DOI: 10.1002/prot.26258] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 09/15/2021] [Accepted: 09/27/2021] [Indexed: 12/19/2022]
Abstract
We explored the Protein Data Bank (PDB) to collect protein-ssDNA structures and create a multi-conformational docking benchmark including both bound and unbound protein structures. Due to ssDNA high flexibility when not bound, no ssDNA unbound structure is included in the benchmark. For the 91 sequence-identity groups identified as bound-unbound structures of the same protein, we studied the conformational changes in the protein induced by the ssDNA binding. Moreover, based on several bound or unbound protein structures in some groups, we also assessed the intrinsic conformational variability in either bound or unbound conditions and compared it to the supposedly binding-induced modifications. To illustrate a use case of this benchmark, we performed docking experiments using ATTRACT docking software. This benchmark is, to our knowledge, the first one made to peruse available structures of ssDNA-protein interactions to such an extent, aiming to improve computational docking tools dedicated to this kind of molecular interactions.
Collapse
|
11
|
Nithin C, Mukherjee S, Basak J, Bahadur RP. NCodR: A multi-class support vector machine classification to distinguish non-coding RNAs in Viridiplantae. QUANTITATIVE PLANT BIOLOGY 2022; 3:e23. [PMID: 37077974 PMCID: PMC10095871 DOI: 10.1017/qpb.2022.18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 08/22/2022] [Accepted: 08/24/2022] [Indexed: 05/02/2023]
Abstract
Non-coding RNAs (ncRNAs) are major players in the regulation of gene expression. This study analyses seven classes of ncRNAs in plants using sequence and secondary structure-based RNA folding measures. We observe distinct regions in the distribution of AU content along with overlapping regions for different ncRNA classes. Additionally, we find similar averages for minimum folding energy index across various ncRNAs classes except for pre-miRNAs and lncRNAs. Various RNA folding measures show similar trends among the different ncRNA classes except for pre-miRNAs and lncRNAs. We observe different k-mer repeat signatures of length three among various ncRNA classes. However, in pre-miRs and lncRNAs, a diffuse pattern of k-mers is observed. Using these attributes, we train eight different classifiers to discriminate various ncRNA classes in plants. Support vector machines employing radial basis function show the highest accuracy (average F1 of ~96%) in discriminating ncRNAs, and the classifier is implemented as a web server, NCodR.
Collapse
Affiliation(s)
- Chandran Nithin
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology, Kharagpur721302, India
- Laboratory of Computational Biology, Faculty of Chemistry, Biological and Chemical Research Centre, University of Warsaw, 02-089Warsaw, Poland
| | - Sunandan Mukherjee
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology, Kharagpur721302, India
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, PL-02-109Warsaw, Poland
| | - Jolly Basak
- Department of Biotechnology, Visva-Bharati, Santiniketan, 731235, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology, Kharagpur721302, India
- Author for correspondence: R. P. Bahadur, E-mail:
| |
Collapse
|
12
|
A comparative analysis of machine learning classifiers for predicting protein-binding nucleotides in RNA sequences. Comput Struct Biotechnol J 2022; 20:3195-3207. [PMID: 35832617 PMCID: PMC9249596 DOI: 10.1016/j.csbj.2022.06.036] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 06/14/2022] [Accepted: 06/14/2022] [Indexed: 11/24/2022] Open
Abstract
RNA are master players in various cellular and biological processes and RNA-protein interactions are vital for proper functioning of cellular machineries. Knowledge of binding sites is crucial to decipher their functional implications. RNA NC-triplet and NC-quartet features could give reasonably high performance. RF model outperformed other machine learning classifiers with 85% accuracy and 0.93 AUC and performed better than few existing methods. An online webserver “Nucpred” is developed with trained model and freely accessible for scientific community.
RNA-protein interactions play vital roles in driving the cellular machineries. Despite significant involvement in several biological processes, the underlying molecular mechanism of RNA-protein interactions is still elusive. This may be due to the experimental difficulties in solving co-crystallized RNA-protein complexes. Inherent flexibility of RNA molecules to adopt different conformations makes them functionally diverse. Their interactions with protein have implications in RNA disease biology. Thus, study of binding interfaces can provide a mechanistic insight of the molecular functioning and aberrations caused due to altered interactions. Moreover, high-throughput sequencing technologies have generated huge sequence data compared to available structural data of RNA-protein complexes. In such a scenario, efficient computational algorithms are required for identification of protein-binding interfaces of RNA in the absence of known structures. We have investigated several machine learning classifiers and various features derived from nucleotide sequences to identify protein-binding nucleotides in RNA. We achieve best performance with nucleotide-triplet and nucleotide-quartet feature-based random forest models. An overall accuracy of 84.8%, sensitivity of 83.2%, specificity of 86.1%, MCC of 0.70 and AUC of 0.93 is achieved. We have further implemented the developed models in a user-friendly webserver “Nucpred”, which is freely accessible at “http://www.csb.iitkgp.ac.in/applications/Nucpred/index”.
Collapse
|
13
|
Ramos TAR, Galindo NRO, Arias-Carrasco R, da Silva CF, Maracaja-Coutinho V, do Rêgo TG. RNAmining: A machine learning stand-alone and web server tool for RNA coding potential prediction. F1000Res 2021; 10:323. [PMID: 34164114 PMCID: PMC8201426 DOI: 10.12688/f1000research.52350.2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/02/2021] [Indexed: 12/22/2022] Open
Abstract
Non-coding RNAs (ncRNAs) are important players in the cellular regulation of organisms from different kingdoms. One of the key steps in ncRNAs research is the ability to distinguish coding/non-coding sequences. We applied seven machine learning algorithms (Naive Bayes, Support Vector Machine, K-Nearest Neighbors, Random Forest, Extreme Gradient Boosting, Neural Networks and Deep Learning) through model organisms from different evolutionary branches to create a stand-alone and web server tool (RNAmining) to distinguish coding and non-coding sequences. Firstly, we used coding/non-coding sequences downloaded from Ensembl (April 14th, 2020). Then, coding/non-coding sequences were balanced, had their trinucleotides count analysed (64 features) and we performed a normalization by the sequence length, resulting in total of 180 models. The machine learning algorithms validations were performed using 10-fold cross-validation and we selected the algorithm with the best results (eXtreme Gradient Boosting) to implement at RNAmining. Best F1-scores ranged from 97.56% to 99.57% depending on the organism. Moreover, we produced a benchmarking with other tools already in literature (CPAT, CPC2, RNAcon and TransDecoder) and our results outperformed them. Both stand-alone and web server versions of RNAmining are freely available at https://rnamining.integrativebioinformatics.me/.
Collapse
Affiliation(s)
- Thaís A R Ramos
- Programa de Pós-Graduação em Bioinformática, Bioinformatics Multidisciplinary Environment (BioME), Instituto Metrópole Digital, Universidade Federal do Rio Grande do Norte, Natal, Brazil.,Departamento de Informática, Centro de Informática, Universidade Federal da Paraíba, João Pessoa, Brazil.,Advanced Center for Chronic Diseases (ACCDiS), Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
| | - Nilbson R O Galindo
- Departamento de Informática, Centro de Informática, Universidade Federal da Paraíba, João Pessoa, Brazil
| | - Raúl Arias-Carrasco
- Advanced Center for Chronic Diseases (ACCDiS), Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
| | - Cecília F da Silva
- Departamento de Informática, Centro de Informática, Universidade Federal da Paraíba, João Pessoa, Brazil
| | - Vinicius Maracaja-Coutinho
- Programa de Pós-Graduação em Bioinformática, Bioinformatics Multidisciplinary Environment (BioME), Instituto Metrópole Digital, Universidade Federal do Rio Grande do Norte, Natal, Brazil.,Advanced Center for Chronic Diseases (ACCDiS), Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile.,Instituto Vandique, João Pessoa, Brazil
| | - Thaís G do Rêgo
- Programa de Pós-Graduação em Bioinformática, Bioinformatics Multidisciplinary Environment (BioME), Instituto Metrópole Digital, Universidade Federal do Rio Grande do Norte, Natal, Brazil.,Departamento de Informática, Centro de Informática, Universidade Federal da Paraíba, João Pessoa, Brazil
| |
Collapse
|
14
|
González-Alemán R, Chevrollier N, Simoes M, Montero-Cabrera L, Leclerc F. MCSS-Based Predictions of Binding Mode and Selectivity of Nucleotide Ligands. J Chem Theory Comput 2021; 17:2599-2618. [PMID: 33764770 DOI: 10.1021/acs.jctc.0c01339] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Computational fragment-based approaches are widely used in drug design and discovery. One of their limitations is the lack of performance of docking methods, mainly the scoring functions. With the emergence of fragment-based approaches for single-stranded RNA ligands, we analyze the performance in docking and screening powers of an MCSS-based approach. The performance is evaluated on a benchmark of protein-nucleotide complexes where the four RNA residues are used as fragments. The screening power can be considered the major limiting factor for the fragment-based modeling or design of sequence-selective oligonucleotides. We show that the MCSS sampling is efficient even for such large and flexible fragments. Hybrid solvent models based on some partial explicit representations improve both the docking and screening powers. Clustering of the n best-ranked poses can also contribute to a lesser extent to better performance. A detailed analysis of molecular features suggests various ways to optimize the performance further.
Collapse
Affiliation(s)
- Roy González-Alemán
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris Saclay, Gif-sur-Yvette F-91198, France.,Laboratorio de Química Computacional y Teórica (LQCT), Facultad de Química, Universidad de La Habana, 10400 La Habana, Cuba
| | - Nicolas Chevrollier
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris Saclay, Gif-sur-Yvette F-91198, France
| | - Manuel Simoes
- CPC Manufacturing Analytics, 67000 Strasbourg, France
| | - Luis Montero-Cabrera
- Laboratorio de Química Computacional y Teórica (LQCT), Facultad de Química, Universidad de La Habana, 10400 La Habana, Cuba
| | - Fabrice Leclerc
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris Saclay, Gif-sur-Yvette F-91198, France
| |
Collapse
|
15
|
Wang H, Zhao Y. RBinds: A user-friendly server for RNA binding site prediction. Comput Struct Biotechnol J 2020; 18:3762-3765. [PMID: 34136090 PMCID: PMC8164131 DOI: 10.1016/j.csbj.2020.10.043] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/27/2020] [Accepted: 10/31/2020] [Indexed: 12/03/2022] Open
Abstract
RNA performs various biological functions by interacting with other molecules. The knowledge of RNA binding sites is essential for the understanding of RNA-protein or RNA-ligand complex structures and their mechanisms. However, the RNA binding site prediction study requires tedious programming scripts and manual handling. One user-friendly bioinformatics tool for RNA binding site prediction has been missing. This limitation motivated us to develop the RBinds, a user-friendly web server, to predict the RNA binding site using a simple graphical user interface. Some advanced features implemented in RBinds are (1) transforming the RNA structure to a network automatically; (2) analyzing the structural network properties to predict binding site; (3) constructing one annotated force-directed network; (4) providing a visualization tool for users to scale and rotate the structure; (5) offering the related tools to predict or simulate RNA structures. RBinds web server is a reliable and user-friendly tool and facilitates the RNA binding site study without installing programs locally. RBinds is freely accessible at http://zhaoserver.com.cn/RBinds/RBinds.html.
Collapse
Affiliation(s)
- Huiwen Wang
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
16
|
Yasmeen F, Seo H, Javaid N, Kim MS, Choi S. Therapeutic Interventions into Innate Immune Diseases by Means of Aptamers. Pharmaceutics 2020; 12:pharmaceutics12100955. [PMID: 33050544 PMCID: PMC7600108 DOI: 10.3390/pharmaceutics12100955] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 10/03/2020] [Accepted: 10/04/2020] [Indexed: 12/25/2022] Open
Abstract
The immune system plays a crucial role in the body's defense system against various pathogens, such as bacteria, viruses, and parasites, as well as recognizes non-self- and self-molecules. The innate immune system is composed of special receptors known as pattern recognition receptors, which play a crucial role in the identification of pathogen-associated molecular patterns from diverse microorganisms. Any disequilibrium in the activation of a particular pattern recognition receptor leads to various inflammatory, autoimmune, or immunodeficiency diseases. Aptamers are short single-stranded deoxyribonucleic acid or ribonucleic acid molecules, also termed "chemical antibodies," which have tremendous specificity and affinity for their target molecules. Their features, such as stability, low immunogenicity, ease of manufacturing, and facile screening against a target, make them preferable as therapeutics. Immune-system-targeting aptamers have a great potential as a targeted therapeutic strategy against immune diseases. This review summarizes components of the innate immune system, aptamer production, pharmacokinetic characteristics of aptamers, and aptamers related to innate-immune-system diseases.
Collapse
|
17
|
Zheng J, Hong X, Xie J, Tong X, Liu S. P3DOCK: a protein-RNA docking webserver based on template-based and template-free docking. Bioinformatics 2020; 36:96-103. [PMID: 31173056 DOI: 10.1093/bioinformatics/btz478] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 05/24/2019] [Accepted: 06/04/2019] [Indexed: 01/02/2023] Open
Abstract
MOTIVATION The main function of protein-RNA interaction is to regulate the expression of genes. Therefore, studying protein-RNA interactions is of great significance. The information of three-dimensional (3D) structures reveals that atomic interactions are particularly important. The calculation method for modeling a 3D structure of a complex mainly includes two strategies: free docking and template-based docking. These two methods are complementary in protein-protein docking. Therefore, integrating these two methods may improve the prediction accuracy. RESULTS In this article, we compare the difference between the free docking and the template-based algorithm. Then we show the complementarity of these two methods. Based on the analysis of the calculation results, the transition point is confirmed and used to integrate two docking algorithms to develop P3DOCK. P3DOCK holds the advantages of both algorithms. The results of the three docking benchmarks show that P3DOCK is better than those two non-hybrid docking algorithms. The success rate of P3DOCK is also higher (3-20%) than state-of-the-art hybrid and non-hybrid methods. Finally, the hierarchical clustering algorithm is utilized to cluster the P3DOCK's decoys. The clustering algorithm improves the success rate of P3DOCK. For ease of use, we provide a P3DOCK webserver, which can be accessed at www.rnabinding.com/P3DOCK/P3DOCK.html. An integrated protein-RNA docking benchmark can be downloaded from http://rnabinding.com/P3DOCK/benchmark.html. AVAILABILITY AND IMPLEMENTATION www.rnabinding.com/P3DOCK/P3DOCK.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jinfang Zheng
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Xu Hong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Juan Xie
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Xiaoxue Tong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Shiyong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| |
Collapse
|
18
|
He J, Tao H, Huang SY. Protein-ensemble-RNA docking by efficient consideration of protein flexibility through homology models. Bioinformatics 2020; 35:4994-5002. [PMID: 31086984 DOI: 10.1093/bioinformatics/btz388] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2019] [Revised: 04/28/2019] [Accepted: 05/03/2019] [Indexed: 12/18/2022] Open
Abstract
MOTIVATION Given the importance of protein-ribonucleic acid (RNA) interactions in many biological processes, a variety of docking algorithms have been developed to predict the complex structure from individual protein and RNA partners in the past decade. However, due to the impact of molecular flexibility, the performance of current methods has hit a bottleneck in realistic unbound docking. Pushing the limit, we have proposed a protein-ensemble-RNA docking strategy to explicitly consider the protein flexibility in protein-RNA docking through an ensemble of multiple protein structures, which is referred to as MPRDock. Instead of taking conformations from MD simulations or experimental structures, we obtained the multiple structures of a protein by building models from its homologous templates in the Protein Data Bank (PDB). RESULTS Our approach can not only avoid the reliability issue of structures from MD simulations but also circumvent the limited number of experimental structures for a target protein in the PDB. Tested on 68 unbound-bound and 18 unbound-unbound protein-RNA complexes, our MPRDock/DITScorePR considerably improved the docking performance and achieved a significantly higher success rate than single-protein rigid docking whether pseudo-unbound templates are included or not. Similar improvements were also observed when combining our ensemble docking strategy with other scoring functions. The present homology model-based ensemble docking approach will have a general application in molecular docking for other interactions. AVAILABILITY AND IMPLEMENTATION http://huanglab.phys.hust.edu.cn/mprdock/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jiahua He
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Huanyu Tao
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Sheng-You Huang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| |
Collapse
|
19
|
Yan Y, Tao H, He J, Huang SY. The HDOCK server for integrated protein–protein docking. Nat Protoc 2020; 15:1829-1852. [DOI: 10.1038/s41596-020-0312-x] [Citation(s) in RCA: 288] [Impact Index Per Article: 72.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2019] [Accepted: 02/03/2020] [Indexed: 12/27/2022]
|
20
|
Nithin C, Mukherjee S, Bahadur RP. A structure-based model for the prediction of protein-RNA binding affinity. RNA (NEW YORK, N.Y.) 2019; 25:1628-1645. [PMID: 31395671 PMCID: PMC6859855 DOI: 10.1261/rna.071779.119] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 08/05/2019] [Indexed: 05/28/2023]
Abstract
Protein-RNA recognition is highly affinity-driven and regulates a wide array of cellular functions. In this study, we have curated a binding affinity data set of 40 protein-RNA complexes, for which at least one unbound partner is available in the docking benchmark. The data set covers a wide affinity range of eight orders of magnitude as well as four different structural classes. On average, we find the complexes with single-stranded RNA have the highest affinity, whereas the complexes with the duplex RNA have the lowest. Nevertheless, free energy gain upon binding is the highest for the complexes with ribosomal proteins and the lowest for the complexes with tRNA with an average of -5.7 cal/mol/Å2 in the entire data set. We train regression models to predict the binding affinity from the structural and physicochemical parameters of protein-RNA interfaces. The best fit model with the lowest maximum error is provided with three interface parameters: relative hydrophobicity, conformational change upon binding and relative hydration pattern. This model has been used for predicting the binding affinity on a test data set, generated using mutated structures of yeast aspartyl-tRNA synthetase, for which experimentally determined ΔG values of 40 mutations are available. The predicted ΔGempirical values highly correlate with the experimental observations. The data set provided in this study should be useful for further development of the binding affinity prediction methods. Moreover, the model developed in this study enhances our understanding on the structural basis of protein-RNA binding affinity and provides a platform to engineer protein-RNA interfaces with desired affinity.
Collapse
Affiliation(s)
- Chandran Nithin
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Sunandan Mukherjee
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| |
Collapse
|
21
|
Protein-assisted RNA fragment docking (RnaX) for modeling RNA-protein interactions using ModelX. Proc Natl Acad Sci U S A 2019; 116:24568-24573. [PMID: 31732673 PMCID: PMC6900601 DOI: 10.1073/pnas.1910999116] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Protein–RNA interactions, key in biological processes, remained refractory to prediction algorithms. Here we present a new extension of the ModelX tool suite designed for this purpose. RNA–protein complexes in the Protein Data Bank were decomposed into small peptide–oligonucleotide interacting fragment pairs and used as building blocks to assemble big scaffolds representing complex RNA–protein interactions. This method has already been successful for designing DNA–protein and protein–protein interfaces. Areas under the curve up to 0.86 were achieved on binding site prediction showing the accuracy and coverage of our approach over established and in-house benchmarking sets. Together with FoldX protein design tool suite we were able to engineer backbone- and side chain-compatible interfaces using naked protein structures as input. RNA–protein interactions are crucial for such key biological processes as regulation of transcription, splicing, translation, and gene silencing, among many others. Knowing where an RNA molecule interacts with a target protein and/or engineering an RNA molecule to specifically bind to a protein could allow for rational interference with these cellular processes and the design of novel therapies. Here we present a robust RNA–protein fragment pair-based method, termed RnaX, to predict RNA-binding sites. This methodology, which is integrated into the ModelX tool suite (http://modelx.crg.es), takes advantage of the structural information present in all released RNA–protein complexes. This information is used to create an exhaustive database for docking and a statistical forcefield for fast discrimination of true backbone-compatible interactions. RnaX, together with the protein design forcefield FoldX, enables us to predict RNA–protein interfaces and, when sufficient crystallographic information is available, to reengineer the interface at the sequence-specificity level by mimicking those conformational changes that occur on protein and RNA mutagenesis. These results, obtained at just a fraction of the computational cost of methods that simulate conformational dynamics, open up perspectives for the engineering of RNA–protein interfaces.
Collapse
|
22
|
Yan Y, Zhang D, Zhou P, Li B, Huang SY. HDOCK: a web server for protein-protein and protein-DNA/RNA docking based on a hybrid strategy. Nucleic Acids Res 2019; 45:W365-W373. [PMID: 28521030 PMCID: PMC5793843 DOI: 10.1093/nar/gkx407] [Citation(s) in RCA: 651] [Impact Index Per Article: 130.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Accepted: 04/29/2017] [Indexed: 12/16/2022] Open
Abstract
Protein–protein and protein–DNA/RNA interactions play a fundamental role in a variety of biological processes. Determining the complex structures of these interactions is valuable, in which molecular docking has played an important role. To automatically make use of the binding information from the PDB in docking, here we have presented HDOCK, a novel web server of our hybrid docking algorithm of template-based modeling and free docking, in which cases with misleading templates can be rescued by the free docking protocol. The server supports protein–protein and protein–DNA/RNA docking and accepts both sequence and structure inputs for proteins. The docking process is fast and consumes about 10–20 min for a docking run. Tested on the cases with weakly homologous complexes of <30% sequence identity from five docking benchmarks, the HDOCK pipeline tied with template-based modeling on the protein–protein and protein–DNA benchmarks and performed better than template-based modeling on the three protein–RNA benchmarks when the top 10 predictions were considered. The performance of HDOCK became better when more predictions were considered. Combining the results of HDOCK and template-based modeling by ranking first of the template-based model further improved the predictive power of the server. The HDOCK web server is available at http://hdock.phys.hust.edu.cn/.
Collapse
Affiliation(s)
- Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Di Zhang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Pei Zhou
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Botong Li
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
23
|
Yan Y, Huang SY. RRDB: a comprehensive and non-redundant benchmark for RNA-RNA docking and scoring. Bioinformatics 2018; 34:453-458. [PMID: 29028888 DOI: 10.1093/bioinformatics/btx615] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Accepted: 09/23/2017] [Indexed: 12/15/2022] Open
Abstract
Motivation With the discovery of more and more noncoding RNAs and their versatile functions, RNA-RNA interactions have received increased attention. Therefore, determination of their complex structures is valuable to understand the molecular mechanism of the interactions. Given the high cost of experimental methods, computational approaches like molecular docking have played an important role in the determination of complex structures, in which a benchmark is critical for the development of docking algorithms. Results Meeting the need, we have developed the first comprehensive and nonredundant RNA-RNA docking benchmark (RRDB). The diverse dataset of 123 targets consists of 78 unbound-unbound and 45 bound-unbound (or unbound-bound) test cases. The dataset was classified into three groups according to the interface conformational changes between bound and unbound structures: 47 'easy', 38 'medium' and 38 'difficult' targets. A docking test with the benchmark using ZDOCK 2.1 demonstrated the challenging nature of the RNA-RNA docking problem and the important value of the present benchmark. The bound and unbound cases of the benchmark will be beneficial for the development and optimization of docking and scoring algorithms for RNA-RNA interactions. Availability and implementation The benchmark is available at http://huanglab.phys.hust.edu.cn/RRDbenchmark/. Contact huangsy@hust.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, People's Republic of China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, People's Republic of China
| |
Collapse
|
24
|
Krüger A, Zimbres FM, Kronenberger T, Wrenger C. Molecular Modeling Applied to Nucleic Acid-Based Molecule Development. Biomolecules 2018; 8:E83. [PMID: 30150587 PMCID: PMC6163985 DOI: 10.3390/biom8030083] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Revised: 08/12/2018] [Accepted: 08/16/2018] [Indexed: 12/15/2022] Open
Abstract
Molecular modeling by means of docking and molecular dynamics (MD) has become an integral part of early drug discovery projects, enabling the screening and enrichment of large libraries of small molecules. In the past decades, special emphasis was drawn to nucleic acid (NA)-based molecules in the fields of therapy, diagnosis, and drug delivery. Research has increased dramatically with the advent of the SELEX (systematic evolution of ligands by exponential enrichment) technique, which results in single-stranded DNA or RNA sequences that bind with high affinity and specificity to their targets. Herein, we discuss the role and contribution of docking and MD to the development and optimization of new nucleic acid-based molecules. This review focuses on the different approaches currently available for molecular modeling applied to NA interaction with proteins. We discuss topics ranging from structure prediction to docking and MD, highlighting their main advantages and limitations and the influence of flexibility on their calculations.
Collapse
Affiliation(s)
- Arne Krüger
- Unit for Drug Discovery, Department of Parasitology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, SP 05508-000, Brazil.
| | - Flávia M Zimbres
- Department of Biochemistry and Molecular Biology and Center for Tropical and Emerging Global Diseases, University of Georgia, Athens, GA 30602, USA.
| | - Thales Kronenberger
- Department of Internal Medicine VIII, University Hospital of Tübingen, 72076 Tübingen, Germany.
| | - Carsten Wrenger
- Unit for Drug Discovery, Department of Parasitology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, SP 05508-000, Brazil.
| |
Collapse
|
25
|
Nithin C, Ghosh P, Bujnicki JM. Bioinformatics Tools and Benchmarks for Computational Docking and 3D Structure Prediction of RNA-Protein Complexes. Genes (Basel) 2018; 9:genes9090432. [PMID: 30149645 PMCID: PMC6162694 DOI: 10.3390/genes9090432] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 07/26/2018] [Accepted: 08/21/2018] [Indexed: 12/29/2022] Open
Abstract
RNA-protein (RNP) interactions play essential roles in many biological processes, such as regulation of co-transcriptional and post-transcriptional gene expression, RNA splicing, transport, storage and stabilization, as well as protein synthesis. An increasing number of RNP structures would aid in a better understanding of these processes. However, due to the technical difficulties associated with experimental determination of macromolecular structures by high-resolution methods, studies on RNP recognition and complex formation present significant challenges. As an alternative, computational prediction of RNP interactions can be carried out. Structural models obtained by theoretical predictive methods are, in general, less reliable compared to models based on experimental measurements but they can be sufficiently accurate to be used as a basis for to formulating functional hypotheses. In this article, we present an overview of computational methods for 3D structure prediction of RNP complexes. We discuss currently available methods for macromolecular docking and for scoring 3D structural models of RNP complexes in particular. Additionally, we also review benchmarks that have been developed to assess the accuracy of these methods.
Collapse
Affiliation(s)
- Chandran Nithin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
| | - Pritha Ghosh
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
- Bioinformatics Laboratory, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, ul. Umultowska 89, PL-61-614 Poznan, Poland.
| |
Collapse
|
26
|
An account of solvent accessibility in protein-RNA recognition. Sci Rep 2018; 8:10546. [PMID: 30002431 PMCID: PMC6043566 DOI: 10.1038/s41598-018-28373-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Accepted: 06/21/2018] [Indexed: 01/16/2023] Open
Abstract
Protein–RNA recognition often induces conformational changes in binding partners. Consequently, the solvent accessible surface area (SASA) buried in contact estimated from the co-crystal structures may differ from that calculated using their unbound forms. To evaluate the change in accessibility upon binding, we compare SASA of 126 protein-RNA complexes between bound and unbound forms. We observe, in majority of cases the interface of both the binding partners gain accessibility upon binding, which is often associated with either large domain movements or secondary structural transitions in RNA-binding proteins (RBPs), and binding-induced conformational changes in RNAs. At the non-interface region, majority of RNAs lose accessibility upon binding, however, no such preference is observed for RBPs. Side chains of RBPs have major contribution in change in accessibility. In case of flexible binding, we find a moderate correlation between the binding free energy and change in accessibility at the interface. Finally, we introduce a parameter, the ratio of gain to loss of accessibility upon binding, which can be used to identify the native solution among the flexible docking models. Our findings provide fundamental insights into the relationship between flexibility and solvent accessibility, and advance our understanding on binding induced folding in protein-RNA recognition.
Collapse
|