1
|
Mahmoudi I, Quignot C, Martins C, Andreani J. Structural comparison of homologous protein-RNA interfaces reveals widespread overall conservation contrasted with versatility in polar contacts. PLoS Comput Biol 2024; 20:e1012650. [PMID: 39625988 PMCID: PMC11642956 DOI: 10.1371/journal.pcbi.1012650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 12/13/2024] [Accepted: 11/18/2024] [Indexed: 12/14/2024] Open
Abstract
Protein-RNA interactions play a critical role in many cellular processes and pathologies. However, experimental determination of protein-RNA structures is still challenging, therefore computational tools are needed for the prediction of protein-RNA interfaces. Although evolutionary pressures can be exploited for structural prediction of protein-protein interfaces, and recent deep learning methods using protein multiple sequence alignments have radically improved the performance of protein-protein interface structural prediction, protein-RNA structural prediction is lagging behind, due to the scarcity of structural data and the flexibility involved in these complexes. To study the evolution of protein-RNA interface structures, we first identified a large and diverse dataset of 2,022 pairs of structurally homologous interfaces (termed structural interologs). We leveraged this unique dataset to analyze the conservation of interface contacts among structural interologs based on the properties of involved amino acids and nucleotides. We uncovered that 73% of distance-based contacts and 68% of apolar contacts are conserved on average, and the strong conservation of these contacts occurs even in distant homologs with sequence identity below 20%. Distance-based contacts are also much more conserved compared to what we had found in a previous study of homologous protein-protein interfaces. In contrast, hydrogen bonds, salt bridges, and π-stacking interactions are very versatile in pairs of protein-RNA interologs, even for close homologs with high interface sequence identity. We found that almost half of the non-conserved distance-based contacts are linked to a small proportion of interface residues that no longer make interface contacts in the interolog, a phenomenon we term "interface switching out". We also examined possible recovery mechanisms for non-conserved hydrogen bonds and salt bridges, uncovering diverse scenarios of switching out, change in amino acid chemical nature, intermolecular and intramolecular compensations. Our findings provide insights for integrating evolutionary signals into predictive protein-RNA structural modeling methods.
Collapse
Affiliation(s)
- Ikram Mahmoudi
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Chloé Quignot
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Carla Martins
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| |
Collapse
|
2
|
Zeng C, Zhuo C, Gao J, Liu H, Zhao Y. Advances and Challenges in Scoring Functions for RNA-Protein Complex Structure Prediction. Biomolecules 2024; 14:1245. [PMID: 39456178 PMCID: PMC11506084 DOI: 10.3390/biom14101245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 09/24/2024] [Accepted: 09/30/2024] [Indexed: 10/28/2024] Open
Abstract
RNA-protein complexes play a crucial role in cellular functions, providing insights into cellular mechanisms and potential therapeutic targets. However, experimental determination of these complex structures is often time-consuming and resource-intensive, and it rarely yields high-resolution data. Many computational approaches have been developed to predict RNA-protein complex structures in recent years. Despite these advances, achieving accurate and high-resolution predictions remains a formidable challenge, primarily due to the limitations inherent in current RNA-protein scoring functions. These scoring functions are critical tools for evaluating and interpreting RNA-protein interactions. This review comprehensively explores the latest advancements in scoring functions for RNA-protein docking, delving into the fundamental principles underlying various approaches, including coarse-grained knowledge-based, all-atom knowledge-based, and machine-learning-based methods. We critically evaluate the strengths and limitations of existing scoring functions, providing a detailed performance assessment. Considering the significant progress demonstrated by machine learning techniques, we discuss emerging trends and propose future research directions to enhance the accuracy and efficiency of scoring functions in RNA-protein complex prediction. We aim to inspire the development of more sophisticated and reliable computational tools in this rapidly evolving field.
Collapse
Affiliation(s)
| | | | | | | | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China; (C.Z.); (C.Z.); (J.G.); (H.L.)
| |
Collapse
|
3
|
Roche R, Tarafder S, Bhattacharya D. Single-sequence protein-RNA complex structure prediction by geometric attention-enabled pairing of biological language models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.27.605468. [PMID: 39091736 PMCID: PMC11291176 DOI: 10.1101/2024.07.27.605468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Ground-breaking progress has been made in structure prediction of biomolecular assemblies, including the recent breakthrough of AlphaFold 3. However, it remains challenging for AlphaFold 3 and other state-of-the-art deep learning-based methods to accurately predict protein-RNA complex structures, in part due to the limited availability of evolutionary and structural information related to protein-RNA interactions that are used as inputs to the existing approaches. Here, we introduce ProRNA3D-single, a new deep-learning framework for protein-RNA complex structure prediction with only single-sequence input. Using a novel geometric attention-enabled pairing of biological language models of protein and RNA, a previously unexplored avenue, ProRNA3D-single enables the prediction of interatomic protein-RNA interaction maps, which are then transformed into multi-scale geometric restraints for modeling 3D structures of protein-RNA complexes via geometry optimization. Benchmark tests show that ProRNA3D-single convincingly outperforms current state-of-the-art methods including AlphaFold 3, particularly when evolutionary information is limited; and exhibits remarkable robustness and performance resilience by attaining better accuracy with only single-sequence input than what most methods can achieve even with explicit evolutionary information. Freely available at https://github.com/Bhattacharya-Lab/ProRNA3D-single, ProRNA3D-single should be broadly useful for modeling 3D structures of protein-RNA complexes at scale, regardless of the availability of evolutionary information.
Collapse
Affiliation(s)
- Rahmatullah Roche
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States of America
| | - Sumit Tarafder
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States of America
| | - Debswapna Bhattacharya
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States of America
| |
Collapse
|
4
|
Kravchenko A, de Vries SJ, Smaïl-Tabbone M, Chauvot de Beauchene I. HIPPO: HIstogram-based Pseudo-POtential for scoring protein-ssRNA fragment-based docking poses. BMC Bioinformatics 2024; 25:129. [PMID: 38532339 DOI: 10.1186/s12859-024-05733-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 03/06/2024] [Indexed: 03/28/2024] Open
Abstract
BACKGROUND The RNA-Recognition motif (RRM) is a protein domain that binds single-stranded RNA (ssRNA) and is present in as much as 2% of the human genome. Despite this important role in biology, RRM-ssRNA interactions are very challenging to study on the structural level because of the remarkable flexibility of ssRNA. In the absence of atomic-level experimental data, the only method able to predict the 3D structure of protein-ssRNA complexes with any degree of accuracy is ssRNA'TTRACT, an ssRNA fragment-based docking approach using ATTRACT. However, since ATTRACT parameters are not ssRNA-specific and were determined in 2010, there is substantial opportunity for enhancement. RESULTS Here we present HIPPO, a composite RRM-ssRNA scoring potential derived analytically from contact frequencies in near-native versus non-native docking models. HIPPO consists of a consensus of four distinct potentials, each extracted from a distinct reference pool of protein-trinucleotide docking decoys. To score a docking pose with one potential, for each pair of RNA-protein coarse-grained bead types, each contact is awarded or penalised according to the relative frequencies of this contact distance range among the correct and incorrect poses of the reference pool. Validated on a fragment-based docking benchmark of 57 experimentally solved RRM-ssRNA complexes, HIPPO achieved a threefold or higher enrichment for half of the fragments, versus only a quarter with the ATTRACT scoring function. In particular, HIPPO drastically improved the chance of very high enrichment (12-fold or higher), a scenario where the incremental modelling of entire ssRNA chains from fragments becomes viable. However, for the latter result, more research is needed to make it directly practically applicable. Regardless, our approach already improves upon the state of the art in RRM-ssRNA modelling and is in principle extendable to other types of protein-nucleic acid interactions.
Collapse
Affiliation(s)
- Anna Kravchenko
- Université de Lorraine, CNRS, Inria, LORIA, 54000, Nancy, France
| | | | | | | |
Collapse
|
5
|
Hong X, Tong X, Xie J, Liu P, Liu X, Song Q, Liu S, Liu S. An updated dataset and a structure-based prediction model for protein-RNA binding affinity. Proteins 2023; 91:1245-1253. [PMID: 37186412 DOI: 10.1002/prot.26503] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 03/08/2023] [Accepted: 04/12/2023] [Indexed: 05/17/2023]
Abstract
Understanding the process of protein-RNA interaction is essential for structural biology. The thermodynamic process is an important part to uncover the protein-RNA interaction mechanism. The regulatory networks between protein and RNA in organisms are dominated by the binding or dissociation in the cells. Therefore, determining the binding affinity for protein-RNA complexes can help us to understand the regulation mechanism of protein-RNA interaction. Since it is time-consuming and labor-intensive to determine the binding affinity for protein-RNA complexes by experimental methods, it is necessary and urgent to develop computational methods to predict that. To develop a binding affinity prediction model, first we update the dataset of protein-RNA binding affinity benchmark (PRBAB), which includes 145 complexes now. Second, we extract the structural features based on complex structure, and then we analyze and select the representative structural features to train the regression model. Third, we random select the subset from the PRBAB2.0 to fit the protein-RNA binding affinity determined by experiment. In the end, we tested our model on the nonredundant PDBbind dataset, and the results showed that Pearson correlation coefficient r = .57 and RMSE = 2.51 kcal/mol. The Pearson correlation coefficient achieves 0.7 while removing 5 complex structures with modified residues/nucleotides and metal ions. While testing on ProNAB, the results showed that 71.60% of the prediction achieves Pearson correlation coefficient r = .61 and RMSE = 1.56 kcal/mol with experiment values.
Collapse
Affiliation(s)
- Xu Hong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Xiaoxue Tong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Juan Xie
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Pinyu Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Xudong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Qi Song
- Key Laboratory of Fermentation Engineering (Ministry of Education), Hubei University of Technology, Wuhan, China
| | - Sen Liu
- Key Laboratory of Fermentation Engineering (Ministry of Education), Hubei University of Technology, Wuhan, China
| | - Shiyong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| |
Collapse
|
6
|
Liu X, Duan Y, Hong X, Xie J, Liu S. Challenges in structural modeling of RNA-protein interactions. Curr Opin Struct Biol 2023; 81:102623. [PMID: 37301066 DOI: 10.1016/j.sbi.2023.102623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 05/14/2023] [Accepted: 05/16/2023] [Indexed: 06/12/2023]
Abstract
In the past few years, the number of RNA-binding proteins (RBP) and RNA-RBP interactions has increased significantly. Here, we review recent developments in the methodology for protein-RNA and protein-protein complex structure modeling with deep learning and co-evolution, as well as discuss the challenges and opportunities for building a reliable approach for protein-RNA complex structure modelling. Protein Data bank (PDB) and Cross-linking immunoprecipitation (CLIP) data could be combined together and used to infer 2D geometry of protein-RNA interactions by deep learning.
Collapse
Affiliation(s)
- Xudong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Yingtian Duan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Xu Hong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Juan Xie
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Shiyong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China.
| |
Collapse
|
7
|
Mias‐Lucquin D, Chauvot de Beauchene I. Conformational variability in proteins bound to single-stranded DNA: A new benchmark for new docking perspectives. Proteins 2022; 90:625-631. [PMID: 34617336 PMCID: PMC9292434 DOI: 10.1002/prot.26258] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 09/15/2021] [Accepted: 09/27/2021] [Indexed: 12/19/2022]
Abstract
We explored the Protein Data Bank (PDB) to collect protein-ssDNA structures and create a multi-conformational docking benchmark including both bound and unbound protein structures. Due to ssDNA high flexibility when not bound, no ssDNA unbound structure is included in the benchmark. For the 91 sequence-identity groups identified as bound-unbound structures of the same protein, we studied the conformational changes in the protein induced by the ssDNA binding. Moreover, based on several bound or unbound protein structures in some groups, we also assessed the intrinsic conformational variability in either bound or unbound conditions and compared it to the supposedly binding-induced modifications. To illustrate a use case of this benchmark, we performed docking experiments using ATTRACT docking software. This benchmark is, to our knowledge, the first one made to peruse available structures of ssDNA-protein interactions to such an extent, aiming to improve computational docking tools dedicated to this kind of molecular interactions.
Collapse
|
8
|
Feng Y, Zhang K, Wu Q, Huang SY. NLDock: a Fast Nucleic Acid-Ligand Docking Algorithm for Modeling RNA/DNA-Ligand Complexes. J Chem Inf Model 2021; 61:4771-4782. [PMID: 34468128 DOI: 10.1021/acs.jcim.1c00341] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Nucleic acid-ligand interactions play an important role in numerous cellular processes such as gene function expression and regulation. Therefore, nucleic acids such as RNAs have become more and more important drug targets, where the structural determination of nucleic acid-ligand complexes is pivotal for understanding their functions and thus developing therapeutic interventions. Molecular docking has been a useful computational tool in predicting the complex structure between molecules. However, although a number of docking algorithms have been developed for protein-ligand interactions, only a few docking programs were presented for nucleic acid-ligand interactions. Here, we have developed a fast nucleic acid-ligand docking algorithm, named NLDock, by implementing our intrinsic scoring function ITScoreNL for nucleic acid-ligand interactions into a modified version of the MDock program. NLDock was extensively evaluated on four test sets and compared with five other state-of-the-art docking algorithms including AutoDock, DOCK 6, rDock, GOLD, and Glide. It was shown that our NLDock algorithm obtained a significantly better performance than the other docking programs in binding mode predictions and achieved the success rates of 73%, 36%, and 32% on the largest test set of 77 complexes for local rigid-, local flexible-, and global flexible-ligand docking, respectively. In addition, our NLDock approach is also computationally efficient and consumed an average of as short as 0.97 and 2.08 min for a local flexible-ligand docking job and a global flexible-ligand docking job, respectively. These results suggest the good performance of our NLDock in both docking accuracy and computational efficiency.
Collapse
Affiliation(s)
- Yuyu Feng
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Keqiong Zhang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Qilong Wu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
9
|
Feng Y, Huang SY. ITScore-NL: An Iterative Knowledge-Based Scoring Function for Nucleic Acid-Ligand Interactions. J Chem Inf Model 2020; 60:6698-6708. [PMID: 33291885 DOI: 10.1021/acs.jcim.0c00974] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Nucleic acid-ligand complexes underlie numerous cellular processes, such as gene function expression and regulation, in which their three-dimensional structures are important to understand their functions and thus to develop therapeutic interventions. Given the high cost and technical difficulties in experimental methods, computational methods such as molecular docking have been actively used to investigate nucleic acid-ligand interactions in which an accurate scoring function is crucial. However, because of the limited number of experimental nucleic acid-ligand binding data and structures, the scoring function development for nucleic acid-ligand interactions falls far behind that for protein-protein and protein-ligand interactions. Here, based on our statistical mechanics-based iterative approach, we have developed an iterative knowledge-based scoring function for nucleic acid-ligand interactions, named as ITScore-NL, by explicitly including stacking and electrostatic potentials. Our ITScore-NL scoring function was extensively evaluated for its ability in the binding mode and binding affinity predictions on three diverse test sets and compared with state-of-the-art scoring functions. Overall, ITScore-NL obtained significantly better performance than the other 12 scoring functions and predicted near-native poses with rmsd ≤ 1.5 Å for 71.43% of the cases when the top three binding modes were considered and a good correlation of R = 0.64 in binding affinity prediction on the large test set of 77 nucleic acid-ligand complexes. These results suggested the accuracy of ITScore-NL and the necessity of explicitly including stacking and electrostatic potentials.
Collapse
Affiliation(s)
- Yuyu Feng
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
10
|
Zheng J, Hong X, Xie J, Tong X, Liu S. P3DOCK: a protein-RNA docking webserver based on template-based and template-free docking. Bioinformatics 2020; 36:96-103. [PMID: 31173056 DOI: 10.1093/bioinformatics/btz478] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 05/24/2019] [Accepted: 06/04/2019] [Indexed: 01/02/2023] Open
Abstract
MOTIVATION The main function of protein-RNA interaction is to regulate the expression of genes. Therefore, studying protein-RNA interactions is of great significance. The information of three-dimensional (3D) structures reveals that atomic interactions are particularly important. The calculation method for modeling a 3D structure of a complex mainly includes two strategies: free docking and template-based docking. These two methods are complementary in protein-protein docking. Therefore, integrating these two methods may improve the prediction accuracy. RESULTS In this article, we compare the difference between the free docking and the template-based algorithm. Then we show the complementarity of these two methods. Based on the analysis of the calculation results, the transition point is confirmed and used to integrate two docking algorithms to develop P3DOCK. P3DOCK holds the advantages of both algorithms. The results of the three docking benchmarks show that P3DOCK is better than those two non-hybrid docking algorithms. The success rate of P3DOCK is also higher (3-20%) than state-of-the-art hybrid and non-hybrid methods. Finally, the hierarchical clustering algorithm is utilized to cluster the P3DOCK's decoys. The clustering algorithm improves the success rate of P3DOCK. For ease of use, we provide a P3DOCK webserver, which can be accessed at www.rnabinding.com/P3DOCK/P3DOCK.html. An integrated protein-RNA docking benchmark can be downloaded from http://rnabinding.com/P3DOCK/benchmark.html. AVAILABILITY AND IMPLEMENTATION www.rnabinding.com/P3DOCK/P3DOCK.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jinfang Zheng
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Xu Hong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Juan Xie
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Xiaoxue Tong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Shiyong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| |
Collapse
|
11
|
He J, Tao H, Huang SY. Protein-ensemble-RNA docking by efficient consideration of protein flexibility through homology models. Bioinformatics 2020; 35:4994-5002. [PMID: 31086984 DOI: 10.1093/bioinformatics/btz388] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2019] [Revised: 04/28/2019] [Accepted: 05/03/2019] [Indexed: 12/18/2022] Open
Abstract
MOTIVATION Given the importance of protein-ribonucleic acid (RNA) interactions in many biological processes, a variety of docking algorithms have been developed to predict the complex structure from individual protein and RNA partners in the past decade. However, due to the impact of molecular flexibility, the performance of current methods has hit a bottleneck in realistic unbound docking. Pushing the limit, we have proposed a protein-ensemble-RNA docking strategy to explicitly consider the protein flexibility in protein-RNA docking through an ensemble of multiple protein structures, which is referred to as MPRDock. Instead of taking conformations from MD simulations or experimental structures, we obtained the multiple structures of a protein by building models from its homologous templates in the Protein Data Bank (PDB). RESULTS Our approach can not only avoid the reliability issue of structures from MD simulations but also circumvent the limited number of experimental structures for a target protein in the PDB. Tested on 68 unbound-bound and 18 unbound-unbound protein-RNA complexes, our MPRDock/DITScorePR considerably improved the docking performance and achieved a significantly higher success rate than single-protein rigid docking whether pseudo-unbound templates are included or not. Similar improvements were also observed when combining our ensemble docking strategy with other scoring functions. The present homology model-based ensemble docking approach will have a general application in molecular docking for other interactions. AVAILABILITY AND IMPLEMENTATION http://huanglab.phys.hust.edu.cn/mprdock/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jiahua He
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Huanyu Tao
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Sheng-You Huang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| |
Collapse
|
12
|
Qiu L, Zou X. Scoring Functions for Protein-RNA Complex Structure Prediction: Advances, Applications, and Future Directions. COMMUNICATIONS IN INFORMATION AND SYSTEMS 2020; 20:1-22. [PMID: 33867869 DOI: 10.4310/cis.2020.v20.n1.a1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Protein-RNA interaction is among the most essential of biological events in living cells, being involved in protein synthesizing, RNA processing and transport, DNA transcription, and regulation of gene expression, and many other critical bio-molecular activities. A thorough understanding of this interaction is of paramount importance in fundamental study of a variety of vital cellular processes and therapeutic application for remedy of a broad range of diseases. Experimental high-resolution 3D structure determination is the primary source of knowledge for protein-RNA complexes. However, due to technical limitations, the existing techniques for experimental structure determination couldn't match the demand from fast growing interest in academia and industry. This problem necessitates the alternative high-throughput computational method for protein-RNA complex structure prediction. Similar to the in silico methods used for protein-protein and protein-DNA interactions, a reliable prediction of protein-RNA complex structure requires a scoring function with commensurate discriminatory power. Derived from determined structures and purposed to predict the to-be-determined structures, the scoring function is not only a predictive tool but also a gauge of our knowledge of protein-RNA interaction. In this review, we present an overview of the status of existing scoring functions and the scientific principle behind their constructions as well as their strengths and limitations. Finally, we will discuss about future directions of the scoring function development for protein-RNA structure prediction.
Collapse
Affiliation(s)
- Liming Qiu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri 65211
| | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri 65211.,Department of Physics & Astronomy, University of Missouri, Columbia, Missouri 65211.,Department of Biochemistry, University of Missouri, Columbia, Missouri 65211.,Informatics Institute, University of Missouri, Columbia, Missouri 65211
| |
Collapse
|
13
|
Glashagen G, de Vries S, Uciechowska-Kaczmarzyk U, Samsonov SA, Murail S, Tuffery P, Zacharias M. Coarse-grained and atomic resolution biomolecular docking with the ATTRACT approach. Proteins 2019; 88:1018-1028. [PMID: 31785163 DOI: 10.1002/prot.25860] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 11/20/2019] [Accepted: 11/27/2019] [Indexed: 01/17/2023]
Abstract
The ATTRACT protein-protein docking program has been employed to predict protein-protein complex structures in CAPRI rounds 38-45. For 11 out of 16 targets acceptable or better quality solutions have been submitted (~70%). It includes also several cases of peptide-protein docking and the successful prediction of the geometry of carbohydrate-protein interactions. The option of combining rigid body minimization and simultaneous optimization in collective degrees of freedom based on elastic network modes was employed and systematically evaluated. Application to a large benchmark set indicates a modest improvement in docking performance compared to rigid docking. Possible further improvements of the docking approach in particular at the scoring and the flexible refinement steps are discussed.
Collapse
Affiliation(s)
- Glenn Glashagen
- Physik-Department T38, Technische Universität München, Garching, Germany
| | - Sjoerd de Vries
- Université de Paris, CNRS UMR 8251, INSERM ERL U1133, Paris, France.,Ressource Parisienne en Bioinformatique Structurale (RPBS), Paris, France
| | | | | | - Samuel Murail
- Université de Paris, CNRS UMR 8251, INSERM ERL U1133, Paris, France
| | - Pierre Tuffery
- Université de Paris, CNRS UMR 8251, INSERM ERL U1133, Paris, France.,Ressource Parisienne en Bioinformatique Structurale (RPBS), Paris, France
| | - Martin Zacharias
- Physik-Department T38, Technische Universität München, Garching, Germany
| |
Collapse
|
14
|
Nithin C, Mukherjee S, Bahadur RP. A structure-based model for the prediction of protein-RNA binding affinity. RNA (NEW YORK, N.Y.) 2019; 25:1628-1645. [PMID: 31395671 PMCID: PMC6859855 DOI: 10.1261/rna.071779.119] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 08/05/2019] [Indexed: 05/28/2023]
Abstract
Protein-RNA recognition is highly affinity-driven and regulates a wide array of cellular functions. In this study, we have curated a binding affinity data set of 40 protein-RNA complexes, for which at least one unbound partner is available in the docking benchmark. The data set covers a wide affinity range of eight orders of magnitude as well as four different structural classes. On average, we find the complexes with single-stranded RNA have the highest affinity, whereas the complexes with the duplex RNA have the lowest. Nevertheless, free energy gain upon binding is the highest for the complexes with ribosomal proteins and the lowest for the complexes with tRNA with an average of -5.7 cal/mol/Å2 in the entire data set. We train regression models to predict the binding affinity from the structural and physicochemical parameters of protein-RNA interfaces. The best fit model with the lowest maximum error is provided with three interface parameters: relative hydrophobicity, conformational change upon binding and relative hydration pattern. This model has been used for predicting the binding affinity on a test data set, generated using mutated structures of yeast aspartyl-tRNA synthetase, for which experimentally determined ΔG values of 40 mutations are available. The predicted ΔGempirical values highly correlate with the experimental observations. The data set provided in this study should be useful for further development of the binding affinity prediction methods. Moreover, the model developed in this study enhances our understanding on the structural basis of protein-RNA binding affinity and provides a platform to engineer protein-RNA interfaces with desired affinity.
Collapse
Affiliation(s)
- Chandran Nithin
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Sunandan Mukherjee
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| |
Collapse
|
15
|
Deng L, Yang W, Liu H. PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees. Front Genet 2019; 10:637. [PMID: 31428122 PMCID: PMC6688581 DOI: 10.3389/fgene.2019.00637] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 06/18/2019] [Indexed: 01/24/2023] Open
Abstract
Protein-RNA interactions play essential roles in many biological aspects. Quantifying the binding affinity of protein-RNA complexes is helpful to the understanding of protein-RNA recognition mechanisms and identification of strong binding partners. Due to experimentally measured protein-RNA binding affinity data available is still limited to date, there is a pressing demand for accurate and reliable computational approaches. In this paper, we propose a computational approach, PredPRBA, which can effectively predict protein-RNA binding affinity using gradient boosted regression trees. We build a dataset of protein-RNA binding affinity that includes 103 protein-RNA complex structures manually collected from related literature. Then, we generate 37 kinds of sequence and structural features and explore the relationship between the features and protein-RNA binding affinity. We find that the binding affinity mainly depends on the structure of RNA molecules. According to the type of RNA associated with proteins composed of the protein-RNA complex, we split the 103 protein-RNA complexes into six categories. For each category, we build a gradient boosted regression tree (GBRT) model based on the generated features. We perform a comprehensive evaluation for the proposed method on the binding affinity dataset using leave-one-out cross-validation. We show that PredPRBA achieves correlations ranging from 0.723 to 0.897 among six categories, which is significantly better than other typical regression methods and the pioneer protein-RNA binding affinity predictor SPOT-Seq-RNA. In addition, a user-friendly web server has been developed to predict the binding affinity of protein-RNA complexes. The PredPRBA webserver is freely available at http://PredPRBA.denglab.org/.
Collapse
Affiliation(s)
- Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, China.,School of Software, Xinjiang University, Urumqi, China
| | - Wenyi Yang
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Hui Liu
- Lab of Information Management, Changzhou University, Changzhou, China
| |
Collapse
|
16
|
Yan Y, Wen Z, Zhang D, Huang SY. Determination of an effective scoring function for RNA-RNA interactions with a physics-based double-iterative method. Nucleic Acids Res 2019; 46:e56. [PMID: 29506237 PMCID: PMC5961370 DOI: 10.1093/nar/gky113] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2017] [Accepted: 02/08/2018] [Indexed: 11/15/2022] Open
Abstract
RNA–RNA interactions play fundamental roles in gene and cell regulation. Therefore, accurate prediction of RNA–RNA interactions is critical to determine their complex structures and understand the molecular mechanism of the interactions. Here, we have developed a physics-based double-iterative strategy to determine the effective potentials for RNA–RNA interactions based on a training set of 97 diverse RNA–RNA complexes. The double-iterative strategy circumvented the reference state problem in knowledge-based scoring functions by updating the potentials through iteration and also overcame the decoy-dependent limitation in previous iterative methods by constructing the decoys iteratively. The derived scoring function, which is referred to as DITScoreRR, was evaluated on an RNA–RNA docking benchmark of 60 test cases and compared with three other scoring functions. It was shown that for bound docking, our scoring function DITScoreRR obtained the excellent success rates of 90% and 98.3% in binding mode predictions when the top 1 and 10 predictions were considered, compared to 63.3% and 71.7% for van der Waals interactions, 45.0% and 65.0% for ITScorePP, and 11.7% and 26.7% for ZDOCK 2.1, respectively. For unbound docking, DITScoreRR achieved the good success rates of 53.3% and 71.7% in binding mode predictions when the top 1 and 10 predictions were considered, compared to 13.3% and 28.3% for van der Waals interactions, 11.7% and 26.7% for our ITScorePP, and 3.3% and 6.7% for ZDOCK 2.1, respectively. DITScoreRR also performed significantly better in ranking decoys and obtained significantly higher score-RMSD correlations than the other three scoring functions. DITScoreRR will be of great value for the prediction and design of RNA structures and RNA–RNA complexes.
Collapse
Affiliation(s)
- Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P.R. China
| | - Zeyu Wen
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P.R. China
| | - Di Zhang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P.R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P.R. China
| |
Collapse
|
17
|
Pal A, Levy Y. Structure, stability and specificity of the binding of ssDNA and ssRNA with proteins. PLoS Comput Biol 2019; 15:e1006768. [PMID: 30933978 PMCID: PMC6467422 DOI: 10.1371/journal.pcbi.1006768] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 04/16/2019] [Accepted: 01/01/2019] [Indexed: 02/06/2023] Open
Abstract
Recognition of single-stranded DNA (ssDNA) or single-stranded RNA (ssRNA) is important for many fundamental cellular functions. A variety of single-stranded DNA-binding proteins (ssDBPs) and single-stranded RNA-binding proteins (ssRBPs) have evolved that bind ssDNA and ssRNA, respectively, with varying degree of affinities and specificities to form complexes. Structural studies of these complexes provide key insights into their recognition mechanism. However, computational modeling of the specific recognition process and to predict the structure of the complex is challenging, primarily due to the heterogeneity of their binding energy landscape and the greater flexibility of ssDNA or ssRNA compared with double-stranded nucleic acids. Consequently, considerably fewer computational studies have explored interactions between proteins and single-stranded nucleic acids compared with protein interactions with double-stranded nucleic acids. Here, we report a newly developed energy-based coarse-grained model to predict the structure of ssDNA–ssDBP and ssRNA–ssRBP complexes and to assess their sequence-specific interactions and stabilities. We tuned two factors that can modulate specific recognition: base–aromatic stacking strength and the flexibility of the single-stranded nucleic acid. The model was successfully applied to predict the binding conformations of 12 distinct ssDBP and ssRBP structures with their cognate ssDNA and ssRNA partners having various sequences. Estimated binding energies agreed well with the corresponding experimental binding affinities. Bound conformations from the simulation showed a funnel-shaped binding energy distribution where the native-like conformations corresponded to the energy minima. The various ssDNA–protein and ssRNA–protein complexes differed in the balance of electrostatic and aromatic energies. The lower affinity of the ssRNA–ssRBP complexes compared with the ssDNA–ssDBP complexes stems from lower flexibility of ssRNA compared to ssDNA, which results in higher rate constants for the dissociation of the complex (koff) for complexes involving the former. Quantifying bimolecular self-assembly is pivotal to understanding cellular function. In recent years, a large progress has been made in understanding the structure and biophysics of protein-protein interactions. Particularly, various computational tools are available for predicting these structures and to estimate their stability and the driving forces of their formation. The understating of the interactions between proteins and nucleic acids, however, is still limited, presumably due to the involvement of non-specific interactions as well as the high conformational plasticity that may demand an induced-fit mechanism. In particular, the interactions between proteins and single-stranded nucleic acids (i.e., single-stranded DNA and RNA) is very challenging due to their high flexibility. Furthermore, the interface between proteins and single-stranded nucleic acids is often chemically more heterogeneous than the interface between proteins and double-stranded DNA. In this study, we developed a coarse-grained computational model to predict the structure of complexes between proteins and single-stranded nucleic acids. The model was applied to estimate binding affinities and the estimated binding energies agreed well with the corresponding experimental binding affinities. The kinetics of association as well as the specificity of the complexes between proteins and ssDNA are different than those with ssRNA, mostly due to differences in their conformational flexibility.
Collapse
Affiliation(s)
- Arumay Pal
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Yaakov Levy
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
- * E-mail:
| |
Collapse
|
18
|
Potter TD, Tasche J, Wilson MR. Assessing the transferability of common top-down and bottom-up coarse-grained molecular models for molecular mixtures. Phys Chem Chem Phys 2019; 21:1912-1927. [DOI: 10.1039/c8cp05889j] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Assessing the performance of top-down and bottom-up coarse-graining approaches.
Collapse
Affiliation(s)
| | - Jos Tasche
- Department of Chemistry
- Durham University
- Lower Mountjoy
- Durham
- UK
| | - Mark R. Wilson
- Department of Chemistry
- Durham University
- Lower Mountjoy
- Durham
- UK
| |
Collapse
|
19
|
Corona RI, Sudarshan S, Aluru S, Guo JT. An SVM-based method for assessment of transcription factor-DNA complex models. BMC Bioinformatics 2018; 19:506. [PMID: 30577740 PMCID: PMC6302363 DOI: 10.1186/s12859-018-2538-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Background Atomic details of protein-DNA complexes can provide insightful information for better understanding of the function and binding specificity of DNA binding proteins. In addition to experimental methods for solving protein-DNA complex structures, protein-DNA docking can be used to predict native or near-native complex models. A docking program typically generates a large number of complex conformations and predicts the complex model(s) based on interaction energies between protein and DNA. However, the prediction accuracy is hampered by current approaches to model assessment, especially when docking simulations fail to produce any near-native models. Results We present here a Support Vector Machine (SVM)-based approach for quality assessment of the predicted transcription factor (TF)-DNA complex models. Besides a knowledge-based protein-DNA interaction potential DDNA3, we applied several structural features that have been shown to play important roles in binding specificity between transcription factors and DNA molecules to quality assessment of complex models. To address the issue of unbalanced positive and negative cases in the training dataset, we applied hard-negative mining, an iterative training process that selects an initial training dataset by combining all of the positive cases and a random sample from the negative cases. Results show that the SVM model greatly improves prediction accuracy (84.2%) over two knowledge-based protein-DNA interaction potentials, orientation potential (60.8%) and DDNA3 (68.4%). The improvement is achieved through reducing the number of false positive predictions, especially for the hard docking cases, in which a docking algorithm fails to produce any near-native complex models. Conclusions A learning-based SVM scoring model with structural features for specific protein-DNA binding and an atomic-level protein-DNA interaction potential DDNA3 significantly improves prediction accuracy of complex models by successfully identifying cases without near-native structural models. Electronic supplementary material The online version of this article (10.1186/s12859-018-2538-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rosario I Corona
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, NC, 28223, USA
| | - Sanjana Sudarshan
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, NC, 28223, USA
| | - Srinivas Aluru
- School of Computational Science and Engineering, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA, 30332, USA
| | - Jun-Tao Guo
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, NC, 28223, USA.
| |
Collapse
|
20
|
Kappel K, Das R. Sampling Native-like Structures of RNA-Protein Complexes through Rosetta Folding and Docking. Structure 2018; 27:140-151.e5. [PMID: 30416038 DOI: 10.1016/j.str.2018.10.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 08/27/2018] [Accepted: 10/05/2018] [Indexed: 10/27/2022]
Abstract
RNA-protein complexes underlie numerous cellular processes including translation, splicing, and posttranscriptional regulation of gene expression. The structures of these complexes are crucial to their functions but often elude high-resolution structure determination. Computational methods are needed that can integrate low-resolution data for RNA-protein complexes while modeling de novo the large conformational changes of RNA components upon complex formation. To address this challenge, we describe RNP-denovo, a Rosetta method to simultaneously fold-and-dock RNA to a protein surface. On a benchmark set of diverse RNA-protein complexes not solvable with prior strategies, RNP-denovo consistently sampled native-like structures with better than nucleotide resolution. We revisited three past blind modeling challenges involving the spliceosome, telomerase, and a methyltransferase-ribosomal RNA complex in which previous methods gave poor results. When coupled with the same sparse FRET, crosslinking, and functional data used previously, RNP-denovo gave models with significantly improved accuracy. These results open a route to modeling global folds of RNA-protein complexes from low-resolution data.
Collapse
Affiliation(s)
- Kalli Kappel
- Biophysics Program, Stanford University, Stanford, CA 94305, USA
| | - Rhiju Das
- Biophysics Program, Stanford University, Stanford, CA 94305, USA; Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Physics, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
21
|
Chen F, Sun H, Wang J, Zhu F, Liu H, Wang Z, Lei T, Li Y, Hou T. Assessing the performance of MM/PBSA and MM/GBSA methods. 8. Predicting binding free energies and poses of protein-RNA complexes. RNA (NEW YORK, N.Y.) 2018; 24:1183-1194. [PMID: 29930024 PMCID: PMC6097651 DOI: 10.1261/rna.065896.118] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Accepted: 06/13/2018] [Indexed: 05/10/2023]
Abstract
Molecular docking provides a computationally efficient way to predict the atomic structural details of protein-RNA interactions (PRI), but accurate prediction of the three-dimensional structures and binding affinities for PRI is still notoriously difficult, partly due to the unreliability of the existing scoring functions for PRI. MM/PBSA and MM/GBSA are more theoretically rigorous than most scoring functions for protein-RNA docking, but their prediction performance for protein-RNA systems remains unclear. Here, we systemically evaluated the capability of MM/PBSA and MM/GBSA to predict the binding affinities and recognize the near-native binding structures for protein-RNA systems with different solvent models and interior dielectric constants (εin). For predicting the binding affinities, the predictions given by MM/GBSA based on the minimized structures in explicit solvent and the GBGBn1 model with εin = 2 yielded the highest correlation with the experimental data. Moreover, the MM/GBSA calculations based on the minimized structures in implicit solvent and the GBGBn1 model distinguished the near-native binding structures within the top 10 decoys for 117 out of the 148 protein-RNA systems (79.1%). This performance is better than all docking scoring functions studied here. Therefore, the MM/GBSA rescoring is an efficient way to improve the prediction capability of scoring functions for protein-RNA systems.
Collapse
Affiliation(s)
- Fu Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058, China
- College of Life and Environmental Sciences, Shanghai Normal University, Shanghai 200234, China
| | - Huiyong Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Junmei Wang
- Department of Pharmaceutical Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Hui Liu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Zhe Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Tailong Lei
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Youyong Li
- Institute of Functional Nano and Soft Materials (FUNSOM), Soochow University, Suzhou, Jiangsu 215123, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
22
|
Nithin C, Ghosh P, Bujnicki JM. Bioinformatics Tools and Benchmarks for Computational Docking and 3D Structure Prediction of RNA-Protein Complexes. Genes (Basel) 2018; 9:genes9090432. [PMID: 30149645 PMCID: PMC6162694 DOI: 10.3390/genes9090432] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 07/26/2018] [Accepted: 08/21/2018] [Indexed: 12/29/2022] Open
Abstract
RNA-protein (RNP) interactions play essential roles in many biological processes, such as regulation of co-transcriptional and post-transcriptional gene expression, RNA splicing, transport, storage and stabilization, as well as protein synthesis. An increasing number of RNP structures would aid in a better understanding of these processes. However, due to the technical difficulties associated with experimental determination of macromolecular structures by high-resolution methods, studies on RNP recognition and complex formation present significant challenges. As an alternative, computational prediction of RNP interactions can be carried out. Structural models obtained by theoretical predictive methods are, in general, less reliable compared to models based on experimental measurements but they can be sufficiently accurate to be used as a basis for to formulating functional hypotheses. In this article, we present an overview of computational methods for 3D structure prediction of RNP complexes. We discuss currently available methods for macromolecular docking and for scoring 3D structural models of RNP complexes in particular. Additionally, we also review benchmarks that have been developed to assess the accuracy of these methods.
Collapse
Affiliation(s)
- Chandran Nithin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
| | - Pritha Ghosh
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
- Bioinformatics Laboratory, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, ul. Umultowska 89, PL-61-614 Poznan, Poland.
| |
Collapse
|
23
|
Abstract
Protein-RNA interactions play an important role in many biological processes. Computational methods such as docking have been developed to complement existing biophysical and structural biology techniques. Computational prediction of protein-RNA complex structures includes two steps: generating candidate structures from the individual protein and RNA parts and scoring the generated poses to pick out the correct one. In this work, we considered three recently developed data sets of protein-RNA complexes to evaluate and improve the performance of the FFT-based rigid-body docking algorithm implemented in the ICM package. An electrostatic term describing interactions between negatively charged phosphate groups and positively charged protein residues was added to the energy function used during the docking step to take into account the greater role that electrostatic interactions play in protein-RNA complexes. Next, the docking results were used to optimize a scoring function including van der Waals, electrostatic, and solvation terms. This optimization yielded a much smaller weight for the solvation term indicating that solvation energy may be less important for the scoring of protein-RNA structures. Rescoring of the generated poses with the new scoring function led to much higher success rates, while pose clustering by contact fingerprints produced further improvements, achieving a success rate of 0.66 for the top 100 structures.
Collapse
Affiliation(s)
- Yelena A Arnautova
- Molsoft L.L.C., 11199 Sorrento Valley Road, S209 , San Diego , California 92121 , United States
| | - Ruben Abagyan
- Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California San Diego , La Jolla , California 92093 , United States
| | - Maxim Totrov
- Molsoft L.L.C., 11199 Sorrento Valley Road, S209 , San Diego , California 92121 , United States
| |
Collapse
|
24
|
Orr AA, Gonzalez-Rivera JC, Wilson M, Bhikha PR, Wang D, Contreras LM, Tamamis P. A high-throughput and rapid computational method for screening of RNA post-transcriptional modifications that can be recognized by target proteins. Methods 2018; 143:34-47. [DOI: 10.1016/j.ymeth.2018.01.015] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Revised: 01/14/2018] [Accepted: 01/26/2018] [Indexed: 12/25/2022] Open
|
25
|
de Vries SJ, Zacharias M. Fast and accurate grid representations for atom-based docking with partner flexibility. J Comput Chem 2017; 38:1538-1546. [DOI: 10.1002/jcc.24795] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Revised: 01/18/2017] [Accepted: 01/19/2017] [Indexed: 12/12/2022]
Affiliation(s)
- Sjoerd J. de Vries
- MTi, UMR-S 973, Physics Department T38; Technische Universität München; James-Franck-Strasse 1 85748 Garching Germany
| | - Martin Zacharias
- MTi, UMR-S 973, Physics Department T38; Technische Universität München; James-Franck-Strasse 1 85748 Garching Germany
| |
Collapse
|
26
|
A pair-conformation-dependent scoring function for evaluating 3D RNA-protein complex structures. PLoS One 2017; 12:e0174662. [PMID: 28358834 PMCID: PMC5373608 DOI: 10.1371/journal.pone.0174662] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 03/13/2017] [Indexed: 01/04/2023] Open
Abstract
Computational prediction of RNA-protein complex 3D structures includes two basic steps: one is sampling possible structures and another is scoring the sampled structures to pick out the correct one. At present, constructing accurate scoring functions is still not well solved and the performances of the scoring functions usually depend on used benchmarks. Here we propose a pair-conformation-dependent scoring function, 3dRPC-Score, for 3D RNA-protein complex structure prediction by considering the nucleotide-residue pairs having the same energy if their conformations are similar, instead of the distance-only dependence of the most existing scoring functions. Benchmarking shows that 3dRPC-Score has a consistent performance in three test sets.
Collapse
|
27
|
Application of the ATTRACT Coarse-Grained Docking and Atomistic Refinement for Predicting Peptide-Protein Interactions. Methods Mol Biol 2017. [PMID: 28236233 DOI: 10.1007/978-1-4939-6798-8_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Peptide-protein interactions are abundant in the cell and form an important part of the interactome. Large-scale modeling of peptide-protein complexes requires a fully blind approach; i.e., simultaneously predicting the peptide-binding site and the peptide conformation to high accuracy. Here, we present one of the first fully blind peptide-protein docking protocols, pepATTRACT. It combines a coarse-grained ensemble docking search of the entire protein surface with two stages of atomistic flexible refinement. pepATTRACT yields high-quality predictions for 70 % of the cases when tested on a large benchmark of peptide-protein complexes. This performance in fully blind mode is similar to state-of-the-art local docking approaches that use information on the location of the binding site. Limiting the search to the peptide-binding region, the resulting pepATTRACT-local approach further improves the performance. Docking scripts for pepATTRACT and pepATTRACT-local can be generated via a web interface at www.attract.ph.tum.de/peptide.html . Here, we explain how to set up a docking run with the pepATTRACT web interface and demonstrate its usage by an application on binding of disordered regions from tumor suppressor p53 to a partner protein.
Collapse
|
28
|
Sasse A, de Vries SJ, Schindler CEM, de Beauchêne IC, Zacharias M. Rapid Design of Knowledge-Based Scoring Potentials for Enrichment of Near-Native Geometries in Protein-Protein Docking. PLoS One 2017; 12:e0170625. [PMID: 28118389 PMCID: PMC5261736 DOI: 10.1371/journal.pone.0170625] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Accepted: 01/07/2017] [Indexed: 01/15/2023] Open
Abstract
Protein-protein docking protocols aim to predict the structures of protein-protein complexes based on the structure of individual partners. Docking protocols usually include several steps of sampling, clustering, refinement and re-scoring. The scoring step is one of the bottlenecks in the performance of many state-of-the-art protocols. The performance of scoring functions depends on the quality of the generated structures and its coupling to the sampling algorithm. A tool kit, GRADSCOPT (GRid Accelerated Directly SCoring OPTimizing), was designed to allow rapid development and optimization of different knowledge-based scoring potentials for specific objectives in protein-protein docking. Different atomistic and coarse-grained potentials can be created by a grid-accelerated directly scoring dependent Monte-Carlo annealing or by a linear regression optimization. We demonstrate that the scoring functions generated by our approach are similar to or even outperform state-of-the-art scoring functions for predicting near-native solutions. Of additional importance, we find that potentials specifically trained to identify the native bound complex perform rather poorly on identifying acceptable or medium quality (near-native) solutions. In contrast, atomistic long-range contact potentials can increase the average fraction of near-native poses by up to a factor 2.5 in the best scored 1% decoys (compared to existing scoring), emphasizing the need of specific docking potentials for different steps in the docking protocol.
Collapse
Affiliation(s)
- Alexander Sasse
- Physik Department T38, Technische Universität München, James-Franck-Straße, Garching, Germany
| | - Sjoerd J. de Vries
- Physik Department T38, Technische Universität München, James-Franck-Straße, Garching, Germany
| | | | | | - Martin Zacharias
- Physik Department T38, Technische Universität München, James-Franck-Straße, Garching, Germany
- * E-mail:
| |
Collapse
|
29
|
Pérez-Cano L, Romero-Durana M, Fernández-Recio J. Structural and energy determinants in protein-RNA docking. Methods 2016; 118-119:163-170. [PMID: 27816523 DOI: 10.1016/j.ymeth.2016.11.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 10/14/2016] [Accepted: 11/01/2016] [Indexed: 01/02/2023] Open
Abstract
Deciphering the structural and energetic determinants of protein-RNA interactions harbors the potential to understand key cell processes at molecular level, such as gene expression and regulation. With this purpose, computational methods like docking aim to complement current biophysical and structural biology efforts. However, the few reported docking algorithms for protein-RNA interactions show limited predictive success rates, mainly due to incomplete sampling of the conformational space of both the protein and the RNA molecules, as well as to the difficulties of the scoring function in identifying the correct docking models. Here, we have tested the predictive value of a variety of knowledge-based and energetic scoring functions on a recently published protein-RNA docking benchmark and developed a scoring function able to efficiently discriminate docking decoys. We first performed docking calculations with the bound conformation, which allowed us to analyze the problem in optimal conditions. We found that geometry-based terms and electrostatics were the most important scoring terms, while binding propensities and desolvation were much less relevant for the scoring of protein-RNA models. This is in contrast with what we observed for protein-protein docking. The results also showed an interesting dependence of the predictive rates on the flexibility of the protein molecule, which arises from the observed higher positive charge of flexible interfaces and provides hints for future development of more efficient protein-RNA docking methods.
Collapse
Affiliation(s)
- Laura Pérez-Cano
- Joint BSC-CRG-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center (BSC), Jordi Girona 29, Barcelona 08034, Spain; Center for Neurobehavioral Genetics and Center for Autism Research and Treatment, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Miguel Romero-Durana
- Joint BSC-CRG-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center (BSC), Jordi Girona 29, Barcelona 08034, Spain
| | - Juan Fernández-Recio
- Joint BSC-CRG-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center (BSC), Jordi Girona 29, Barcelona 08034, Spain.
| |
Collapse
|
30
|
Zheng J, Kundrotas PJ, Vakser IA, Liu S. Template-Based Modeling of Protein-RNA Interactions. PLoS Comput Biol 2016; 12:e1005120. [PMID: 27662342 PMCID: PMC5035060 DOI: 10.1371/journal.pcbi.1005120] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Accepted: 08/25/2016] [Indexed: 12/29/2022] Open
Abstract
Protein-RNA complexes formed by specific recognition between RNA and RNA-binding proteins play an important role in biological processes. More than a thousand of such proteins in human are curated and many novel RNA-binding proteins are to be discovered. Due to limitations of experimental approaches, computational techniques are needed for characterization of protein-RNA interactions. Although much progress has been made, adequate methodologies reliably providing atomic resolution structural details are still lacking. Although protein-RNA free docking approaches proved to be useful, in general, the template-based approaches provide higher quality of predictions. Templates are key to building a high quality model. Sequence/structure relationships were studied based on a representative set of binary protein-RNA complexes from PDB. Several approaches were tested for pairwise target/template alignment. The analysis revealed a transition point between random and correct binding modes. The results showed that structural alignment is better than sequence alignment in identifying good templates, suitable for generating protein-RNA complexes close to the native structure, and outperforms free docking, successfully predicting complexes where the free docking fails, including cases of significant conformational change upon binding. A template-based protein-RNA interaction modeling protocol PRIME was developed and benchmarked on a representative set of complexes. Structures of protein-RNA complexes are important for characterization of biological processes. The number of experimentally determined protein-RNA complexes is limited. Thus modeling of these complexes is important. Reliable structural predictions of proteins and their complexes are provided by comparative modeling, which takes advantage of similar complexes with experimentally determined structures. Thus, in the case of protein-RNA complexes, it is important to determine if similar proteins and RNAs bind in a similar way. We show that, similarly to the earlier published results on protein-protein complexes, such correlation of the protein-RNA binding mode and the monomers similarity indeed exists, and is stronger when the similarity is determined by structure rather than sequence alignment. The data shows clear transition from random to similar binding mode with the increase of the structural similarity of the monomers. On the basis of the results we designed and implemented a predictive tool, which should be useful for the biological community interested in modeling of protein-RNA interactions.
Collapse
Affiliation(s)
- Jinfang Zheng
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Petras J. Kundrotas
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America
| | - Ilya A. Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America
- * E-mail: (IAV); (SL)
| | - Shiyong Liu
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, Hubei, China
- * E-mail: (IAV); (SL)
| |
Collapse
|
31
|
Iwakiri J, Hamada M, Asai K, Kameda T. Improved Accuracy in RNA-Protein Rigid Body Docking by Incorporating Force Field for Molecular Dynamics Simulation into the Scoring Function. J Chem Theory Comput 2016; 12:4688-97. [PMID: 27494732 DOI: 10.1021/acs.jctc.6b00254] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
RNA-protein interactions play fundamental roles in many biological processes. To understand these interactions, it is necessary to know the three-dimensional structures of RNA-protein complexes. However, determining the tertiary structure of these complexes is often difficult, suggesting that an accurate rigid body docking for RNA-protein complexes is needed. In general, the rigid body docking process is divided into two steps: generating candidate structures from the individual RNA and protein structures and then narrowing down the candidates. In this study, we focus on the former problem to improve the prediction accuracy in RNA-protein docking. Our method is based on the integration of physicochemical information about RNA into ZDOCK, which is known as one of the most successful computer programs for protein-protein docking. Because recent studies showed the current force field for molecular dynamics simulation of protein and nucleic acids is quite accurate, we modeled the physicochemical information about RNA by force fields such as AMBER and CHARMM. A comprehensive benchmark of RNA-protein docking, using three recently developed data sets, reveals the remarkable prediction accuracy of the proposed method compared with existing programs for docking: the highest success rate is 34.7% for the predicted structure of the RNA-protein complex with the best score and 79.2% for 3,600 predicted ones. Three full atomistic force fields for RNA (AMBER94, AMBER99, and CHARMM22) produced almost the same accurate result, which showed current force fields for nucleic acids are quite accurate. In addition, we found that the electrostatic interaction and the representation of shape complementary between protein and RNA plays the important roles for accurate prediction of the native structures of RNA-protein complexes.
Collapse
Affiliation(s)
- Junichi Iwakiri
- Graduate School of Frontier Sciences, The University of Tokyo , 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
| | - Michiaki Hamada
- Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University , 55N-06-10, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan.,Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST) , 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Kiyoshi Asai
- Graduate School of Frontier Sciences, The University of Tokyo , 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan.,Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST) , 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Tomoshi Kameda
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST) , 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
32
|
Szameit K, Berg K, Kruspe S, Valentini E, Magbanua E, Kwiatkowski M, Chauvot de Beauchêne I, Krichel B, Schamoni K, Uetrecht C, Svergun DI, Schlüter H, Zacharias M, Hahn U. Structure and target interaction of a G-quadruplex RNA-aptamer. RNA Biol 2016; 13:973-987. [PMID: 27471797 DOI: 10.1080/15476286.2016.1212151] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
G-quadruplexes have recently moved into focus of research in nucleic acids, thereby evolving in scientific significance from exceptional secondary structure motifs to complex modulators of gene regulation. Aptamers (nucleic acid based ligands with recognition properties for a specific target) that form Gquadruplexes may have particular potential for therapeutic applications as they combine the characteristics of specific targeting and Gquadruplex mediated stability and regulation. We have investigated the structure and target interaction properties of one such aptamer: AIR-3 and its truncated form AIR-3A. These RNA aptamers are specific for human interleukin-6 receptor (hIL-6R), a key player in inflammatory diseases and cancer, and have recently been exploited for in vitro drug delivery studies. With the aim to resolve the RNA structure, global shape, RNA:protein interaction site and binding stoichiometry, we now investigated AIR-3 and AIR-3A by different methods including RNA structure probing, Small Angle X-ray scattering and microscale thermophoresis. Our findings suggest a broader spectrum of folding species than assumed so far and remarkable tolerance toward different modifications. Mass spectrometry based binding site analysis, supported by molecular modeling and docking studies propose a general Gquadruplex affinity for the target molecule hIL-6R.
Collapse
Affiliation(s)
- Kristina Szameit
- a Institute for Biochemistry and Molecular Biology, Department of Chemistry , University of Hamburg , Hamburg , Germany
| | - Katharina Berg
- a Institute for Biochemistry and Molecular Biology, Department of Chemistry , University of Hamburg , Hamburg , Germany
| | - Sven Kruspe
- a Institute for Biochemistry and Molecular Biology, Department of Chemistry , University of Hamburg , Hamburg , Germany
| | - Erica Valentini
- b European Molecular Biology Laboratory, Hamburg Unit , Hamburg , Germany
| | - Eileen Magbanua
- a Institute for Biochemistry and Molecular Biology, Department of Chemistry , University of Hamburg , Hamburg , Germany
| | - Marcel Kwiatkowski
- c University Medical Center Hamburg-Eppendorf , Department of Clinical Chemistry , Hamburg , Germany
| | | | - Boris Krichel
- e Heinrich Pette Institute, Leibniz Institute for Experimental Virology , Hamburg , Germany
| | - Kira Schamoni
- e Heinrich Pette Institute, Leibniz Institute for Experimental Virology , Hamburg , Germany
| | - Charlotte Uetrecht
- e Heinrich Pette Institute, Leibniz Institute for Experimental Virology , Hamburg , Germany.,f European XFEL GmbH , Hamburg , Germany
| | - Dmitri I Svergun
- b European Molecular Biology Laboratory, Hamburg Unit , Hamburg , Germany
| | - Hartmut Schlüter
- c University Medical Center Hamburg-Eppendorf , Department of Clinical Chemistry , Hamburg , Germany
| | - Martin Zacharias
- d Physics Department , Technical University Munich , Garching , Germany
| | - Ulrich Hahn
- a Institute for Biochemistry and Molecular Biology, Department of Chemistry , University of Hamburg , Hamburg , Germany
| |
Collapse
|
33
|
Kmiecik S, Gront D, Kolinski M, Wieteska L, Dawid AE, Kolinski A. Coarse-Grained Protein Models and Their Applications. Chem Rev 2016; 116:7898-936. [DOI: 10.1021/acs.chemrev.6b00163] [Citation(s) in RCA: 555] [Impact Index Per Article: 61.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Sebastian Kmiecik
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Dominik Gront
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Michal Kolinski
- Bioinformatics
Laboratory, Mossakowski Medical Research Center of the Polish Academy of Sciences, Pawinskiego 5, 02-106 Warsaw, Poland
| | - Lukasz Wieteska
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
- Department
of Medical Biochemistry, Medical University of Lodz, Mazowiecka 6/8, 92-215 Lodz, Poland
| | | | - Andrzej Kolinski
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
34
|
de Beauchene IC, de Vries SJ, Zacharias M. Fragment-based modelling of single stranded RNA bound to RNA recognition motif containing proteins. Nucleic Acids Res 2016; 44:4565-80. [PMID: 27131381 PMCID: PMC4889956 DOI: 10.1093/nar/gkw328] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 04/12/2016] [Indexed: 12/12/2022] Open
Abstract
Protein-RNA complexes are important for many biological processes. However, structural modeling of such complexes is hampered by the high flexibility of RNA. Particularly challenging is the docking of single-stranded RNA (ssRNA). We have developed a fragment-based approach to model the structure of ssRNA bound to a protein, based on only the protein structure, the RNA sequence and conserved contacts. The conformational diversity of each RNA fragment is sampled by an exhaustive library of trinucleotides extracted from all known experimental protein–RNA complexes. The method was applied to ssRNA with up to 12 nucleotides which bind to dimers of the RNA recognition motifs (RRMs), a highly abundant eukaryotic RNA-binding domain. The fragment based docking allows a precise de novo atomic modeling of protein-bound ssRNA chains. On a benchmark of seven experimental ssRNA–RRM complexes, near-native models (with a mean heavy-atom deviation of <3 Å from experiment) were generated for six out of seven bound RNA chains, and even more precise models (deviation < 2 Å) were obtained for five out of seven cases, a significant improvement compared to the state of the art. The method is not restricted to RRMs but was also successfully applied to Pumilio RNA binding proteins.
Collapse
Affiliation(s)
| | - Sjoerd J de Vries
- Physics Department T38, Technical University of Munich, James-Franck-Str. 1, 85748 Garching, Germany
| | - Martin Zacharias
- Physics Department T38, Technical University of Munich, James-Franck-Str. 1, 85748 Garching, Germany
| |
Collapse
|
35
|
Chauvot de Beauchene I, de Vries SJ, Zacharias M. Binding Site Identification and Flexible Docking of Single Stranded RNA to Proteins Using a Fragment-Based Approach. PLoS Comput Biol 2016; 12:e1004697. [PMID: 26815409 PMCID: PMC4729675 DOI: 10.1371/journal.pcbi.1004697] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Accepted: 12/08/2015] [Indexed: 11/18/2022] Open
Abstract
Protein-RNA docking is hampered by the high flexibility of RNA, and particularly single-stranded RNA (ssRNA). Yet, ssRNA regions typically carry the specificity of protein recognition. The lack of methodology for modeling such regions limits the accuracy of current protein-RNA docking methods. We developed a fragment-based approach to model protein-bound ssRNA, based on the structure of the protein and the sequence of the RNA, without any prior knowledge of the RNA binding site or the RNA structure. The conformational diversity of each fragment is sampled by an exhaustive RNA fragment library that was created from all the existing experimental structures of protein-ssRNA complexes. A systematic and detailed analysis of fragment-based ssRNA docking was performed which constitutes a proof-of-principle for the fragment-based approach. The method was tested on two 8-homo-nucleotide ssRNA-protein complexes and was able to identify the binding site on the protein within 10 Å. Moreover, a structure of each bound ssRNA could be generated in close agreement with the crystal structure with a mean deviation of ~1.5 Å except for a terminal nucleotide. This is the first time a bound ssRNA could be modeled from sequence with high precision.
Collapse
Affiliation(s)
| | - Sjoerd J. de Vries
- Physik-Department T38, Technische Universität München, Garching, Germany
| | - Martin Zacharias
- Physik-Department T38, Technische Universität München, Garching, Germany
| |
Collapse
|
36
|
Dawson WK, Bujnicki JM. Computational modeling of RNA 3D structures and interactions. Curr Opin Struct Biol 2015; 37:22-8. [PMID: 26689764 DOI: 10.1016/j.sbi.2015.11.007] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Revised: 11/11/2015] [Accepted: 11/12/2015] [Indexed: 11/25/2022]
Abstract
RNA molecules have key functions in cellular processes beyond being carriers of protein-coding information. These functions are often dependent on the ability to form complex three-dimensional (3D) structures. However, experimental determination of RNA 3D structures is difficult, which has prompted the development of computational methods for structure prediction from sequence. Recent progress in 3D structure modeling of RNA and emerging approaches for predicting RNA interactions with ions, ligands and proteins have been stimulated by successes in protein 3D structure modeling.
Collapse
Affiliation(s)
- Wayne K Dawson
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, ul. Ks. Trojdena 4, 02-109 Warsaw, Poland; Bioinformatics Laboratory, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, ul. Umultowska 89, 61-614 Poznan, Poland.
| |
Collapse
|
37
|
de Vries SJ, Schindler CEM, Chauvot de Beauchêne I, Zacharias M. A web interface for easy flexible protein-protein docking with ATTRACT. Biophys J 2015; 108:462-5. [PMID: 25650913 DOI: 10.1016/j.bpj.2014.12.015] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Revised: 12/06/2014] [Accepted: 12/10/2014] [Indexed: 01/03/2023] Open
Abstract
Protein-protein docking programs can give valuable insights into the structure of protein complexes in the absence of an experimental complex structure. Web interfaces can facilitate the use of docking programs by structural biologists. Here, we present an easy web interface for protein-protein docking with the ATTRACT program. While aimed at nonexpert users, the web interface still covers a considerable range of docking applications. The web interface supports systematic rigid-body protein docking with the ATTRACT coarse-grained force field, as well as various kinds of protein flexibility. The execution of a docking protocol takes up to a few hours on a standard desktop computer.
Collapse
Affiliation(s)
- Sjoerd J de Vries
- Physics Department, Technische Universität München, Garching, Germany.
| | | | | | - Martin Zacharias
- Physics Department, Technische Universität München, Garching, Germany
| |
Collapse
|
38
|
Li W, Wong WJ, Lim CJ, Ju HP, Li M, Yan J, Wang PY. Complex kinetics of DNA condensation revealed through DNA twist tracing. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015; 92:022707. [PMID: 26382432 DOI: 10.1103/physreve.92.022707] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Indexed: 06/05/2023]
Abstract
Toroid formation is an important mechanism for DNA condensation in cells. The length change during DNA condensation was investigated in previous single-molecule experiments. However, DNA twist is key to understanding the topological kinetics of DNA condensation. In this study, DNA twist as well as DNA length was traced during the DNA condensation by the freely orbiting magnetic tweezers and the tilted magnetic tweezers combined with Brownian dynamics simulations. The experimental results disclose the complex relationship between DNA extension and backbone rotation. Brownian dynamics simulations show that the toroid formation follows a wiggling pathway which leads to the complex DNA backbone rotation as revealed in our experiments. These findings provide the complete description of multivalent cation-dependent DNA toroid formation under tension.
Collapse
Affiliation(s)
- Wei Li
- Beijing National Laboratory for Condensed Matter Physics and Key Laboratory of Soft Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
| | - Wei Juan Wong
- Department of Physics, National University of Singapore, Singapore 117542
- Mechanobiology Institute, National University of Singapore, Singapore 117411
| | - Ci Ji Lim
- Department of Physics, National University of Singapore, Singapore 117542
- Mechanobiology Institute, National University of Singapore, Singapore 117411
| | - Hai-Peng Ju
- Beijing National Laboratory for Condensed Matter Physics and Key Laboratory of Soft Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
| | - Ming Li
- Beijing National Laboratory for Condensed Matter Physics and Key Laboratory of Soft Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
| | - Jie Yan
- Department of Physics, National University of Singapore, Singapore 117542
- Mechanobiology Institute, National University of Singapore, Singapore 117411
| | - Peng-Ye Wang
- Beijing National Laboratory for Condensed Matter Physics and Key Laboratory of Soft Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
| |
Collapse
|
39
|
Hall D, Li S, Yamashita K, Azuma R, Carver JA, Standley DM. RNA-LIM: a novel procedure for analyzing protein/single-stranded RNA propensity data with concomitant estimation of interface structure. Anal Biochem 2015; 472:52-61. [PMID: 25479604 DOI: 10.1016/j.ab.2014.11.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2014] [Revised: 10/27/2014] [Accepted: 11/10/2014] [Indexed: 11/29/2022]
Abstract
RNA-LIM is a procedure that can analyze various pseudo-potentials describing the affinity between single-stranded RNA (ssRNA) ribonucleotides and surface amino acids to produce a coarse-grained estimate of the structure of the ssRNA at the protein interface. The search algorithm works by evolving an ssRNA chain, of known sequence, as a series of walks between fixed sites on a protein surface. Optimal routes are found by application of a set of minimal "limiting" restraints derived jointly from (i) selective sampling of the ribonucleotide amino acid affinity pseudo-potential data, (ii) limited surface path exploration by prior determination of surface arc lengths, and (iii) RNA structural specification obtained from a statistical potential gathered from a library of experimentally determined ssRNA structures. We describe the general approach using a NAST (Nucleic Acid Simulation Tool)-like approximation of the ssRNA chain and a generalized pseudo-potential reflecting the location of nucleic acid binding residues. Minimum and maximum performance indicators of the methodology are established using both synthetic data, for which the pseudo-potential defining nucleic acid binding affinity is systematically degraded, and a representative real case, where the RNA binding sites are predicted by the amplified antisense RNA (aaRNA) method. Some potential uses and extensions of the routine are discussed. RNA-LIM analysis programs along with detailed instructions for their use are available on request from the authors.
Collapse
Affiliation(s)
- Damien Hall
- Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia; Immunology Frontier Research Center (IFReC), Section on Systems Immunology, Osaka University, Suita, Osaka 565-0871, Japan.
| | - Songling Li
- Immunology Frontier Research Center (IFReC), Section on Systems Immunology, Osaka University, Suita, Osaka 565-0871, Japan
| | - Kazuo Yamashita
- Immunology Frontier Research Center (IFReC), Section on Systems Immunology, Osaka University, Suita, Osaka 565-0871, Japan
| | - Ryuzo Azuma
- Immunology Frontier Research Center (IFReC), Section on Systems Immunology, Osaka University, Suita, Osaka 565-0871, Japan
| | - John A Carver
- Research School of Chemistry, Australian National University, Canberra ACT 2601, Australia
| | - Daron M Standley
- Immunology Frontier Research Center (IFReC), Section on Systems Immunology, Osaka University, Suita, Osaka 565-0871, Japan
| |
Collapse
|
40
|
Liu L, Heermann DW. The interaction of DNA with multi-Cys2His2 zinc finger proteins. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2015; 27:064107. [PMID: 25563438 DOI: 10.1088/0953-8984/27/6/064107] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The multi-Cys2His2 (mC2H2) zinc finger protein, like CTCF, plays a central role in the three-dimensional organization of chromatin and gene regulation. The interaction between DNA and mC2H2 zinc finger proteins becomes crucial to better understand how CTCF dynamically shapes the chromatin structure. Here, we study a coarse-grained model of the mC2H2 zinc finger proteins in complexes with DNA, and in particular, we study how a mC2H2 zinc finger protein binds to and searches for its target DNA loci. On the basis of coarse-grained molecular dynamics simulations, we present several interesting kinetic conformational properties of the proteins, such as the rotation-coupled sliding, the asymmetrical roles of different zinc fingers and the partial binding partial dangling mode. In addition, two kinds of studied mC2H2 zinc finger proteins, of CG-rich and AT-rich binding motif each, were able to recognize their target sites and slide away from their non-target sites, which shows a proper sequence specificity in our model and the derived force field for mC2H2-DNA interaction. A further application to CTCF shows that the protein binds to a specific DNA duplex only with its central zinc fingers. The zinc finger domains of CTCF asymmetrically bend the DNA, but do not form a DNA loop alone in our simulations.
Collapse
Affiliation(s)
- Lei Liu
- Institute for Theoretical Physics, Heidelberg University, 69117 Heidelberg, Germany
| | | |
Collapse
|
41
|
Abstract
Molecular dynamics (MD) simulations at the atomic scale are a powerful tool to study the structure and dynamics of model biological systems. However, because of their high computational cost, the time and length scales of atomistic simulations are limited. Biologically important processes, such as protein folding, ion channel gating, signal transduction, and membrane remodeling, are difficult to investigate using atomistic simulations. Coarse-graining reduces the computational cost of calculations by reducing the number of degrees of freedom in the model, allowing simulations of larger systems for longer times. In the first part of this chapter we review briefly some of the coarse-grained models available for proteins, focusing on the specific scope of each model. Then we describe in more detail the MARTINI coarse-grained force field, and we illustrate how to set up and run a simulation of a membrane protein using the Gromacs software package. We explain step-by-step the preparation of the protein and the membrane, the insertion of the protein in the membrane, the equilibration of the system, the simulation itself, and the analysis of the trajectory.
Collapse
|
42
|
Guilhot-Gaudeffroy A, Froidevaux C, Azé J, Bernauer J. Protein-RNA complexes and efficient automatic docking: expanding RosettaDock possibilities. PLoS One 2014; 9:e108928. [PMID: 25268579 PMCID: PMC4182525 DOI: 10.1371/journal.pone.0108928] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Accepted: 09/05/2014] [Indexed: 12/03/2022] Open
Abstract
Protein-RNA complexes provide a wide range of essential functions in the cell. Their atomic experimental structure solving, despite essential to the understanding of these functions, is often difficult and expensive. Docking approaches that have been developed for proteins are often challenging to adapt for RNA because of its inherent flexibility and the structural data available being relatively scarce. In this study we adapted the RosettaDock protocol for protein-RNA complexes both at the nucleotide and atomic levels. Using a genetic algorithm-based strategy, and a non-redundant protein-RNA dataset, we derived a RosettaDock scoring scheme able not only to discriminate but also score efficiently docking decoys. The approach proved to be both efficient and robust for generating and identifying suitable structures when applied to two protein-RNA docking benchmarks in both bound and unbound settings. It also compares well to existing strategies. This is the first approach that currently offers a multi-level optimized scoring approach integrated in a full docking suite, leading the way to adaptive fully flexible strategies.
Collapse
Affiliation(s)
- Adrien Guilhot-Gaudeffroy
- AMIB Project, Inria Saclay-Île de France, Palaiseau, France
- Laboratoire de Recherche en Informatique (LRI), CNRS UMR 8623, Université Paris-Sud, Orsay, France
- Laboratoire d'Informatique de l'École Polytechnique (LIX), CNRS UMR 7161, École Polytechnique, Palaiseau, France
| | - Christine Froidevaux
- AMIB Project, Inria Saclay-Île de France, Palaiseau, France
- Laboratoire de Recherche en Informatique (LRI), CNRS UMR 8623, Université Paris-Sud, Orsay, France
| | - Jérôme Azé
- AMIB Project, Inria Saclay-Île de France, Palaiseau, France
- Laboratoire de Recherche en Informatique (LRI), CNRS UMR 8623, Université Paris-Sud, Orsay, France
- Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), CNRS UMR 5506, Université Montpellier 2, Montpellier, France
| | - Julie Bernauer
- AMIB Project, Inria Saclay-Île de France, Palaiseau, France
- Laboratoire d'Informatique de l'École Polytechnique (LIX), CNRS UMR 7161, École Polytechnique, Palaiseau, France
- * E-mail:
| |
Collapse
|
43
|
Yang X, Li H, Huang Y, Liu S. The dataset for protein-RNA binding affinity. Protein Sci 2014; 22:1808-11. [PMID: 24127340 DOI: 10.1002/pro.2383] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Revised: 10/04/2013] [Accepted: 10/07/2013] [Indexed: 12/16/2022]
Abstract
We have developed a non-redundant protein-RNA binding benchmark dataset derived from the available protein-RNA structures in the Protein Database Bank. It consists of 73 complexes with measured binding affinity. The experimental conditions (pH and temperature) for binding affinity measurements are also listed in our dataset. This binding affinity dataset can be used to compare and develop protein-RNA scoring functions. The predicted binding free energy of the 73 complexes from three available scoring functions for protein-RNA docking has a low correlation with the binding Gibbs free energy calculated from Kd.
Collapse
Affiliation(s)
- Xiufeng Yang
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China
| | | | | | | |
Collapse
|
44
|
Abstract
By focusing on essential features, while averaging over less important details, coarse-grained (CG) models provide significant computational and conceptual advantages with respect to more detailed models. Consequently, despite dramatic advances in computational methodologies and resources, CG models enjoy surging popularity and are becoming increasingly equal partners to atomically detailed models. This perspective surveys the rapidly developing landscape of CG models for biomolecular systems. In particular, this review seeks to provide a balanced, coherent, and unified presentation of several distinct approaches for developing CG models, including top-down, network-based, native-centric, knowledge-based, and bottom-up modeling strategies. The review summarizes their basic philosophies, theoretical foundations, typical applications, and recent developments. Additionally, the review identifies fundamental inter-relationships among the diverse approaches and discusses outstanding challenges in the field. When carefully applied and assessed, current CG models provide highly efficient means for investigating the biological consequences of basic physicochemical principles. Moreover, rigorous bottom-up approaches hold great promise for further improving the accuracy and scope of CG models for biomolecular systems.
Collapse
Affiliation(s)
- W G Noid
- Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
45
|
Huang SY, Zou X. A knowledge-based scoring function for protein-RNA interactions derived from a statistical mechanics-based iterative method. Nucleic Acids Res 2014; 42:e55. [PMID: 24476917 PMCID: PMC3985650 DOI: 10.1093/nar/gku077] [Citation(s) in RCA: 104] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Protein-RNA interactions play important roles in many biological processes. Given the high cost and technique difficulties in experimental methods, computationally predicting the binding complexes from individual protein and RNA structures is pressingly needed, in which a reliable scoring function is one of the critical components. Here, we have developed a knowledge-based scoring function, referred to as ITScore-PR, for protein-RNA binding mode prediction by using a statistical mechanics-based iterative method. The pairwise distance-dependent atomic interaction potentials of ITScore-PR were derived from experimentally determined protein–RNA complex structures. For validation, we have compared ITScore-PR with 10 other scoring methods on four diverse test sets. For bound docking, ITScore-PR achieved a success rate of up to 86% if the top prediction was considered and up to 94% if the top 10 predictions were considered, respectively. For truly unbound docking, the respective success rates of ITScore-PR were up to 24 and 46%. ITScore-PR can be used stand-alone or easily implemented in other docking programs for protein–RNA recognition.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211, USA
| | | |
Collapse
|
46
|
Yang Y, Zhao H, Wang J, Zhou Y. SPOT-Seq-RNA: predicting protein-RNA complex structure and RNA-binding function by fold recognition and binding affinity prediction. Methods Mol Biol 2014; 1137:119-30. [PMID: 24573478 PMCID: PMC3937850 DOI: 10.1007/978-1-4939-0366-5_9] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
RNA-binding proteins (RBPs) play key roles in RNA metabolism and post-transcriptional regulation. Computational methods have been developed separately for prediction of RBPs and RNA-binding residues by machine-learning techniques and prediction of protein-RNA complex structures by rigid or semiflexible structure-to-structure docking. Here, we describe a template-based technique called SPOT-Seq-RNA that integrates prediction of RBPs, RNA-binding residues, and protein-RNA complex structures into a single package. This integration is achieved by combining template-based structure-prediction software, SPARKS X, with binding affinity prediction software, DRNA. This tool yields reasonable sensitivity (46 %) and high precision (84 %) for an independent test set of 215 RBPs and 5,766 non-RBPs. SPOT-Seq-RNA is computationally efficient for genome-scale prediction of RBPs and protein-RNA complex structures. Its application to human genome study has revealed a similar sensitivity and ability to uncover hundreds of novel RBPs beyond simple homology. The online server and downloadable version of SPOT-Seq-RNA are available at http://sparks-lab.org/server/SPOT-Seq-RNA/.
Collapse
Affiliation(s)
- Yuedong Yang
- School of Informatics, Indiana University Purdue University, Indianapolis, IN, USA
| | | | | | | |
Collapse
|
47
|
Setny P, Zacharias M. Elastic Network Models of Nucleic Acids Flexibility. J Chem Theory Comput 2013; 9:5460-70. [PMID: 26592282 DOI: 10.1021/ct400814n] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Elastic network models (ENMs) are a useful tool for describing large scale motions in protein systems. While they are well validated in the context of proteins, relatively little is known about their applicability to nucleic acids, whose different architecture does not necessarily warrant comparable performance. In this study we thoroughly evaluate and optimize the efficiency of popular ENMs for capturing RNA and DNA flexibility. We also introduce two alternative models in which the strength of elastic connections at a coarse-grained level is governed by distance distribution at atomic resolution. For each of the considered ENMs we report the optimal length of spring connections as well as the scaling of elastic force constants that provides the best agreement of vibrational frequencies with normal modes based on atomic force field. In order to determine the absolute values of force constants we introduce a novel method based on the overlap of pseudoinverse of Hessian matrices.
Collapse
Affiliation(s)
- Piotr Setny
- Centre for New Technologies, University of Warsaw , 00-927 Warsaw, Poland
| | - Martin Zacharias
- Physics Department T38, Technical University Munich , 85748 Garching, Germany
| |
Collapse
|
48
|
Zhao H, Yang Y, Zhou Y. Prediction of RNA binding proteins comes of age from low resolution to high resolution. MOLECULAR BIOSYSTEMS 2013; 9:2417-25. [PMID: 23872922 PMCID: PMC3870025 DOI: 10.1039/c3mb70167k] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Networks of protein-RNA interactions is likely to be larger than protein-protein and protein-DNA interaction networks because RNA transcripts are encoded tens of times more than proteins (e.g. only 3% of human genome coded for proteins), have diverse function and localization, and are controlled by proteins from birth (transcription) to death (degradation). This massive network is evidenced by several recent experimental discoveries of large numbers of previously unknown RNA-binding proteins (RBPs). Meanwhile, more than 400 non-redundant protein-RNA complex structures (at 25% sequence identity or less) have been deposited into the protein databank. These sequences and structural resources for RBPs provide ample data for the development of computational techniques dedicated to RBP prediction, as experimentally determining RNA-binding functions is time-consuming and expensive. This review compares traditional machine-learning based approaches with emerging template-based methods at several levels of prediction resolution ranging from two-state binding/non-binding prediction, to binding residue prediction and protein-RNA complex structure prediction. The analysis indicates that the two approaches are complementary and their combinations may lead to further improvements.
Collapse
Affiliation(s)
- Huiying Zhao
- School of Informatics, Indiana University Purdue University, Indianapolis, Indiana 46202, USA.
| | | | | |
Collapse
|
49
|
Mapping the Spatial Neighborhood of the Regulatory 6S RNA Bound to Escherichia coli RNA Polymerase Holoenzyme. J Mol Biol 2013; 425:3649-61. [DOI: 10.1016/j.jmb.2013.07.008] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2013] [Revised: 06/28/2013] [Accepted: 07/04/2013] [Indexed: 11/15/2022]
|
50
|
Parisien M, Wang X, Perdrizet G, Lamphear C, Fierke CA, Maheshwari KC, Wilde MJ, Sosnick TR, Pan T. Discovering RNA-protein interactome by using chemical context profiling of the RNA-protein interface. Cell Rep 2013; 3:1703-13. [PMID: 23665222 DOI: 10.1016/j.celrep.2013.04.010] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2012] [Revised: 03/04/2013] [Accepted: 04/12/2013] [Indexed: 02/04/2023] Open
Abstract
RNA-protein (RNP) interactions generally are required for RNA function. At least 5% of human genes code for RNA-binding proteins. Whereas many approaches can identify the RNA partners for a specific protein, finding the protein partners for a specific RNA is difficult. We present a machine-learning method that scores a protein's binding potential for an RNA structure by utilizing the chemical context profiles of the interface from known RNP structures. Our approach is applicable even when only a single RNP structure is available. We examined 801 mammalian proteins and find that 37 (4.6%) potentially bind transfer RNA (tRNA). Most are enzymes involved in cellular processes unrelated to translation and were not known to interact with RNA. We experimentally tested six positive and three negative predictions for tRNA binding in vivo, and all nine predictions were correct. Our computational approach provides a powerful complement to experiments in discovering new RNPs.
Collapse
Affiliation(s)
- Marc Parisien
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL 60637, USA
| | | | | | | | | | | | | | | | | |
Collapse
|