1
|
Sun C, Feng Y. EPDRNA: A Model for Identifying DNA-RNA Binding Sites in Disease-Related Proteins. Protein J 2024; 43:513-521. [PMID: 38491248 DOI: 10.1007/s10930-024-10183-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/02/2024] [Indexed: 03/18/2024]
Abstract
Protein-DNA and protein-RNA interactions are involved in many biological processes and regulate many cellular functions. Moreover, they are related to many human diseases. To understand the molecular mechanism of protein-DNA binding and protein-RNA binding, it is important to identify which residues in the protein sequence bind to DNA and RNA. At present, there are few methods for specifically identifying the binding sites of disease-related protein-DNA and protein-RNA. In this study, so we combined four machine learning algorithms into an ensemble classifier (EPDRNA) to predict DNA and RNA binding sites in disease-related proteins. The dataset used in model was collated from UniProt and PDB database, and PSSM, physicochemical properties and amino acid type were used as features. The EPDRNA adopted soft voting and achieved the best AUC value of 0.73 at the DNA binding sites, and the best AUC value of 0.71 at the RNA binding sites in 10-fold cross validation in the training sets. In order to further verify the performance of the model, we assessed EPDRNA for the prediction of DNA-binding sites and the prediction of RNA-binding sites on the independent test dataset. The EPDRNA achieved 85% recall rate and 25% precision on the protein-DNA interaction independent test set, and achieved 82% recall rate and 27% precision on the protein-RNA interaction independent test set. The online EPDRNA webserver is freely available at http://www.s-bioinformatics.cn/epdrna .
Collapse
Affiliation(s)
- CanZhuang Sun
- College of Science, Inner Mongolia Agriculture University, Hohhot, 010018, People's Republic of China
| | - YongE Feng
- College of Science, Inner Mongolia Agriculture University, Hohhot, 010018, People's Republic of China.
| |
Collapse
|
2
|
Agarwal A, Kant S, Bahadur RP. Efficient mapping of RNA-binding residues in RNA-binding proteins using local sequence features of binding site residues in protein-RNA complexes. Proteins 2023; 91:1361-1379. [PMID: 37254800 DOI: 10.1002/prot.26528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 04/13/2023] [Accepted: 05/02/2023] [Indexed: 06/01/2023]
Abstract
Protein-RNA interactions play vital roles in plethora of biological processes such as regulation of gene expression, protein synthesis, mRNA processing and biogenesis. Identification of RNA-binding residues (RBRs) in proteins is essential to understand RNA-mediated protein functioning, to perform site-directed mutagenesis and to develop novel targeted drug therapies. Moreover, the extensive gap between sequence and structural data restricts the identification of binding sites in unsolved structures. However, efficient use of computational methods demanding only sequence to identify binding residues can bridge this huge sequence-structure gap. In this study, we have extensively studied protein-RNA interface in known RNA-binding proteins (RBPs). We find that the interface is highly enriched in basic and polar residues with Gly being the most common interface neighbor. We investigated several amino acid features and developed a method to predict putative RBRs from amino acid sequence. We have implemented balanced random forest (BRF) classifier with local residue features of protein sequences for prediction. With 5-fold cross-validations, the sequence pattern derived dipeptide composition based BRF model (DCP-BRF) resulted in an accuracy of 87.9%, specificity of 88.8%, sensitivity of 82.2%, Mathew's correlation coefficient of 0.60 and AUC of 0.93, performing better than few existing methods. We further validated our prediction model on known human RBPs through RBR prediction and could map ~54% of them. Further, knowledge of binding site preferences obtained from computational predictions combined with experimental validations of potential RNA binding sites can enhance our understanding of protein-RNA interactions. This may serve to accelerate investigations on functional roles of many novel RBPs.
Collapse
Affiliation(s)
- Ankita Agarwal
- School of Bio Science, Indian Institute of Technology Kharagpur, Kharagpur, India
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, India
| | - Shri Kant
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, India
| |
Collapse
|
3
|
Zhao B, Katuwawala A, Oldfield CJ, Hu G, Wu Z, Uversky VN, Kurgan L. Intrinsic Disorder in Human RNA-Binding Proteins. J Mol Biol 2021; 433:167229. [PMID: 34487791 DOI: 10.1016/j.jmb.2021.167229] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 08/30/2021] [Accepted: 08/31/2021] [Indexed: 12/24/2022]
Abstract
Although RNA-binding proteins (RBPs) are known to be enriched in intrinsic disorder, no previous analysis focused on RBPs interacting with specific RNA types. We fill this gap with a comprehensive analysis of the putative disorder in RBPs binding to six common RNA types: messenger RNA (mRNA), transfer RNA (tRNA), small nuclear RNA (snRNA), non-coding RNA (ncRNA), ribosomal RNA (rRNA), and internal ribosome RNA (irRNA). We also analyze the amount of putative intrinsic disorder in the RNA-binding domains (RBDs) and non-RNA-binding-domain regions (non-RBD regions). Consistent with previous studies, we show that in comparison with human proteome, RBPs are significantly enriched in disorder. However, closer examination finds significant enrichment in predicted disorder for the mRNA-, rRNA- and snRNA-binding proteins, while the proteins that interact with ncRNA and irRNA are not enriched in disorder, and the tRNA-binding proteins are significantly depleted in disorder. We show a consistent pattern of significant disorder enrichment in the non-RBD regions coupled with low levels of disorder in RBDs, which suggests that disorder is relatively rarely utilized in the RNA-binding regions. Our analysis of the non-RBD regions suggests that disorder harbors posttranslational modification sites and is involved in the putative interactions with DNA. Importantly, we utilize experimental data from DisProt and independent data from Pfam to validate the above observations that rely on the disorder predictions. This study provides new insights into the distribution of disorder across proteins that bind different RNA types and the functional role of disorder in the regions where it is enriched.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Christopher J Oldfield
- Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA.
| |
Collapse
|
4
|
Corley M, Burns MC, Yeo GW. How RNA-Binding Proteins Interact with RNA: Molecules and Mechanisms. Mol Cell 2020; 78:9-29. [PMID: 32243832 PMCID: PMC7202378 DOI: 10.1016/j.molcel.2020.03.011] [Citation(s) in RCA: 348] [Impact Index Per Article: 87.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 01/13/2020] [Accepted: 03/09/2020] [Indexed: 12/17/2022]
Abstract
RNA-binding proteins (RBPs) comprise a large class of over 2,000 proteins that interact with transcripts in all manner of RNA-driven processes. The structures and mechanisms that RBPs use to bind and regulate RNA are incredibly diverse. In this review, we take a look at the components of protein-RNA interaction, from the molecular level to multi-component interaction. We first summarize what is known about protein-RNA molecular interactions based on analyses of solved structures. We additionally describe software currently available for predicting protein-RNA interaction and other resources useful for the study of RBPs. We then review the structure and function of seventeen known RNA-binding domains and analyze the hydrogen bonds adopted by protein-RNA structures on a domain-by-domain basis. We conclude with a summary of the higher-level mechanisms that regulate protein-RNA interactions.
Collapse
Affiliation(s)
- Meredith Corley
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Margaret C Burns
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA; Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, USA
| | - Gene W Yeo
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA; Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
5
|
Nithin C, Mukherjee S, Bahadur RP. A structure-based model for the prediction of protein-RNA binding affinity. RNA (NEW YORK, N.Y.) 2019; 25:1628-1645. [PMID: 31395671 PMCID: PMC6859855 DOI: 10.1261/rna.071779.119] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 08/05/2019] [Indexed: 05/28/2023]
Abstract
Protein-RNA recognition is highly affinity-driven and regulates a wide array of cellular functions. In this study, we have curated a binding affinity data set of 40 protein-RNA complexes, for which at least one unbound partner is available in the docking benchmark. The data set covers a wide affinity range of eight orders of magnitude as well as four different structural classes. On average, we find the complexes with single-stranded RNA have the highest affinity, whereas the complexes with the duplex RNA have the lowest. Nevertheless, free energy gain upon binding is the highest for the complexes with ribosomal proteins and the lowest for the complexes with tRNA with an average of -5.7 cal/mol/Å2 in the entire data set. We train regression models to predict the binding affinity from the structural and physicochemical parameters of protein-RNA interfaces. The best fit model with the lowest maximum error is provided with three interface parameters: relative hydrophobicity, conformational change upon binding and relative hydration pattern. This model has been used for predicting the binding affinity on a test data set, generated using mutated structures of yeast aspartyl-tRNA synthetase, for which experimentally determined ΔG values of 40 mutations are available. The predicted ΔGempirical values highly correlate with the experimental observations. The data set provided in this study should be useful for further development of the binding affinity prediction methods. Moreover, the model developed in this study enhances our understanding on the structural basis of protein-RNA binding affinity and provides a platform to engineer protein-RNA interfaces with desired affinity.
Collapse
Affiliation(s)
- Chandran Nithin
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Sunandan Mukherjee
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| |
Collapse
|
6
|
Pilla SP, Thomas A, Bahadur RP. Dissecting macromolecular recognition sites in ribosome: implication to its self-assembly. RNA Biol 2019; 16:1300-1312. [PMID: 31179876 DOI: 10.1080/15476286.2019.1629767] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Interactions between macromolecules play a crucial role in ribosome assembly that follows a highly coordinated process involving RNA folding and binding of ribosomal proteins (r-proteins). Although extensive studies have been carried out to understand macromolecular interactions in ribosomes, most of them are confined to either large or small ribosomal-subunit of few species. A comparative analysis of macromolecular interactions across different domains is still missing. We have analyzed the structural and physicochemical properties of protein-protein (PP), protein-RNA (PR) and RNA-RNA (RR) interfaces in small and large subunits of ribosomes, as well as in between the two subunits. Additionally, we have also developed Random Forest (RF) classifier to catalog the r-proteins. We find significant differences as well as similarities in macromolecular recognition sites between ribosomal assemblies of prokaryotes and eukaryotes. PR interfaces are substantially larger and have more ionic interactions than PP and RR interfaces in both prokaryotes and eukaryotes. PP, PR and RR interfaces in eukaryotes are well packed compared to those in prokaryotes. However, the packing density between the large and the small subunit interfaces in the entire assembly is strikingly low in both prokaryotes and eukaryotes, indicating the periodic association and dissociation of the two subunits during the translation. The structural and physicochemical properties of PR interfaces are used to predict the r-proteins in the assembly pathway into early, intermediate and late binders using RF classifier with an accuracy of 80%. The results provide new insights into the classification of r-proteins in the assembly pathway.
Collapse
Affiliation(s)
- Smita P Pilla
- a Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Amal Thomas
- a Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Ranjit Prasad Bahadur
- a Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur , Kharagpur , India
| |
Collapse
|
7
|
Seo M, Lei L, Egli M. Label-Free Electrophoretic Mobility Shift Assay (EMSA) for Measuring Dissociation Constants of Protein-RNA Complexes. CURRENT PROTOCOLS IN NUCLEIC ACID CHEMISTRY 2019; 76:e70. [PMID: 30461222 PMCID: PMC6391183 DOI: 10.1002/cpnc.70] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The electrophoretic mobility shift assay (EMSA) is a well-established method to detect formation of complexes between proteins and nucleic acids and to determine, among other parameters, equilibrium constants for the interaction. Mixtures of protein and nucleic acid solutions of various ratios are analyzed via polyacrylamide gel electrophoresis (PAGE) under native conditions. In general, protein-nucleic acid complexes will migrate more slowly than the free nucleic acid. From the distributions of the nucleic acid components in the observed bands in individual gel lanes, quantitative parameters such as the dissociation constant (Kd ) of the interaction can be measured. This article describes a simple and rapid EMSA that relies either on precast commercial or handcast polyacrylamide gels and uses unlabeled protein and nucleic acid. Nucleic acids are instead detected with SYBR Gold stain and band intensities established with a standard gel imaging system. We used this protocol specifically to determine Kd values for complexes between the PAZ domain of Argonaute 2 (Ago2) enzyme and native and chemically modified RNA oligonucleotides. EMSA-based equilibrium constants are compared to those determined with isothermal titration calorimetry (ITC). Advantages and limitations of this simple EMSA are discussed by comparing it to other techniques used for determination of equilibrium constants of protein-RNA interactions, and a troubleshooting guide is provided. © 2018 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Minguk Seo
- Department of Biochemistry, School of Medicine, Vanderbilt University,
Nashville TN 37232
| | - Li Lei
- Department of Biochemistry, School of Medicine, Vanderbilt University,
Nashville TN 37232
| | - Martin Egli
- Department of Biochemistry, School of Medicine, Vanderbilt University,
Nashville TN 37232
| |
Collapse
|
8
|
Nithin C, Ghosh P, Bujnicki JM. Bioinformatics Tools and Benchmarks for Computational Docking and 3D Structure Prediction of RNA-Protein Complexes. Genes (Basel) 2018; 9:genes9090432. [PMID: 30149645 PMCID: PMC6162694 DOI: 10.3390/genes9090432] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 07/26/2018] [Accepted: 08/21/2018] [Indexed: 12/29/2022] Open
Abstract
RNA-protein (RNP) interactions play essential roles in many biological processes, such as regulation of co-transcriptional and post-transcriptional gene expression, RNA splicing, transport, storage and stabilization, as well as protein synthesis. An increasing number of RNP structures would aid in a better understanding of these processes. However, due to the technical difficulties associated with experimental determination of macromolecular structures by high-resolution methods, studies on RNP recognition and complex formation present significant challenges. As an alternative, computational prediction of RNP interactions can be carried out. Structural models obtained by theoretical predictive methods are, in general, less reliable compared to models based on experimental measurements but they can be sufficiently accurate to be used as a basis for to formulating functional hypotheses. In this article, we present an overview of computational methods for 3D structure prediction of RNP complexes. We discuss currently available methods for macromolecular docking and for scoring 3D structural models of RNP complexes in particular. Additionally, we also review benchmarks that have been developed to assess the accuracy of these methods.
Collapse
Affiliation(s)
- Chandran Nithin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
| | - Pritha Ghosh
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland.
- Bioinformatics Laboratory, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, ul. Umultowska 89, PL-61-614 Poznan, Poland.
| |
Collapse
|
9
|
An account of solvent accessibility in protein-RNA recognition. Sci Rep 2018; 8:10546. [PMID: 30002431 PMCID: PMC6043566 DOI: 10.1038/s41598-018-28373-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Accepted: 06/21/2018] [Indexed: 01/16/2023] Open
Abstract
Protein–RNA recognition often induces conformational changes in binding partners. Consequently, the solvent accessible surface area (SASA) buried in contact estimated from the co-crystal structures may differ from that calculated using their unbound forms. To evaluate the change in accessibility upon binding, we compare SASA of 126 protein-RNA complexes between bound and unbound forms. We observe, in majority of cases the interface of both the binding partners gain accessibility upon binding, which is often associated with either large domain movements or secondary structural transitions in RNA-binding proteins (RBPs), and binding-induced conformational changes in RNAs. At the non-interface region, majority of RNAs lose accessibility upon binding, however, no such preference is observed for RBPs. Side chains of RBPs have major contribution in change in accessibility. In case of flexible binding, we find a moderate correlation between the binding free energy and change in accessibility at the interface. Finally, we introduce a parameter, the ratio of gain to loss of accessibility upon binding, which can be used to identify the native solution among the flexible docking models. Our findings provide fundamental insights into the relationship between flexibility and solvent accessibility, and advance our understanding on binding induced folding in protein-RNA recognition.
Collapse
|
10
|
Stephen P, Ye S, Zhou M, Song J, Zhang R, Wang ED, Giegé R, Lin SX. Structure of Escherichia coli Arginyl-tRNA Synthetase in Complex with tRNA Arg: Pivotal Role of the D-loop. J Mol Biol 2018; 430:1590-1606. [PMID: 29678554 DOI: 10.1016/j.jmb.2018.04.011] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Revised: 03/16/2018] [Accepted: 04/10/2018] [Indexed: 10/17/2022]
Abstract
Aminoacyl-tRNA synthetases are essential components in protein biosynthesis. Arginyl-tRNA synthetase (ArgRS) belongs to the small group of aminoacyl-tRNA synthetases requiring cognate tRNA for amino acid activation. The crystal structure of Escherichia coli (Eco) ArgRS has been solved in complex with tRNAArg at 3.0-Å resolution. With this first bacterial tRNA complex, we are attempting to bridge the gap existing in structure-function understanding in prokaryotic tRNAArg recognition. The structure shows a tight binding of tRNA on the synthetase through the identity determinant A20 from the D-loop, a tRNA recognition snapshot never elucidated structurally. This interaction of A20 involves 5 amino acids from the synthetase. Additional contacts via U20a and U16 from the D-loop reinforce the interaction. The importance of D-loop recognition in EcoArgRS functioning is supported by a mutagenesis analysis of critical amino acids that anchor tRNAArg on the synthetase; in particular, mutations at amino acids interacting with A20 affect binding affinity to the tRNA and specificity of arginylation. Altogether the structural and functional data indicate that the unprecedented ArgRS crystal structure represents a snapshot during functioning and suggest that the recognition of the D-loop by ArgRS is an important trigger that anchors tRNAArg on the synthetase. In this process, A20 plays a major role, together with prominent conformational changes in several ArgRS domains that may eventually lead to the mature ArgRS:tRNA complex and the arginine activation. Functional implications that could be idiosyncratic to the arginine identity of bacterial ArgRSs are discussed.
Collapse
Affiliation(s)
- Preyesh Stephen
- Laboratory of Molecular Endocrinology, CHU Research Center and Laval University, Québec, Canada
| | - Sheng Ye
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Ming Zhou
- Laboratory of Molecular Endocrinology, CHU Research Center and Laval University, Québec, Canada
| | - Jian Song
- Laboratory of Molecular Endocrinology, CHU Research Center and Laval University, Québec, Canada
| | - Rongguang Zhang
- National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China; Shanghai Institutes of Biochemistry and Cell Biology, SIBS, Shanghai, China.
| | - En-Duo Wang
- Shanghai Institutes of Biochemistry and Cell Biology, SIBS, Shanghai, China.
| | - Richard Giegé
- Institut de Biologie Moléculaire et Cellulaire, CNRS and Université de Strasbourg, Strasbourg Cedex, France
| | - Sheng-Xiang Lin
- Laboratory of Molecular Endocrinology, CHU Research Center and Laval University, Québec, Canada; Shanghai Institutes of Biochemistry and Cell Biology, SIBS, Shanghai, China.
| |
Collapse
|
11
|
Hu W, Qin L, Li M, Pu X, Guo Y. A structural dissection of protein–RNA interactions based on different RNA base areas of interfaces. RSC Adv 2018; 8:10582-10592. [PMID: 35540439 PMCID: PMC9078961 DOI: 10.1039/c8ra00598b] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2018] [Accepted: 03/05/2018] [Indexed: 11/21/2022] Open
Abstract
Protein–RNA interactions are very common cellular processes, but the mechanisms of interactions are not fully understood, mainly due to the complicated RNA structures. By the elaborate investigation on RNA structures of protein–RNA complexes, it was firstly found in this paper that RNAs in these complexes could be clearly classified into three classes (high, medium and low) based on the different levels of Pbase (the percentage of base area buried in the RNA interface). In view of the three RNA classes, more detailed analyses on protein–RNA interactions were comprehensively performed from various aspects, including interface area, structure, composition and interaction force, so as to achieve a deeper understanding of the recognition specificity for the three classes of protein–RNA interactions. According to our classification strategy, the three complex classes have significant differences in terms of almost all properties. Complexes in the high class have short and extended RNA structures and behave like protein–ssDNA interactions. Their hydrogen bonds and hydrophobic interactions are strong. For complexes in low class, their RNA structures are mainly double-stranded, like protein–dsDNA interactions, and electrostatic interactions frequently occur. The complexes in medium class have the longest RNA chains and largest average interface area. Meanwhile, they do not show any preference for the interaction force. On average, in terms of composition, secondary structures and intermolecular physicochemical properties, significant feature preferences can be observed in high and low complexes, but no highly specific features are found for medium complexes. We found that our proposed Pbase is an important parameter which can be used as a new determinant to distinguish protein–RNA complexes. For high and low complexes, we can more easily understand the specificity of the recognition process from the interface features than for medium complexes. In the future, medium complexes should be our research focus to further structurally analyze from more feature aspects. Overall, this study may contribute to further understanding of the mechanism of protein–RNA interactions on a more detailed level. Qualitative and quantitative measurements of the influence of structure and composition of RNA interfaces on protein–RNA interactions.![]()
Collapse
Affiliation(s)
- Wen Hu
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Liu Qin
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Menglong Li
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Xuemei Pu
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Yanzhi Guo
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| |
Collapse
|
12
|
Zhang J, Ma Z, Kurgan L. Comprehensive review and empirical analysis of hallmarks of DNA-, RNA- and protein-binding residues in protein chains. Brief Bioinform 2017; 20:1250-1268. [DOI: 10.1093/bib/bbx168] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Revised: 11/15/2017] [Indexed: 11/13/2022] Open
Abstract
Abstract
Proteins interact with a variety of molecules including proteins and nucleic acids. We review a comprehensive collection of over 50 studies that analyze and/or predict these interactions. While majority of these studies address either solely protein–DNA or protein–RNA binding, only a few have a wider scope that covers both protein–protein and protein–nucleic acid binding. Our analysis reveals that binding residues are typically characterized with three hallmarks: relative solvent accessibility (RSA), evolutionary conservation and propensity of amino acids (AAs) for binding. Motivated by drawbacks of the prior studies, we perform a large-scale analysis to quantify and contrast the three hallmarks for residues that bind DNA-, RNA-, protein- and (for the first time) multi-ligand-binding residues that interact with DNA and proteins, and with RNA and proteins. Results generated on a well-annotated data set of over 23 000 proteins show that conservation of binding residues is higher for nucleic acid- than protein-binding residues. Multi-ligand-binding residues are more conserved and have higher RSA than single-ligand-binding residues. We empirically show that each hallmark discriminates between binding and nonbinding residues, even predicted RSA, and that combining them improves discriminatory power for each of the five types of interactions. Linear scoring functions that combine these hallmarks offer good predictive performance of residue-level propensity for binding and provide intuitive interpretation of predictions. Better understanding of these residue-level interactions will facilitate development of methods that accurately predict binding in the exponentially growing databases of protein sequences.
Collapse
|
13
|
Nithin C, Mukherjee S, Bahadur RP. A non-redundant protein-RNA docking benchmark version 2.0. Proteins 2016; 85:256-267. [PMID: 27862282 DOI: 10.1002/prot.25211] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Revised: 10/27/2016] [Accepted: 11/08/2016] [Indexed: 12/23/2022]
Abstract
We present an updated version of the protein-RNA docking benchmark, which we first published four years back. The non-redundant protein-RNA docking benchmark version 2.0 consists of 126 test cases, a threefold increase in number compared to its previous version. The present version consists of 21 unbound-unbound cases, of which, in 12 cases, the unbound RNAs are taken from another complex. It also consists of 95 unbound-bound cases where only the protein is available in the unbound state. Besides, we introduce 10 new bound-unbound cases where only the RNA is found in the unbound state. Based on the degree of conformational change of the interface residues upon complex formation the benchmark is classified into 72 rigid-body cases, 25 semiflexible cases and 19 full flexible cases. It also covers a wide range of conformational flexibility including small side chain movement to large domain swapping in protein structures as well as flipping and restacking in RNA bases. This benchmark should provide the docking community with more test cases for evaluating rigid-body as well as flexible docking algorithms. Besides, it will also facilitate the development of new algorithms that require large number of training set. The protein-RNA docking benchmark version 2.0 can be freely downloaded from http://www.csb.iitkgp.ernet.in/applications/PRDBv2. Proteins 2017; 85:256-267. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Chandran Nithin
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, 721302, India
| | - Sunandan Mukherjee
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, 721302, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, 721302, India
| |
Collapse
|
14
|
Newo ANS. Molecular modeling of the Plasmodium falciparum pre-mRNA splicing and nuclear export factor PfU52. Protein J 2015; 33:354-68. [PMID: 24861003 DOI: 10.1007/s10930-014-9566-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
UAP56/SUB2 is a DExD/H-box RNA helicase that is critically involved in pre-mRNA splicing and mRNA nuclear export. This helicase is broadly conserved and essential in many eukaryotic lineages, including protozoan and metazoan parasites. Previous research suggests that helicases from parasites could be promising drug targets for treating parasitic diseases. Accordingly, characterizing the structure and function of these proteins is of interest for structure-based, de novo design of new lead compounds. Here, we used homology modeling to construct a three-dimensional structure of PfU52 (PMDB ID: PM0079288), the Plasmodium falciparum ortholog of UAP56/SUB2, and explored the detailed architecture of its functional sites. Comparative in silico analysis revealed that although PfU52 shared many physicochemical, structural and dynamic similarities with its human homolog, it also displayed some unique features that could be exploited for drug design.
Collapse
Affiliation(s)
- Alain N S Newo
- Beckman Research Institute of City of Hope, Duarte, CA, USA,
| |
Collapse
|
15
|
Barik A, C N, Pilla SP, Bahadur RP. Molecular architecture of protein-RNA recognition sites. J Biomol Struct Dyn 2015; 33:2738-51. [PMID: 25562181 DOI: 10.1080/07391102.2015.1004652] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The molecular architecture of protein-RNA interfaces are analyzed using a non-redundant dataset of 152 protein-RNA complexes. We find that an average protein-RNA interface is smaller than an average protein-DNA interface but larger than an average protein-protein interface. Among the different classes of protein-RNA complexes, interfaces with tRNA are the largest, while the interfaces with the single-stranded RNA are the smallest. Significantly, RNA contributes more to the interface area than its partner protein. Moreover, unlike protein-protein interfaces where the side chain contributes less to the interface area compared to the main chain, the main chain and side chain contributions flipped in protein-RNA interfaces. We find that the protein surface in contact with the RNA in protein-RNA complexes is better packed than that in contact with the DNA in protein-DNA complexes, but loosely packed than that in contact with the protein in protein-protein complexes. Shape complementarity and electrostatic potential are the two major factors that determine the specificity of the protein-RNA interaction. We find that the H-bond density at the protein-RNA interfaces is similar with that of protein-DNA interfaces but higher than the protein-protein interfaces. Unlike protein-DNA interfaces where the deoxyribose has little role in intermolecular H-bonds, due to the presence of an oxygen atom at the 2' position, the ribose in RNA plays significant role in protein-RNA H-bonds. We find that besides H-bonds, salt bridges and stacking interactions also play significant role in stabilizing protein-nucleic acids interfaces; however, their contribution at the protein-protein interfaces is insignificant.
Collapse
Affiliation(s)
- Amita Barik
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Nithin C
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Smita P Pilla
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Ranjit Prasad Bahadur
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| |
Collapse
|
16
|
Gupta S, Chavan S, Deobagkar DN, Deobagkar DD. Bio/chemoinformatics in India: an outlook. Brief Bioinform 2014; 16:710-31. [PMID: 25159593 DOI: 10.1093/bib/bbu028] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 07/28/2014] [Indexed: 12/25/2022] Open
Abstract
With the advent of significant establishment and development of Internet facilities and computational infrastructure, an overview on bio/chemoinformatics is presented along with its multidisciplinary facts, promises and challenges. The Government of India has paved the way for more profound research in biological field with the use of computational facilities and schemes/projects to collaborate with scientists from different disciplines. Simultaneously, the growth of available biomedical data has provided fresh insight into the nature of redundant and compensatory data. Today, bioinformatics research in India is characterized by a powerful grid computing systems, great variety of biological questions addressed and the close collaborations between scientists and clinicians, with a full spectrum of focuses ranging from database building and methods development to biological discoveries. In fact, this outlook provides a resourceful platform highlighting the funding agencies, institutes and industries working in this direction, which would certainly be of great help to students seeking their career in bioinformatics. Thus, in short, this review highlights the current bio/chemoinformatics trend, educations, status, diverse applicability and demands for further development.
Collapse
|
17
|
Abstract
We investigate the role of water molecules in 89 protein–RNA complexes taken from the Protein Data Bank. Those with tRNA and single-stranded RNA are less hydrated than with duplex or ribosomal proteins. Protein–RNA interfaces are hydrated less than protein–DNA interfaces, but more than protein–protein interfaces. Majority of the waters at protein–RNA interfaces makes multiple H-bonds; however, a fraction do not make any. Those making H-bonds have preferences for the polar groups of RNA than its partner protein. The spatial distribution of waters makes interfaces with ribosomal proteins and single-stranded RNA relatively ‘dry’ than interfaces with tRNA and duplex RNA. In contrast to protein–DNA interfaces, mainly due to the presence of the 2′OH, the ribose in protein–RNA interfaces is hydrated more than the phosphate or the bases. The minor groove in protein–RNA interfaces is hydrated more than the major groove, while in protein–DNA interfaces it is reverse. The strands make the highest number of water-mediated H-bonds per unit interface area followed by the helices and the non-regular structures. The preserved waters at protein–RNA interfaces make higher number of H-bonds than the other waters. Preserved waters contribute toward the affinity in protein–RNA recognition and should be carefully treated while engineering protein–RNA interfaces.
Collapse
Affiliation(s)
- Amita Barik
- Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur-721302, India
| | - Ranjit Prasad Bahadur
- Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur-721302, India
| |
Collapse
|