1
|
Matiz-González JM, Pardo-Rodriguez D, Puerta CJ, Requena JM, Nocua PA, Cuervo C. Exploring the functionality and conservation of Alba proteins in Trypanosoma cruzi: A focus on biological diversity and RNA binding ability. Int J Biol Macromol 2024; 272:132705. [PMID: 38810850 DOI: 10.1016/j.ijbiomac.2024.132705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 05/07/2024] [Accepted: 05/26/2024] [Indexed: 05/31/2024]
Abstract
Trypanosoma cruzi is the causative agent of Chagas disease, as well as a trypanosomatid parasite with a complex biological cycle that requires precise mechanisms for regulating gene expression. In Trypanosomatidae, gene regulation occurs mainly at the mRNA level through the recognition of cis elements by RNA-binding proteins (RBPs). Alba family members are ubiquitous DNA/RNA-binding proteins with representatives in trypanosomatid parasites functionally related to gene expression regulation. Although T. cruzi possesses two groups of Alba proteins (Alba1/2 and Alba30/40), their functional role remains poorly understood. Thus, herein, a characterization of T. cruzi Alba (TcAlba) proteins was undertaken. Physicochemical, structural, and phylogenetic analysis of TcAlba showed features compatible with RBPs, such as hydrophilicity, RBP domains/motifs, and evolutionary conservation of the Alba-domain, mainly regarding other trypanosomatid Alba. However, in silico RNA interaction analysis of T. cruzi Alba proteins showed that TcAlba30/40 proteins, but not TcAlba1/2, would directly interact with the assayed RNA molecules, suggesting that these two groups of TcAlba proteins have different targets. Given the marked differences existing between both T. cruzi Alba groups (TcAlba1/2 and TcAlba30/40), regarding sequence divergence, RNA binding potential, and life-cycle expression patterns, we suggest that they would be involved in different biological processes.
Collapse
Affiliation(s)
- J Manuel Matiz-González
- Grupo de Enfermedades Infecciosas, Facultad de Ciencias, Pontificia Universidad Javeriana, 110231 Bogotá, Colombia
| | - Daniel Pardo-Rodriguez
- Grupo de Fitoquímica, Facultad de Ciencias, Pontificia Universidad Javeriana, 110231 Bogotá, Colombia; Metabolomics Core Facility, Vice-Presidency for Research, Universidad de los Andes, 111711 Bogotá, Colombia
| | - Concepción J Puerta
- Grupo de Enfermedades Infecciosas, Facultad de Ciencias, Pontificia Universidad Javeriana, 110231 Bogotá, Colombia
| | - José M Requena
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, 28049 Madrid, Spain
| | - Paola A Nocua
- Grupo de Enfermedades Infecciosas, Facultad de Ciencias, Pontificia Universidad Javeriana, 110231 Bogotá, Colombia.
| | - Claudia Cuervo
- Grupo de Enfermedades Infecciosas, Facultad de Ciencias, Pontificia Universidad Javeriana, 110231 Bogotá, Colombia.
| |
Collapse
|
2
|
Sun C, Feng Y. EPDRNA: A Model for Identifying DNA-RNA Binding Sites in Disease-Related Proteins. Protein J 2024; 43:513-521. [PMID: 38491248 DOI: 10.1007/s10930-024-10183-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/02/2024] [Indexed: 03/18/2024]
Abstract
Protein-DNA and protein-RNA interactions are involved in many biological processes and regulate many cellular functions. Moreover, they are related to many human diseases. To understand the molecular mechanism of protein-DNA binding and protein-RNA binding, it is important to identify which residues in the protein sequence bind to DNA and RNA. At present, there are few methods for specifically identifying the binding sites of disease-related protein-DNA and protein-RNA. In this study, so we combined four machine learning algorithms into an ensemble classifier (EPDRNA) to predict DNA and RNA binding sites in disease-related proteins. The dataset used in model was collated from UniProt and PDB database, and PSSM, physicochemical properties and amino acid type were used as features. The EPDRNA adopted soft voting and achieved the best AUC value of 0.73 at the DNA binding sites, and the best AUC value of 0.71 at the RNA binding sites in 10-fold cross validation in the training sets. In order to further verify the performance of the model, we assessed EPDRNA for the prediction of DNA-binding sites and the prediction of RNA-binding sites on the independent test dataset. The EPDRNA achieved 85% recall rate and 25% precision on the protein-DNA interaction independent test set, and achieved 82% recall rate and 27% precision on the protein-RNA interaction independent test set. The online EPDRNA webserver is freely available at http://www.s-bioinformatics.cn/epdrna .
Collapse
Affiliation(s)
- CanZhuang Sun
- College of Science, Inner Mongolia Agriculture University, Hohhot, 010018, People's Republic of China
| | - YongE Feng
- College of Science, Inner Mongolia Agriculture University, Hohhot, 010018, People's Republic of China.
| |
Collapse
|
3
|
Sabei A, Hognon C, Martin J, Frezza E. Dynamics of Protein-RNA Interfaces Using All-Atom Molecular Dynamics Simulations. J Phys Chem B 2024; 128:4865-4886. [PMID: 38740056 DOI: 10.1021/acs.jpcb.3c07698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Facing the current challenges posed by human health diseases requires the understanding of cell machinery at a molecular level. The interplay between proteins and RNA is key for any physiological phenomenon, as well protein-RNA interactions. To understand these interactions, many experimental techniques have been developed, spanning a very wide range of spatial and temporal resolutions. In particular, the knowledge of tridimensional structures of protein-RNA complexes provides structural, mechanical, and dynamical pieces of information essential to understand their functions. To get insights into the dynamics of protein-RNA complexes, we carried out all-atom molecular dynamics simulations in explicit solvent on nine different protein-RNA complexes with different functions and interface size by taking into account the bound and unbound forms. First, we characterized structural changes upon binding and, for the RNA part, the change in the puckering. Second, we extensively analyzed the interfaces, their dynamics and structural properties, and the structural waters involved in the binding, as well as the contacts mediated by them. Based on our analysis, the interfaces rearranged during the simulation time showing alternative and stable residue-residue contacts with respect to the experimental structure.
Collapse
Affiliation(s)
- Afra Sabei
- Université Paris Cité, CiTCoM, CNRS, Paris F-75006, France
| | - Cécilia Hognon
- Université Paris Cité, CiTCoM, CNRS, Paris F-75006, France
| | - Juliette Martin
- Univ Lyon, Université Claude Bernard Lyon 1, CNRS, UMR 5086 MMSB, Lyon 69367, France
- Laboratory of Biology and Modeling of the Cell, Université de Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, Inserm U1293, Lyon 69367, France
| | - Elisa Frezza
- Université Paris Cité, CiTCoM, CNRS, Paris F-75006, France
| |
Collapse
|
4
|
Agarwal A, Kant S, Bahadur RP. Efficient mapping of RNA-binding residues in RNA-binding proteins using local sequence features of binding site residues in protein-RNA complexes. Proteins 2023; 91:1361-1379. [PMID: 37254800 DOI: 10.1002/prot.26528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 04/13/2023] [Accepted: 05/02/2023] [Indexed: 06/01/2023]
Abstract
Protein-RNA interactions play vital roles in plethora of biological processes such as regulation of gene expression, protein synthesis, mRNA processing and biogenesis. Identification of RNA-binding residues (RBRs) in proteins is essential to understand RNA-mediated protein functioning, to perform site-directed mutagenesis and to develop novel targeted drug therapies. Moreover, the extensive gap between sequence and structural data restricts the identification of binding sites in unsolved structures. However, efficient use of computational methods demanding only sequence to identify binding residues can bridge this huge sequence-structure gap. In this study, we have extensively studied protein-RNA interface in known RNA-binding proteins (RBPs). We find that the interface is highly enriched in basic and polar residues with Gly being the most common interface neighbor. We investigated several amino acid features and developed a method to predict putative RBRs from amino acid sequence. We have implemented balanced random forest (BRF) classifier with local residue features of protein sequences for prediction. With 5-fold cross-validations, the sequence pattern derived dipeptide composition based BRF model (DCP-BRF) resulted in an accuracy of 87.9%, specificity of 88.8%, sensitivity of 82.2%, Mathew's correlation coefficient of 0.60 and AUC of 0.93, performing better than few existing methods. We further validated our prediction model on known human RBPs through RBR prediction and could map ~54% of them. Further, knowledge of binding site preferences obtained from computational predictions combined with experimental validations of potential RNA binding sites can enhance our understanding of protein-RNA interactions. This may serve to accelerate investigations on functional roles of many novel RBPs.
Collapse
Affiliation(s)
- Ankita Agarwal
- School of Bio Science, Indian Institute of Technology Kharagpur, Kharagpur, India
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, India
| | - Shri Kant
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur, India
| |
Collapse
|
5
|
Tubiana T, Sillitoe I, Orengo C, Reuter N. Dissecting peripheral protein-membrane interfaces. PLoS Comput Biol 2022; 18:e1010346. [PMID: 36516231 PMCID: PMC9797079 DOI: 10.1371/journal.pcbi.1010346] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 12/28/2022] [Accepted: 11/24/2022] [Indexed: 12/15/2022] Open
Abstract
Peripheral membrane proteins (PMPs) include a wide variety of proteins that have in common to bind transiently to the chemically complex interfacial region of membranes through their interfacial binding site (IBS). In contrast to protein-protein or protein-DNA/RNA interfaces, peripheral protein-membrane interfaces are poorly characterized. We collected a dataset of PMP domains representative of the variety of PMP functions: membrane-targeting domains (Annexin, C1, C2, discoidin C2, PH, PX), enzymes (PLA, PLC/D) and lipid-transfer proteins (START). The dataset contains 1328 experimental structures and 1194 AphaFold models. We mapped the amino acid composition and structural patterns of the IBS of each protein in this dataset, and evaluated which were more likely to be found at the IBS compared to the rest of the domains' accessible surface. In agreement with earlier work we find that about two thirds of the PMPs in the dataset have protruding hydrophobes (Leu, Ile, Phe, Tyr, Trp and Met) at their IBS. The three aromatic amino acids Trp, Tyr and Phe are a hallmark of PMPs IBS regardless of whether they protrude on loops or not. This is also the case for lysines but not arginines suggesting that, unlike for Arg-rich membrane-active peptides, the less membrane-disruptive lysine is preferred in PMPs. Another striking observation was the over-representation of glycines at the IBS of PMPs compared to the rest of their surface, possibly procuring IBS loops a much-needed flexibility to insert in-between membrane lipids. The analysis of the 9 superfamilies revealed amino acid distribution patterns in agreement with their known functions and membrane-binding mechanisms. Besides revealing novel amino acids patterns at protein-membrane interfaces, our work contributes a new PMP dataset and an analysis pipeline that can be further built upon for future studies of PMPs properties, or for developing PMPs prediction tools using for example, machine learning approaches.
Collapse
Affiliation(s)
- Thibault Tubiana
- Department of Chemistry, University of Bergen, Bergen, Norway
- Computational Biology Unit, University of Bergen, Bergen, Norway
| | - Ian Sillitoe
- Department of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Christine Orengo
- Department of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Nathalie Reuter
- Department of Chemistry, University of Bergen, Bergen, Norway
- Computational Biology Unit, University of Bergen, Bergen, Norway
| |
Collapse
|
6
|
Shema Mugisha C, Dinh T, Kumar A, Tenneti K, Eschbach JE, Davis K, Gifford R, Kvaratskhelia M, Kutluay SB. Emergence of Compensatory Mutations Reveals the Importance of Electrostatic Interactions between HIV-1 Integrase and Genomic RNA. mBio 2022; 13:e0043122. [PMID: 35975921 PMCID: PMC9601147 DOI: 10.1128/mbio.00431-22] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 07/27/2022] [Indexed: 01/11/2023] Open
Abstract
HIV-1 integrase (IN) has a noncatalytic function in virion maturation through its binding to the viral RNA genome (gRNA). Class II IN substitutions inhibit IN-gRNA binding and result in the formation of virions with aberrant morphologies marked by mislocalization of the gRNA between the capsid lattice and the lipid envelope. These viruses are noninfectious due to a block at an early reverse transcription stage in target cells. HIV-1 IN utilizes basic residues within its C-terminal domain (CTD) to bind to the gRNA; however, the molecular nature of how these residues mediate gRNA binding and whether other regions of IN are involved remain unknown. To address this, we have isolated compensatory substitutions in the background of a class II IN mutant virus bearing R269A/K273A substitutions within the IN-CTD. We found that the nearby D256N and D270N compensatory substitutions restored the ability of IN to bind gRNA and led to the formation of mature infectious virions. Reinstating the local positive charge of the IN-CTD through individual D256R, D256K, D278R, and D279R substitutions was sufficient to specifically restore IN-gRNA binding and reverse transcription for the IN R269A/K273A as well as the IN R262A/R263A class II mutants. Structural modeling suggested that compensatory substitutions in the D256 residue created an additional interaction interface for gRNA binding, whereas other substitutions acted locally within the unstructured C-terminal tail of IN. Taken together, our findings highlight the essential role of CTD in gRNA binding and reveal the importance of pliable electrostatic interactions between the IN-CTD and the gRNA. IMPORTANCE In addition to its catalytic function, HIV-1 integrase (IN) binds to the viral RNA genome (gRNA) through positively charged residues (i.e., R262, R263, R269, K273) within its C-terminal domain (CTD) and regulates proper virion maturation. Mutation of these residues results in the formation of morphologically aberrant viruses blocked at an early reverse transcription stage in cells. Here we show that compensatory substitutions in nearby negatively charged aspartic acid residues (i.e., D256N, D270N) restore the ability of IN to bind gRNA for these mutant viruses and result in the formation of accurately matured infectious virions. Similarly, individual charge reversal substitutions at D256 as well as other nearby positions (i.e., D278, D279) are all sufficient to enable the respective IN mutants to bind gRNA, and subsequently restore reverse transcription and virion infectivity. Taken together, our findings reveal the importance of highly pliable electrostatic interactions in IN-gRNA binding.
Collapse
Affiliation(s)
- Christian Shema Mugisha
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Tung Dinh
- Division of Infectious Diseases, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Abhishek Kumar
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Kasyap Tenneti
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Jenna E. Eschbach
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Keanu Davis
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Robert Gifford
- MRC-University of Glasgow Centre for Virus Research, Bearsden, Glasgow, United Kingdom
| | - Mamuka Kvaratskhelia
- Division of Infectious Diseases, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Sebla B. Kutluay
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, Missouri, USA
| |
Collapse
|
7
|
杨 爽. Analysis of Residue Interface Preference in Protein-DNA Complexes and Its Application in Recognition of Binding Interface. Biophysics (Nagoya-shi) 2022. [DOI: 10.12677/biphy.2022.104006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
8
|
A comparative analysis of machine learning classifiers for predicting protein-binding nucleotides in RNA sequences. Comput Struct Biotechnol J 2022; 20:3195-3207. [PMID: 35832617 PMCID: PMC9249596 DOI: 10.1016/j.csbj.2022.06.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 06/14/2022] [Accepted: 06/14/2022] [Indexed: 11/24/2022] Open
Abstract
RNA are master players in various cellular and biological processes and RNA-protein interactions are vital for proper functioning of cellular machineries. Knowledge of binding sites is crucial to decipher their functional implications. RNA NC-triplet and NC-quartet features could give reasonably high performance. RF model outperformed other machine learning classifiers with 85% accuracy and 0.93 AUC and performed better than few existing methods. An online webserver “Nucpred” is developed with trained model and freely accessible for scientific community.
RNA-protein interactions play vital roles in driving the cellular machineries. Despite significant involvement in several biological processes, the underlying molecular mechanism of RNA-protein interactions is still elusive. This may be due to the experimental difficulties in solving co-crystallized RNA-protein complexes. Inherent flexibility of RNA molecules to adopt different conformations makes them functionally diverse. Their interactions with protein have implications in RNA disease biology. Thus, study of binding interfaces can provide a mechanistic insight of the molecular functioning and aberrations caused due to altered interactions. Moreover, high-throughput sequencing technologies have generated huge sequence data compared to available structural data of RNA-protein complexes. In such a scenario, efficient computational algorithms are required for identification of protein-binding interfaces of RNA in the absence of known structures. We have investigated several machine learning classifiers and various features derived from nucleotide sequences to identify protein-binding nucleotides in RNA. We achieve best performance with nucleotide-triplet and nucleotide-quartet feature-based random forest models. An overall accuracy of 84.8%, sensitivity of 83.2%, specificity of 86.1%, MCC of 0.70 and AUC of 0.93 is achieved. We have further implemented the developed models in a user-friendly webserver “Nucpred”, which is freely accessible at “http://www.csb.iitkgp.ac.in/applications/Nucpred/index”.
Collapse
|
9
|
Kagra D, Jangra R, Sharma P. Exploring the Nature of Hydrogen Bonding between RNA and Proteins: A Comprehensive Analysis of RNA : Protein Complexes. Chemphyschem 2021; 23:e202100731. [PMID: 34747094 DOI: 10.1002/cphc.202100731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 11/02/2021] [Indexed: 11/08/2022]
Abstract
A nonredundant dataset of ∼300 high (up to 2.5 Å) resolution X-ray structures of RNA:protein complexes were analyzed for hydrogen bonds between amino-acid residues and canonical ribonucleotides (rNs). The identified 17100 contacts were classified based on the identity (rA, rC, rG or rU) and interacting fragment (base, sugar, or ribose) of the rN, the nature (polar or nonpolar) and interacting moiety (main chain or side chain) of the amino-acid residue, as well as the rN and amino-acid atoms participating in the hydrogen bonding. 80 possible hydrogen-bonding combinations (4 (rNs) X 20 (amino acids)) involve a wide variety of RNA and protein types and are present in multiple occurrences in almost all PDB files. Comparison with the analogously-selected DNA:protein complexes reveals that the absence of 2'-OH group in DNA mainly accounts for the differences in DNA:protein and RNA:protein hydrogen bonding. Search for intrinsically-stable base:amino acid pairs containing single or multiple hydrogen bonds reveals 37 unique pairs, which may act as well-defined RNA:protein interaction motifs. Overall, our work collectively analyzes the largest set of nucleic acid-protein hydrogen bonds to date, and therefore highlights several trends that may help frame structural rules governing the physiochemical characteristics of RNA:protein recognition.
Collapse
Affiliation(s)
- Deepika Kagra
- Computational Biochemistry Laboratory, Department of Chemistry and Centre for Advanced Studies in Chemistry, Panjab University, Chandigarh, 160014, India
| | - Raman Jangra
- Computational Biochemistry Laboratory, Department of Chemistry and Centre for Advanced Studies in Chemistry, Panjab University, Chandigarh, 160014, India
| | - Purshotam Sharma
- Computational Biochemistry Laboratory, Department of Chemistry and Centre for Advanced Studies in Chemistry, Panjab University, Chandigarh, 160014, India
| |
Collapse
|
10
|
Jiang Z, Xiao SR, Liu R. Dissecting and predicting different types of binding sites in nucleic acids based on structural information. Brief Bioinform 2021; 23:6384399. [PMID: 34624074 PMCID: PMC8769709 DOI: 10.1093/bib/bbab411] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 08/26/2021] [Accepted: 09/07/2021] [Indexed: 12/16/2022] Open
Abstract
The biological functions of DNA and RNA generally depend on their interactions with other molecules, such as small ligands, proteins and nucleic acids. However, our knowledge of the nucleic acid binding sites for different interaction partners is very limited, and identification of these critical binding regions is not a trivial work. Herein, we performed a comprehensive comparison between binding and nonbinding sites and among different categories of binding sites in these two nucleic acid classes. From the structural perspective, RNA may interact with ligands through forming binding pockets and contact proteins and nucleic acids using protruding surfaces, while DNA may adopt regions closer to the middle of the chain to make contacts with other molecules. Based on structural information, we established a feature-based ensemble learning classifier to identify the binding sites by fully using the interplay among different machine learning algorithms, feature spaces and sample spaces. Meanwhile, we designed a template-based classifier by exploiting structural conservation. The complementarity between the two classifiers motivated us to build an integrative framework for improving prediction performance. Moreover, we utilized a post-processing procedure based on the random walk algorithm to further correct the integrative predictions. Our unified prediction framework yielded promising results for different binding sites and outperformed existing methods.
Collapse
Affiliation(s)
- Zheng Jiang
- College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| | - Si-Rui Xiao
- College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| | - Rong Liu
- College of Informatics, Huazhong Agricultural University, Wuhan, P. R. China
| |
Collapse
|
11
|
Wilson KA, Kung RW, D'souza S, Wetmore SD. Anatomy of noncovalent interactions between the nucleobases or ribose and π-containing amino acids in RNA-protein complexes. Nucleic Acids Res 2021; 49:2213-2225. [PMID: 33544852 PMCID: PMC7913691 DOI: 10.1093/nar/gkab008] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 01/22/2021] [Indexed: 01/07/2023] Open
Abstract
A set of >300 nonredundant high-resolution RNA–protein complexes were rigorously searched for π-contacts between an amino acid side chain (W, H, F, Y, R, E and D) and an RNA nucleobase (denoted π–π interaction) or ribose moiety (denoted sugar–π). The resulting dataset of >1500 RNA–protein π-contacts were visually inspected and classified based on the interaction type, and amino acids and RNA components involved. More than 80% of structures searched contained at least one RNA–protein π-interaction, with π–π contacts making up 59% of the identified interactions. RNA–protein π–π and sugar–π contacts exhibit a range in the RNA and protein components involved, relative monomer orientations and quantum mechanically predicted binding energies. Interestingly, π–π and sugar–π interactions occur more frequently with RNA (4.8 contacts/structure) than DNA (2.6). Moreover, the maximum stability is greater for RNA–protein contacts than DNA–protein interactions. In addition to highlighting distinct differences between RNA and DNA–protein binding, this work has generated the largest dataset of RNA–protein π-interactions to date, thereby underscoring that RNA–protein π-contacts are ubiquitous in nature, and key to the stability and function of RNA–protein complexes.
Collapse
Affiliation(s)
- Katie A Wilson
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive West, Lethbridge, Alberta T1K 3M4, Canada
| | - Ryan W Kung
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive West, Lethbridge, Alberta T1K 3M4, Canada
| | - Simmone D'souza
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive West, Lethbridge, Alberta T1K 3M4, Canada
| | - Stacey D Wetmore
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive West, Lethbridge, Alberta T1K 3M4, Canada
| |
Collapse
|
12
|
Kagra D, Prabhakar PS, Sharma KD, Sharma P. Structural Patterns and Stabilities of Hydrogen-Bonded Pairs Involving Ribonucleotide Bases and Arginine, Glutamic Acid, or Glutamine Residues of Proteins from Quantum Mechanical Calculations. ACS OMEGA 2020; 5:3612-3623. [PMID: 32118177 PMCID: PMC7045552 DOI: 10.1021/acsomega.9b04083] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 01/28/2020] [Indexed: 06/10/2023]
Abstract
Ribonucleotide:protein interactions play crucial roles in a number of biological processes. Unlike the RNA:protein interface where van der Waals contacts are prevalent, the recognition of a single ribonucleotide such as ATP by a protein occurs predominantly through hydrogen-bonding interactions. As a first step toward understanding the role of hydrogen bonding in ribonucleotide:protein recognition, the present work employs density functional theory to provide a detailed quantum-mechanical analysis of the structural and energetic characteristics of 18 unique hydrogen-bonded pairs involving the nucleobase/nucleoside moiety of four canonical ribonucleotides and the side chains of three polar amino-acid residues (arginine, glutamine, and glutamic acid) of proteins. In addition, we model five new pairs that are till now not observed in crystallographically identified ribonucleotide:protein complexes but may be identified in complexes crystallized in the future. We critically examine the characteristics of each pair in its ribonucleotide:protein crystal structure occurrence and (gas phase and water phase) optimized intrinsic structure. We further evaluated the interaction energy of each pair and characterized the associated hydrogen bonds using a number of quantum mechanics-based relationships including natural bond orbital analysis, quantum theory atoms in molecules analysis, Iogansen relationships, Nikolaienko-Bulavin-Hovorun relationships, and noncovalent interaction-reduced density gradient analysis. Our analyses reveal rich variability in hydrogen bonds in the crystallographic as well as intrinsic structure of each pair, which includes conventional O/N-H···N/O and C-H···O hydrogen bonds as well as donor/acceptor-bifurcated hydrogen bonds. Further, we identify five combinations of nucleobase and amino acid moieties; each of which exhibits at least two alternate (i.e., multimodal) structures that interact through the same nucleobase edge. In fact, one such pair exhibits four multimodal structures; one of which possesses unconventional "amino-acceptor" hydrogen bonding with comparable (-9.4 kcal mol-1) strength to the corresponding conventional (i.e., amino:donor) structure (-9.2 kcal mol-1). This points to the importance of amino-acceptor hydrogen bonds in RNA:protein interactions and suggests that such interactions must be considered in the future while studying the dynamics in the context of molecular recognition. Overall, our study provides preliminary insights into the intrinsic features of ribonucleotide:amino acid interactions, which may help frame a clearer picture of the molecular basis of RNA:protein recognition and further appreciate the role of such contacts in biology.
Collapse
Affiliation(s)
- Deepika Kagra
- Computational
Biochemistry Laboratory, Department of Chemistry, and Centre for Advanced
Studies in Chemistry, Panjab University, Chandigarh 160014, India
| | - Preethi Seelam Prabhakar
- Center
for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology
Hyderabad (IIIT-H), Gachibowli, Hyderabad, Telangana 500032, India
| | - Karan Deep Sharma
- Computational
Biochemistry Laboratory, Department of Chemistry, and Centre for Advanced
Studies in Chemistry, Panjab University, Chandigarh 160014, India
| | - Purshotam Sharma
- Computational
Biochemistry Laboratory, Department of Chemistry, and Centre for Advanced
Studies in Chemistry, Panjab University, Chandigarh 160014, India
| |
Collapse
|
13
|
Yang Z, Deng X, Liu Y, Gong W, Li C. Analyses on clustering of the conserved residues at protein-RNA interfaces and its application in binding site identification. BMC Bioinformatics 2020; 21:57. [PMID: 32066366 PMCID: PMC7027071 DOI: 10.1186/s12859-020-3398-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 02/07/2020] [Indexed: 12/26/2022] Open
Abstract
Background The maintenance of protein structural stability requires the cooperativity among spatially neighboring residues. Previous studies have shown that conserved residues tend to occur clustered together within enzyme active sites and protein-protein/DNA interfaces. It is possible that conserved residues form one or more local clusters in protein tertiary structures as it can facilitate the formation of functional motifs. In this work, we systematically investigate the spatial distributions of conserved residues as well as hot spot ones within protein-RNA interfaces. Results The analysis of 191 polypeptide chains from 160 complexes shows the polypeptides interacting with tRNAs evolve relatively rapidly. A statistical analysis of residues in different regions shows that the interface residues are often more conserved, while the most conserved ones are those occurring at protein interiors which maintain the stability of folded polypeptide chains. Additionally, we found that 77.8% of the interfaces have the conserved residues clustered within the entire interface regions. Appling the clustering characteristics to the identification of the real interface, there are 31.1% of cases where the real interfaces are ranked in top 10% of 1000 randomly generated surface patches. In the conserved clusters, the preferred residues are the hydrophobic (Leu, Ile, Met), aromatic (Tyr, Phe, Trp) and interestingly only one positively charged Arg residues. For the hot spot residues, 51.5% of them are situated in the conserved residue clusters, and they are largely consistent with the preferred residue types in the conserved clusters. Conclusions The protein-RNA interface residues are often more conserved than non-interface surface ones. The conserved interface residues occur more spatially clustered relative to the entire interface residues. The high consistence of hot spot residue types and the preferred residue types in the conserved clusters has important implications for the experimental alanine scanning mutagenesis study. This work deepens the understanding of the residual organization at protein-RNA interface and is of potential applications in the identification of binding site and hot spot residues.
Collapse
Affiliation(s)
- Zhen Yang
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Xueqing Deng
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Yang Liu
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Weikang Gong
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China
| | - Chunhua Li
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, 100124, China.
| |
Collapse
|
14
|
Nithin C, Mukherjee S, Bahadur RP. A structure-based model for the prediction of protein-RNA binding affinity. RNA (NEW YORK, N.Y.) 2019; 25:1628-1645. [PMID: 31395671 PMCID: PMC6859855 DOI: 10.1261/rna.071779.119] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 08/05/2019] [Indexed: 05/28/2023]
Abstract
Protein-RNA recognition is highly affinity-driven and regulates a wide array of cellular functions. In this study, we have curated a binding affinity data set of 40 protein-RNA complexes, for which at least one unbound partner is available in the docking benchmark. The data set covers a wide affinity range of eight orders of magnitude as well as four different structural classes. On average, we find the complexes with single-stranded RNA have the highest affinity, whereas the complexes with the duplex RNA have the lowest. Nevertheless, free energy gain upon binding is the highest for the complexes with ribosomal proteins and the lowest for the complexes with tRNA with an average of -5.7 cal/mol/Å2 in the entire data set. We train regression models to predict the binding affinity from the structural and physicochemical parameters of protein-RNA interfaces. The best fit model with the lowest maximum error is provided with three interface parameters: relative hydrophobicity, conformational change upon binding and relative hydration pattern. This model has been used for predicting the binding affinity on a test data set, generated using mutated structures of yeast aspartyl-tRNA synthetase, for which experimentally determined ΔG values of 40 mutations are available. The predicted ΔGempirical values highly correlate with the experimental observations. The data set provided in this study should be useful for further development of the binding affinity prediction methods. Moreover, the model developed in this study enhances our understanding on the structural basis of protein-RNA binding affinity and provides a platform to engineer protein-RNA interfaces with desired affinity.
Collapse
Affiliation(s)
- Chandran Nithin
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Sunandan Mukherjee
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| |
Collapse
|
15
|
Deng L, Yang W, Liu H. PredPRBA: Prediction of Protein-RNA Binding Affinity Using Gradient Boosted Regression Trees. Front Genet 2019; 10:637. [PMID: 31428122 PMCID: PMC6688581 DOI: 10.3389/fgene.2019.00637] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 06/18/2019] [Indexed: 01/24/2023] Open
Abstract
Protein-RNA interactions play essential roles in many biological aspects. Quantifying the binding affinity of protein-RNA complexes is helpful to the understanding of protein-RNA recognition mechanisms and identification of strong binding partners. Due to experimentally measured protein-RNA binding affinity data available is still limited to date, there is a pressing demand for accurate and reliable computational approaches. In this paper, we propose a computational approach, PredPRBA, which can effectively predict protein-RNA binding affinity using gradient boosted regression trees. We build a dataset of protein-RNA binding affinity that includes 103 protein-RNA complex structures manually collected from related literature. Then, we generate 37 kinds of sequence and structural features and explore the relationship between the features and protein-RNA binding affinity. We find that the binding affinity mainly depends on the structure of RNA molecules. According to the type of RNA associated with proteins composed of the protein-RNA complex, we split the 103 protein-RNA complexes into six categories. For each category, we build a gradient boosted regression tree (GBRT) model based on the generated features. We perform a comprehensive evaluation for the proposed method on the binding affinity dataset using leave-one-out cross-validation. We show that PredPRBA achieves correlations ranging from 0.723 to 0.897 among six categories, which is significantly better than other typical regression methods and the pioneer protein-RNA binding affinity predictor SPOT-Seq-RNA. In addition, a user-friendly web server has been developed to predict the binding affinity of protein-RNA complexes. The PredPRBA webserver is freely available at http://PredPRBA.denglab.org/.
Collapse
Affiliation(s)
- Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, China.,School of Software, Xinjiang University, Urumqi, China
| | - Wenyi Yang
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Hui Liu
- Lab of Information Management, Changzhou University, Changzhou, China
| |
Collapse
|
16
|
CAPRI enables comparison of evolutionarily conserved RNA interacting regions. Nat Commun 2019; 10:2682. [PMID: 31213602 PMCID: PMC6581911 DOI: 10.1038/s41467-019-10585-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Accepted: 05/21/2019] [Indexed: 12/21/2022] Open
Abstract
RNA-protein complexes play essential regulatory roles at nearly all levels of gene expression. Using in vivo crosslinking and RNA capture, we report a comprehensive RNA-protein interactome in a metazoan at four levels of resolution: single amino acids, domains, proteins and multisubunit complexes. We devise CAPRI, a method to map RNA-binding domains (RBDs) by simultaneous identification of RNA interacting crosslinked peptides and peptides adjacent to such crosslinked sites. CAPRI identifies more than 3000 RNA proximal peptides in Drosophila and human proteins with more than 45% of them forming new interaction interfaces. The comparison of orthologous proteins enables the identification of evolutionary conserved RBDs in globular domains and intrinsically disordered regions (IDRs). By comparing the sequences of IDRs through evolution, we classify them based on the type of motif, accumulation of tandem repeats, conservation of amino acid composition and high sequence divergence. Comprehensive characterisation of RNA-protein interactions requires different levels of resolution. Here, the authors present an integrated mass spectrometry-based approach that allows them to define the Drosophila RNA-protein interactome from the level of multisubunit complexes down to the RNA-binding amino acid.
Collapse
|
17
|
Pilla SP, Thomas A, Bahadur RP. Dissecting macromolecular recognition sites in ribosome: implication to its self-assembly. RNA Biol 2019; 16:1300-1312. [PMID: 31179876 DOI: 10.1080/15476286.2019.1629767] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Interactions between macromolecules play a crucial role in ribosome assembly that follows a highly coordinated process involving RNA folding and binding of ribosomal proteins (r-proteins). Although extensive studies have been carried out to understand macromolecular interactions in ribosomes, most of them are confined to either large or small ribosomal-subunit of few species. A comparative analysis of macromolecular interactions across different domains is still missing. We have analyzed the structural and physicochemical properties of protein-protein (PP), protein-RNA (PR) and RNA-RNA (RR) interfaces in small and large subunits of ribosomes, as well as in between the two subunits. Additionally, we have also developed Random Forest (RF) classifier to catalog the r-proteins. We find significant differences as well as similarities in macromolecular recognition sites between ribosomal assemblies of prokaryotes and eukaryotes. PR interfaces are substantially larger and have more ionic interactions than PP and RR interfaces in both prokaryotes and eukaryotes. PP, PR and RR interfaces in eukaryotes are well packed compared to those in prokaryotes. However, the packing density between the large and the small subunit interfaces in the entire assembly is strikingly low in both prokaryotes and eukaryotes, indicating the periodic association and dissociation of the two subunits during the translation. The structural and physicochemical properties of PR interfaces are used to predict the r-proteins in the assembly pathway into early, intermediate and late binders using RF classifier with an accuracy of 80%. The results provide new insights into the classification of r-proteins in the assembly pathway.
Collapse
Affiliation(s)
- Smita P Pilla
- a Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Amal Thomas
- a Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Ranjit Prasad Bahadur
- a Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur , Kharagpur , India
| |
Collapse
|
18
|
Structural mechanism for HIV-1 TAR loop recognition by Tat and the super elongation complex. Proc Natl Acad Sci U S A 2018; 115:12973-12978. [PMID: 30514815 DOI: 10.1073/pnas.1806438115] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Promoter-proximal pausing by RNA polymerase II (Pol II) is a key regulatory step in human immunodeficiency virus-1 (HIV-1) transcription and thus in the reversal of HIV latency. By binding to the nascent transactivating response region (TAR) RNA, HIV-1 Tat recruits the human super elongation complex (SEC) to the promoter and releases paused Pol II. Structural studies of TAR interactions have been largely focused on interactions between the TAR bulge and the arginine-rich motif (ARM) of Tat. Here, the crystal structure of the TAR loop in complex with Tat and the SEC core was determined at a 3.5-Å resolution. The bound TAR loop is stabilized by cross-loop hydrogen bonds. It makes structure-specific contacts with the side chains of the Cyclin T1 Tat-TAR recognition motif (TRM) and the zinc-coordinating loop of Tat. The TAR loop phosphate backbone forms electrostatic and VDW interactions with positively charged side chains of the CycT1 TRM. Mutational analysis showed that these interactions contribute importantly to binding affinity. The Tat ARM was present in the crystallized construct; however, it was not visualized in the electron density, and the TAR bulge was not formed in the RNA construct used in crystallization. Binding assays showed that TAR bulge-Tat ARM interactions contribute less to TAR binding affinity than TAR loop interactions with the CycT1 TRM and Tat core. Thus, the TAR loop evolved to make high-affinity interactions with the TRM while Tat has three roles: scaffolding and stabilizing the TRM, making specific interactions through its zinc-coordinating loop, and making electrostatic interactions through its ARM.
Collapse
|
19
|
Hu W, Qin L, Li M, Pu X, Guo Y. Individually double minimum-distance definition of protein-RNA binding residues and application to structure-based prediction. J Comput Aided Mol Des 2018; 32:1363-1373. [PMID: 30478757 DOI: 10.1007/s10822-018-0177-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Accepted: 11/14/2018] [Indexed: 01/01/2023]
Abstract
Identifying protein-RNA binding residues is essential for understanding the mechanism of protein-RNA interactions. So far, rigid distance thresholds are commonly used to define protein-RNA binding residues. However, after investigating 182 non-redundant protein-RNA complexes, we find that it would be unsuitable for a certain amount of complexes since the distances between proteins and RNAs vary widely. In this work, a novel definition method was proposed based on a flexible distance cutoff. This method can fully consider the individual differences among complexes by setting a variable tolerance limit of protein-RNA interactions, i.e. the double minimum-distance by which different distance thresholds are achieved for different complexes. In order to validate our method, a comprehensive comparison between our flexible method and traditional rigid methods was implemented in terms of interface structure, amino acid composition, interface area and interaction force, etc. The results indicate that this method is more reasonable because it incorporates the specificity of different complexes by extracting the important residues lost by rigid distance methods and discarding some redundant residues. Finally, to further test our double minimum-distance definition strategy, we developed a classifier to predict those binding sites derived from our new method by using structural features and a random forest machine learning algorithm. The model achieved a satisfactory prediction performance and the accuracy on independent data sets reaches to 85.0%. To the best of our knowledge, it is the first prediction model to define positive and negative samples using a flexible cutoff. So the comparison analysis and modeling results have demonstrated that our method would be a very promising strategy for more precisely defining protein-RNA binding sites.
Collapse
Affiliation(s)
- Wen Hu
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, People's Republic of China
| | - Liu Qin
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, People's Republic of China
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, People's Republic of China
| | - Xuemei Pu
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, People's Republic of China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, People's Republic of China.
| |
Collapse
|
20
|
Chen F, Sun H, Wang J, Zhu F, Liu H, Wang Z, Lei T, Li Y, Hou T. Assessing the performance of MM/PBSA and MM/GBSA methods. 8. Predicting binding free energies and poses of protein-RNA complexes. RNA (NEW YORK, N.Y.) 2018; 24:1183-1194. [PMID: 29930024 PMCID: PMC6097651 DOI: 10.1261/rna.065896.118] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Accepted: 06/13/2018] [Indexed: 05/10/2023]
Abstract
Molecular docking provides a computationally efficient way to predict the atomic structural details of protein-RNA interactions (PRI), but accurate prediction of the three-dimensional structures and binding affinities for PRI is still notoriously difficult, partly due to the unreliability of the existing scoring functions for PRI. MM/PBSA and MM/GBSA are more theoretically rigorous than most scoring functions for protein-RNA docking, but their prediction performance for protein-RNA systems remains unclear. Here, we systemically evaluated the capability of MM/PBSA and MM/GBSA to predict the binding affinities and recognize the near-native binding structures for protein-RNA systems with different solvent models and interior dielectric constants (εin). For predicting the binding affinities, the predictions given by MM/GBSA based on the minimized structures in explicit solvent and the GBGBn1 model with εin = 2 yielded the highest correlation with the experimental data. Moreover, the MM/GBSA calculations based on the minimized structures in implicit solvent and the GBGBn1 model distinguished the near-native binding structures within the top 10 decoys for 117 out of the 148 protein-RNA systems (79.1%). This performance is better than all docking scoring functions studied here. Therefore, the MM/GBSA rescoring is an efficient way to improve the prediction capability of scoring functions for protein-RNA systems.
Collapse
Affiliation(s)
- Fu Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058, China
- College of Life and Environmental Sciences, Shanghai Normal University, Shanghai 200234, China
| | - Huiyong Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Junmei Wang
- Department of Pharmaceutical Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Hui Liu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Zhe Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Tailong Lei
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Youyong Li
- Institute of Functional Nano and Soft Materials (FUNSOM), Soochow University, Suzhou, Jiangsu 215123, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
21
|
An account of solvent accessibility in protein-RNA recognition. Sci Rep 2018; 8:10546. [PMID: 30002431 PMCID: PMC6043566 DOI: 10.1038/s41598-018-28373-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Accepted: 06/21/2018] [Indexed: 01/16/2023] Open
Abstract
Protein–RNA recognition often induces conformational changes in binding partners. Consequently, the solvent accessible surface area (SASA) buried in contact estimated from the co-crystal structures may differ from that calculated using their unbound forms. To evaluate the change in accessibility upon binding, we compare SASA of 126 protein-RNA complexes between bound and unbound forms. We observe, in majority of cases the interface of both the binding partners gain accessibility upon binding, which is often associated with either large domain movements or secondary structural transitions in RNA-binding proteins (RBPs), and binding-induced conformational changes in RNAs. At the non-interface region, majority of RNAs lose accessibility upon binding, however, no such preference is observed for RBPs. Side chains of RBPs have major contribution in change in accessibility. In case of flexible binding, we find a moderate correlation between the binding free energy and change in accessibility at the interface. Finally, we introduce a parameter, the ratio of gain to loss of accessibility upon binding, which can be used to identify the native solution among the flexible docking models. Our findings provide fundamental insights into the relationship between flexibility and solvent accessibility, and advance our understanding on binding induced folding in protein-RNA recognition.
Collapse
|
22
|
Lagarde N, Carbone A, Sacquin-Mora S. Hidden partners: Using cross-docking calculations to predict binding sites for proteins with multiple interactions. Proteins 2018; 86:723-737. [DOI: 10.1002/prot.25506] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 03/23/2018] [Accepted: 04/07/2018] [Indexed: 02/06/2023]
Affiliation(s)
- Nathalie Lagarde
- Laboratoire de Biochimie Théorique, CNRS UPR9080, Institut de Biologie Physico-Chimique, University Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie; Paris 75005 France
| | - Alessandra Carbone
- Laboratoire de Biologie Computationnelle et Quantitative, CNRS UMR7238, UPMC Univ-Paris 6, Sorbonne Université, 4 place Jussieu; Paris 75005 France
- Institut Universitaire de France; Paris 75005 France
| | - Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, CNRS UPR9080, Institut de Biologie Physico-Chimique, University Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie; Paris 75005 France
| |
Collapse
|
23
|
Flores JK, Ataide SF. Structural Changes of RNA in Complex with Proteins in the SRP. Front Mol Biosci 2018; 5:7. [PMID: 29459899 PMCID: PMC5807370 DOI: 10.3389/fmolb.2018.00007] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Accepted: 01/17/2018] [Indexed: 12/18/2022] Open
Abstract
The structural flexibility of RNA allows it to exist in several shapes and sizes. Thus, RNA is functionally diverse and is known to be involved in processes such as catalysis, ligand binding, and most importantly, protein recognition. RNA can adopt different structures, which can often dictate its functionality. When RNA binds onto protein to form a ribonucleoprotein complex (RNP), multiple interactions and conformational changes occur with the RNA and protein. However, there is the question of whether there is a specific pattern for these changes to occur upon recognition. In particular when RNP complexity increases with the addition of multiple proteins/RNA, it becomes difficult to structurally characterize the overall changes using the current structural determination techniques. Hence, there is a need to use a combination of biochemical, structural and computational modeling to achieve a better understanding of the processes that RNPs are involved. Nevertheless, there are well-characterized systems that are evolutionarily conserved [such as the signal recognition particle (SRP)] that give us important information on the structural changes of RNA and protein upon complex formation.
Collapse
Affiliation(s)
- Janine K Flores
- Ataide Lab, School of Life and Environmental Sciences, University of Sydney, Sydney, NSW, Australia
| | - Sandro F Ataide
- Ataide Lab, School of Life and Environmental Sciences, University of Sydney, Sydney, NSW, Australia
| |
Collapse
|
24
|
Barradas-Bautista D, Rosell M, Pallara C, Fernández-Recio J. Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems. PROTEIN-PROTEIN INTERACTIONS IN HUMAN DISEASE, PART A 2018; 110:203-249. [DOI: 10.1016/bs.apcsb.2017.06.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
25
|
Hu W, Qin L, Li M, Pu X, Guo Y. A structural dissection of protein–RNA interactions based on different RNA base areas of interfaces. RSC Adv 2018; 8:10582-10592. [PMID: 35540439 PMCID: PMC9078961 DOI: 10.1039/c8ra00598b] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2018] [Accepted: 03/05/2018] [Indexed: 11/21/2022] Open
Abstract
Protein–RNA interactions are very common cellular processes, but the mechanisms of interactions are not fully understood, mainly due to the complicated RNA structures. By the elaborate investigation on RNA structures of protein–RNA complexes, it was firstly found in this paper that RNAs in these complexes could be clearly classified into three classes (high, medium and low) based on the different levels of Pbase (the percentage of base area buried in the RNA interface). In view of the three RNA classes, more detailed analyses on protein–RNA interactions were comprehensively performed from various aspects, including interface area, structure, composition and interaction force, so as to achieve a deeper understanding of the recognition specificity for the three classes of protein–RNA interactions. According to our classification strategy, the three complex classes have significant differences in terms of almost all properties. Complexes in the high class have short and extended RNA structures and behave like protein–ssDNA interactions. Their hydrogen bonds and hydrophobic interactions are strong. For complexes in low class, their RNA structures are mainly double-stranded, like protein–dsDNA interactions, and electrostatic interactions frequently occur. The complexes in medium class have the longest RNA chains and largest average interface area. Meanwhile, they do not show any preference for the interaction force. On average, in terms of composition, secondary structures and intermolecular physicochemical properties, significant feature preferences can be observed in high and low complexes, but no highly specific features are found for medium complexes. We found that our proposed Pbase is an important parameter which can be used as a new determinant to distinguish protein–RNA complexes. For high and low complexes, we can more easily understand the specificity of the recognition process from the interface features than for medium complexes. In the future, medium complexes should be our research focus to further structurally analyze from more feature aspects. Overall, this study may contribute to further understanding of the mechanism of protein–RNA interactions on a more detailed level. Qualitative and quantitative measurements of the influence of structure and composition of RNA interfaces on protein–RNA interactions.![]()
Collapse
Affiliation(s)
- Wen Hu
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Liu Qin
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Menglong Li
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Xuemei Pu
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| | - Yanzhi Guo
- College of Chemistry
- Sichuan University
- Chengdu 610064
- People's Republic of China
| |
Collapse
|
26
|
Zhang J, Ma Z, Kurgan L. Comprehensive review and empirical analysis of hallmarks of DNA-, RNA- and protein-binding residues in protein chains. Brief Bioinform 2017; 20:1250-1268. [DOI: 10.1093/bib/bbx168] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Revised: 11/15/2017] [Indexed: 11/13/2022] Open
Abstract
Abstract
Proteins interact with a variety of molecules including proteins and nucleic acids. We review a comprehensive collection of over 50 studies that analyze and/or predict these interactions. While majority of these studies address either solely protein–DNA or protein–RNA binding, only a few have a wider scope that covers both protein–protein and protein–nucleic acid binding. Our analysis reveals that binding residues are typically characterized with three hallmarks: relative solvent accessibility (RSA), evolutionary conservation and propensity of amino acids (AAs) for binding. Motivated by drawbacks of the prior studies, we perform a large-scale analysis to quantify and contrast the three hallmarks for residues that bind DNA-, RNA-, protein- and (for the first time) multi-ligand-binding residues that interact with DNA and proteins, and with RNA and proteins. Results generated on a well-annotated data set of over 23 000 proteins show that conservation of binding residues is higher for nucleic acid- than protein-binding residues. Multi-ligand-binding residues are more conserved and have higher RSA than single-ligand-binding residues. We empirically show that each hallmark discriminates between binding and nonbinding residues, even predicted RSA, and that combining them improves discriminatory power for each of the five types of interactions. Linear scoring functions that combine these hallmarks offer good predictive performance of residue-level propensity for binding and provide intuitive interpretation of predictions. Better understanding of these residue-level interactions will facilitate development of methods that accurately predict binding in the exponentially growing databases of protein sequences.
Collapse
|
27
|
Gali VK, Balint E, Serbyn N, Frittmann O, Stutz F, Unk I. Translesion synthesis DNA polymerase η exhibits a specific RNA extension activity and a transcription-associated function. Sci Rep 2017; 7:13055. [PMID: 29026143 PMCID: PMC5638924 DOI: 10.1038/s41598-017-12915-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Accepted: 09/01/2017] [Indexed: 11/09/2022] Open
Abstract
Polymerase eta (Polη) is a low fidelity translesion synthesis DNA polymerase that rescues damage-stalled replication by inserting deoxy-ribonucleotides opposite DNA damage sites resulting in error-free or mutagenic damage bypass. In this study we identify a new specific RNA extension activity of Polη of Saccharomyces cerevisiae. We show that Polη is able to extend RNA primers in the presence of ribonucleotides (rNTPs), and that these reactions are an order of magnitude more efficient than the misinsertion of rNTPs into DNA. Moreover, during RNA extension Polη performs error-free bypass of the 8-oxoguanine and thymine dimer DNA lesions, though with a 103 and 102-fold lower efficiency, respectively, than it synthesizes opposite undamaged nucleotides. Furthermore, in vivo experiments demonstrate that the transcription of several genes is affected by the lack of Polη, and that Polη is enriched over actively transcribed regions. Moreover, inactivation of its polymerase activity causes similar transcription inhibition as the absence of Polη. In summary, these results suggest that the new RNA synthetic activity of Polη can have in vivo relevance.
Collapse
Affiliation(s)
- Vamsi K Gali
- The Institute of Genetics, Biological Research Centre, Hungarian Academy of Sciences, Szeged, H-6726, Hungary.,Institute of Medical Sciences Foresterhill, University of Aberdeen, Aberdeen, United Kingdom
| | - Eva Balint
- The Institute of Genetics, Biological Research Centre, Hungarian Academy of Sciences, Szeged, H-6726, Hungary
| | - Nataliia Serbyn
- Department of Cell Biology, iGE3, University of Geneva, 1211, Geneva, Switzerland
| | - Orsolya Frittmann
- The Institute of Genetics, Biological Research Centre, Hungarian Academy of Sciences, Szeged, H-6726, Hungary
| | - Francoise Stutz
- Department of Cell Biology, iGE3, University of Geneva, 1211, Geneva, Switzerland
| | - Ildiko Unk
- The Institute of Genetics, Biological Research Centre, Hungarian Academy of Sciences, Szeged, H-6726, Hungary.
| |
Collapse
|
28
|
Prostova MA, Deviatkin AA, Tcelykh IO, Lukashev AN, Gmyl AP. Independent evolution of tetraloop in enterovirus oriL replicative element and its putative binding partners in virus protein 3C. PeerJ 2017; 5:e3896. [PMID: 29018627 PMCID: PMC5633025 DOI: 10.7717/peerj.3896] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 09/16/2017] [Indexed: 12/18/2022] Open
Abstract
Background Enteroviruses are small non-enveloped viruses with a (+) ssRNA genome with one open reading frame. Enterovirus protein 3C (or 3CD for some species) binds the replicative element oriL to initiate replication. The replication of enteroviruses features a low-fidelity process, which allows the virus to adapt to the changing environment on the one hand, and requires additional mechanisms to maintain the genome stability on the other. Structural disturbances in the apical region of oriL domain d can be compensated by amino acid substitutions in positions 154 or 156 of 3C (amino acid numeration corresponds to poliovirus 3C), thus suggesting the co-evolution of these interacting sequences in nature. The aim of this work was to understand co-evolution patterns of two interacting replication machinery elements in enteroviruses, the apical region of oriL domain d and its putative binding partners in the 3C protein. Methods To evaluate the variability of the domain d loop sequence we retrieved all available full enterovirus sequences (>6, 400 nucleotides), which were present in the NCBI database on February 2017 and analysed the variety and abundance of sequences in domain d of the replicative element oriL and in the protein 3C. Results A total of 2,842 full genome sequences was analysed. The majority of domain d apical loops were tetraloops, which belonged to consensus YNHG (Y = U/C, N = any nucleotide, H = A/C/U). The putative RNA-binding tripeptide 154–156 (Enterovirus C 3C protein numeration) was less diverse than the apical domain d loop region and, in contrast to it, was species-specific. Discussion Despite the suggestion that the RNA-binding tripeptide interacts with the apical region of domain d, they evolve independently in nature. Together, our data indicate the plastic evolution of both interplayers of 3C-oriL recognition.
Collapse
Affiliation(s)
- Maria A Prostova
- Chumakov Institute of Poliomyelitis and Viral Encephalitides, Moscow, Russia
| | - Andrei A Deviatkin
- Chumakov Institute of Poliomyelitis and Viral Encephalitides, Moscow, Russia
| | - Irina O Tcelykh
- Chumakov Institute of Poliomyelitis and Viral Encephalitides, Moscow, Russia.,Lomonosov Moscow State University, Moscow, Russia
| | - Alexander N Lukashev
- Chumakov Institute of Poliomyelitis and Viral Encephalitides, Moscow, Russia.,Sechenov First Moscow State Medical University, Moscow, Russia
| | - Anatoly P Gmyl
- Chumakov Institute of Poliomyelitis and Viral Encephalitides, Moscow, Russia.,Lomonosov Moscow State University, Moscow, Russia.,Sechenov First Moscow State Medical University, Moscow, Russia
| |
Collapse
|
29
|
Nygaard R, Romaniuk JAH, Rice DM, Cegelski L. Whole Ribosome NMR: Dipolar Couplings and Contributions to Whole Cells. J Phys Chem B 2017; 121:9331-9335. [PMID: 28901760 DOI: 10.1021/acs.jpcb.7b06736] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Solid-state NMR is a powerful tool for quantifying chemical composition and structure in complex assemblies and even whole cells. We employed N{P} REDOR NMR to obtain atomic-level distance propensities in intact 15N-labeled E. coli ribosomes. The experimental REDOR dephasing of shift-resolved lysyl amine nitrogens by phosphorus was comparable to that expected from a calculation of N-P distances involving the lysines included in the crystal structure coordinates. Among the nitrogen contributions to the REDOR spectra, the strongest dephasing emerged from the dipolar couplings to phosphorus involving nitrogen peaks ascribed primarily to rRNA, and the weakest dephasing arose from protein amide nitrogens. This approach is applicable to any macromolecular system and provides quantitative comparisons of distance proximities between shift-resolved nuclei of one type and heteronuclear dephasing spins. Enhanced molecular specificity could be achieved through the use of spectroscopic filters or specific labeling. Furthermore, ribosome 13C and 15N CPMAS spectra were compared with those of whole cells from which the ribosomes were isolated. Whole-cell signatures of ribosomes were identified and should be of value in comparing overall cellular ribosome content in whole-cell samples.
Collapse
Affiliation(s)
- Rie Nygaard
- Department of Chemistry, Stanford University , 380 Roth Way, Stanford California 94305, United States
| | - Joseph A H Romaniuk
- Department of Chemistry, Stanford University , 380 Roth Way, Stanford California 94305, United States
| | - David M Rice
- Department of Chemistry, Stanford University , 380 Roth Way, Stanford California 94305, United States
| | - Lynette Cegelski
- Department of Chemistry, Stanford University , 380 Roth Way, Stanford California 94305, United States
| |
Collapse
|
30
|
Luo J, Liu L, Venkateswaran S, Song Q, Zhou X. RPI-Bind: a structure-based method for accurate identification of RNA-protein binding sites. Sci Rep 2017; 7:614. [PMID: 28377624 PMCID: PMC5429624 DOI: 10.1038/s41598-017-00795-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 03/13/2017] [Indexed: 01/11/2023] Open
Abstract
RNA and protein interactions play crucial roles in multiple biological processes, while these interactions are significantly influenced by the structures and sequences of protein and RNA molecules. In this study, we first performed an analysis of RNA-protein interacting complexes, and identified interface properties of sequences and structures, which reveal the diverse nature of the binding sites. With the observations, we built a three-step prediction model, namely RPI-Bind, for the identification of RNA-protein binding regions using the sequences and structures of both proteins and RNAs. The three steps include 1) the prediction of RNA binding regions on protein, 2) the prediction of protein binding regions on RNA, and 3) the prediction of interacting regions on both RNA and protein simultaneously, with the results from steps 1) and 2). Compared with existing methods, most of which employ only sequences, our model significantly improves the prediction accuracy at each of the three steps. Especially, our model outperforms the catRAPID by >20% at the 3rd step. All of these results indicate the importance of structures in RNA-protein interactions, and suggest that the RPI-Bind model is a powerful theoretical framework for studying RNA-protein interactions.
Collapse
Affiliation(s)
- Jiesi Luo
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Liang Liu
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Suresh Venkateswaran
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Qianqian Song
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA
| | - Xiaobo Zhou
- Center for Bioinformatics and Systems Biology and Department of Radiology, Wake Forest School of Medicine, Winston-Salem, NC, 27157, USA.
| |
Collapse
|
31
|
Using 3dRPC for RNA-protein complex structure prediction. BIOPHYSICS REPORTS 2017; 2:95-99. [PMID: 28317012 PMCID: PMC5334405 DOI: 10.1007/s41048-017-0034-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Accepted: 01/05/2017] [Indexed: 02/07/2023] Open
Abstract
3dRPC is a computational method designed for three-dimensional RNA–protein complex structure prediction. Starting from a protein structure and a RNA structure, 3dRPC first generates presumptive complex structures by RPDOCK and then evaluates the structures by RPRANK. RPDOCK is an FFT-based docking algorithm that takes features of RNA–protein interactions into consideration, and RPRANK is a knowledge-based potential using root mean square deviation as a measure. Here we give a detailed description of the usage of 3dRPC. The source code is available at http://biophy.hust.edu.cn/3dRPC.html.
Collapse
|
32
|
Nithin C, Mukherjee S, Bahadur RP. A non-redundant protein-RNA docking benchmark version 2.0. Proteins 2016; 85:256-267. [PMID: 27862282 DOI: 10.1002/prot.25211] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Revised: 10/27/2016] [Accepted: 11/08/2016] [Indexed: 12/23/2022]
Abstract
We present an updated version of the protein-RNA docking benchmark, which we first published four years back. The non-redundant protein-RNA docking benchmark version 2.0 consists of 126 test cases, a threefold increase in number compared to its previous version. The present version consists of 21 unbound-unbound cases, of which, in 12 cases, the unbound RNAs are taken from another complex. It also consists of 95 unbound-bound cases where only the protein is available in the unbound state. Besides, we introduce 10 new bound-unbound cases where only the RNA is found in the unbound state. Based on the degree of conformational change of the interface residues upon complex formation the benchmark is classified into 72 rigid-body cases, 25 semiflexible cases and 19 full flexible cases. It also covers a wide range of conformational flexibility including small side chain movement to large domain swapping in protein structures as well as flipping and restacking in RNA bases. This benchmark should provide the docking community with more test cases for evaluating rigid-body as well as flexible docking algorithms. Besides, it will also facilitate the development of new algorithms that require large number of training set. The protein-RNA docking benchmark version 2.0 can be freely downloaded from http://www.csb.iitkgp.ernet.in/applications/PRDBv2. Proteins 2017; 85:256-267. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Chandran Nithin
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, 721302, India
| | - Sunandan Mukherjee
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, 721302, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology Kharagpur, 721302, India
| |
Collapse
|
33
|
Zheng J, Kundrotas PJ, Vakser IA, Liu S. Template-Based Modeling of Protein-RNA Interactions. PLoS Comput Biol 2016; 12:e1005120. [PMID: 27662342 PMCID: PMC5035060 DOI: 10.1371/journal.pcbi.1005120] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Accepted: 08/25/2016] [Indexed: 12/29/2022] Open
Abstract
Protein-RNA complexes formed by specific recognition between RNA and RNA-binding proteins play an important role in biological processes. More than a thousand of such proteins in human are curated and many novel RNA-binding proteins are to be discovered. Due to limitations of experimental approaches, computational techniques are needed for characterization of protein-RNA interactions. Although much progress has been made, adequate methodologies reliably providing atomic resolution structural details are still lacking. Although protein-RNA free docking approaches proved to be useful, in general, the template-based approaches provide higher quality of predictions. Templates are key to building a high quality model. Sequence/structure relationships were studied based on a representative set of binary protein-RNA complexes from PDB. Several approaches were tested for pairwise target/template alignment. The analysis revealed a transition point between random and correct binding modes. The results showed that structural alignment is better than sequence alignment in identifying good templates, suitable for generating protein-RNA complexes close to the native structure, and outperforms free docking, successfully predicting complexes where the free docking fails, including cases of significant conformational change upon binding. A template-based protein-RNA interaction modeling protocol PRIME was developed and benchmarked on a representative set of complexes. Structures of protein-RNA complexes are important for characterization of biological processes. The number of experimentally determined protein-RNA complexes is limited. Thus modeling of these complexes is important. Reliable structural predictions of proteins and their complexes are provided by comparative modeling, which takes advantage of similar complexes with experimentally determined structures. Thus, in the case of protein-RNA complexes, it is important to determine if similar proteins and RNAs bind in a similar way. We show that, similarly to the earlier published results on protein-protein complexes, such correlation of the protein-RNA binding mode and the monomers similarity indeed exists, and is stronger when the similarity is determined by structure rather than sequence alignment. The data shows clear transition from random to similar binding mode with the increase of the structural similarity of the monomers. On the basis of the results we designed and implemented a predictive tool, which should be useful for the biological community interested in modeling of protein-RNA interactions.
Collapse
Affiliation(s)
- Jinfang Zheng
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Petras J. Kundrotas
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America
| | - Ilya A. Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America
- * E-mail: (IAV); (SL)
| | - Shiyong Liu
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, Hubei, China
- * E-mail: (IAV); (SL)
| |
Collapse
|
34
|
Iwakiri J, Hamada M, Asai K, Kameda T. Improved Accuracy in RNA-Protein Rigid Body Docking by Incorporating Force Field for Molecular Dynamics Simulation into the Scoring Function. J Chem Theory Comput 2016; 12:4688-97. [PMID: 27494732 DOI: 10.1021/acs.jctc.6b00254] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
RNA-protein interactions play fundamental roles in many biological processes. To understand these interactions, it is necessary to know the three-dimensional structures of RNA-protein complexes. However, determining the tertiary structure of these complexes is often difficult, suggesting that an accurate rigid body docking for RNA-protein complexes is needed. In general, the rigid body docking process is divided into two steps: generating candidate structures from the individual RNA and protein structures and then narrowing down the candidates. In this study, we focus on the former problem to improve the prediction accuracy in RNA-protein docking. Our method is based on the integration of physicochemical information about RNA into ZDOCK, which is known as one of the most successful computer programs for protein-protein docking. Because recent studies showed the current force field for molecular dynamics simulation of protein and nucleic acids is quite accurate, we modeled the physicochemical information about RNA by force fields such as AMBER and CHARMM. A comprehensive benchmark of RNA-protein docking, using three recently developed data sets, reveals the remarkable prediction accuracy of the proposed method compared with existing programs for docking: the highest success rate is 34.7% for the predicted structure of the RNA-protein complex with the best score and 79.2% for 3,600 predicted ones. Three full atomistic force fields for RNA (AMBER94, AMBER99, and CHARMM22) produced almost the same accurate result, which showed current force fields for nucleic acids are quite accurate. In addition, we found that the electrostatic interaction and the representation of shape complementary between protein and RNA plays the important roles for accurate prediction of the native structures of RNA-protein complexes.
Collapse
Affiliation(s)
- Junichi Iwakiri
- Graduate School of Frontier Sciences, The University of Tokyo , 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
| | - Michiaki Hamada
- Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University , 55N-06-10, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan.,Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST) , 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Kiyoshi Asai
- Graduate School of Frontier Sciences, The University of Tokyo , 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan.,Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST) , 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Tomoshi Kameda
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST) , 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
35
|
Wang J, Dong H, Chionh YH, McBee ME, Sirirungruang S, Cunningham RP, Shi PY, Dedon PC. The role of sequence context, nucleotide pool balance and stress in 2'-deoxynucleotide misincorporation in viral, bacterial and mammalian RNA. Nucleic Acids Res 2016; 44:8962-8975. [PMID: 27365049 PMCID: PMC5062971 DOI: 10.1093/nar/gkw572] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 06/06/2016] [Indexed: 11/16/2022] Open
Abstract
The misincorporation of 2′-deoxyribonucleotides (dNs) into RNA has important implications for the function of non-coding RNAs, the translational fidelity of coding RNAs and the mutagenic evolution of viral RNA genomes. However, quantitative appreciation for the degree to which dN misincorporation occurs is limited by the lack of analytical tools. Here, we report a method to hydrolyze RNA to release 2′-deoxyribonucleotide-ribonucleotide pairs (dNrN) that are then quantified by chromatography-coupled mass spectrometry (LC-MS). Using this platform, we found misincorporated dNs occurring at 1 per 103 to 105 ribonucleotide (nt) in mRNA, rRNAs and tRNA in human cells, Escherichia coli, Saccharomyces cerevisiae and, most abundantly, in the RNA genome of dengue virus. The frequency of dNs varied widely among organisms and sequence contexts, and partly reflected the in vitro discrimination efficiencies of different RNA polymerases against 2′-deoxyribonucleoside 5′-triphosphates (dNTPs). Further, we demonstrate a strong link between dN frequencies in RNA and the balance of dNTPs and ribonucleoside 5′-triphosphates (rNTPs) in the cellular pool, with significant stress-induced variation of dN incorporation. Potential implications of dNs in RNA are discussed, including the possibilities of dN incorporation in RNA as a contributing factor in viral evolution and human disease, and as a host immune defense mechanism against viral infections.
Collapse
Affiliation(s)
- Jin Wang
- Infectious Disease Interdisciplinary Research Group, Singapore-MIT Alliance for Research and Technology, Singapore 138602
| | - Hongping Dong
- Novartis Institute for Tropical Diseases, Singapore 138670
| | - Yok Hian Chionh
- Infectious Disease Interdisciplinary Research Group, Singapore-MIT Alliance for Research and Technology, Singapore 138602 Department of Microbiology & Immunology Programme, Center for Life Sciences, National University of Singapore, Singapore 117545
| | - Megan E McBee
- Infectious Disease Interdisciplinary Research Group, Singapore-MIT Alliance for Research and Technology, Singapore 138602
| | - Sasilada Sirirungruang
- Infectious Disease Interdisciplinary Research Group, Singapore-MIT Alliance for Research and Technology, Singapore 138602
| | - Richard P Cunningham
- Department of Biological Sciences, The University at Albany, Albany, NY 12222, USA
| | - Pei-Yong Shi
- Departments of Biochemistry & Molecular Biology and Phamarcology & Toxicology, and Sealy Center for Structural Biology & Molecular Biophysics, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Peter C Dedon
- Infectious Disease Interdisciplinary Research Group, Singapore-MIT Alliance for Research and Technology, Singapore 138602 Department of Biological Engineering & Center for Environmental Health Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139-4307, USA
| |
Collapse
|
36
|
Sun M, Wang X, Zou C, He Z, Liu W, Li H. Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors. BMC Bioinformatics 2016; 17:231. [PMID: 27266516 PMCID: PMC4897909 DOI: 10.1186/s12859-016-1110-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2016] [Accepted: 06/02/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA-binding proteins participate in many important biological processes concerning RNA-mediated gene regulation, and several computational methods have been recently developed to predict the protein-RNA interactions of RNA-binding proteins. Newly developed discriminative descriptors will help to improve the prediction accuracy of these prediction methods and provide further meaningful information for researchers. RESULTS In this work, we designed two structural features (residue electrostatic surface potential and triplet interface propensity) and according to the statistical and structural analysis of protein-RNA complexes, the two features were powerful for identifying RNA-binding protein residues. Using these two features and other excellent structure- and sequence-based features, a random forest classifier was constructed to predict RNA-binding residues. The area under the receiver operating characteristic curve (AUC) of five-fold cross-validation for our method on training set RBP195 was 0.900, and when applied to the test set RBP68, the prediction accuracy (ACC) was 0.868, and the F-score was 0.631. CONCLUSIONS The good prediction performance of our method revealed that the two newly designed descriptors could be discriminative for inferring protein residues interacting with RNAs. To facilitate the use of our method, a web-server called RNAProSite, which implements the proposed method, was constructed and is freely available at http://lilab.ecust.edu.cn/NABind .
Collapse
Affiliation(s)
- Meijian Sun
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Mei Long Road, Shanghai, 200237, China
| | - Xia Wang
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Mei Long Road, Shanghai, 200237, China
| | - Chuanxin Zou
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Mei Long Road, Shanghai, 200237, China
| | - Zenghui He
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Mei Long Road, Shanghai, 200237, China
| | - Wei Liu
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Mei Long Road, Shanghai, 200237, China
| | - Honglin Li
- State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Mei Long Road, Shanghai, 200237, China.
| |
Collapse
|
37
|
Wilson KA, Holland DJ, Wetmore SD. Topology of RNA-protein nucleobase-amino acid π-π interactions and comparison to analogous DNA-protein π-π contacts. RNA (NEW YORK, N.Y.) 2016; 22:696-708. [PMID: 26979279 PMCID: PMC4836644 DOI: 10.1261/rna.054924.115] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2015] [Accepted: 02/13/2016] [Indexed: 06/05/2023]
Abstract
The present work analyzed 120 high-resolution X-ray crystal structures and identified 335 RNA-protein π-interactions (154 nonredundant) between a nucleobase and aromatic (W, H, F, or Y) or acyclic (R, E, or D) π-containing amino acid. Each contact was critically analyzed (including using a visual inspection protocol) to determine the most prevalent composition, structure, and strength of π-interactions at RNA-protein interfaces. These contacts most commonly involve F and U, with U:F interactions comprising one-fifth of the total number of contacts found. Furthermore, the RNA and protein π-systems adopt many different relative orientations, although there is a preference for more parallel (stacked) arrangements. Due to the variation in structure, the strength of the intermolecular forces between the RNA and protein components (as determined from accurate quantum chemical calculations) exhibits a significant range, with most of the contacts providing significant stability to the associated RNA-protein complex (up to -65 kJ mol(-1)). Comparison to the analogous DNA-protein π-interactions emphasizes differences in RNA- and DNA-protein π-interactions at the molecular level, including the greater abundance of RNA contacts and the involvement of different nucleobase/amino acid residues. Overall, our results provide a clearer picture of the molecular basis of nucleic acid-protein binding and underscore the important role of these contacts in biology, including the significant contribution of π-π interactions to the stability of nucleic acid-protein complexes. Nevertheless, more work is still needed in this area in order to further appreciate the properties and roles of RNA nucleobase-amino acid π-interactions in nature.
Collapse
Affiliation(s)
- Katie A Wilson
- Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, Alberta T1K 3M4, Canada
| | - Devany J Holland
- Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, Alberta T1K 3M4, Canada
| | - Stacey D Wetmore
- Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, Alberta T1K 3M4, Canada
| |
Collapse
|
38
|
Abstract
Interactions between protein and RNA play a key role in many biological processes in the gene expression pathway. Those interactions are mediated through a variety of RNA-binding protein domains, among them the highly abundant RNA recognition motif (RRM). Here we studied protein-RNA complexes from different RNA binding domain families solved by NMR and x-ray crystallography. Characterizing the structural properties of the RNA at the binding interfaces revealed an unexpected number of nucleotides with unusual RNA conformations, specifically found in RNA-RRM complexes. Moreover, we observed that the RNA nucleotides that are directly involved in interactions with the RRM domains, via hydrogen bonds and hydrophobic contacts, are significantly enriched with unique RNA conformations. Further examination of the sequences binding the RRM domain showed a preference for G nucleotides in syn conformation to precede or to follow U nucleotides in the anti-conformation, and U nucleotides in C2' endo conformation to precede U and G nucleotides possessing the more common C3' endo conformation. These findings imply a possible mode of RNA recognition by the RRM domains which enables the recognition of a wide variety of different RNA sequences and shapes. Overall, this study suggests an additional way by which the RRM domain recognizes its RNA target, involving a conformational readout.
Collapse
Affiliation(s)
- Efrat Kligun
- a Department of Biology; Technion - Israel Institute of Technology ; Haifa , Israel
| | | |
Collapse
|
39
|
Barik A, Nithin C, Karampudi NBR, Mukherjee S, Bahadur RP. Probing binding hot spots at protein-RNA recognition sites. Nucleic Acids Res 2015; 44:e9. [PMID: 26365245 PMCID: PMC4737170 DOI: 10.1093/nar/gkv876] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 08/23/2015] [Indexed: 01/30/2023] Open
Abstract
We use evolutionary conservation derived from structure alignment of polypeptide sequences along with structural and physicochemical attributes of protein–RNA interfaces to probe the binding hot spots at protein–RNA recognition sites. We find that the degree of conservation varies across the RNA binding proteins; some evolve rapidly compared to others. Additionally, irrespective of the structural class of the complexes, residues at the RNA binding sites are evolutionary better conserved than those at the solvent exposed surfaces. For recognitions involving duplex RNA, residues interacting with the major groove are better conserved than those interacting with the minor groove. We identify multi-interface residues participating simultaneously in protein–protein and protein–RNA interfaces in complexes where more than one polypeptide is involved in RNA recognition, and show that they are better conserved compared to any other RNA binding residues. We find that the residues at water preservation site are better conserved than those at hydrated or at dehydrated sites. Finally, we develop a Random Forests model using structural and physicochemical attributes for predicting binding hot spots. The model accurately predicts 80% of the instances of experimental ΔΔG values in a particular class, and provides a stepping-stone towards the engineering of protein–RNA recognition sites with desired affinity.
Collapse
Affiliation(s)
- Amita Barik
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur-721302, India
| | - Chandran Nithin
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur-721302, India
| | | | - Sunandan Mukherjee
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur-721302, India
| | - Ranjit Prasad Bahadur
- Computational Structural Biology Laboratory, Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur-721302, India Advanced Technology Development Centre, Indian Institute of Technology Kharagpur, Kharagpur-721302, India
| |
Collapse
|
40
|
Pérez-Cano L, Fernández-Recio J. Dissection and prediction of RNA-binding sites on proteins. Biomol Concepts 2015; 1:345-55. [PMID: 25962008 DOI: 10.1515/bmc.2010.037] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
RNA-binding proteins are involved in many important regulatory processes in cells and their study is essential for a complete understanding of living organisms. They show a large variability from both structural and functional points of view. However, several recent studies performed on protein-RNA crystal structures have revealed interesting common properties. RNA-binding sites usually constitute patches of positively charged or polar residues that make most of the specific and non-specific contacts with RNA. Negatively charged or aliphatic residues are less frequent at protein-RNA interfaces, although they can also be found either forming aliphatic and positive-negative pairs in protein RNA-binding sites or contacting RNA through their main chains. Aromatic residues found within these interfaces are usually involved in specific base recognition at RNA single-strand regions. This specific recognition, in combination with structural complementarity, represents the key source for specificity in protein-RNA association. From all this knowledge, a variety of computational methods for prediction of RNA-binding sites have been developed based either on protein sequence or on protein structure. Some reported methods are really successful in the identification of RNA-binding proteins or the prediction of RNA-binding sites. Given the growing interest in the field, all these studies and prediction methods will undoubtedly contribute to the identification and comprehension of protein-RNA interactions.
Collapse
|
41
|
Cruz-Gallardo I, Del Conte R, Velázquez-Campoy A, García-Mauriño SM, Díaz-Moreno I. A Non-Invasive NMR Method Based on Histidine Imidazoles to Analyze the pH-Modulation of Protein-Nucleic Acid Interfaces. Chemistry 2015; 21:7588-95. [PMID: 25846236 DOI: 10.1002/chem.201405538] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2014] [Revised: 02/19/2015] [Indexed: 12/20/2022]
Abstract
A useful (2) J(N-H) coupling-based NMR spectroscopic approach is proposed to unveil, at the molecular level, the contribution of the imidazole groups of histidines from RNA/DNA-binding proteins on the modulation of binding to nucleic acids by pH. Such protonation/deprotonation events have been monitored on the single His96 located at the second RNA/DNA recognition motif (RRM2) of T-cell intracellular antigen-1 (TIA-1) protein. The pKa values of the His96 ionizable groups were substantially higher in the complexes with short U-rich RNA and T-rich DNA oligonucleotides than those of the isolated TIA-1 RRM2. Herein, the methodology applied to determine changes in pKa of histidine side chains upon DNA/RNA binding, gives valuable information to understand the pH effect on multidomain DNA/RNA-binding proteins that shuttle among different cellular compartments.
Collapse
Affiliation(s)
- Isabel Cruz-Gallardo
- Instituto de Bioquímica Vegetal y Fotosíntesis cicCartuja, Universidad de Sevilla - CSIC, Avenida Américo Vespucio 49, 41092 Sevilla (Spain)
| | | | | | | | | |
Collapse
|
42
|
Chattopadhyay A, Dey P, Barik A, Bahadur RP, Maiti MK. A repressor activator protein1 homologue from an oleaginous strain of Candida tropicalis increases storage lipid production in Saccharomyces cerevisiae. FEMS Yeast Res 2015; 15:fov013. [DOI: 10.1093/femsyr/fov013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/17/2015] [Indexed: 01/02/2023] Open
|
43
|
Nagarajan R, Chothani SP, Ramakrishnan C, Sekijima M, Gromiha MM. Structure based approach for understanding organism specific recognition of protein-RNA complexes. Biol Direct 2015; 10:8. [PMID: 25886642 PMCID: PMC4352265 DOI: 10.1186/s13062-015-0039-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 02/03/2015] [Indexed: 12/11/2022] Open
Abstract
Background Protein-RNA interactions perform diverse functions within the cell. Understanding the recognition mechanism of protein-RNA complexes has been a challenging task in molecular and computational biology. In earlier works, the recognition mechanisms have been studied for a specific complex or using a set of non–redundant complexes. In this work, we have constructed 18 sets of same protein-RNA complexes belonging to different organisms from Protein Data Bank (PDB). The similarities and differences in each set of complexes have been revealed in terms of various sequence and structure based features such as root mean square deviation, sequence homology, propensity of binding site residues, variance, conservation at binding sites, binding segments, binding motifs of amino acid residues and nucleotides, preferred amino acid-nucleotide pairs and influence of neighboring residues for binding. Results We found that the proteins of mesophilic organisms have more number of binding sites than thermophiles and the binding propensities of amino acid residues are distinct in E. coli, H. sapiens, S. cerevisiae, thermophiles and archaea. Proteins prefer to bind with RNA using a single residue segment in all the organisms while RNA prefers to use a stretch of up to six nucleotides for binding with proteins. We have developed amino acid residue-nucleotide pair potentials for different organisms, which could be used for predicting the binding specificity. Further, molecular dynamics simulation studies on aspartyl tRNA synthetase complexed with aspartyl tRNA showed specific modes of recognition in E. coli, T. thermophilus and S. cerevisiae. Conclusion Based on structural analysis and molecular dynamics simulations we suggest that the mode of recognition depends on the type of the organism in a protein-RNA complex. Reviewers This article was reviewed by Sandor Pongor, Gajendra Raghava and Narayanaswamy Srinivasan. Electronic supplementary material The online version of this article (doi:10.1186/s13062-015-0039-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Raju Nagarajan
- Department of Biotechnology, Bhupat Jyoti Metha School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, Tamilnadu, India.
| | - Sonia Pankaj Chothani
- Department of Biotechnology, Bhupat Jyoti Metha School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, Tamilnadu, India. .,Philips Research North America, 345 Scarborough Road, Briarcliff Manor, NY, 10510, USA.
| | - Chandrasekaran Ramakrishnan
- Department of Biotechnology, Bhupat Jyoti Metha School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, Tamilnadu, India.
| | - Masakazu Sekijima
- Global Scientific Information and Computing Center (GSIC), Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan.
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat Jyoti Metha School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, Tamilnadu, India.
| |
Collapse
|
44
|
Barik A, C N, Pilla SP, Bahadur RP. Molecular architecture of protein-RNA recognition sites. J Biomol Struct Dyn 2015; 33:2738-51. [PMID: 25562181 DOI: 10.1080/07391102.2015.1004652] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The molecular architecture of protein-RNA interfaces are analyzed using a non-redundant dataset of 152 protein-RNA complexes. We find that an average protein-RNA interface is smaller than an average protein-DNA interface but larger than an average protein-protein interface. Among the different classes of protein-RNA complexes, interfaces with tRNA are the largest, while the interfaces with the single-stranded RNA are the smallest. Significantly, RNA contributes more to the interface area than its partner protein. Moreover, unlike protein-protein interfaces where the side chain contributes less to the interface area compared to the main chain, the main chain and side chain contributions flipped in protein-RNA interfaces. We find that the protein surface in contact with the RNA in protein-RNA complexes is better packed than that in contact with the DNA in protein-DNA complexes, but loosely packed than that in contact with the protein in protein-protein complexes. Shape complementarity and electrostatic potential are the two major factors that determine the specificity of the protein-RNA interaction. We find that the H-bond density at the protein-RNA interfaces is similar with that of protein-DNA interfaces but higher than the protein-protein interfaces. Unlike protein-DNA interfaces where the deoxyribose has little role in intermolecular H-bonds, due to the presence of an oxygen atom at the 2' position, the ribose in RNA plays significant role in protein-RNA H-bonds. We find that besides H-bonds, salt bridges and stacking interactions also play significant role in stabilizing protein-nucleic acids interfaces; however, their contribution at the protein-protein interfaces is insignificant.
Collapse
Affiliation(s)
- Amita Barik
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Nithin C
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Smita P Pilla
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| | - Ranjit Prasad Bahadur
- a Computational Structural Biology Laboratory, Department of Biotechnology , Indian Institute of Technology Kharagpur , Kharagpur , India
| |
Collapse
|
45
|
Kyne C, Ruhle B, Gautier VW, Crowley PB. Specific ion effects on macromolecular interactions in Escherichia coli extracts. Protein Sci 2014; 24:310-8. [PMID: 25492389 DOI: 10.1002/pro.2615] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2014] [Revised: 11/25/2014] [Accepted: 11/27/2014] [Indexed: 12/16/2022]
Abstract
Protein characterization in situ remains a major challenge for protein science. Here, the interactions of ΔTat-GB1 in Escherichia coli cell extracts were investigated by NMR spectroscopy and size exclusion chromatography (SEC). ΔTat-GB1 was found to participate in high molecular weight complexes that remain intact at physiologically-relevant ionic strength. This observation helps to explain why ΔTat-GB1 was not detected by in-cell NMR spectroscopy. Extracts pre-treated with RNase A had a different SEC elution profile indicating that ΔTat-GB1 predominantly interacted with RNA. The roles of biological and laboratory ions in mediating macromolecular interactions were studied. Interestingly, the interactions of ΔTat-GB1 could be disrupted by biologically-relevant multivalent ions. The most effective shielding of interactions occurred in Mg(2+) -containing buffers. Moreover, a combination of RNA digestion and Mg(2+) greatly enhanced the NMR detection of ΔTat-GB1 in cell extracts.
Collapse
Affiliation(s)
- Ciara Kyne
- School of Chemistry, National University of Ireland Galway, University Road, Galway, Ireland
| | | | | | | |
Collapse
|
46
|
Mota É, Sousa F, Queiroz JA, Cruz C. Quantitative analysis of the interaction between l-methionine derivative and oligonucleotides. J Biochem 2014; 157:261-70. [PMID: 25425656 DOI: 10.1093/jb/mvu073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
This study explores the use of l-methionine derivative as a potential affinity ligand for nucleic acids purification. The l-methionine derivative is synthesized by activation of the carboxylic acid group with 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide/N-hydroxysuccinimide follow by immobilization on amine sensor surface, previously activated and treated with ethylenediamine. Their affinity towards oligonucleotides has been determined by surface plasmon resonance biosensor. The highest affinity is found for cytosine and thymine, followed by adenine, whereas the lowest affinity is found for guanine. For hetero-oligonucleotides the affinity order is CCCTTT > CCCAAA ≈ AAATTT > GGGTTT, showing that nucleotides with cytosine have the highest affinity, and the presence of guanine reduces the affinity, corroborating with the results obtained with homo-oligonucleotides.
Collapse
Affiliation(s)
- Élia Mota
- CICS-UBI - Centro de Investigação em Ciências da Saúde, Universidade da Beira Interior, Av. Infante D. Henrique, 6200-506 Covilhã, Portugal
| | - Fani Sousa
- CICS-UBI - Centro de Investigação em Ciências da Saúde, Universidade da Beira Interior, Av. Infante D. Henrique, 6200-506 Covilhã, Portugal
| | - João A Queiroz
- CICS-UBI - Centro de Investigação em Ciências da Saúde, Universidade da Beira Interior, Av. Infante D. Henrique, 6200-506 Covilhã, Portugal
| | - Carla Cruz
- CICS-UBI - Centro de Investigação em Ciências da Saúde, Universidade da Beira Interior, Av. Infante D. Henrique, 6200-506 Covilhã, Portugal
| |
Collapse
|
47
|
The structure, function and evolution of proteins that bind DNA and RNA. Nat Rev Mol Cell Biol 2014; 15:749-60. [PMID: 25269475 DOI: 10.1038/nrm3884] [Citation(s) in RCA: 241] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Proteins that bind both DNA and RNA typify the ability of a single gene product to perform multiple functions. Such DNA- and RNA-binding proteins (DRBPs) have unique functional characteristics that stem from their specific structural features; these developed early in evolution and are widely conserved. Proteins that bind RNA have typically been considered as functionally distinct from proteins that bind DNA and studied independently. This practice is becoming outdated, in partly owing to the discovery of long non-coding RNAs (lncRNAs) that target DNA-binding proteins. Consequently, DRBPs were found to regulate many cellular processes, including transcription, translation, gene silencing, microRNA biogenesis and telomere maintenance.
Collapse
|
48
|
Abstract
We investigate the role of water molecules in 89 protein–RNA complexes taken from the Protein Data Bank. Those with tRNA and single-stranded RNA are less hydrated than with duplex or ribosomal proteins. Protein–RNA interfaces are hydrated less than protein–DNA interfaces, but more than protein–protein interfaces. Majority of the waters at protein–RNA interfaces makes multiple H-bonds; however, a fraction do not make any. Those making H-bonds have preferences for the polar groups of RNA than its partner protein. The spatial distribution of waters makes interfaces with ribosomal proteins and single-stranded RNA relatively ‘dry’ than interfaces with tRNA and duplex RNA. In contrast to protein–DNA interfaces, mainly due to the presence of the 2′OH, the ribose in protein–RNA interfaces is hydrated more than the phosphate or the bases. The minor groove in protein–RNA interfaces is hydrated more than the major groove, while in protein–DNA interfaces it is reverse. The strands make the highest number of water-mediated H-bonds per unit interface area followed by the helices and the non-regular structures. The preserved waters at protein–RNA interfaces make higher number of H-bonds than the other waters. Preserved waters contribute toward the affinity in protein–RNA recognition and should be carefully treated while engineering protein–RNA interfaces.
Collapse
Affiliation(s)
- Amita Barik
- Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur-721302, India
| | - Ranjit Prasad Bahadur
- Department of Biotechnology, Indian Institute of Technology Kharagpur, Kharagpur-721302, India
| |
Collapse
|
49
|
Yang XX, Deng ZL, Liu R. RBRDetector: Improved prediction of binding residues on RNA-binding protein structures using complementary feature- and template-based strategies. Proteins 2014; 82:2455-71. [DOI: 10.1002/prot.24610] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2014] [Revised: 04/28/2014] [Accepted: 05/09/2014] [Indexed: 11/05/2022]
Affiliation(s)
- Xiao-Xia Yang
- Agricultural Bioinformatics Key Laboratory of Hubei Province; College of Informatics; Huazhong Agricultural University; Wuhan 430070 People's Republic of China
| | - Zhi-Luo Deng
- Agricultural Bioinformatics Key Laboratory of Hubei Province; College of Informatics; Huazhong Agricultural University; Wuhan 430070 People's Republic of China
| | - Rong Liu
- Agricultural Bioinformatics Key Laboratory of Hubei Province; College of Informatics; Huazhong Agricultural University; Wuhan 430070 People's Republic of China
| |
Collapse
|
50
|
Nagarajan R, Gromiha MM. Prediction of RNA binding residues: an extensive analysis based on structure and function to select the best predictor. PLoS One 2014; 9:e91140. [PMID: 24658593 PMCID: PMC3962366 DOI: 10.1371/journal.pone.0091140] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2013] [Accepted: 02/08/2014] [Indexed: 11/18/2022] Open
Abstract
Protein-RNA complexes play key roles in several cellular processes by the interactions of amino acids with RNA. To understand the recognition mechanism, it is important to identify the specific amino acids involved in RNA binding. Various computational methods have been developed for predicting RNA binding residues from protein sequence. However, their performances mainly depend on the training dataset, feature selection for developing a model and learning capacity of the model. Hence, it is important to reveal the correspondence between the performance of methods and properties of RNA-binding proteins (RBPs). In this work, we have collected all available RNA binding residues prediction methods and revealed their performances on unbiased, stringent and diverse datasets for RBPs with less than 25% sequence identity based on structural class, fold, superfamily, family, protein function, RNA type, RNA strand and RNA conformation. The best methods for each type of RBPs and the type of RBPs, which require further refinement in prediction, have been brought out. We also analyzed the performance of these methods for the disordered regions, structures which are not included in the training dataset and recently solved structures. The reliability of prediction is better than randomly choosing any method or combination of methods. This approach would be a valuable resource for biologists to choose the best method based on the type of RBPs for designing their experiments and the tool is freely accessible online at www.iitm.ac.in/bioinfo/RNA-protein/.
Collapse
Affiliation(s)
- R. Nagarajan
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, India
| | - M. Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, India
- * E-mail:
| |
Collapse
|