1
|
Chen YC, Sargsyan K, Wright JD, Chen YH, Huang YS, Lim C. PPI-hotspot ID for detecting protein-protein interaction hot spots from the free protein structure. eLife 2024; 13:RP96643. [PMID: 39283314 PMCID: PMC11405013 DOI: 10.7554/elife.96643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2024] Open
Abstract
Experimental detection of residues critical for protein-protein interactions (PPI) is a time-consuming, costly, and labor-intensive process. Hence, high-throughput PPI-hot spot prediction methods have been developed, but they have been validated using relatively small datasets, which may compromise their predictive reliability. Here, we introduce PPI-hotspotID, a novel method for identifying PPI-hot spots using the free protein structure, and validated it on the largest collection of experimentally confirmed PPI-hot spots to date. We explored the possibility of detecting PPI-hot spots using (i) FTMap in the PPI mode, which identifies hot spots on protein-protein interfaces from the free protein structure, and (ii) the interface residues predicted by AlphaFold-Multimer. PPI-hotspotID yielded better performance than FTMap and SPOTONE, a webserver for predicting PPI-hot spots given the protein sequence. When combined with the AlphaFold-Multimer-predicted interface residues, PPI-hotspotID yielded better performance than either method alone. Furthermore, we experimentally verified several PPI-hotspotID-predicted PPI-hot spots of eukaryotic elongation factor 2. Notably, PPI-hotspotID can reveal PPI-hot spots not obvious from complex structures, including those in indirect contact with binding partners. PPI-hotspotID serves as a valuable tool for understanding PPI mechanisms and aiding drug design. It is available as a web server (https://ppihotspotid.limlab.dnsalias.org/) and open-source code (https://github.com/wrigjz/ppihotspotid/).
Collapse
Affiliation(s)
- Yao Chi Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Karen Sargsyan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Jon D Wright
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Yu-Hsien Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Yi-Shuian Huang
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Carmay Lim
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
2
|
Blanco JD, Radusky L, Climente-González H, Serrano L. FoldX accurate structural protein-DNA binding prediction using PADA1 (Protein Assisted DNA Assembly 1). Nucleic Acids Res 2019; 46:3852-3863. [PMID: 29608705 PMCID: PMC5934639 DOI: 10.1093/nar/gky228] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Accepted: 03/20/2018] [Indexed: 12/20/2022] Open
Abstract
The speed at which new genomes are being sequenced highlights the need for genome-wide methods capable of predicting protein–DNA interactions. Here, we present PADA1, a generic algorithm that accurately models structural complexes and predicts the DNA-binding regions of resolved protein structures. PADA1 relies on a library of protein and double-stranded DNA fragment pairs obtained from a training set of 2103 DNA–protein complexes. It includes a fast statistical force field computed from atom-atom distances, to evaluate and filter the 3D docking models. Using published benchmark validation sets and 212 DNA–protein structures published after 2016 we predicted the DNA-binding regions with an RMSD of <1.8 Å per residue in >95% of the cases. We show that the quality of the docked templates is compatible with FoldX protein design tool suite to identify the crystallized DNA molecule sequence as the most energetically favorable in 80% of the cases. We highlighted the biological potential of PADA1 by reconstituting DNA and protein conformational changes upon protein mutagenesis of a meganuclease and its variants, and by predicting DNA-binding regions and nucleotide sequences in proteins crystallized without DNA. These results opens up new perspectives for the engineering of DNA–protein interfaces.
Collapse
Affiliation(s)
- Javier Delgado Blanco
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Leandro Radusky
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Héctor Climente-González
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Pg. Lluis Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
3
|
Abstract
The increasing number of protein structures with uncharacterized function necessitates the development of in silico prediction methods for functional annotations on proteins. In this chapter, different kinds of computational approaches are briefly introduced to predict DNA-binding residues on surface of DNA-binding proteins, and the merits and limitations of these methods are mainly discussed. This chapter focuses on the structure-based approaches and mainly discusses the framework of machine learning methods in application to DNA-binding prediction task.
Collapse
|
4
|
Chandrasekaran A, Chan J, Lim C, Yang LW. Protein Dynamics and Contact Topology Reveal Protein–DNA Binding Orientation. J Chem Theory Comput 2016; 12:5269-5277. [DOI: 10.1021/acs.jctc.6b00688] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Affiliation(s)
| | | | | | - Lee-Wei Yang
- Physics
Division, National Center for Theoretical Sciences, Hsinchu 30013, Taiwan
| |
Collapse
|
5
|
Miao Z, Westhof E. A Large-Scale Assessment of Nucleic Acids Binding Site Prediction Programs. PLoS Comput Biol 2015; 11:e1004639. [PMID: 26681179 PMCID: PMC4683125 DOI: 10.1371/journal.pcbi.1004639] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Accepted: 10/30/2015] [Indexed: 11/18/2022] Open
Abstract
Computational prediction of nucleic acid binding sites in proteins are necessary to disentangle functional mechanisms in most biological processes and to explore the binding mechanisms. Several strategies have been proposed, but the state-of-the-art approaches display a great diversity in i) the definition of nucleic acid binding sites; ii) the training and test datasets; iii) the algorithmic methods for the prediction strategies; iv) the performance measures and v) the distribution and availability of the prediction programs. Here we report a large-scale assessment of 19 web servers and 3 stand-alone programs on 41 datasets including more than 5000 proteins derived from 3D structures of protein-nucleic acid complexes. Well-defined binary assessment criteria (specificity, sensitivity, precision, accuracy…) are applied. We found that i) the tools have been greatly improved over the years; ii) some of the approaches suffer from theoretical defects and there is still room for sorting out the essential mechanisms of binding; iii) RNA binding and DNA binding appear to follow similar driving forces and iv) dataset bias may exist in some methods.
Collapse
Affiliation(s)
- Zhichao Miao
- Architecture et Réactivité de l'ARN, Université de Strasbourg, Institut de Biologie Moléculaire et Cellulaire du CNRS, Strasbourg, France
| | - Eric Westhof
- Architecture et Réactivité de l'ARN, Université de Strasbourg, Institut de Biologie Moléculaire et Cellulaire du CNRS, Strasbourg, France
| |
Collapse
|
6
|
Wang W, Liu J, Xiong Y, Zhu L, Zhou X. Analysis and classification of DNA-binding sites in single-stranded and double-stranded DNA-binding proteins using protein information. IET Syst Biol 2014; 8:176-83. [PMID: 25075531 DOI: 10.1049/iet-syb.2013.0048] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs) play different roles in biological processes when they bind to single-stranded DNA (ssDNA) or double-stranded DNA (dsDNA). However, the underlying binding mechanisms of SSBs and DSBs have not yet been fully understood. Here, the authors firstly constructed two groups of ssDNA and dsDNA specific binding sites from two non-redundant sets of SSBs and DSBs. They further analysed the relationship between the two classes of binding sites and a newly proposed set of features (residue charge distribution, secondary structure and spatial shape). To assess and utilise the predictive power of these features, they trained a classification model using support vector machine to make predictions about the ssDNA and the dsDNA binding sites. The author's analysis and prediction results indicated that the two classes of binding sites can be distinguishable by the three types of features, and the final classifier using all the features achieved satisfactory performance. In conclusion, the proposed features will deepen their understanding of the specificity of proteins which bind to ssDNA or dsDNA.
Collapse
Affiliation(s)
- Wei Wang
- School of Computer, Wuhan University, Wuhan, Hubei, People's Republic of China
| | - Juan Liu
- School of Computer, Wuhan University, Wuhan, Hubei, People's Republic of China.
| | - Yi Xiong
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | - Lida Zhu
- School of Computer, Wuhan University, Wuhan, Hubei, People's Republic of China
| | - Xionghui Zhou
- School of Computer, Wuhan University, Wuhan, Hubei, People's Republic of China
| |
Collapse
|
7
|
Chen YC, Sargsyan K, Wright JD, Huang YS, Lim C. Identifying RNA-binding residues based on evolutionary conserved structural and energetic features. Nucleic Acids Res 2013; 42:e15. [PMID: 24343026 PMCID: PMC3919582 DOI: 10.1093/nar/gkt1299] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Increasing numbers of protein structures are solved each year, but many of these structures belong to proteins whose sequences are homologous to sequences in the Protein Data Bank. Nevertheless, the structures of homologous proteins belonging to the same family contain useful information because functionally important residues are expected to preserve physico-chemical, structural and energetic features. This information forms the basis of our method, which detects RNA-binding residues of a given RNA-binding protein as those residues that preserve physico-chemical, structural and energetic features in its homologs. Tests on 81 RNA-bound and 35 RNA-free protein structures showed that our method yields a higher fraction of true RNA-binding residues (higher precision) than two structure-based and two sequence-based machine-learning methods. Because the method requires no training data set and has no parameters, its precision does not degrade when applied to 'novel' protein sequences unlike methods that are parameterized for a given training data set. It was used to predict the 'unknown' RNA-binding residues in the C-terminal RNA-binding domain of human CPEB3. The two predicted residues, F430 and F474, were experimentally verified to bind RNA, in particular F430, whose mutation to alanine or asparagine nearly abolished RNA binding. The method has been implemented in a webserver called DR_bind1, which is freely available with no login requirement at http://drbind.limlab.ibms.sinica.edu.tw.
Collapse
Affiliation(s)
- Yao Chi Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan and Department of Chemistry, National Tsing Hua University, Hsinchu 300, Taiwan
| | | | | | | | | |
Collapse
|
8
|
Chen YC, Wright JD, Lim C. DR_bind: a web server for predicting DNA-binding residues from the protein structure based on electrostatics, evolution and geometry. Nucleic Acids Res 2012; 40:W249-56. [PMID: 22661576 PMCID: PMC3394278 DOI: 10.1093/nar/gks481] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
DR_bind is a web server that automatically predicts DNA-binding residues, given the respective protein structure based on (i) electrostatics, (ii) evolution and (iii) geometry. In contrast to machine-learning methods, DR_bind does not require a training data set or any parameters. It predicts DNA-binding residues by detecting a cluster of conserved, solvent-accessible residues that are electrostatically stabilized upon mutation to Asp−/Glu−. The server requires as input the DNA-binding protein structure in PDB format and outputs a downloadable text file of the predicted DNA-binding residues, a 3D visualization of the predicted residues highlighted in the given protein structure, and a downloadable PyMol script for visualization of the results. Calibration on 83 and 55 non-redundant DNA-bound and DNA-free protein structures yielded a DNA-binding residue prediction accuracy/precision of 90/47% and 88/42%, respectively. Since DR_bind does not require any training using protein–DNA complex structures, it may predict DNA-binding residues in novel structures of DNA-binding proteins resulting from structural genomics projects with no conservation data. The DR_bind server is freely available with no login requirement at http://dnasite.limlab.ibms.sinica.edu.tw.
Collapse
Affiliation(s)
- Yao Chi Chen
- Institute of Biomedical Sciences, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan
| | | | | |
Collapse
|
9
|
Xiong Y, Xia J, Zhang W, Liu J. Exploiting a reduced set of weighted average features to improve prediction of DNA-binding residues from 3D structures. PLoS One 2011; 6:e28440. [PMID: 22174808 PMCID: PMC3234263 DOI: 10.1371/journal.pone.0028440] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2011] [Accepted: 11/08/2011] [Indexed: 01/29/2023] Open
Abstract
Predicting DNA-binding residues from a protein three-dimensional structure is a key task of computational structural proteomics. In the present study, based on machine learning technology, we aim to explore a reduced set of weighted average features for improving prediction of DNA-binding residues on protein surfaces. Via constructing the spatial environment around a DNA-binding residue, a novel weighting factor is first proposed to quantify the distance-dependent contribution of each neighboring residue in determining the location of a binding residue. Then, a weighted average scheme is introduced to represent the surface patch of the considering residue. Finally, the classifier is trained on the reduced set of these weighted average features, consisting of evolutionary profile, interface propensity, betweenness centrality and solvent surface area of side chain. Experimental results on 5-fold cross validation and independent tests indicate that the new feature set are effective to describe DNA-binding residues and our approach has significantly better performance than two previous methods. Furthermore, a brief case study suggests that the weighted average features are powerful for identifying DNA-binding residues and are promising for further study of protein structure-function relationship. The source code and datasets are available upon request.
Collapse
Affiliation(s)
- Yi Xiong
- School of Computer, Wuhan University, Wuhan, China
| | - Junfeng Xia
- Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Wen Zhang
- School of Computer, Wuhan University, Wuhan, China
| | - Juan Liu
- School of Computer, Wuhan University, Wuhan, China
- * E-mail:
| |
Collapse
|
10
|
Shazman S, Elber G, Mandel-Gutfreund Y. From face to interface recognition: a differential geometric approach to distinguish DNA from RNA binding surfaces. Nucleic Acids Res 2011; 39:7390-9. [PMID: 21693557 PMCID: PMC3177183 DOI: 10.1093/nar/gkr395] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Protein nucleic acid interactions play a critical role in all steps of the gene expression pathway. Nucleic acid (NA) binding proteins interact with their partners, DNA or RNA, via distinct regions on their surface that are characterized by an ensemble of chemical, physical and geometrical properties. In this study, we introduce a novel methodology based on differential geometry, commonly used in face recognition, to characterize and predict NA binding surfaces on proteins. Applying the method on experimentally solved three-dimensional structures of proteins we successfully classify double-stranded DNA (dsDNA) from single-stranded RNA (ssRNA) binding proteins, with 83% accuracy. We show that the method is insensitive to conformational changes that occur upon binding and can be applicable for de novo protein-function prediction. Remarkably, when concentrating on the zinc finger motif, we distinguish successfully between RNA and DNA binding interfaces possessing the same binding motif even within the same protein, as demonstrated for the RNA polymerase transcription-factor, TFIIIA. In conclusion, we present a novel methodology to characterize protein surfaces, which can accurately tell apart dsDNA from an ssRNA binding interfaces. The strength of our method in recognizing fine-tuned differences on NA binding interfaces make it applicable for many other molecular recognition problems, with potential implications for drug design.
Collapse
Affiliation(s)
- Shula Shazman
- Department of Computer Science, Technion-Israel Institute of Technology, Haifa, Israel
| | | | | |
Collapse
|
11
|
Wu CY, Chen YC, Lim C. A structural-alphabet-based strategy for finding structural motifs across protein families. Nucleic Acids Res 2010; 38:e150. [PMID: 20525797 PMCID: PMC2919736 DOI: 10.1093/nar/gkq478] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Proteins with insignificant sequence and overall structure similarity may still share locally conserved contiguous structural segments; i.e. structural/3D motifs. Most methods for finding 3D motifs require a known motif to search for other similar structures or functionally/structurally crucial residues. Here, without requiring a query motif or essential residues, a fully automated method for discovering 3D motifs of various sizes across protein families with different folds based on a 16-letter structural alphabet is presented. It was applied to structurally non-redundant proteins bound to DNA, RNA, obligate/non-obligate proteins as well as free DNA-binding proteins (DBPs) and proteins with known structures but unknown function. Its usefulness was illustrated by analyzing the 3D motifs found in DBPs. A non-specific motif was found with a ‘corner’ architecture that confers a stable scaffold and enables diverse interactions, making it suitable for binding not only DNA but also RNA and proteins. Furthermore, DNA-specific motifs present ‘only’ in DBPs were discovered. The motifs found can provide useful guidelines in detecting binding sites and computational protein redesign.
Collapse
Affiliation(s)
- Chih Yuan Wu
- Department of Chemistry, National Tsing Hua University, Hsinchu, Taiwan
| | | | | |
Collapse
|
12
|
Sun Y, Liu Z, Zhang S. Tissue distribution, developmental expression and up-regulation of p8 transcripts on stress in zebrafish. FISH & SHELLFISH IMMUNOLOGY 2010; 28:549-554. [PMID: 20036747 DOI: 10.1016/j.fsi.2009.12.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2009] [Revised: 12/10/2009] [Accepted: 12/12/2009] [Indexed: 05/28/2023]
Abstract
The p8 is a transcription factor with a basic helix-loop-helix motif and a nuclear localization signal. A zebrafish p8 cDNA, which consists of 732 bp and encodes 75 amino acids, was identified in this paper. Sequence alignment showed that the bHLH region of p8 was well-conserved during the evolution. Phylogenetic analysis revealed that zebrafish p8 was close to its homologous protein in frog, together clustering to the clade of vertebrates. The zebrafish p8 mRNA expression levels varied much among the detected adult tissues, with the obvious higher expression in backbone and liver. During embryogenesis, the expression of zebrafish p8 mRNA was in higher levels in cleavage stage, decreased from blastula to segmentation stage, but sharply elevated at hatching stage. Quantitative real-time PCR assay suggested up-regulation expressions of zebrafish p8 on a wide range of cellular stressors such as starvation, temperature, osmotic pressure and pH value, implying an important role of p8 gene in response to stress.
Collapse
Affiliation(s)
- Yanling Sun
- Key Laboratory of Marine Genetics and Breeding (Ocean University of China), Ministry of Education, Qingdao 266003, PR China
| | | | | |
Collapse
|
13
|
Wang YT, Wright JD, Doudeva LG, Jhang HC, Lim C, Yuan HS. Redesign of high-affinity nonspecific nucleases with altered sequence preference. J Am Chem Soc 2010; 131:17345-53. [PMID: 19929021 DOI: 10.1021/ja907160r] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
It is of crucial importance to elucidate the underlying principles that govern the binding affinity and selectivity between proteins and DNA. Here we use the nuclease domain of Colicin E7 (nColE7) as a model system to generate redesigned nucleases with improved DNA-binding affinities. ColE7 is a bacterial toxin, bearing a nonspecific endonuclease domain with a preference for hydrolyzing DNA phosphodiester bonds at the 3'O-side after thymine and adenine; i.e., it prefers Thy and Ade at the -1 site. Using systematic computational screening, six nColE7 mutants were predicted to bind DNA with high affinity. Five of the redesigned single-point mutants were constructed and purified, and four mutants had a 3- to 5-fold higher DNA binding affinity than wild-type nColE7 as measured by fluorescence kinetic assays. Moreover, three of the designed mutants, D493N, D493Q, and D493R, digested DNA with an increased preference for guanine at +3 sites compared to the wild-type enzyme, as shown by DNA footprint assays. X-ray structure determination of the ColE7 mutant D493Q-DNA complex in conjunction with structural and free energy decomposition analyses provides a physical basis for the improved protein-DNA interactions: Replacing D493 at the protein-DNA interface with an amino acid residue that can maintain the native hydrogen bonds removes the unfavorable electrostatic repulsion between the negatively charged carboxylate and DNA phosphate groups. These results show that computational screening combined with biochemical, structural, and free energy analyses provide a useful means for generating redesigned nucleases with a higher DNA-binding affinity and altered sequence preferences in DNA cleavage.
Collapse
Affiliation(s)
- Yi-Ting Wang
- Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan, ROC
| | | | | | | | | | | |
Collapse
|
14
|
Huang YF, Huang CC, Liu YC, Oyang YJ, Huang CK. DNA-binding residues and binding mode prediction with binding-mechanism concerned models. BMC Genomics 2009; 10 Suppl 3:S23. [PMID: 19958487 PMCID: PMC2788376 DOI: 10.1186/1471-2164-10-s3-s23] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Background Protein-DNA interactions are essential for fundamental biological activities including DNA transcription, replication, packaging, repair and rearrangement. Proteins interacting with DNA can be classified into two categories of binding mechanisms - sequence-specific and non-specific binding. Protein-DNA specific binding provides a mechanism to recognize correct nucleotide base pairs for sequence-specific identification. Protein-DNA non-specific binding shows sequence independent interaction for accelerated targeting by interacting with DNA backbone. Both sequence-specific and non-specific binding residues contribute to their roles for interaction. Results The proposed framework has two stage predictors: DNA-binding residues prediction and binding mode prediction. In the first stage - DNA-binding residues prediction, the predictor for DNA specific binding residues achieves 96.45% accuracy with 50.14% sensitivity, 99.31% specificity, 81.70% precision, and 62.15% F-measure. The predictor for DNA non-specific binding residues achieves 89.14% accuracy with 53.06% sensitivity, 95.25% specificity, 65.47% precision, and 58.62% F-measure. While combining prediction results of sequence-specific and non-specific binding residues with OR operation, the predictor achieves 89.26% accuracy with 56.86% sensitivity, 95.63% specificity, 71.92% precision, and 63.51% F-measure. In the second stage, protein-DNA binding mode prediction achieves 75.83% accuracy while using support vector machine with multi-class prediction. Conclusion This article presents the design of a sequence based predictor aiming to identify sequence-specific and non-specific binding residues in a transcription factor with DNA binding-mechanism concerned. The protein-DNA binding mode prediction was introduced to help improve DNA-binding residues prediction. In addition, the results of this study will help with the design of binding-mechanism concerned predictors for other families of proteins interacting with DNA.
Collapse
Affiliation(s)
- Yu-Feng Huang
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, 106, Taiwan, Republic of China.
| | | | | | | | | |
Collapse
|
15
|
Tang YL, Shi YH, Zhao W, Hao G, Le GW. Interaction of MDpep9, a novel antimicrobial peptide from Chinese traditional edible larvae of housefly, with Escherichia coli genomic DNA. Food Chem 2009. [DOI: 10.1016/j.foodchem.2008.12.102] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
16
|
In silico cloning and characterization of p8 homolog cDNA from common urchin (Paracentrotus lividus). Mol Biol Rep 2009; 36:2431-7. [DOI: 10.1007/s11033-009-9474-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2008] [Accepted: 02/17/2009] [Indexed: 10/21/2022]
|
17
|
Abstract
Protein–DNA/RNA/protein interactions play critical roles in many biological functions. Previous studies have focused on the different features characterizing the different macromolecule-binding sites and approaches to detect these sites. However, no common unique signature of these sites had been reported. Thus, this work aims to provide a ‘common’ principle dictating the location of the different macromolecule-binding sites founded upon fundamental principles of binding thermodynamics. To achieve this aim, a comprehensive set of structurally nonhomologous DNA-, RNA-, obligate protein- and nonobligate protein-binding proteins, both free and bound to their respective macromolecules, was created and a novel strategy for detecting clusters of residues with electrostatic or steric strain given the protein structure was developed. The results show that regardless of the macromolecule type, the binding strength and conformational changes upon binding, macromolecule-binding sites are energetically less stable than nonmacromolecule-binding sites. They also reveal new energetic features distinguishing DNA- from RNA-binding sites and obligate protein- from nonobligate protein-binding sites in both free/bound protein structures.
Collapse
Affiliation(s)
- Yao Chi Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | | |
Collapse
|
18
|
Rutledge LR, Wetmore SD. Remarkably Strong T-Shaped Interactions between Aromatic Amino Acids and Adenine: Their Increase upon Nucleobase Methylation and a Comparison to Stacking. J Chem Theory Comput 2008; 4:1768-80. [DOI: 10.1021/ct8002332] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Lesley R. Rutledge
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, Alberta, Canada T1K 3M4
| | - Stacey D. Wetmore
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, Alberta, Canada T1K 3M4
| |
Collapse
|
19
|
Chen YC, Lim C. Predicting RNA-binding sites from the protein structure based on electrostatics, evolution and geometry. Nucleic Acids Res 2008; 36:e29. [PMID: 18276647 PMCID: PMC2275128 DOI: 10.1093/nar/gkn008] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
An RNA-binding protein places a surface helix, β-ribbon, or loop in an RNA helix groove and/or uses a cavity to accommodate unstacked bases. Hence, our strategy for predicting RNA-binding residues is based on detecting a surface patch and a disparate cleft. These were generated and scored according to the gas-phase electrostatic energy change upon mutating each residue to Asp−/Glu− and each residue's relative conservation. The method requires as input the protein structure and sufficient homologous sequences to define each residue's relative conservation. It yields as output a priority list of surface patch residues followed by a backup list of surface cleft residues distant from the patch residues for experimental testing of RNA binding. Among the 69 structurally non-homologous proteins tested, 81% possess a RNA-binding site with at least 70% of the maximum number of true positives in randomly generated patches of the same size as the predicted site; only two proteins did not contain any true RNA-binding residues in both predicted regions. Regardless of the protein conformational changes upon RNA-binding, the prediction accuracies based on the RNA-free/bound protein structures were found to be comparable and their binding sites overlapped as long as there are no disordered RNA-binding regions in the free structure that are ordered in the corresponding RNA-bound protein structure.
Collapse
Affiliation(s)
- Yao Chi Chen
- Department of Chemistry, National Tsing Hua University, Hsinchu 300, Taiwan
| | | |
Collapse
|