1
|
Sun C, Feng Y. EPDRNA: A Model for Identifying DNA-RNA Binding Sites in Disease-Related Proteins. Protein J 2024; 43:513-521. [PMID: 38491248 DOI: 10.1007/s10930-024-10183-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/02/2024] [Indexed: 03/18/2024]
Abstract
Protein-DNA and protein-RNA interactions are involved in many biological processes and regulate many cellular functions. Moreover, they are related to many human diseases. To understand the molecular mechanism of protein-DNA binding and protein-RNA binding, it is important to identify which residues in the protein sequence bind to DNA and RNA. At present, there are few methods for specifically identifying the binding sites of disease-related protein-DNA and protein-RNA. In this study, so we combined four machine learning algorithms into an ensemble classifier (EPDRNA) to predict DNA and RNA binding sites in disease-related proteins. The dataset used in model was collated from UniProt and PDB database, and PSSM, physicochemical properties and amino acid type were used as features. The EPDRNA adopted soft voting and achieved the best AUC value of 0.73 at the DNA binding sites, and the best AUC value of 0.71 at the RNA binding sites in 10-fold cross validation in the training sets. In order to further verify the performance of the model, we assessed EPDRNA for the prediction of DNA-binding sites and the prediction of RNA-binding sites on the independent test dataset. The EPDRNA achieved 85% recall rate and 25% precision on the protein-DNA interaction independent test set, and achieved 82% recall rate and 27% precision on the protein-RNA interaction independent test set. The online EPDRNA webserver is freely available at http://www.s-bioinformatics.cn/epdrna .
Collapse
Affiliation(s)
- CanZhuang Sun
- College of Science, Inner Mongolia Agriculture University, Hohhot, 010018, People's Republic of China
| | - YongE Feng
- College of Science, Inner Mongolia Agriculture University, Hohhot, 010018, People's Republic of China.
| |
Collapse
|
2
|
Memon AA, Fu X, Fan XY, Xu L, Xiao J, Rahman MU, Yang X, Yao YF, Deng Z, Ma W. Substrate DNA Promoting Binding of Mycobacterium tuberculosis MtrA by Facilitating Dimerization and Interpretation of Affinity by Minor Groove Width. Microorganisms 2023; 11:2505. [PMID: 37894163 PMCID: PMC10609481 DOI: 10.3390/microorganisms11102505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/21/2023] [Accepted: 09/28/2023] [Indexed: 10/29/2023] Open
Abstract
In order to deepen the understanding of the role and regulation mechanisms of prokaryotic global transcription regulators in complex processes, including virulence, the associations between the affinity and binding sequences of Mycobacterium tuberculosis MtrA have been explored extensively. Analysis of MtrA 294 diversified 26 bp binding sequences revealed that the sequence similarity of fragments was not simply associated with affinity. The unique variation patterns of GC content and periodical and sequential fluctuation of affinity contribution curves were observed along the sequence in this study. Furthermore, docking analysis demonstrated that the structure of the dimer MtrA-DNA (high affinity) was generally consistent with other OmpR family members, while Arg 219 and Gly 220 of the wing domain interacted with the minor groove. The results of the binding box replacement experiment proved that box 2 was essential for binding, which implied the differential roles of the two boxes in the binding process. Furthermore, the results of the substitution of the nucleotide at the 20th and/or 21st positions indicated that the affinity was negatively associated with the value of minor groove width precisely at the 21st position. The dimerization of the unphosphorylated MtrA facilitated by a low-affinity DNA fragment was observed for the first time. However, the proportion of the dimer was associated with the affinity of substrate DNA, which further suggested that the affinity was actually one characteristic of the stability of dimers. Based on the finding of 17 inter-molecule hydrogen bonds identified in the interface of the MtrA dimer, including 8 symmetric complementary ones in the conserved α4-β5-α5 face, we propose that hydrogen bonds should be considered just as important as salt bridges and the hydrophobic patch in the dimerization. Our comprehensive study on a large number of binding fragments with quantitative affinity values provided new insight into the molecular mechanism of dimerization, binding specificity and affinity determination of MtrA and clues for solving the puzzle of how global transcription factors regulate a large quantity of target genes.
Collapse
Affiliation(s)
- Aadil Ahmed Memon
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| | - Xiang Fu
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| | - Xiao-Yong Fan
- Shanghai Institute of Infectious Diseases and Biosecurity, Shanghai Public Health Clinical Center, Fudan University, Shanghai 200032, China
| | - Lingyun Xu
- Shanghai Huaxin Biotechnology Co., Ltd., Room 604, Building 1, Tongji Chuangyuan, No. 99 South Changjiang Road, Baoshan District, Shanghai 200441, China
| | - Jihua Xiao
- Shanghai Huaxin Biotechnology Co., Ltd., Room 604, Building 1, Tongji Chuangyuan, No. 99 South Changjiang Road, Baoshan District, Shanghai 200441, China
| | - Mueed Ur Rahman
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| | - Xiaoqi Yang
- Shanghai Huaxin Biotechnology Co., Ltd., Room 604, Building 1, Tongji Chuangyuan, No. 99 South Changjiang Road, Baoshan District, Shanghai 200441, China
| | - Yu-Feng Yao
- Laboratory of Bacterial Pathogenesis, Institutes of Medical Sciences, School of Medicine, Shanghai Jiao Tong University, Shanghai 200025, China
| | - Zixin Deng
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| | - Wei Ma
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| |
Collapse
|
3
|
Kohestani H, Wereszczynski J. The effects of RNA.DNA-DNA triple helices on nucleosome structures and dynamics. Biophys J 2023; 122:1229-1239. [PMID: 36798026 PMCID: PMC10111275 DOI: 10.1016/j.bpj.2023.02.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 01/22/2023] [Accepted: 02/10/2023] [Indexed: 02/17/2023] Open
Abstract
Noncoding RNAs (ncRNAs) are an emerging epigenetic factor and have been recognized as playing a key role in many gene expression pathways. Structurally, binding of ncRNAs to isolated DNA is strongly dependent on sequence complementary and results in the formation of an RNA.DNA-DNA (RDD) triple helix. However, in vivo DNA is not isolated but is rather packed in chromatin fibers, the fundamental unit of which is the nucleosome. Biochemical experiments have shown that ncRNA binding to nucleosomal DNA is elevated at DNA entry and exit sites and is dependent on the presence of the H3 N-terminal tails. However, the structural and dynamical bases for these mechanisms remain unknown. Here, we have examined the mechanisms and effects of RDD formation in the context of the nucleosome using a series of all-atom molecular dynamics simulations. Results highlight the importance of DNA sequence on complex stability, elucidate the effects of the H3 tails on RDD structures, show how RDD formation impacts the structure and dynamics of the H3 tails, and show how RNA alters the local and global DNA double-helical structure. Together, our results suggest ncRNAs can modify nucleosome, and potentially higher-order chromatin, structures and dynamics as a means of exerting epigenetic control.
Collapse
Affiliation(s)
- Havva Kohestani
- Department of Biology, Illinois Institute of Technology, Chicago, Illinois
| | - Jeff Wereszczynski
- Departments of Physics & Biology, Illinois Institute of Technology, Chicago, Illinois.
| |
Collapse
|
4
|
Boral A, Mitra D. Heterogeneity in winged helix-turn-helix and substrate DNA interactions: Insights from theory and experiments. J Cell Biochem 2023; 124:337-358. [PMID: 36715571 DOI: 10.1002/jcb.30369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 12/29/2022] [Accepted: 01/02/2023] [Indexed: 01/31/2023]
Abstract
Specific interactions between transcription factors (TFs) and substrate DNA constitute the fundamental basis of gene expression. Unlike in TFs like basic helix-loop-helix or basic leucine zippers, prediction of substrate DNA is extremely challenging for helix-turn-helix (HTH). Experimental techniques like chromatin immunoprecipitation combined with massively parallel DNA sequencing remains a viable option. We characterize the molecular basis of heterogeneity in HTH-DNA interaction using in silico tools and thence validate them experimentally. Given the profound functional diversity in HTH, we focus primarily on winged-HTH (wHTH). We consider 180 wHTH TFs, whose experimental three-dimensional structures are available in DNA bound/unbound conformations. Starting with PDB-wide scanning and curation of data, we construct a phylogenetic tree, which distributes 180 wHTH sequences under multiple sub-groups. Structure-sequence alignment followed by detailed intra/intergroup analysis, covariation studies and extensive network theory analysis help us to gain deep insight into heterogeneous wHTH-substrate DNA interactions. A central aim of this study is to find a consensus to predict the substrate DNA sequence for wHTH, amidst heterogeneity. The strength of our exhaustive theoretical investigations including molecular docking are successfully tested through experimental characterization of wHTH TF from Sulfurimonas denitrificans.
Collapse
Affiliation(s)
- Aparna Boral
- Department of Life Sciences, Presidency University, Kolkata, West Bengal, India
| | - Devrani Mitra
- Department of Life Sciences, Presidency University, Kolkata, West Bengal, India
| |
Collapse
|
5
|
Chaudhuri S, Srivastava A. Network approach to understand biological systems: From single to multilayer networks. J Biosci 2022. [DOI: 10.1007/s12038-022-00285-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
6
|
Identifying Important Nodes in Complex Networks Based on Node Propagation Entropy. ENTROPY 2022; 24:e24020275. [PMID: 35205569 PMCID: PMC8871465 DOI: 10.3390/e24020275] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/07/2022] [Accepted: 02/12/2022] [Indexed: 02/01/2023]
Abstract
In recent years, the identification of the essential nodes in complex networks has attracted significant attention because of their theoretical and practical significance in many applications, such as preventing and controlling epidemic diseases and discovering essential proteins. Several importance measures have been proposed from diverse perspectives to identify crucial nodes more accurately. In this paper, we propose a novel importance metric called node propagation entropy, which uses a combination of the clustering coefficients of nodes and the influence of the first- and second-order neighbor numbers on node importance to identify essential nodes from an entropy perspective while considering the local and global information of the network. Furthermore, the susceptible–infected–removed and susceptible–infected–removed–susceptible epidemic models along with the Kendall coefficient are used to reveal the relevant correlations among the various importance measures. The results of experiments conducted on several real networks from different domains show that the proposed metric is more accurate and stable in identifying significant nodes than many existing techniques, including degree centrality, betweenness centrality, closeness centrality, eigenvector centrality, and H-index.
Collapse
|
7
|
Kohestani H, Wereszczynski J. Effects of H2A.B incorporation on nucleosome structures and dynamics. Biophys J 2021; 120:1498-1509. [PMID: 33609493 DOI: 10.1016/j.bpj.2021.01.036] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 12/31/2020] [Accepted: 01/12/2021] [Indexed: 01/20/2023] Open
Abstract
The H2A.B histone variant is an epigenetic regulator involved in transcriptional upregulation, DNA synthesis, and splicing that functions by replacing the canonical H2A histone in the nucleosome core particle. Introduction of H2A.B results in less compact nucleosome states with increased DNA unwinding and accessibility at the nucleosomal entry and exit sites. Despite being well characterized experimentally, the molecular mechanisms by which H2A.B incorporation alters nucleosome stability and dynamics remain poorly understood. To study the molecular mechanisms of H2A.B, we have performed a series of conventional and enhanced sampling molecular dynamics simulation of H2A.B- and canonical H2A-containing nucleosomes. Results of conventional simulations show that H2A.B weakens protein-protein and protein-DNA interactions at specific locations throughout the nucleosome. These weakened interactions result in significantly more DNA opening from both the entry and exit sites in enhanced sampling simulations. Furthermore, free energy profiles show that H2A.B-containing nucleosomes have significantly broader free wells and that H2A.B allows for sampling of states with increased DNA breathing, which are shown to be stable on the hundreds of nanoseconds timescale with further conventional simulations. Together, our results show the molecular mechanisms by which H2A.B creates less compacted nucleosome states as a means of increasing genetic accessibility and gene transcription.
Collapse
Affiliation(s)
- Havva Kohestani
- Department of Biology, Center for Molecular Study of Condensed Soft Matter, Illinois Institute of Technology, Chicago, Illinois
| | - Jeff Wereszczynski
- Department of Physics, Center for Molecular Study of Condensed Soft Matter, Illinois Institute of Technology, Chicago, Illinois.
| |
Collapse
|
8
|
Halder A, Anto A, Subramanyan V, Bhattacharyya M, Vishveshwara S, Vishveshwara S. Surveying the Side-Chain Network Approach to Protein Structure and Dynamics: The SARS-CoV-2 Spike Protein as an Illustrative Case. Front Mol Biosci 2020; 7:596945. [PMID: 33392257 PMCID: PMC7775578 DOI: 10.3389/fmolb.2020.596945] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 11/04/2020] [Indexed: 02/04/2023] Open
Abstract
Network theory-based approaches provide valuable insights into the variations in global structural connectivity between different dynamical states of proteins. Our objective is to review network-based analyses to elucidate such variations, especially in the context of subtle conformational changes. We present technical details of the construction and analyses of protein structure networks, encompassing both the non-covalent connectivity and dynamics. We examine the selection of optimal criteria for connectivity based on the physical concept of percolation. We highlight the advantages of using side-chain-based network metrics in contrast to backbone measurements. As an illustrative example, we apply the described network approach to investigate the global conformational changes between the closed and partially open states of the SARS-CoV-2 spike protein. These conformational changes in the spike protein is crucial for coronavirus entry and fusion into human cells. Our analysis reveals global structural reorientations between the two states of the spike protein despite small changes between the two states at the backbone level. We also observe some differences at strategic locations in the structures, correlating with their functions, asserting the advantages of the side-chain network analysis. Finally, we present a view of allostery as a subtle synergistic-global change between the ligand and the receptor, the incorporation of which would enhance drug design strategies.
Collapse
Affiliation(s)
- Anushka Halder
- Department of Pharmacology, Yale University, New Haven, CT, United States
| | - Arinnia Anto
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Varsha Subramanyan
- Department of Physics, University of Illinois at Urbana-Champaign, Champaign, IL, United States
| | | | - Smitha Vishveshwara
- Department of Physics, University of Illinois at Urbana-Champaign, Champaign, IL, United States
| | | |
Collapse
|
9
|
Amirkhani A, Kolahdoozi M, Wang C, Kurgan LA. Prediction of DNA-Binding Residues in Local Segments of Protein Sequences with Fuzzy Cognitive Maps. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1372-1382. [PMID: 30602422 DOI: 10.1109/tcbb.2018.2890261] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
While protein-DNA interactions are crucial for a wide range of cellular functions, only a small fraction of these interactions was annotated to date. One solution to close this annotation gap is to employ computational methods that accurately predict protein-DNA interactions from widely available protein sequences. We present and empirically test first-of-its-kind predictor of DNA-binding residues in local segments of protein sequences that relies on the Fuzzy Cognitive Map (FCM) model. The FCM model uses information about putative solvent accessibility, evolutionary conservation, and relative propensities of amino acid to interact with DNA to generate putative DNA-binding residues. Empirical tests on a benchmark dataset reveal that the FCM model secures AUC = 0.72 and outperforms recently released hybridNAP predictor and several popular machine learning methods including Support Vector Machines, Naïve Bayes, and k-Nearest Neighbor. The improvements in the predictive performance result from an intrinsic feature of FCMs that incorporate relations between the input features, besides the relations between the inputs and output that are modelled by other algorithms. We also empirically demonstrate that use of a short sliding window results in further improvements in the predictive quality. The funDNApred webserver that implements the FCM predictor is available at http://biomine.cs.vcu.edu/servers/funDNApred/.
Collapse
|
10
|
Faltejsková K, Jakubec D, Vondrášek J. Hydrophobic Amino Acids as Universal Elements of Protein-Induced DNA Structure Deformation. Int J Mol Sci 2020; 21:ijms21113986. [PMID: 32498246 PMCID: PMC7312683 DOI: 10.3390/ijms21113986] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 05/26/2020] [Accepted: 05/29/2020] [Indexed: 12/01/2022] Open
Abstract
Interaction with the DNA minor groove is a significant contributor to specific sequence recognition in selected families of DNA-binding proteins. Based on a statistical analysis of 3D structures of protein–DNA complexes, we propose that distortion of the DNA minor groove resulting from interactions with hydrophobic amino acid residues is a universal element of protein–DNA recognition. We provide evidence to support this by associating each DNA minor groove-binding amino acid residue with the local dimensions of the DNA double helix using a novel algorithm. The widened DNA minor grooves are associated with high GC content. However, some AT-rich sequences contacted by hydrophobic amino acids (e.g., phenylalanine) display extreme values of minor groove width as well. For a number of hydrophobic amino acids, distinct secondary structure preferences could be identified for residues interacting with the widened DNA minor groove. These results hold even after discarding the most populous families of minor groove-binding proteins.
Collapse
Affiliation(s)
- Kateřina Faltejsková
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Flemingovo náměstí 542/2, 166 10 Prague 6, Czech Republic;
- Department of Cell Biology, Faculty of Science, Charles University, Viničná 7, 128 00 Prague 2, Czech Republic
| | - David Jakubec
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Flemingovo náměstí 542/2, 166 10 Prague 6, Czech Republic;
- Department of Physical and Macromolecular Chemistry, Faculty of Science, Charles University, Hlavova 8, 128 40 Prague 2, Czech Republic
- Correspondence: (D.J.); (J.V.); Tel.: +420-220183267 (J.V.)
| | - Jiří Vondrášek
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Flemingovo náměstí 542/2, 166 10 Prague 6, Czech Republic;
- Correspondence: (D.J.); (J.V.); Tel.: +420-220183267 (J.V.)
| |
Collapse
|
11
|
Chakrabarty B, Naganathan V, Garg K, Agarwal Y, Parekh N. NAPS update: network analysis of molecular dynamics data and protein-nucleic acid complexes. Nucleic Acids Res 2020; 47:W462-W470. [PMID: 31106363 PMCID: PMC6602509 DOI: 10.1093/nar/gkz399] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Revised: 04/30/2019] [Accepted: 05/07/2019] [Indexed: 02/04/2023] Open
Abstract
Network theory is now a method of choice to gain insights in understanding protein structure, folding and function. In combination with molecular dynamics (MD) simulations, it is an invaluable tool with widespread applications such as analyzing subtle conformational changes and flexibility regions in proteins, dynamic correlation analysis across distant regions for allosteric communications, in drug design to reveal alternative binding pockets for drugs, etc. Updated version of NAPS now facilitates network analysis of the complete repertoire of these biomolecules, i.e., proteins, protein–protein/nucleic acid complexes, MD trajectories, and RNA. Various options provided for analysis of MD trajectories include individual network construction and analysis of intermediate time-steps, comparative analysis of these networks, construction and analysis of average network of the ensemble of trajectories and dynamic cross-correlations. For protein–nucleic acid complexes, networks of the whole complex as well as that of the interface can be constructed and analyzed. For analysis of proteins, protein–protein complexes and MD trajectories, network construction based on inter-residue interaction energies with realistic edge-weights obtained from standard force fields is provided to capture the atomistic details. Updated version of NAPS also provides improved visualization features, interactive plots and bulk execution. URL: http://bioinf.iiit.ac.in/NAPS/
Collapse
Affiliation(s)
- Broto Chakrabarty
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology - Hyderabad 500032, India
| | - Varun Naganathan
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology - Hyderabad 500032, India
| | - Kanak Garg
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology - Hyderabad 500032, India
| | - Yash Agarwal
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology - Hyderabad 500032, India
| | - Nita Parekh
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology - Hyderabad 500032, India
| |
Collapse
|
12
|
Influential Nodes Identification in Complex Networks via Information Entropy. ENTROPY 2020; 22:e22020242. [PMID: 33286016 PMCID: PMC7516697 DOI: 10.3390/e22020242] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/09/2019] [Revised: 02/17/2020] [Accepted: 02/19/2020] [Indexed: 12/11/2022]
Abstract
Identifying a set of influential nodes is an important topic in complex networks which plays a crucial role in many applications, such as market advertising, rumor controlling, and predicting valuable scientific publications. In regard to this, researchers have developed algorithms from simple degree methods to all kinds of sophisticated approaches. However, a more robust and practical algorithm is required for the task. In this paper, we propose the EnRenew algorithm aimed to identify a set of influential nodes via information entropy. Firstly, the information entropy of each node is calculated as initial spreading ability. Then, select the node with the largest information entropy and renovate its l-length reachable nodes’ spreading ability by an attenuation factor, repeat this process until specific number of influential nodes are selected. Compared with the best state-of-the-art benchmark methods, the performance of proposed algorithm improved by 21.1%, 7.0%, 30.0%, 5.0%, 2.5%, and 9.0% in final affected scale on CEnew, Email, Hamster, Router, Condmat, and Amazon network, respectively, under the Susceptible-Infected-Recovered (SIR) simulation model. The proposed algorithm measures the importance of nodes based on information entropy and selects a group of important nodes through dynamic update strategy. The impressive results on the SIR simulation model shed light on new method of node mining in complex networks for information spreading and epidemic prevention.
Collapse
|
13
|
Oke M, Agbalajobi R, Osifeso M, Muhammad B, Lawal H, Mai M, Adegunle Q. Design and implementation of structural bioinformatics projects for biological sciences undergraduate students. BIOCHEMISTRY AND MOLECULAR BIOLOGY EDUCATION : A BIMONTHLY PUBLICATION OF THE INTERNATIONAL UNION OF BIOCHEMISTRY AND MOLECULAR BIOLOGY 2018; 46:547-554. [PMID: 30369034 DOI: 10.1002/bmb.21169] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Revised: 08/21/2018] [Accepted: 09/03/2018] [Indexed: 06/08/2023]
Abstract
Contemporary biology is currently undergoing a revolution, driven by the availability of high-throughput technologies and a wide variety of bioinformatics tools. However, bioinformatics education and practice is still in its infancy in most of the African continent. Consequently, concerted efforts have been made in recent years to incorporate bioinformatics modules into biological sciences curriculum of African Universities. Despite this, one aspect of bioinformatics that is yet to be incorporated is structural bioinformatics. In this article, we report on a structural bioinformatics project carried out by final year project students in a Nigerian university. The target protein was the thermoacidophilic Sulfolobus islandicus rod-shaped virus 1 (SIRV1) Rep protein, which was further characterized using various free, user-friendly and online sequence-based and structure-based bioinformatics tools. This exercise gave students the opportunity to generate new data, interpret the data, and acquire collaborative research skills. In this report, emphasis is placed on analysis of the data generated to further encourage analytical skills. By sharing this experience, it is anticipated that other similar institutions would adopt parallel strategies to expose undergraduate students to structural biology, and increase awareness of freely available bioinformatics tools for tackling pertinent biological questions. © 2018 International Union of Biochemistry and Molecular Biology, 46(5):547-554, 2018.
Collapse
Affiliation(s)
- Muse Oke
- Department of Biological Sciences, Fountain University, Oke-Osun, Osogbo, Nigeria
| | - Ramon Agbalajobi
- Department of Biological Sciences, Fountain University, Oke-Osun, Osogbo, Nigeria
| | | | - Babagana Muhammad
- Department of Biological Sciences, Fountain University, Oke-Osun, Osogbo, Nigeria
| | - Halimat Lawal
- Department of Biological Sciences, Fountain University, Oke-Osun, Osogbo, Nigeria
| | - Muhammad Mai
- Department of Biological Sciences, Fountain University, Oke-Osun, Osogbo, Nigeria
| | - Quadri Adegunle
- Department of Biological Sciences, Fountain University, Oke-Osun, Osogbo, Nigeria
| |
Collapse
|
14
|
Zhang J, Ma Z, Kurgan L. Comprehensive review and empirical analysis of hallmarks of DNA-, RNA- and protein-binding residues in protein chains. Brief Bioinform 2017; 20:1250-1268. [DOI: 10.1093/bib/bbx168] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Revised: 11/15/2017] [Indexed: 11/13/2022] Open
Abstract
Abstract
Proteins interact with a variety of molecules including proteins and nucleic acids. We review a comprehensive collection of over 50 studies that analyze and/or predict these interactions. While majority of these studies address either solely protein–DNA or protein–RNA binding, only a few have a wider scope that covers both protein–protein and protein–nucleic acid binding. Our analysis reveals that binding residues are typically characterized with three hallmarks: relative solvent accessibility (RSA), evolutionary conservation and propensity of amino acids (AAs) for binding. Motivated by drawbacks of the prior studies, we perform a large-scale analysis to quantify and contrast the three hallmarks for residues that bind DNA-, RNA-, protein- and (for the first time) multi-ligand-binding residues that interact with DNA and proteins, and with RNA and proteins. Results generated on a well-annotated data set of over 23 000 proteins show that conservation of binding residues is higher for nucleic acid- than protein-binding residues. Multi-ligand-binding residues are more conserved and have higher RSA than single-ligand-binding residues. We empirically show that each hallmark discriminates between binding and nonbinding residues, even predicted RSA, and that combining them improves discriminatory power for each of the five types of interactions. Linear scoring functions that combine these hallmarks offer good predictive performance of residue-level propensity for binding and provide intuitive interpretation of predictions. Better understanding of these residue-level interactions will facilitate development of methods that accurately predict binding in the exponentially growing databases of protein sequences.
Collapse
|
15
|
Fanelli F, Felline A. Uncovering GPCR and G Protein Function by Protein Structure Network Analysis. COMPUTATIONAL TOOLS FOR CHEMICAL BIOLOGY 2017. [DOI: 10.1039/9781788010139-00198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Protein structure network (PSN) analysis is one of the graph theory-based approaches currently used for investigating structural communication in biomolecular systems. Information on the system's dynamics can be provided by atomistic molecular dynamics (MD) simulations or coarse grained elastic network models paired with normal mode analysis (ENM-NMA). This chapter reports on selected applications of PSN analysis to uncover the structural communication in G protein coupled receptors (GPCRs) and G proteins. Strategies to highlight changes in structural communication caused by mutations, ligand and protein binding are described. Conserved amino acids, sites of misfolding mutations, or ligands acting as functional switches tend to behave as hubs in the native structure networks. Densely linked regions in the protein structure graphs could be identified as playing central roles in protein stability and function. Changes in the communication pathway fingerprints depending on the bound ligand or following amino acid mutation could be highlighted as well. A bridge between misfolding and misrouting could be established in rhodopsin mutants linked to inherited blindness. The analysis of native network perturbations by misfolding mutations served to infer key structural elements of protein responsiveness to small chaperones with implications for drug discovery.
Collapse
Affiliation(s)
- Francesca Fanelli
- Department of Life Sciences University of Modena and Reggio Emilia Italy
- Center for Neuroscience and Neurotechnology University of Modena and Reggio Emilia Italy
| | - Angelo Felline
- Department of Life Sciences University of Modena and Reggio Emilia Italy
| |
Collapse
|
16
|
Kozuki T, Chikamori K, Surleac MD, Micluta MA, Petrescu AJ, Norris EJ, Elson P, Hoeltge GA, Grabowski DR, Porter ACG, Ganapathi RN, Ganapathi MK. Roles of the C-terminal domains of topoisomerase IIα and topoisomerase IIβ in regulation of the decatenation checkpoint. Nucleic Acids Res 2017; 45:5995-6010. [PMID: 28472494 PMCID: PMC5449615 DOI: 10.1093/nar/gkx325] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Accepted: 04/14/2017] [Indexed: 12/13/2022] Open
Abstract
Topoisomerase (topo) IIα and IIβ maintain genome stability and are targets for anti-tumor drugs. In this study, we demonstrate that the decatenation checkpoint is regulated, not only by topo IIα, as previously reported, but also by topo IIβ. The decatenation checkpoint is most efficient when both isoforms are present. Regulation of this checkpoint and sensitivity to topo II-targeted drugs is influenced by the C-terminal domain (CTD) of the topo II isoforms and by a conserved non-catalytic tyrosine, Y640 in topo IIα and Y656 in topo IIβ. Deletion of most of the CTD of topo IIα, while preserving the nuclear localization signal (NLS), enhances the decatenation checkpoint and sensitivity to topo II-targeted drugs. In contrast, deletion of most of the CTD of topo IIβ, while preserving the NLS, and mutation of Y640 in topo IIα and Y656 in topo IIβ inhibits these activities. Structural studies suggest that the differential impact of the CTD on topo IIα and topo IIβ function may be due to differences in CTD charge distribution and differential alignment of the CTD with reference to transport DNA. Together these results suggest that topo IIα and topo IIβ cooperate to maintain genome stability, which may be distinctly modulated by their CTDs.
Collapse
Affiliation(s)
- Toshiyuki Kozuki
- Taussig Cancer Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH 44195, USA
| | - Kenichi Chikamori
- Taussig Cancer Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH 44195, USA
| | - Marius D Surleac
- Department of Bioinformatics, Institute of Biochemistry of the Romanian Academy, Bucharest, Romania
| | - Marius A Micluta
- Department of Bioinformatics, Institute of Biochemistry of the Romanian Academy, Bucharest, Romania
| | - Andrei J Petrescu
- Department of Bioinformatics, Institute of Biochemistry of the Romanian Academy, Bucharest, Romania
| | - Eric J Norris
- Department of Cancer Pharmacology, Levine Cancer Institute, Carolinas HealthCare System, 1021 Morehead Medical Drive, Charlotte, NC 28204, USA
| | - Paul Elson
- Taussig Cancer Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH 44195, USA
| | - Gerald A Hoeltge
- Clinical Pathology, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH 44195, USA
| | - Dale R Grabowski
- Taussig Cancer Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH 44195, USA
| | - Andrew C G Porter
- Imperial College Faculty of Medicine, Hammersmith Hospital, London W10 ONN, UK
| | - Ram N Ganapathi
- Department of Cancer Pharmacology, Levine Cancer Institute, Carolinas HealthCare System, 1021 Morehead Medical Drive, Charlotte, NC 28204, USA
| | - Mahrukh K Ganapathi
- Department of Cancer Pharmacology, Levine Cancer Institute, Carolinas HealthCare System, 1021 Morehead Medical Drive, Charlotte, NC 28204, USA
| |
Collapse
|
17
|
Dissecting intrinsic and ligand-induced structural communication in the β3 headpiece of integrins. Biochim Biophys Acta Gen Subj 2017; 1861:2367-2381. [DOI: 10.1016/j.bbagen.2017.05.018] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Revised: 05/20/2017] [Accepted: 05/22/2017] [Indexed: 12/15/2022]
|
18
|
Wilson KA, Wetmore SD. Combining crystallographic and quantum chemical data to understand DNA-protein π-interactions in nature. Struct Chem 2017. [DOI: 10.1007/s11224-017-0954-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
19
|
Gardini S, Furini S, Santucci A, Niccolai N. A structural bioinformatics investigation on protein–DNA complexes delineates their modes of interaction. MOLECULAR BIOSYSTEMS 2017; 13:1010-1017. [DOI: 10.1039/c7mb00071e] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
A non-redundant dataset of 629 protein–DNA complexes has been used to investigate on amino acid composition of protein-DNA interfaces. Structural proteins, transcription factors and DNA-related enzymes show specific patterns accounting for different modes of their interaction with DNA.
Collapse
Affiliation(s)
- Simone Gardini
- Department of Biotechnology
- Chemistry and Pharmacy
- University of Siena
- Italy
| | - Simone Furini
- Department of Medical Biotechnologies
- University of Siena
- Siena
- Italy
| | - Annalisa Santucci
- Department of Biotechnology
- Chemistry and Pharmacy
- University of Siena
- Italy
| | - Neri Niccolai
- Department of Biotechnology
- Chemistry and Pharmacy
- University of Siena
- Italy
| |
Collapse
|
20
|
Chakrabarty B, Parekh N. NAPS: Network Analysis of Protein Structures. Nucleic Acids Res 2016; 44:W375-82. [PMID: 27151201 PMCID: PMC4987928 DOI: 10.1093/nar/gkw383] [Citation(s) in RCA: 111] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Accepted: 04/25/2016] [Indexed: 12/29/2022] Open
Abstract
Traditionally, protein structures have been analysed by the secondary structure architecture and fold arrangement. An alternative approach that has shown promise is modelling proteins as a network of non-covalent interactions between amino acid residues. The network representation of proteins provide a systems approach to topological analysis of complex three-dimensional structures irrespective of secondary structure and fold type and provide insights into structure-function relationship. We have developed a web server for network based analysis of protein structures, NAPS, that facilitates quantitative and qualitative (visual) analysis of residue-residue interactions in: single chains, protein complex, modelled protein structures and trajectories (e.g. from molecular dynamics simulations). The user can specify atom type for network construction, distance range (in Å) and minimal amino acid separation along the sequence. NAPS provides users selection of node(s) and its neighbourhood based on centrality measures, physicochemical properties of amino acids or cluster of well-connected residues (k-cliques) for further analysis. Visual analysis of interacting domains and protein chains, and shortest path lengths between pair of residues are additional features that aid in functional analysis. NAPS support various analyses and visualization views for identifying functional residues, provide insight into mechanisms of protein folding, domain-domain and protein-protein interactions for understanding communication within and between proteins. URL:http://bioinf.iiit.ac.in/NAPS/.
Collapse
Affiliation(s)
- Broto Chakrabarty
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| | - Nita Parekh
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| |
Collapse
|
21
|
Wilson KA, Holland DJ, Wetmore SD. Topology of RNA-protein nucleobase-amino acid π-π interactions and comparison to analogous DNA-protein π-π contacts. RNA (NEW YORK, N.Y.) 2016; 22:696-708. [PMID: 26979279 PMCID: PMC4836644 DOI: 10.1261/rna.054924.115] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2015] [Accepted: 02/13/2016] [Indexed: 06/05/2023]
Abstract
The present work analyzed 120 high-resolution X-ray crystal structures and identified 335 RNA-protein π-interactions (154 nonredundant) between a nucleobase and aromatic (W, H, F, or Y) or acyclic (R, E, or D) π-containing amino acid. Each contact was critically analyzed (including using a visual inspection protocol) to determine the most prevalent composition, structure, and strength of π-interactions at RNA-protein interfaces. These contacts most commonly involve F and U, with U:F interactions comprising one-fifth of the total number of contacts found. Furthermore, the RNA and protein π-systems adopt many different relative orientations, although there is a preference for more parallel (stacked) arrangements. Due to the variation in structure, the strength of the intermolecular forces between the RNA and protein components (as determined from accurate quantum chemical calculations) exhibits a significant range, with most of the contacts providing significant stability to the associated RNA-protein complex (up to -65 kJ mol(-1)). Comparison to the analogous DNA-protein π-interactions emphasizes differences in RNA- and DNA-protein π-interactions at the molecular level, including the greater abundance of RNA contacts and the involvement of different nucleobase/amino acid residues. Overall, our results provide a clearer picture of the molecular basis of nucleic acid-protein binding and underscore the important role of these contacts in biology, including the significant contribution of π-π interactions to the stability of nucleic acid-protein complexes. Nevertheless, more work is still needed in this area in order to further appreciate the properties and roles of RNA nucleobase-amino acid π-interactions in nature.
Collapse
Affiliation(s)
- Katie A Wilson
- Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, Alberta T1K 3M4, Canada
| | - Devany J Holland
- Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, Alberta T1K 3M4, Canada
| | - Stacey D Wetmore
- Department of Chemistry and Biochemistry, University of Lethbridge, Lethbridge, Alberta T1K 3M4, Canada
| |
Collapse
|
22
|
González J, Baños I, León I, Contreras-García J, Cocinero EJ, Lesarri A, Fernández JA, Millán J. Unravelling Protein–DNA Interactions at Molecular Level: A DFT and NCI Study. J Chem Theory Comput 2016; 12:523-34. [DOI: 10.1021/acs.jctc.5b00330] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- J. González
- Departamento
de Química Física, Facultad de Ciencia y Tecnología, Universidad del País Vasco-UPV/EHU, Barrio Sarriena s/n, Leioa, 48940 Spain
| | - I. Baños
- Departamento
de Química, Facultad de Ciencias, Estudios Agroalimentarios
e Informática, Universidad de La Rioja, Madre de Dios,
53, Logroño, 26006 Spain
| | - I. León
- Departamento
de Química Física, Facultad de Ciencia y Tecnología, Universidad del País Vasco-UPV/EHU, Barrio Sarriena s/n, Leioa, 48940 Spain
| | - J. Contreras-García
- Sorbonne Universités,
UPMC Univ. Paris 06, UMR7616, Laboratoire de Chimie Théorique, F-75005, Paris, France
- CNRS, UMR 7616,
Laboratoire de Chimie Théorique, F-75005, Paris, France
| | - E. J. Cocinero
- Departamento
de Química Física, Facultad de Ciencia y Tecnología, Universidad del País Vasco-UPV/EHU, Barrio Sarriena s/n, Leioa, 48940 Spain
| | - A. Lesarri
- Departamento
de Química Física y Química Inorgánica,
Facultad de Ciencias, Universidad de Valladolid, 47011 Valladolid, Spain
| | - J. A. Fernández
- Departamento
de Química Física, Facultad de Ciencia y Tecnología, Universidad del País Vasco-UPV/EHU, Barrio Sarriena s/n, Leioa, 48940 Spain
| | - J. Millán
- Departamento
de Química, Facultad de Ciencias, Estudios Agroalimentarios
e Informática, Universidad de La Rioja, Madre de Dios,
53, Logroño, 26006 Spain
| |
Collapse
|
23
|
Zanegina O, Kirsanov D, Baulin E, Karyagina A, Alexeevski A, Spirin S. An updated version of NPIDB includes new classifications of DNA-protein complexes and their families. Nucleic Acids Res 2016; 44:D144-53. [PMID: 26656949 PMCID: PMC4702928 DOI: 10.1093/nar/gkv1339] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Revised: 11/13/2015] [Accepted: 11/16/2015] [Indexed: 11/13/2022] Open
Abstract
The recent upgrade of nucleic acid-protein interaction database (NPIDB, http://npidb.belozersky.msu.ru/) includes a newly elaborated classification of complexes of protein domains with double-stranded DNA and a classification of families of related complexes. Our classifications are based on contacting structural elements of both DNA: the major groove, the minor groove and the backbone; and protein: helices, beta-strands and unstructured segments. We took into account both hydrogen bonds and hydrophobic interaction. The analyzed material contains 1942 structures of protein domains from 748 PDB entries. We have identified 97 interaction modes of individual protein domain-DNA complexes and 17 DNA-protein interaction classes of protein domain families. We analyzed the sources of diversity of DNA-protein interaction modes in different complexes of one protein domain family. The observed interaction mode is sometimes influenced by artifacts of crystallization or diversity in secondary structure assignment. The interaction classes of domain families are more stable and thus possess more biological sense than a classification of single complexes. Integration of the classification into NPIDB allows the user to browse the database according to the interacting structural elements of DNA and protein molecules. For each family, we present average DNA shape parameters in contact zones with domains of the family.
Collapse
Affiliation(s)
- Olga Zanegina
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119992, Russia
| | | | - Eugene Baulin
- Laboratory of Applied Mathematics, Institute of Mathematical Problems in Biology, Puschino 142290, Russia
| | - Anna Karyagina
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119992, Russia Laboratory of Biologically Active Nanostructures, Gamaleya Center of Epidemiology and Microbiology, Moscow 123098, Russia Laboratory of Genome Analysis, Institute of Agricultural Biotechnology, Moscow 127550, Russia
| | - Andrei Alexeevski
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119992, Russia Sector of Applied Informatics, Research Institute for System Studies, Moscow 117218, Russia
| | - Sergey Spirin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow 119992, Russia Sector of Applied Informatics, Research Institute for System Studies, Moscow 117218, Russia
| |
Collapse
|
24
|
Hu G, Xiao F, Li Y, Li Y, Vongsangnak W. Protein-Protein Interface and Disease: Perspective from Biomolecular Networks. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2016; 160:57-74. [PMID: 27928579 DOI: 10.1007/10_2016_40] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Protein-protein interactions are involved in many important biological processes and molecular mechanisms of disease association. Structural studies of interfacial residues in protein complexes provide information on protein-protein interactions. Characterizing protein-protein interfaces, including binding sites and allosteric changes, thus pose an imminent challenge. With special focus on protein complexes, approaches based on network theory are proposed to meet this challenge. In this review we pay attention to protein-protein interfaces from the perspective of biomolecular networks and their roles in disease. We first describe the different roles of protein complexes in disease through several structural aspects of interfaces. We then discuss some recent advances in predicting hot spots and communication pathway analysis in terms of amino acid networks. Finally, we highlight possible future aspects of this area with respect to both methodology development and applications for disease treatment.
Collapse
Affiliation(s)
- Guang Hu
- Center for Systems Biology, School of Electronic and Information Engineering, Soochow University, Suzhou, 215006, China.
| | - Fei Xiao
- School of Basic Medicine and Biological Sciences, Medical College of Soochow University, Suzhou, 215123, China
| | - Yuqian Li
- School of Electronic Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Yuan Li
- Center for Systems Biology, School of Electronic and Information Engineering, Soochow University, Suzhou, 215006, China
| | - Wanwipa Vongsangnak
- Department of Zoology, Faculty of Science, Kasetsart University, Bangkok, 10900, Thailand.
- Computational Biomodelling Laboratory for Agricultural Science and Technology (CBLAST), Faculty of Science, Kasetsart University, Bangkok, 10900, Thailand.
| |
Collapse
|
25
|
Ghosh S, Chandra N, Vishveshwara S. Mechanism of Iron-Dependent Repressor (IdeR) Activation and DNA Binding: A Molecular Dynamics and Protein Structure Network Study. PLoS Comput Biol 2015; 11:e1004500. [PMID: 26699663 PMCID: PMC4689551 DOI: 10.1371/journal.pcbi.1004500] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Accepted: 08/11/2015] [Indexed: 11/19/2022] Open
Abstract
Metalloproteins form a major class of enzymes in the living system that are involved in crucial biological functions such as catalysis, redox reactions and as 'switches' in signal transductions. Iron dependent repressor (IdeR) is a metal-sensing transcription factor that regulates free iron concentration in Mycobacterium tuberculosis. IdeR is also known to promote bacterial virulence, making it an important target in the field of therapeutics. Mechanistic details of how iron ions modulate IdeR such that it dimerizes and binds to DNA is not understood clearly. In this study, we have performed molecular dynamic simulations and integrated it with protein structure networks to study the influence of iron on IdeR structure and function. A significant structural variation between the metallated and the non-metallated system is observed. Our simulations clearly indicate the importance of iron in stabilizing its monomeric subunit, which in turn promotes dimerization. However, the most striking results are obtained from the simulations of IdeR-DNA complex in the absence of metals, where at the end of 100ns simulations, the protein subunits are seen to rapidly dissociate away from the DNA, thereby forming an excellent resource to investigate the mechanism of DNA binding. We have also investigated the role of iron as an allosteric regulator of IdeR that positively induces IdeR-DNA complex formation. Based on this study, a mechanistic model of IdeR activation and DNA binding has been proposed.
Collapse
Affiliation(s)
- Soma Ghosh
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
- I.I.Sc. Mathematics Initiative, Indian Institute of Science, Bangalore, Karnataka, India
| | - Nagasuma Chandra
- I.I.Sc. Mathematics Initiative, Indian Institute of Science, Bangalore, Karnataka, India
- Department of Biochemistry, Indian Institute of Science, Bangalore, Karnataka, India
| | - Saraswathi Vishveshwara
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
- I.I.Sc. Mathematics Initiative, Indian Institute of Science, Bangalore, Karnataka, India
| |
Collapse
|
26
|
Tse A, Verkhivker GM. Molecular Dynamics Simulations and Structural Network Analysis of c-Abl and c-Src Kinase Core Proteins: Capturing Allosteric Mechanisms and Communication Pathways from Residue Centrality. J Chem Inf Model 2015; 55:1645-62. [DOI: 10.1021/acs.jcim.5b00240] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Affiliation(s)
- Amanda Tse
- Graduate Program in Computational and Data Sciences,
Department of Computational Sciences, Schmid College of Science and
Technology, Chapman University, One University Drive, Orange, California 92866, United States
| | - Gennady M. Verkhivker
- Graduate Program in Computational and Data Sciences,
Department of Computational Sciences, Schmid College of Science and
Technology, Chapman University, One University Drive, Orange, California 92866, United States
- Chapman University School of Pharmacy, Irvine, California 92618, United States
| |
Collapse
|
27
|
A new hereditary congenital facial palsy case supports arg5 in HOX-DNA binding domain as possible hot spot for mutations. Eur J Med Genet 2015; 58:358-63. [PMID: 26007620 DOI: 10.1016/j.ejmg.2015.05.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2015] [Accepted: 05/18/2015] [Indexed: 11/22/2022]
Abstract
Moebius syndrome (MBS) is a rare congenital disorder characterized by rhombencephalic mal development, mainly presenting with facial palsy with limited gaze abduction. Most cases are sporadic, possibly caused by a combination of environmental and genetic factors; however, no proven specific associations have been yet established. Hereditary congenital facial palsy (HCFP) is an autosomal dominant congenital dysinnervation syndrome, recognizable by the isolated dysfunction of the seventh cranial nerve. Mutant mice for Hoxb1 were reported to present with facial weakness, resembling MBS. Recently a homozygous mutation altering arg5 residue of HOXB1 homeodomain into cys5 was identified in two families with HCFP. We screened 95 sporadic patients diagnosed as MBS or HCFP for mutations in HOXB1. A novel homozygous alteration was identified in one HCFP case, affecting the same residue, resulting to his5. In silico protein analysis predicted stronger HOXB1-DNA binding properties for his5 than cys5 that resulted to milder phenotype. It should be noted that, inclusive of the previous report, only two mutations revealed in HOXB1 associated with HCFP involved the same amino acid arg5 in HOXB1 residing in HOXB1-DNA-PBX1 ternary complex.
Collapse
|
28
|
Shih ESC, Hwang MJ. NPPD: A Protein-Protein Docking Scoring Function Based on Dyadic Differences in Networks of Hydrophobic and Hydrophilic Amino Acid Residues. BIOLOGY 2015; 4:282-97. [PMID: 25811640 PMCID: PMC4498300 DOI: 10.3390/biology4020282] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2014] [Accepted: 03/16/2015] [Indexed: 11/16/2022]
Abstract
Protein-protein docking (PPD) predictions usually rely on the use of a scoring function to rank docking models generated by exhaustive sampling. To rank good models higher than bad ones, a large number of scoring functions have been developed and evaluated, but the methods used for the computation of PPD predictions remain largely unsatisfactory. Here, we report a network-based PPD scoring function, the NPPD, in which the network consists of two types of network nodes, one for hydrophobic and the other for hydrophilic amino acid residues, and the nodes are connected when the residues they represent are within a certain contact distance. We showed that network parameters that compute dyadic interactions and those that compute heterophilic interactions of the amino acid networks thus constructed allowed NPPD to perform well in a benchmark evaluation of 115 PPD scoring functions, most of which, unlike NPPD, are based on some sort of protein-protein interaction energy. We also showed that NPPD was highly complementary to these energy-based scoring functions, suggesting that the combined use of conventional scoring functions and NPPD might significantly improve the accuracy of current PPD predictions.
Collapse
Affiliation(s)
- Edward S C Shih
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei 115, Taiwan.
| | - Ming-Jing Hwang
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei 115, Taiwan.
| |
Collapse
|
29
|
Wilson KA, Wells RA, Abendong MN, Anderson CB, Kung RW, Wetmore SD. Landscape of π-π and sugar-π contacts in DNA-protein interactions. J Biomol Struct Dyn 2015; 34:184-200. [PMID: 25723403 DOI: 10.1080/07391102.2015.1013157] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
There were 1765 contacts identified between DNA nucleobases or deoxyribose and cyclic (W, H, F, Y) or acyclic (R, E, D) amino acids in 672 X-ray structures of DNA-protein complexes. In this first study to compare π-interactions between the cyclic and acyclic amino acids, visual inspection was used to categorize amino acid interactions as nucleobase π-π (according to biological edge) or deoxyribose sugar-π (according to sugar edge). Overall, 54% of contacts are nucleobase π-π interactions, which involve all amino acids, but are more common for Y, F, and R, and involve all DNA nucleobases with similar frequencies. Among binding arrangements, cyclic amino acids prefer more planar (stacked) π-systems than the acyclic counterparts. Although sugar-π interactions were only previously identified with the cyclic amino acids and were found to be less common (38%) than nucleobase-cyclic amino acid contacts, sugar-π interactions are more common than nucleobase π-π contacts for the acyclic series (61% of contacts). Similar to DNA-protein π-π interactions, sugar-π contacts most frequently involve Y and R, although all amino acids adopt many binding orientations relative to deoxyribose. These DNA-protein π-interactions stabilize biological systems, by up to approximately -40 kJ mol(-1) for neutral nucleobase or sugar-amino acid interactions, but up to approximately -95 kJ mol(-1) for positively or negatively charged contacts. The high frequency and strength, despite variation in structure and composition, of these π-interactions point to an important function in biological systems.
Collapse
Affiliation(s)
- Katie A Wilson
- a Department of Chemistry and Biochemistry , University of Lethbridge , 4401 University Drive West, Lethbridge , AB T1K 3M4 , Canada
| | - Rachael A Wells
- a Department of Chemistry and Biochemistry , University of Lethbridge , 4401 University Drive West, Lethbridge , AB T1K 3M4 , Canada
| | - Minette N Abendong
- a Department of Chemistry and Biochemistry , University of Lethbridge , 4401 University Drive West, Lethbridge , AB T1K 3M4 , Canada
| | - Colin B Anderson
- a Department of Chemistry and Biochemistry , University of Lethbridge , 4401 University Drive West, Lethbridge , AB T1K 3M4 , Canada
| | - Ryan W Kung
- a Department of Chemistry and Biochemistry , University of Lethbridge , 4401 University Drive West, Lethbridge , AB T1K 3M4 , Canada
| | - Stacey D Wetmore
- a Department of Chemistry and Biochemistry , University of Lethbridge , 4401 University Drive West, Lethbridge , AB T1K 3M4 , Canada
| |
Collapse
|
30
|
Tse A, Verkhivker GM. Small-world networks of residue interactions in the Abl kinase complexes with cancer drugs: topology of allosteric communication pathways can determine drug resistance effects. MOLECULAR BIOSYSTEMS 2015; 11:2082-95. [DOI: 10.1039/c5mb00246j] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Computational modelling of efficiency and robustness of the residue interaction networks and allosteric pathways in kinase structures can characterize protein kinase sensitivity to drug binding and drug resistance effects.
Collapse
Affiliation(s)
- A. Tse
- Graduate Program in Computational and Data Sciences
- Department of Computational Sciences
- Schmid College of Science and Technology
- Chapman University
- Orange
| | - G. M. Verkhivker
- Graduate Program in Computational and Data Sciences
- Department of Computational Sciences
- Schmid College of Science and Technology
- Chapman University
- Orange
| |
Collapse
|
31
|
Wilson KA, Wetmore SD. A Survey of DNA–Protein π–Interactions: A Comparison of Natural Occurrences and Structures, and Computationally Predicted Structures and Strengths. CHALLENGES AND ADVANCES IN COMPUTATIONAL CHEMISTRY AND PHYSICS 2015. [DOI: 10.1007/978-3-319-14163-3_17] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
|
32
|
Hussain HB, Wilson KA, Wetmore SD. Serine and Cysteine π-Interactions in Nature: A Comparison of the Frequency, Structure, and Stability of Contacts Involving Oxygen and Sulfur. Aust J Chem 2015. [DOI: 10.1071/ch14598] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Despite many DNA–protein π-interactions in high-resolution crystal structures, only four X–H···π or X···π interactions were found between serine (Ser) or cysteine (Cys) and DNA nucleobase π-systems in over 100 DNA–protein complexes (where X = O for Ser and X = S for Cys). Nevertheless, 126 non-covalent contacts occur between Ser or Cys and the aromatic amino acids in many binding arrangements within proteins. Furthermore, Ser and Cys protein–protein π-interactions occur with similar frequencies and strengths. Most importantly, due to the great stability that can be provided to biological macromolecules (up to –20 kJ mol–1 for neutral π-systems or –40 kJ mol–1 for cationic π-systems), Ser and Cys π-interactions should be considered when analyzing protein stability and function.
Collapse
|
33
|
Masoodi HR, Zakarianezhad M, Bagheri S, Makiabadi B, Shool M. Substituent effects on some calculated NMR data in T-shaped configuration of benzene dimer. Chem Phys Lett 2014. [DOI: 10.1016/j.cplett.2014.09.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
34
|
Wilson KA, Kellie JL, Wetmore SD. DNA-protein π-interactions in nature: abundance, structure, composition and strength of contacts between aromatic amino acids and DNA nucleobases or deoxyribose sugar. Nucleic Acids Res 2014; 42:6726-41. [PMID: 24744240 PMCID: PMC4041443 DOI: 10.1093/nar/gku269] [Citation(s) in RCA: 108] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Four hundred twenty-eight high-resolution DNA-protein complexes were chosen for a bioinformatics study. Although 164 crystal structures (38% of those searched) contained no interactions, 574 discrete π-contacts between the aromatic amino acids and the DNA nucleobases or deoxyribose were identified using strict criteria, including visual inspection. The abundance and structure of the interactions were determined by unequivocally classifying the contacts as either π-π stacking, π-π T-shaped or sugar-π contacts. Three hundred forty-four nucleobase-amino acid π-π contacts (60% of all interactions identified) were identified in 175 of the crystal structures searched. Unprecedented in the literature, 230 DNA-protein sugar-π contacts (40% of all interactions identified) were identified in 137 crystal structures, which involve C-H···π and/or lone-pair···π interactions, contain any amino acid and can be classified according to sugar atoms involved. Both π-π and sugar-π interactions display a range of relative monomer orientations and therefore interaction energies (up to -50 (-70) kJ mol(-1) for neutral (charged) interactions as determined using quantum chemical calculations). In general, DNA-protein π-interactions are more prevalent than perhaps currently accepted and the role of such interactions in many biological processes may yet to be uncovered.
Collapse
Affiliation(s)
- Katie A Wilson
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive West, Lethbridge, AB, T1K 3M4, Canada
| | - Jennifer L Kellie
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive West, Lethbridge, AB, T1K 3M4, Canada
| | - Stacey D Wetmore
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive West, Lethbridge, AB, T1K 3M4, Canada
| |
Collapse
|
35
|
Yan W, Zhou J, Sun M, Chen J, Hu G, Shen B. The construction of an amino acid network for understanding protein structure and function. Amino Acids 2014; 46:1419-39. [PMID: 24623120 DOI: 10.1007/s00726-014-1710-6] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2013] [Accepted: 02/21/2014] [Indexed: 01/08/2023]
Abstract
Amino acid networks (AANs) are undirected networks consisting of amino acid residues and their interactions in three-dimensional protein structures. The analysis of AANs provides novel insight into protein science, and several common amino acid network properties have revealed diverse classes of proteins. In this review, we first summarize methods for the construction and characterization of AANs. We then compare software tools for the construction and analysis of AANs. Finally, we review the application of AANs for understanding protein structure and function, including the identification of functional residues, the prediction of protein folding, analyzing protein stability and protein-protein interactions, and for understanding communication within and between proteins.
Collapse
Affiliation(s)
- Wenying Yan
- Center for Systems Biology, Soochow University, Suzhou, 215006, Jiangsu, China
| | | | | | | | | | | |
Collapse
|
36
|
Schneider B, Černý J, Svozil D, Čech P, Gelly JC, de Brevern AG. Bioinformatic analysis of the protein/DNA interface. Nucleic Acids Res 2014; 42:3381-94. [PMID: 24335080 PMCID: PMC3950675 DOI: 10.1093/nar/gkt1273] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 11/14/2013] [Accepted: 11/14/2013] [Indexed: 01/04/2023] Open
Abstract
To investigate the principles driving recognition between proteins and DNA, we analyzed more than thousand crystal structures of protein/DNA complexes. We classified protein and DNA conformations by structural alphabets, protein blocks [de Brevern, Etchebest and Hazout (2000) (Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks. Prots. Struct. Funct. Genet., 41:271-287)] and dinucleotide conformers [Svozil, Kalina, Omelka and Schneider (2008) (DNA conformations and their sequence preferences. Nucleic Acids Res., 36:3690-3706)], respectively. Assembling the mutually interacting protein blocks and dinucleotide conformers into 'interaction matrices' revealed their correlations and conformer preferences at the interface relative to their occurrence outside the interface. The analyzed data demonstrated important differences between complexes of various types of proteins such as transcription factors and nucleases, distinct interaction patterns for the DNA minor groove relative to the major groove and phosphate and importance of water-mediated contacts. Water molecules mediate proportionally the largest number of contacts in the minor groove and form the largest proportion of contacts in complexes of transcription factors. The generally known induction of A-DNA forms by complexation was more accurately attributed to A-like and intermediate A/B conformers rare in naked DNA molecules.
Collapse
Affiliation(s)
- Bohdan Schneider
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| | - Jiří Černý
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| | - Daniel Svozil
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| | - Petr Čech
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| | - Jean-Christophe Gelly
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| | - Alexandre G. de Brevern
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| |
Collapse
|
37
|
Ghosh S, Vishveshwara S. Ranking the quality of protein structure models using sidechain based network properties. F1000Res 2014; 3:17. [PMID: 25580218 PMCID: PMC4038323 DOI: 10.12688/f1000research.3-17.v1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/20/2014] [Indexed: 01/31/2023] Open
Abstract
Determining the correct structure of a protein given its sequence still remains an arduous task with many researchers working towards this goal. Most structure prediction methodologies result in the generation of a large number of probable candidates with the final challenge being to select the best amongst these. In this work, we have used Protein Structure Networks of native and modeled proteins in combination with Support Vector Machines to estimate the quality of a protein structure model and finally to provide ranks for these models. Model ranking is performed using regression analysis and helps in model selection from a group of many similar and good quality structures. Our results show that structures with a rank greater than 16 exhibit native protein-like properties while those below 10 are non-native like. The tool is also made available as a web-server ( http://vishgraph.mbu.iisc.ernet.in/GraProStr/native_non_native_ranking.html), where, 5 modelled structures can be evaluated at a given time.
Collapse
Affiliation(s)
- Soma Ghosh
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, 560012, India ; I.I.Sc. Mathematics Initiative, Indian Institute of Science, Bangalore, 560012, India
| | | |
Collapse
|
38
|
Zou C, Gong J, Li H. An improved sequence based prediction protocol for DNA-binding proteins using SVM and comprehensive feature analysis. BMC Bioinformatics 2013; 14:90. [PMID: 23497329 PMCID: PMC3602657 DOI: 10.1186/1471-2105-14-90] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2012] [Accepted: 03/04/2013] [Indexed: 11/10/2022] Open
Abstract
Background DNA-binding proteins (DNA-BPs) play a pivotal role in both eukaryotic and prokaryotic proteomes. There have been several computational methods proposed in the literature to deal with the DNA-BPs, many informative features and properties were used and proved to have significant impact on this problem. However the ultimate goal of Bioinformatics is to be able to predict the DNA-BPs directly from primary sequence. Results In this work, the focus is how to transform these informative features into uniform numeric representation appropriately and improve the prediction accuracy of our SVM-based classifier for DNA-BPs. A systematic representation of some selected features known to perform well is investigated here. Firstly, four kinds of protein properties are obtained and used to describe the protein sequence. Secondly, three different feature transformation methods (OCTD, AC and SAA) are adopted to obtain numeric feature vectors from three main levels: Global, Nonlocal and Local of protein sequence and their performances are exhaustively investigated. At last, the mRMR-IFS feature selection method and ensemble learning approach are utilized to determine the best prediction model. Besides, the optimal features selected by mRMR-IFS are illustrated based on the observed results which may provide useful insights for revealing the mechanisms of protein-DNA interactions. For five-fold cross-validation over the DNAdset and DNAaset, we obtained an overall accuracy of 0.940 and 0.811, MCC of 0.881 and 0.614 respectively. Conclusions The good results suggest that it can efficiently develop an entirely sequence-based protocol that transforms and integrates informative features from different scales used by SVM to predict DNA-BPs accurately. Moreover, a novel systematic framework for sequence descriptor-based protein function prediction is proposed here.
Collapse
Affiliation(s)
- Chuanxin Zou
- Shanghai Key Laboratory of New Drug Design, State Key Laboratory of Bioreactor Engineering, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | | | | |
Collapse
|
39
|
Gromiha MM, Nagarajan R. Computational approaches for predicting the binding sites and understanding the recognition mechanism of protein-DNA complexes. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2013; 91:65-99. [PMID: 23790211 DOI: 10.1016/b978-0-12-411637-5.00003-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Protein-DNA recognition plays an important role in the regulation of gene expression. Understanding the influence of specific residues for protein-DNA interactions and the recognition mechanism of protein-DNA complexes is a challenging task in molecular and computational biology. Several computational approaches have been put forward to tackle these problems from different perspectives: (i) development of databases for the interactions between protein and DNA and binding specificity of protein-DNA complexes, (ii) structural analysis of protein-DNA complexes, (iii) discriminating DNA-binding proteins from amino acid sequence, (iv) prediction of DNA-binding sites and protein-DNA binding specificity using sequence and/or structural information, and (v) understanding the recognition mechanism of protein-DNA complexes. In this review, we focus on all these issues and extensively discuss the advancements on the development of comprehensive bioinformatics databases for protein-DNA interactions, efficient tools for identifying the binding sites, and plausible mechanisms for understanding the recognition of protein-DNA complexes. Further, the available online resources for understanding protein-DNA interactions are collectively listed, which will serve as ready-to-use information for the research community.
Collapse
Affiliation(s)
- M Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India.
| | | |
Collapse
|
40
|
Chatterjee S, Ghosh S, Vishveshwara S. Network properties of decoys and CASP predicted models: a comparison with native protein structures. MOLECULAR BIOSYSTEMS 2013; 9:1774-88. [DOI: 10.1039/c3mb70157c] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
41
|
Low JKK, Wilkins MR. Protein arginine methylation in Saccharomyces cerevisiae. FEBS J 2012; 279:4423-43. [PMID: 23094907 DOI: 10.1111/febs.12039] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2012] [Revised: 10/10/2012] [Accepted: 10/19/2012] [Indexed: 11/27/2022]
Abstract
Recent research has implicated arginine methylation as a major regulator of cellular processes, including transcription, translation, nucleocytoplasmic transport, signalling, DNA repair, RNA processing and splicing. Arginine methylation is evolutionarily conserved, and it is now thought that it may rival other post-translational modifications such as phosphorylation in terms of its occurrence in the proteome. In addition, multiple recent examples demonstrate an exciting new theme: the interplay between methylation and other post-translational modifications such as phosphorylation. In this review, we summarize our current understanding of arginine methylation and the recent advances made, with a focus on the lower eukaryote Saccharomyces cerevisiae. We cover the types of methylated proteins, their responsible methyltransferases, where and how the effects of arginine methylation are seen in the cell, and, finally, discuss the conservation of the biological function of methylarginines between S. cerevisiae and mammals.
Collapse
Affiliation(s)
- Jason K K Low
- Systems Biology Laboratory, School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, Australia
| | | |
Collapse
|
42
|
Wang DD, Li TH, Sun JM, Li DP, Xiong WW, Wang WY, Tang SN. Shape string: a new feature for prediction of DNA-binding residues. Biochimie 2012; 95:354-8. [PMID: 23116714 DOI: 10.1016/j.biochi.2012.10.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2012] [Accepted: 10/08/2012] [Indexed: 10/27/2022]
Abstract
Protein-DNA interactions are involved in many biological processes essential for gene expression and regulation. To understand the molecular mechanisms of protein-DNA recognition, it is crucial to analyze and identify DNA-binding residues of protein-DNA complexes. Here, we proposed a novel descriptor shape string and another two related features shape string PSSM and shape string pair composition to characterize DNA-binding residues. We employed the new features and the position-specific scoring matrix (PSSM) for modeling and prediction. The results of a benchmark dataset showed that our approach significantly improved the accuracy of the predictor. The overall accuracy of our approach reached 85.86% with 85.02% sensitivity and 86.02% specificity. The results also demonstrated that shape string is a powerful descriptor for the prediction of DNA-binding residues. The additional two related features enhanced the predictive value.
Collapse
Affiliation(s)
- Duo-Duo Wang
- Department of Chemistry, Tongji University, Shanghai 200092, PR China
| | | | | | | | | | | | | |
Collapse
|
43
|
Kubrycht J, Sigler K, Souček P. Virtual interactomics of proteins from biochemical standpoint. Mol Biol Int 2012; 2012:976385. [PMID: 22928109 PMCID: PMC3423939 DOI: 10.1155/2012/976385] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Revised: 05/18/2012] [Accepted: 05/18/2012] [Indexed: 12/24/2022] Open
Abstract
Virtual interactomics represents a rapidly developing scientific area on the boundary line of bioinformatics and interactomics. Protein-related virtual interactomics then comprises instrumental tools for prediction, simulation, and networking of the majority of interactions important for structural and individual reproduction, differentiation, recognition, signaling, regulation, and metabolic pathways of cells and organisms. Here, we describe the main areas of virtual protein interactomics, that is, structurally based comparative analysis and prediction of functionally important interacting sites, mimotope-assisted and combined epitope prediction, molecular (protein) docking studies, and investigation of protein interaction networks. Detailed information about some interesting methodological approaches and online accessible programs or databases is displayed in our tables. Considerable part of the text deals with the searches for common conserved or functionally convergent protein regions and subgraphs of conserved interaction networks, new outstanding trends and clinically interesting results. In agreement with the presented data and relationships, virtual interactomic tools improve our scientific knowledge, help us to formulate working hypotheses, and they frequently also mediate variously important in silico simulations.
Collapse
Affiliation(s)
- Jaroslav Kubrycht
- Department of Physiology, Second Medical School, Charles University, 150 00 Prague, Czech Republic
| | - Karel Sigler
- Laboratory of Cell Biology, Institute of Microbiology, Academy of Sciences of the Czech Republic, 142 20 Prague, Czech Republic
| | - Pavel Souček
- Toxicogenomics Unit, National Institute of Public Health, 100 42 Prague, Czech Republic
| |
Collapse
|
44
|
Raimondi F, Felline A, Portella G, Orozco M, Fanelli F. Light on the structural communication in Ras GTPases. J Biomol Struct Dyn 2012; 31:142-57. [PMID: 22849539 DOI: 10.1080/07391102.2012.698379] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
The graph theory was combined with fluctuation dynamics to investigate the structural communication in four small G proteins, Arf1, H-Ras, RhoA, and Sec4. The topology of small GTPases is such that it requires the presence of the nucleotide to acquire a persistent structural network. The majority of communication paths involves the nucleotide and does not exist in the unbound forms. The latter are almost devoid of high-frequency paths. Thus, small Ras GTPases acquire the ability to transfer signals in the presence of nucleotide, suggesting that it modifies the intrinsic dynamics of the protein through the establishment of regions of hyperlinked nodes with high occurrence of correlated motions. The analysis of communication paths in the inactive (S(GDP)) and active (S(GTP)) states of the four G proteins strengthened the separation of the Ras-like domain into two dynamically distinct lobes, i.e. lobes 1 and 2, representing, respectively, the N-terminal and C-terminal halves of the domain. In the framework of this separation, interfunctional states and interfamily differences could be inferred. The structure network undergoes a reshaping depending on the bound nucleotide. Nucleotide-dependent divergences in structural communication reach the maximum in Arf1 and the minimum in RhoA. In Arf1, the nucleotide-dependent paths essentially express a communication between the G box 4 (G4) and distal portions of lobe 1. In the S(GDP) state, the G4 communicates with the N-term, while, in the S(GTP) state, the G4 communicates with the switch II. Clear differences could be also found between Arf1 and the other three G proteins. In Arf1, the nucleotide tends to communicate with distal portions of lobe 1, whereas in H-Ras, RhoA, and Sec4 it tends to communicate with a cluster of aromatic/hydrophobic amino acids in lobe 2. These differences may be linked, at least in part, to the divergent membrane anchoring modes that would involve the N-term for the Arf family and the C-term for the Rab/Ras/Rho families.
Collapse
Affiliation(s)
- Francesco Raimondi
- Department of Chemistry, University of Modena and Reggio Emilia, Modena, Italy
| | | | | | | | | |
Collapse
|
45
|
Atilgan AR, Atilgan C. Local motifs in proteins combine to generate global functional moves. Brief Funct Genomics 2012; 11:479-88. [PMID: 22811517 DOI: 10.1093/bfgp/els027] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Literature on the topological properties of folded proteins that has emerged as a field in its own right in the past decade is reviewed. Physics-based construction of coarse-grained models of proteins from knowledge of all-atom coordinates of the average structure is discussed. Once network is thus obtained with the node and link information, local motifs provide plethora of information on protein function. The hierarchical structure of the proteins manifested in the interrelations of local motifs is emphasized. Motifs are also related to modularity of the structure, and they quantify shifts in the landscapes upon conformational changes induced by, e.g. ligand binding. Redundancy emerges as a balance between local and global network descriptors and is related to the collectivity of the protein motions. Introducing weight on links followed by sequential removal of least cohesive contacts allows interactions in proteins to be represented as the superposition of essential and redundant sets. Lack of the former makes the network non-functional, while the latter ensures robust functioning under a wide range of perturbation scenarios.
Collapse
Affiliation(s)
- Ali Rana Atilgan
- Faculty of Engineering and Natural Sciences, Sabanci University, 34956 Istanbul, Turkey
| | | |
Collapse
|
46
|
Szabóová A, Kuželka O, Železný F, Tolar J. Prediction of DNA-binding propensity of proteins by the ball-histogram method using automatic template search. BMC Bioinformatics 2012; 13 Suppl 10:S3. [PMID: 22759427 PMCID: PMC3382442 DOI: 10.1186/1471-2105-13-s10-s3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing physicochemical properties of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of amino acids complying with automatically selected properties. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, improving on state-of-the-art accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids.
Collapse
Affiliation(s)
- Andrea Szabóová
- Czech Technical University, Department of Cybernetics, Prague, 166 27, Czech Republic
| | - Ondřej Kuželka
- Czech Technical University, Department of Cybernetics, Prague, 166 27, Czech Republic
| | - Filip Železný
- Czech Technical University, Department of Cybernetics, Prague, 166 27, Czech Republic
| | - Jakub Tolar
- University of Minnesota, Department of Pediatrics, Blood and Marrow Transplantation, Minneapolis, USA
| |
Collapse
|
47
|
Dey S, Pal A, Guharoy M, Sonavane S, Chakrabarti P. Characterization and prediction of the binding site in DNA-binding proteins: improvement of accuracy by combining residue composition, evolutionary conservation and structural parameters. Nucleic Acids Res 2012; 40:7150-61. [PMID: 22641851 PMCID: PMC3424558 DOI: 10.1093/nar/gks405] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
We present a set of four parameters that in combination can predict DNA-binding residues on protein structures to a high degree of accuracy. These are the number of evolutionary conserved residues (Ncons) and their spatial clustering (ρe), hydrogen bond donor capability (Dp) and residue propensity (Rp). We first used these parameters to characterize 130 interfaces in a set of 126 DNA-binding proteins (DBPs). The applicability of these parameters both individually and in combination, to distinguish the true binding region from the rest of the protein surface was then analyzed. Rp shows the best performance identifying the true interface with the top rank in 83% cases. Importantly, we also used the unbound-bound test cases of the protein–DNA docking benchmark to test the efficacy of our method. When applied to the unbound form of the DBPs, Rp can distinguish 86% cases. Finally, we have applied the SVM approach for recognizing the interface region using the above parameters along with the individual amino acid composition as attributes. The accuracy of prediction is 90.5% for the bound structures and 93.6% for the unbound form of the proteins.
Collapse
Affiliation(s)
- Sucharita Dey
- Bioinformatics Centre, Bose Institute, P-1/12 CIT Scheme VIIM, Kolkata 700 054, India
| | | | | | | | | |
Collapse
|
48
|
Xiong Y, Xia J, Zhang W, Liu J. Exploiting a reduced set of weighted average features to improve prediction of DNA-binding residues from 3D structures. PLoS One 2011; 6:e28440. [PMID: 22174808 PMCID: PMC3234263 DOI: 10.1371/journal.pone.0028440] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2011] [Accepted: 11/08/2011] [Indexed: 01/29/2023] Open
Abstract
Predicting DNA-binding residues from a protein three-dimensional structure is a key task of computational structural proteomics. In the present study, based on machine learning technology, we aim to explore a reduced set of weighted average features for improving prediction of DNA-binding residues on protein surfaces. Via constructing the spatial environment around a DNA-binding residue, a novel weighting factor is first proposed to quantify the distance-dependent contribution of each neighboring residue in determining the location of a binding residue. Then, a weighted average scheme is introduced to represent the surface patch of the considering residue. Finally, the classifier is trained on the reduced set of these weighted average features, consisting of evolutionary profile, interface propensity, betweenness centrality and solvent surface area of side chain. Experimental results on 5-fold cross validation and independent tests indicate that the new feature set are effective to describe DNA-binding residues and our approach has significantly better performance than two previous methods. Furthermore, a brief case study suggests that the weighted average features are powerful for identifying DNA-binding residues and are promising for further study of protein structure-function relationship. The source code and datasets are available upon request.
Collapse
Affiliation(s)
- Yi Xiong
- School of Computer, Wuhan University, Wuhan, China
| | - Junfeng Xia
- Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Wen Zhang
- School of Computer, Wuhan University, Wuhan, China
| | - Juan Liu
- School of Computer, Wuhan University, Wuhan, China
- * E-mail:
| |
Collapse
|
49
|
Pons C, Glaser F, Fernandez-Recio J. Prediction of protein-binding areas by small-world residue networks and application to docking. BMC Bioinformatics 2011; 12:378. [PMID: 21943333 PMCID: PMC3189935 DOI: 10.1186/1471-2105-12-378] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2011] [Accepted: 09/26/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein-protein interactions are involved in most cellular processes, and their detailed physico-chemical and structural characterization is needed in order to understand their function at the molecular level. In-silico docking tools can complement experimental techniques, providing three-dimensional structural models of such interactions at atomic resolution. In several recent studies, protein structures have been modeled as networks (or graphs), where the nodes represent residues and the connecting edges their interactions. From such networks, it is possible to calculate different topology-based values for each of the nodes, and to identify protein regions with high centrality scores, which are known to positively correlate with key functional residues, hot spots, and protein-protein interfaces. RESULTS Here we show that this correlation can be efficiently used for the scoring of rigid-body docking poses. When integrated into the pyDock energy-based docking method, the new combined scoring function significantly improved the results of the individual components as shown on a standard docking benchmark. This improvement was particularly remarkable for specific protein complexes, depending on the shape, size, type, or flexibility of the proteins involved. CONCLUSIONS The network-based representation of protein structures can be used to identify protein-protein binding regions and to efficiently score docking poses, complementing energy-based approaches.
Collapse
Affiliation(s)
- Carles Pons
- Joint BSC-IRB research programme in Computational Biology, Barcelona Supercomputing Center, Barcelona 08034, Spain
| | | | | |
Collapse
|
50
|
Rutledge LR, Navarro-Whyte L, Peterson TL, Wetmore SD. Effects of Extending the Computational Model on DNA–Protein T-shaped Interactions: The Case of Adenine–Histidine Dimers. J Phys Chem A 2011; 115:12646-58. [DOI: 10.1021/jp203248j] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Lesley R. Rutledge
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, Alberta, Canada T1K 3M4
| | - Lex Navarro-Whyte
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, Alberta, Canada T1K 3M4
| | - Terri L. Peterson
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, Alberta, Canada T1K 3M4
| | - Stacey D. Wetmore
- Department of Chemistry and Biochemistry, University of Lethbridge, 4401 University Drive, Lethbridge, Alberta, Canada T1K 3M4
| |
Collapse
|