1
|
Su Z, Dhusia K, Wu Y. Encoding the space of protein-protein binding interfaces by artificial intelligence. Comput Biol Chem 2024; 110:108080. [PMID: 38643609 DOI: 10.1016/j.compbiolchem.2024.108080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 04/03/2024] [Accepted: 04/17/2024] [Indexed: 04/23/2024]
Abstract
The physical interactions between proteins are largely determined by the structural properties at their binding interfaces. It was found that the binding interfaces in distinctive protein complexes are highly similar. The structural properties underlying different binding interfaces could be further captured by artificial intelligence. In order to test this hypothesis, we broke protein-protein binding interfaces into pairs of interacting fragments. We employed a generative model to encode these interface fragment pairs in a low-dimensional latent space. After training, new conformations of interface fragment pairs were generated. We found that, by only using a small number of interface fragment pairs that were generated by artificial intelligence, we were able to guide the assembly of protein complexes into their native conformations. These results demonstrate that the conformational space of fragment pairs at protein-protein binding interfaces is highly degenerate. Features in this degenerate space can be well characterized by artificial intelligence. In summary, our machine learning method will be potentially useful to search for and predict the conformations of unknown protein-protein interactions.
Collapse
Affiliation(s)
- Zhaoqian Su
- Data Science Institute, Vanderbilt University, 1001 19th Ave S, Nashville, TN 37212, USA
| | - Kalyani Dhusia
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA.
| |
Collapse
|
2
|
Singh A, Copeland MM, Kundrotas PJ, Vakser IA. GRAMM Web Server for Protein Docking. Methods Mol Biol 2024; 2714:101-112. [PMID: 37676594 DOI: 10.1007/978-1-0716-3441-7_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Prediction of the structure of protein complexes by docking methods is a well-established research field. The intermolecular energy landscapes in protein-protein interactions can be used to refine docking predictions and to detect macro-characteristics, such as the binding funnel. A new GRAMM web server for protein docking predicts a spectrum of docking poses that characterize the intermolecular energy landscape in protein interaction. A user-friendly interface provides options to choose free or template-based docking, as well as other advanced features, such as clustering of the docking poses, and interactive visualization of the docked models.
Collapse
Affiliation(s)
- Amar Singh
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA
| | - Matthew M Copeland
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA
| | - Petras J Kundrotas
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA.
| | - Ilya A Vakser
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA.
| |
Collapse
|
3
|
Meng Q, Guo F, Wang E, Tang J. ComDock: A novel approach for protein-protein docking with an efficient fusing strategy. Comput Biol Med 2023; 167:107660. [PMID: 37944303 DOI: 10.1016/j.compbiomed.2023.107660] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/08/2023] [Accepted: 10/31/2023] [Indexed: 11/12/2023]
Abstract
Protein-protein interaction plays an important role in studying the mechanism of protein functions from the structural perspective. Molecular docking is a powerful approach to detect protein-protein complexes using computational tools, due to the high cost and time-consuming of the traditional experimental methods. Among existing technologies, the template-based method utilizes the structural information of known homologous 3D complexes as available and reliable templates to achieve high accuracy and low computational complexity. However, the performance of the template-based method depends on the quality and quantity of templates. When insufficient or even no templates, the ab initio docking method is necessary and largely enriches the docking conformations. Therefore, it's a feasible strategy to fuse the effectivity of the template-based model and the universality of ab initio model to improve the docking performance. In this study, we construct a new, diverse, comprehensive template library derived from PDB, containing 77,685 complexes. We propose a template-based method (named TemDock), which retrieves the evolutionary relationship between the target sequence and samples in the template library and transfers similar structural information. Then, the target structure is built by superposing on the homologous template complex with TM-align. Moreover, we develop a consensus-based method (named ComDock) to integrate our TemDock and an existing ab initio method (ZDOCK). On 105 targets with templates from Benchmark 5.0, the TemDock and ComDock achieve a success rate of 68.57 % and 71.43 % in the top 10 conformations, respectively. Compared with the HDOCK, ComDock obtains better I-RMSD of hit configurations on 9 targets and more hit models in the top 100 conformations. As an efficient method for protein-protein docking, the ComDock is expected to study protein-protein recognition and reveal the various biological passways that are critical for developing drug discovery. The final results are stored at https://github.com/guofei-tju/mqz_ComDock_docking.
Collapse
Affiliation(s)
- Qiaozhen Meng
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha, China.
| | - Ercheng Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China; Zhejiang Laboratory, Hangzhou, Zhejiang, China.
| | - Jijun Tang
- Shenzhen Institute of Advanced Technology of Chinese Academy of Sciences, Shenzhen, China.
| |
Collapse
|
4
|
Xie P, Zhuang J, Tian G, Yang J. Emvirus: An embedding-based neural framework for human-virus protein-protein interactions prediction. BIOSAFETY AND HEALTH 2023; 5:152-158. [PMID: 37362223 PMCID: PMC10166638 DOI: 10.1016/j.bsheal.2023.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/23/2023] [Accepted: 04/23/2023] [Indexed: 06/28/2023] Open
Abstract
Human-virus protein-protein interactions (PPIs) play critical roles in viral infection. For example, the spike protein of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) binds primarily to human angiotensin-converting enzyme 2 (ACE2) protein to infect human cells. Thus, identifying and blocking these PPIs contribute to controlling and preventing viruses. However, wet-lab experiment-based identification of human-virus PPIs is usually expensive, labor-intensive, and time-consuming, which presents the need for computational methods. Many machine-learning methods have been proposed recently and achieved good results in predicting human-virus PPIs. However, most methods are based on protein sequence features and apply manually extracted features, such as statistical characteristics, phylogenetic profiles, and physicochemical properties. In this work, we present an embedding-based neural framework with convolutional neural network (CNN) and bi-directional long short-term memory unit (Bi-LSTM) architecture, named Emvirus, to predict human-virus PPIs (including human-SARS-CoV-2 PPIs). In addition, we conduct cross-viral experiments to explore the generalization ability of Emvirus. Compared to other feature extraction methods, Emvirus achieves better prediction accuracy.
Collapse
Affiliation(s)
- Pengfei Xie
- College of Transportation Engineering, Dalian Maritime University, Dalian 116026, China
| | - Jujuan Zhuang
- School of Science, Dalian Maritime University, Dalian 116026, China
| | - Geng Tian
- Geneis Beijing Co., Ltd., Beijing 100102, China
- Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China
| | - Jialiang Yang
- Geneis Beijing Co., Ltd., Beijing 100102, China
- Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao 266000, China
| |
Collapse
|
5
|
Chakraborty A, Mitra S, Bhattacharjee M, De D, Pal AJ. Determining human-coronavirus protein-protein interaction using machine intelligence. MEDICINE IN NOVEL TECHNOLOGY AND DEVICES 2023; 18:100228. [PMID: 37056696 PMCID: PMC10077817 DOI: 10.1016/j.medntd.2023.100228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Revised: 03/29/2023] [Accepted: 04/01/2023] [Indexed: 04/08/2023] Open
Abstract
The Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) virus spread the novel CoronaVirus −19 (nCoV-19) pandemic, resulting in millions of fatalities globally. Recent research demonstrated that the Protein-Protein Interaction (PPI) between SARS-CoV-2 and human proteins is accountable for viral pathogenesis. However, many of these PPIs are poorly understood and unexplored, necessitating a more in-depth investigation to find latent yet critical interactions. This article elucidates the host-viral PPI through Machine Learning (ML) lenses and validates the biological significance of the same using web-based tools. ML classifiers are designed based on comprehensive datasets with five sequence-based features of human proteins, namely Amino Acid Composition, Pseudo Amino Acid Composition, Conjoint Triad, Dipeptide Composition, and Normalized Auto Correlation. A majority voting rule-based ensemble method composed of the Random Forest Model (RFM), AdaBoost, and Bagging technique is proposed that delivers encouraging statistical performance compared to other models employed in this work. The proposed ensemble model predicted a total of 111 possible SARS-CoV-2 human target proteins with a high likelihood factor ≥70%, validated by utilizing Gene Ontology (GO) and KEGG pathway enrichment analysis. Consequently, this research can aid in a deeper understanding of the molecular mechanisms underlying viral pathogenesis and provide clues for developing more efficient anti-COVID medications.
Collapse
Affiliation(s)
- Arijit Chakraborty
- Bachelor of Computer Application Department, The Heritage Academy, Kolkata, India
| | - Sajal Mitra
- Department of Computer Science and Engineering, Heritage Institute of Technology, Kolkata, India
| | | | - Debashis De
- Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, India
| | | |
Collapse
|
6
|
Dhusia K, Su Z, Wu Y. Computational analyses of the interactome between TNF and TNFR superfamilies. Comput Biol Chem 2023; 103:107823. [PMID: 36682326 DOI: 10.1016/j.compbiolchem.2023.107823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 01/05/2023] [Accepted: 01/18/2023] [Indexed: 01/20/2023]
Abstract
Proteins in the tumor necrosis factor (TNF) superfamily (TNFSF) regulate diverse cellular processes by interacting with their receptors in the TNF receptor (TNFR) superfamily (TNFRSF). Ligands and receptors in these two superfamilies form a complicated network of interactions, in which the same ligand can bind to different receptors and the same receptor can be shared by different ligands. In order to study these interactions on a systematic level, a TNFSF-TNFRSF interactome was constructed in this study by searching the database which consists of both experimentally measured and computationally predicted protein-protein interactions (PPIs). The interactome contains a total number of 194 interactions between 18 TNFSF ligands and 29 TNFRSF receptors in human. We modeled the structure for each ligand-receptor interaction in the network. Their binding affinities were further computationally estimated based on modeled structures. Our computational outputs, which are all publicly accessible, serve as a valuable addition to the currently limited experimental resources to study TNF-mediated cell signaling.
Collapse
Affiliation(s)
- Kalyani Dhusia
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461, the United States of America
| | - Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461, the United States of America
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY, 10461, the United States of America.
| |
Collapse
|
7
|
Vora DS, Kalakoti Y, Sundar D. Computational Methods and Deep Learning for Elucidating Protein Interaction Networks. Methods Mol Biol 2023; 2553:285-323. [PMID: 36227550 DOI: 10.1007/978-1-0716-2617-7_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Protein interactions play a critical role in all biological processes, but experimental identification of protein interactions is a time- and resource-intensive process. The advances in next-generation sequencing and multi-omics technologies have greatly benefited large-scale predictions of protein interactions using machine learning methods. A wide range of tools have been developed to predict protein-protein, protein-nucleic acid, and protein-drug interactions. Here, we discuss the applications, methods, and challenges faced when employing the various prediction methods. We also briefly describe ways to overcome the challenges and prospective future developments in the field of protein interaction biology.
Collapse
Affiliation(s)
- Dhvani Sandip Vora
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Yogesh Kalakoti
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Durai Sundar
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
- School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
| |
Collapse
|
8
|
Asim MN, Ibrahim MA, Malik MI, Dengel A, Ahmed S. LGCA-VHPPI: A local-global residue context aware viral-host protein-protein interaction predictor. PLoS One 2022; 17:e0270275. [PMID: 35789333 PMCID: PMC9255777 DOI: 10.1371/journal.pone.0270275] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 06/07/2022] [Indexed: 11/19/2022] Open
Abstract
Viral-host protein protein interaction (PPI) analysis is essential to decode the molecular mechanism of viral pathogen and host immunity processes which eventually help to control viral diseases and optimize therapeutics. The state-of-the-art viral-host PPI predictor leverages unsupervised embedding learning technique (doc2vec) to generate statistical representations of viral-host protein sequences and a Random Forest classifier for interaction prediction. However, doc2vec approach generates the statistical representations of viral-host protein sequences by merely modelling the local context of residues which only partially captures residue semantics. The paper in hand proposes a novel technique for generating better statistical representations of viral and host protein sequences based on the infusion of comprehensive local and global contextual information of the residues. While local residue context aware encoding captures semantic relatedness and short range dependencies of residues. Global residue context aware encoding captures comprehensive long-range residues dependencies, positional invariance of residues, and unique residue combination distribution important for interaction prediction. Using concatenated rich statistical representations of viral and host protein sequences, a robust machine learning framework “LGCA-VHPPI” is developed which makes use of a deep forest model to effectively model complex non-linearity of viral-host PPI sequences. An in-depth performance comparison of the proposed LGCA-VHPPI framework with existing diverse sequence encoding schemes based viral-host PPI predictors reveals that LGCA-VHPPI outperforms state-of-the-art predictor by 6%, 2%, and 2% in terms of matthews correlation coefficient over 3 different benchmark viral-host PPI prediction datasets.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
- * E-mail:
| | - Muhammad Ali Ibrahim
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
| | - Muhammad Imran Malik
- National Center of Artificial Intelligence, National University of Sciences and Technology, Islamabad, Pakistan
| | - Andreas Dengel
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
| |
Collapse
|
9
|
Malladi S, Powell HR, David A, Islam SA, Copeland MM, Kundrotas PJ, Sternberg MJ, Vakser IA. GWYRE: A resource for mapping variants onto experimental and modeled structures of human protein complexes. J Mol Biol 2022; 434:167608. [PMID: 35662458 PMCID: PMC9188266 DOI: 10.1016/j.jmb.2022.167608] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/31/2022] [Accepted: 04/20/2022] [Indexed: 02/08/2023]
Abstract
Structure of protein complexes is important for interpreting genetic variation. Data on single amino acid variants is available from high-throughput sequencing. Integrated modeling approach was applied to proteins and their complexes. GWYRE resource incorporates predicted protein complexes with mapped mutations.
Rapid progress in structural modeling of proteins and their interactions is powered by advances in knowledge-based methodologies along with better understanding of physical principles of protein structure and function. The pool of structural data for modeling of proteins and protein–protein complexes is constantly increasing due to the rapid growth of protein interaction databases and Protein Data Bank. The GWYRE (Genome Wide PhYRE) project capitalizes on these developments by advancing and applying new powerful modeling methodologies to structural modeling of protein–protein interactions and genetic variation. The methods integrate knowledge-based tertiary structure prediction using Phyre2 and quaternary structure prediction using template-based docking by a full-structure alignment protocol to generate models for binary complexes. The predictions are incorporated in a comprehensive public resource for structural characterization of the human interactome and the location of human genetic variants. The GWYRE resource facilitates better understanding of principles of protein interaction and structure/function relationships. The resource is available at http://www.gwyre.org.
Collapse
|
10
|
Elhabashy H, Merino F, Alva V, Kohlbacher O, Lupas AN. Exploring protein-protein interactions at the proteome level. Structure 2022; 30:462-475. [DOI: 10.1016/j.str.2022.02.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/26/2021] [Accepted: 02/02/2022] [Indexed: 02/08/2023]
|
11
|
Prediction and Modeling of Protein–Protein Interactions Using “Spotted” Peptides with a Template-Based Approach. Biomolecules 2022; 12:biom12020201. [PMID: 35204702 PMCID: PMC8961654 DOI: 10.3390/biom12020201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 01/20/2022] [Accepted: 01/22/2022] [Indexed: 12/10/2022] Open
Abstract
Protein–peptide interactions (PpIs) are a subset of the overall protein–protein interaction (PPI) network in the living cell and are pivotal for the majority of cell processes and functions. High-throughput methods to detect PpIs and PPIs usually require time and costs that are not always affordable. Therefore, reliable in silico predictions represent a valid and effective alternative. In this work, a new algorithm is described, implemented in a freely available tool, i.e., “PepThreader”, to carry out PPIs and PpIs prediction and analysis. PepThreader threads multiple fragments derived from a full-length protein sequence (or from a peptide library) onto a second template peptide, in complex with a protein target, “spotting” the potential binding peptides and ranking them according to a sequence-based and structure-based threading score. The threading algorithm first makes use of a scoring function that is based on peptides sequence similarity. Then, a rerank of the initial hits is performed, according to structure-based scoring functions. PepThreader has been benchmarked on a dataset of 292 protein–peptide complexes that were collected from existing databases of experimentally determined protein–peptide interactions. An accuracy of 80%, when considering the top predicted 25 hits, was achieved, which performs in a comparable way with the other state-of-art tools in PPIs and PpIs modeling. Nonetheless, PepThreader is unique in that it is able at the same time to spot a binding peptide within a full-length sequence involved in PPI and model its structure within the receptor. Therefore, PepThreader adds to the already-available tools supporting the experimental PPIs and PpIs identification and characterization.
Collapse
|
12
|
Dhusia K, Madrid C, Su Z, Wu Y. EXCESP: A Structure-Based Online Database for Extracellular Interactome of Cell Surface Proteins in Humans. J Proteome Res 2022; 21:349-359. [PMID: 34978816 DOI: 10.1021/acs.jproteome.1c00612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The interactions between ectodomains of cell surface proteins are vital players in many important cellular processes, such as regulating immune responses, coordinating cell differentiation, and shaping neural plasticity. However, while the construction of a large-scale protein interactome has been greatly facilitated by the development of high-throughput experimental techniques, little progress has been made to support the discovery of extracellular interactome for cell surface proteins. Harnessed by the recent advances in computational modeling of protein-protein interactions, here we present a structure-based online database for the extracellular interactome of cell surface proteins in humans, called EXCESP. The database contains both experimentally determined and computationally predicted interactions among all type-I transmembrane proteins in humans. All structural models for these interactions and their binding affinities were further computationally modeled. Moreover, information such as expression levels of each protein in different cell types and its relation to various signaling pathways from other online resources has also been integrated into the database. In summary, the database serves as a valuable addition to the existing online resources for the study of cell surface proteins. It can contribute to the understanding of the functions of cell surface proteins in the era of systems biology.
Collapse
Affiliation(s)
- Kalyani Dhusia
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Carlos Madrid
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States.,Laboratory for Macromolecular Analysis and Proteomics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| |
Collapse
|
13
|
Xie J, Zheng J, Hong X, Tong X, Liu X, Song Q, Liu S, Liu S. Protein-DNA complex structure modeling based on structural template. Biochem Biophys Res Commun 2021; 577:152-157. [PMID: 34517213 DOI: 10.1016/j.bbrc.2021.09.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 09/05/2021] [Accepted: 09/06/2021] [Indexed: 10/20/2022]
Abstract
DNA-binding is an important feature of proteins, and protein-DNA interaction involves in many life processes. Various computational methods have been developed to predict protein-DNA complex structures due to the difficulty of experimentally obtaining protein-DNA complex structures. However, prediction of protein-DNA complex is still a challenging problem compared with prediction of protein-RNA complex, this may be due to the large conformational changes between bound and unbound structure in both protein and DNA. We extend PRIME 2.0 to PRIME 2.0.1 to model protein-DNA complex structures. By comparing sequence and structure alignment methods, we found that structure-based methods can find more templates than sequence-based methods. The results of all-to-all structure alignments showed that DNA structure plays an important role in prediction of protein-DNA complex structure. By exploring the relationship of sequence and structure, we found that in protein-DNA interaction, numerous structures with dissimilar sequences have similar 3D structures and perform the similar function.
Collapse
Affiliation(s)
- Juan Xie
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Jinfang Zheng
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Xu Hong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Xiaoxue Tong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Xudong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Qi Song
- Key Laboratory of Fermentation Engineering (Ministry of Education), Hubei University of Technology, China
| | - Sen Liu
- Key Laboratory of Fermentation Engineering (Ministry of Education), Hubei University of Technology, China
| | - Shiyong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China.
| |
Collapse
|
14
|
Soltanikazemi E, Quadir F, Roy RS, Guo Z, Cheng J. Distance-based reconstruction of protein quaternary structures from inter-chain contacts. Proteins 2021; 90:720-731. [PMID: 34716620 PMCID: PMC8816881 DOI: 10.1002/prot.26269] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 09/25/2021] [Accepted: 10/12/2021] [Indexed: 12/21/2022]
Abstract
Predicting the quaternary structure of protein complex is an important problem. Inter‐chain residue‐residue contact prediction can provide useful information to guide the ab initio reconstruction of quaternary structures. However, few methods have been developed to build quaternary structures from predicted inter‐chain contacts. Here, we develop the first method based on gradient descent optimization (GD) to build quaternary structures of protein dimers utilizing inter‐chain contacts as distance restraints. We evaluate GD on several datasets of homodimers and heterodimers using true/predicted contacts and monomer structures as input. GD consistently performs better than both simulated annealing and Markov Chain Monte Carlo simulation. Starting from an arbitrarily quaternary structure randomly initialized from the tertiary structures of protein chains and using true inter‐chain contacts as input, GD can reconstruct high‐quality structural models for homodimers and heterodimers with average TM‐score ranging from 0.92 to 0.99 and average interface root mean square distance from 0.72 Å to 1.64 Å. On a dataset of 115 homodimers, using predicted inter‐chain contacts as restraints, the average TM‐score of the structural models built by GD is 0.76. For 46% of the homodimers, high‐quality structural models with TM‐score ≥ 0.9 are reconstructed from predicted contacts. There is a strong correlation between the quality of the reconstructed models and the precision and recall of predicted contacts. Only a moderate precision or recall of inter‐chain contact prediction is needed to build good structural models for most homodimers. Moreover, GD improves the quality of quaternary structures predicted by AlphaFold2 on a Critical Assessment of Techniques for Protein Structure Prediction–Critical Assessments of Predictions of Interactions dataset.
Collapse
Affiliation(s)
- Elham Soltanikazemi
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Farhan Quadir
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Zhiye Guo
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
| |
Collapse
|
15
|
Hadarovich A, Chakravarty D, Tuzikov AV, Ben-Tal N, Kundrotas PJ, Vakser IA. Structural motifs in protein cores and at protein-protein interfaces are different. Protein Sci 2020; 30:381-390. [PMID: 33166001 DOI: 10.1002/pro.3996] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 10/30/2020] [Accepted: 10/31/2020] [Indexed: 11/10/2022]
Abstract
Structures of proteins and protein-protein complexes are determined by the same physical principles and thus share a number of similarities. At the same time, there could be differences because in order to function, proteins interact with other molecules, undergo conformations changes, and so forth, which might impose different restraints on the tertiary versus quaternary structures. This study focuses on structural properties of protein-protein interfaces in comparison with the protein core, based on the wealth of currently available structural data and new structure-based approaches. The results showed that physicochemical characteristics, such as amino acid composition, residue-residue contact preferences, and hydrophilicity/hydrophobicity distributions, are similar in protein core and protein-protein interfaces. On the other hand, characteristics that reflect the evolutionary pressure, such as structural composition and packing, are largely different. The results provide important insight into fundamental properties of protein structure and function. At the same time, the results contribute to better understanding of the ways to dock proteins. Recent progress in predicting structures of individual proteins follows the advancement of deep learning techniques and new approaches to residue coevolution data. Protein core could potentially provide large amounts of data for application of the deep learning to docking. However, our results showed that the core motifs are significantly different from those at protein-protein interfaces, and thus may not be directly useful for docking. At the same time, such difference may help to overcome a major obstacle in application of the coevolutionary data to docking-discrimination of the intramolecular information not directly relevant to docking.
Collapse
Affiliation(s)
- Anna Hadarovich
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA.,United Institute of Informatics Problems, National Academy of Sciences, Minsk, Belarus
| | - Devlina Chakravarty
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA.,Department of Chemistry, Rutgers University, Camden, New Jersey, USA
| | - Alexander V Tuzikov
- United Institute of Informatics Problems, National Academy of Sciences, Minsk, Belarus
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Petras J Kundrotas
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA
| | - Ilya A Vakser
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, USA
| |
Collapse
|
16
|
Su Z, Dhusia K, Wu Y. Understand the Functions of Scaffold Proteins in Cell Signaling by a Mesoscopic Simulation Method. Biophys J 2020; 119:2116-2126. [PMID: 33113350 DOI: 10.1016/j.bpj.2020.10.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 08/24/2020] [Accepted: 10/07/2020] [Indexed: 02/02/2023] Open
Abstract
Scaffold proteins are central players in regulating the spatial-temporal organization of many important signaling pathways in cells. They offer physical platforms to downstream signaling proteins so that their transient interactions in a crowded and heterogeneous environment of cytosol can be greatly facilitated. However, most scaffold proteins tend to simultaneously bind more than one signaling molecule, which leads to the spatial assembly of multimeric protein complexes. The kinetics of these protein oligomerizations are difficult to quantify by traditional experimental approaches. To understand the functions of scaffold proteins in cell signaling, we developed a, to our knowledge, new hybrid simulation algorithm in which both spatial organization and binding kinetics of proteins were implemented. We applied this new technique to a simple network system that contains three molecules. One molecule in the network is a scaffold protein, whereas the other two are its binding targets in the downstream signaling pathway. Each of the three molecules in the system contains two binding motifs that can interact with each other and are connected by a flexible linker. By applying the new simulation method to the model, we show that the scaffold proteins will promote not only thermodynamics but also kinetics of cell signaling given the premise that the interaction between the two signaling molecules is transient. Moreover, by changing the flexibility of the linker between two binding motifs, our results suggest that the conformational fluctuations in a scaffold protein play a positive role in recruiting downstream signaling molecules. In summary, this study showcases the capability of computational simulation in understanding the general principles of scaffold protein functions.
Collapse
Affiliation(s)
- Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York
| | - Kalyani Dhusia
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York.
| |
Collapse
|
17
|
Padhorny D, Porter KA, Ignatov M, Alekseenko A, Beglov D, Kotelnikov S, Ashizawa R, Desta I, Alam N, Sun Z, Brini E, Dill K, Schueler-Furman O, Vajda S, Kozakov D. ClusPro in rounds 38 to 45 of CAPRI: Toward combining template-based methods with free docking. Proteins 2020; 88:1082-1090. [PMID: 32142178 DOI: 10.1002/prot.25887] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2019] [Revised: 02/27/2020] [Accepted: 03/04/2020] [Indexed: 01/01/2023]
Abstract
Targets in the protein docking experiment CAPRI (Critical Assessment of Predicted Interactions) generally present new challenges and contribute to new developments in methodology. In rounds 38 to 45 of CAPRI, most targets could be effectively predicted using template-based methods. However, the server ClusPro required structures rather than sequences as input, and hence we had to generate and dock homology models. The available templates also provided distance restraints that were directly used as input to the server. We show here that such an approach has some advantages. Free docking with template-based restraints using ClusPro reproduced some interfaces suggested by weak or ambiguous templates while not reproducing others, resulting in correct server predicted models. More recently we developed the fully automated ClusPro TBM server that performs template-based modeling and thus can use sequences rather than structures of component proteins as input. The performance of the server, freely available for noncommercial use at https://tbm.cluspro.org, is demonstrated by predicting the protein-protein targets of rounds 38 to 45 of CAPRI.
Collapse
Affiliation(s)
- Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Kathryn A Porter
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Mikhail Ignatov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Andrey Alekseenko
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA.,Institute of Computer Aided Design of the Russian Academy of Sciences, Moscow, Russia
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA.,Acpharis Inc., Holliston, Massachusetts, USA
| | - Sergei Kotelnikov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA.,Innopolis University, Innopolis, Russia
| | - Ryota Ashizawa
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Israel Desta
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Nawsad Alam
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University, Jerusalem, Israel
| | - Zhuyezi Sun
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Emiliano Brini
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Ken Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA.,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York, USA.,Department of Chemistry, Stony Brook University, Stony Brook, New York, USA
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University, Jerusalem, Israel
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA.,Department of Chemistry, Boston University, Boston, Massachusetts, USA
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| |
Collapse
|
18
|
Singh A, Dauzhenka T, Kundrotas PJ, Sternberg MJE, Vakser IA. Application of docking methodologies to modeled proteins. Proteins 2020; 88:1180-1188. [PMID: 32170770 DOI: 10.1002/prot.25889] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 02/15/2020] [Accepted: 03/07/2020] [Indexed: 12/12/2022]
Abstract
Protein docking is essential for structural characterization of protein interactions. Besides providing the structure of protein complexes, modeling of proteins and their complexes is important for understanding the fundamental principles and specific aspects of protein interactions. The accuracy of protein modeling, in general, is still less than that of the experimental approaches. Thus, it is important to investigate the applicability of docking techniques to modeled proteins. We present new comprehensive benchmark sets of protein models for the development and validation of protein docking, as well as a systematic assessment of free and template-based docking techniques on these sets. As opposed to previous studies, the benchmark sets reflect the real case modeling/docking scenario where the accuracy of the models is assessed by the modeling procedure, without reference to the native structure (which would be unknown in practical applications). We also expanded the analysis to include docking of protein pairs where proteins have different structural accuracy. The results show that, in general, the template-based docking is less sensitive to the structural inaccuracies of the models than the free docking. The near-native docking poses generated by the template-based approach, typically, also have higher ranks than those produces by the free docking (although the free docking is indispensable in modeling the multiplicity of protein interactions in a crowded cellular environment). The results show that docking techniques are applicable to protein models in a broad range of modeling accuracy. The study provides clear guidelines for practical applications of docking to protein models.
Collapse
Affiliation(s)
- Amar Singh
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA
| | - Taras Dauzhenka
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA
| | - Petras J Kundrotas
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA
| | - Michael J E Sternberg
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, South Kensington, London, UK
| | - Ilya A Vakser
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, USA
| |
Collapse
|
19
|
Chakravarty D, McElfresh GW, Kundrotas PJ, Vakser IA. How to choose templates for modeling of protein complexes: Insights from benchmarking template-based docking. Proteins 2020; 88:1070-1081. [PMID: 31994759 DOI: 10.1002/prot.25875] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 01/07/2020] [Accepted: 01/22/2020] [Indexed: 01/01/2023]
Abstract
Comparative docking is based on experimentally determined structures of protein-protein complexes (templates), following the paradigm that proteins with similar sequences and/or structures form similar complexes. Modeling utilizing structure similarity of target monomers to template complexes significantly expands structural coverage of the interactome. Template-based docking by structure alignment can be performed for the entire structures or by aligning targets to the bound interfaces of the experimentally determined complexes. Systematic benchmarking of docking protocols based on full and interface structure alignment showed that both protocols perform similarly, with top 1 docking success rate 26%. However, in terms of the models' quality, the interface-based docking performed marginally better. The interface-based docking is preferable when one would suspect a significant conformational change in the full protein structure upon binding, for example, a rearrangement of the domains in multidomain proteins. Importantly, if the same structure is selected as the top template by both full and interface alignment, the docking success rate increases 2-fold for both top 1 and top 10 predictions. Matching structural annotations of the target and template proteins for template detection, as a computationally less expensive alternative to structural alignment, did not improve the docking performance. Sophisticated remote sequence homology detection added templates to the pool of those identified by structure-based alignment, suggesting that for practical docking, the combination of the structure alignment protocols and the remote sequence homology detection may be useful in order to avoid potential flaws in generation of the structural templates library.
Collapse
Affiliation(s)
| | - G W McElfresh
- Computational Biology Program, The University of Kansas, Lawrence, Kansas
| | - Petras J Kundrotas
- Computational Biology Program, The University of Kansas, Lawrence, Kansas
| | - Ilya A Vakser
- Computational Biology Program, The University of Kansas, Lawrence, Kansas.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas
| |
Collapse
|
20
|
Vreven T, Vangaveti S, Borrman TM, Gaines JC, Weng Z. Performance of ZDOCK and IRAD in CAPRI rounds 39-45. Proteins 2020; 88:1050-1054. [PMID: 31994784 DOI: 10.1002/prot.25873] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 12/15/2019] [Accepted: 01/22/2020] [Indexed: 12/23/2022]
Abstract
We report docking performance on the six targets of Critical Assessment of PRedicted Interactions (CAPRI) rounds 39-45 that involved heteromeric protein-protein interactions and had the solved structures released since the rounds were held. Our general strategy involved protein-protein docking using ZDOCK, reranking using IRAD, and structural refinement using Rosetta. In addition, we made extensive use of experimental data to guide our docking runs. All the experimental information at the amino-acid level proved correct. However, for two targets, we also used protein-complex structures as templates for modeling interfaces. These resulted in incorrect predictions, presumably due to the low sequence identity between the targets and templates. Albeit a small number of targets, the performance described here compared somewhat less favorably with our previous CAPRI reports, which may be due to the CAPRI targets being increasingly challenging.
Collapse
Affiliation(s)
- Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts
| | - Sweta Vangaveti
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts
| | - Tyler M Borrman
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts
| | - Jennifer C Gaines
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts
| |
Collapse
|
21
|
Yang X, Yang S, Li Q, Wuchty S, Zhang Z. Prediction of human-virus protein-protein interactions through a sequence embedding-based machine learning method. Comput Struct Biotechnol J 2019; 18:153-161. [PMID: 31969974 PMCID: PMC6961065 DOI: 10.1016/j.csbj.2019.12.005] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 11/29/2019] [Accepted: 12/10/2019] [Indexed: 12/11/2022] Open
Abstract
The identification of human-virus protein-protein interactions (PPIs) is an essential and challenging research topic, potentially providing a mechanistic understanding of viral infection. Given that the experimental determination of human-virus PPIs is time-consuming and labor-intensive, computational methods are playing an important role in providing testable hypotheses, complementing the determination of large-scale interactome between species. In this work, we applied an unsupervised sequence embedding technique (doc2vec) to represent protein sequences as rich feature vectors of low dimensionality. Training a Random Forest (RF) classifier through a training dataset that covers known PPIs between human and all viruses, we obtained excellent predictive accuracy outperforming various combinations of machine learning algorithms and commonly-used sequence encoding schemes. Rigorous comparison with three existing human-virus PPI prediction methods, our proposed computational framework further provided very competitive and promising performance, suggesting that the doc2vec encoding scheme effectively captures context information of protein sequences, pertaining to corresponding protein-protein interactions. Our approach is freely accessible through our web server as part of our host-pathogen PPI prediction platform (http://zzdlab.com/InterSPPI/). Taken together, we hope the current work not only contributes a useful predictor to accelerate the exploration of human-virus PPIs, but also provides some meaningful insights into human-virus relationships.
Collapse
Key Words
- AC, Auto Covariance
- ACC, Accuracy
- AUC, area under the ROC curve
- AUPRC, area under the PR curve
- Adaboost, Adaptive Boosting
- CT, Conjoint Triad
- Doc2vec
- Embedding
- Human-virus interaction
- LD, Local Descriptor
- MCC, Matthews correlation coefficient
- ML, machine learning
- MLP, Multiple Layer Perceptron
- MS, mass spectroscopy
- Machine learning
- PPIs, protein-protein interactions
- PR, Precision-Recall
- Prediction
- Protein-protein interaction
- RBF, radial basis function
- RF, Random Forest
- ROC, Receiver Operating Characteristic
- SGD, stochastic gradient descent
- SVM, Support Vector Machine
- Y2H, yeast two-hybrid
Collapse
Affiliation(s)
- Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Qinmengge Li
- National Demonstration Center for Experimental Biological Sciences Education, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Stefan Wuchty
- Dept. of Computer Science, University of Miami, Miami, FL 33146, USA
- Dept. of Biology, University of Miami, Miami, FL 33146, USA
- Center of Computational Science, University of Miami, Miami, FL 33146, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, USA
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
22
|
Mirabello C, Wallner B. Topology independent structural matching discovers novel templates for protein interfaces. Bioinformatics 2019; 34:i787-i794. [PMID: 30423106 DOI: 10.1093/bioinformatics/bty587] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Motivation Protein-protein interactions (PPI) are essential for the function of the cellular machinery. The rapid growth of protein-protein complexes with known 3D structures offers a unique opportunity to study PPI to gain crucial insights into protein function and the causes of many diseases. In particular, it would be extremely useful to compare interaction surfaces of monomers, as this would enable the pinpointing of potential interaction surfaces based solely on the monomer structure, without the need to predict the complete complex structure. While there are many structural alignment algorithms for individual proteins, very few have been developed for protein interfaces, and none that can align only the interface residues to other interfaces or surfaces of interacting monomer subunits in a topology independent (non-sequential) manner. Results We present InterComp, a method for topology and sequence-order independent structural comparisons. The method is general and can be applied to various structural comparison applications. By representing residues as independent points in space rather than as a sequence of residues, InterComp can be applied to a wide range of problems including interface-surface comparisons and interface-interface comparisons. We demonstrate a use-case by applying InterComp to find similar protein interfaces on the surface of proteins. We show that InterComp pinpoints the correct interface for almost half of the targets (283 of 586) when considering the top 10 hits, and for 24% of the top 1, even when no templates can be found with regular sequence-order dependent structural alignment methods. Availability and implementation The source code and the datasets are available at: http://wallnerlab.org/InterComp. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Claudio Mirabello
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping SE, Sweden
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping SE, Sweden
| |
Collapse
|
23
|
Dapkūnas J, Olechnovič K, Venclovas Č. Structural modeling of protein complexes: Current capabilities and challenges. Proteins 2019; 87:1222-1232. [PMID: 31294859 DOI: 10.1002/prot.25774] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 06/21/2019] [Accepted: 07/06/2019] [Indexed: 12/27/2022]
Abstract
Proteins frequently interact with each other, and the knowledge of structures of the corresponding protein complexes is necessary to understand how they function. Computational methods are increasingly used to provide structural models of protein complexes. Not surprisingly, community-wide Critical Assessment of protein Structure Prediction (CASP) experiments have recently started monitoring the progress in this research area. We participated in CASP13 with the aim to evaluate our current capabilities in modeling of protein complexes and to gain a better understanding of factors that exert the largest impact on these capabilities. To model protein complexes in CASP13, we applied template-based modeling, free docking and hybrid techniques that enabled us to generate models of the topmost quality for 27 of 42 multimers. If templates for protein complexes could be identified, we modeled the structures with reasonable accuracy by straightforward homology modeling. If only partial templates were available, it was nevertheless possible to predict the interaction interfaces correctly or to generate acceptable models for protein complexes by combining template-based modeling with docking. If no templates were available, we used rigid-body docking with limited success. However, in some free docking models, despite the incorrect subunit orientation and missed interface contacts, the approximate location of protein binding sites was identified correctly. Apparently, our overall performance in docking was limited by the quality of monomer models and by the imperfection of scoring methods. The impact of human intervention on our results in modeling of protein complexes was significant indicating the need for improvements of automatic methods.
Collapse
Affiliation(s)
- Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
24
|
Johansson-Åkhe I, Mirabello C, Wallner B. Predicting protein-peptide interaction sites using distant protein complexes as structural templates. Sci Rep 2019; 9:4267. [PMID: 30862810 PMCID: PMC6414505 DOI: 10.1038/s41598-019-38498-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 12/31/2018] [Indexed: 01/07/2023] Open
Abstract
Protein-peptide interactions play an important role in major cellular processes, and are associated with several human diseases. To understand and potentially regulate these cellular function and diseases it is important to know the molecular details of the interactions. However, because of peptide flexibility and the transient nature of protein-peptide interactions, peptides are difficult to study experimentally. Thus, computational methods for predicting structural information about protein-peptide interactions are needed. Here we present InterPep, a pipeline for predicting protein-peptide interaction sites. It is a novel pipeline that, given a protein structure and a peptide sequence, utilizes structural template matches, sequence information, random forest machine learning, and hierarchical clustering to predict what region of the protein structure the peptide is most likely to bind. When tested on its ability to predict binding sites, InterPep successfully pinpointed 255 of 502 (50.7%) binding sites in experimentally determined structures at rank 1 and 348 of 502 (69.3%) among the top five predictions using only structures with no significant sequence similarity as templates. InterPep is a powerful tool for identifying peptide-binding sites; with a precision of 80% at a recall of 20% it should be an excellent starting point for docking protocols or experiments investigating peptide interactions. The source code for InterPred is available at http://wallnerlab.org/InterPep/ .
Collapse
Affiliation(s)
- Isak Johansson-Åkhe
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, SE-581 83, Linköping, Sweden
| | - Claudio Mirabello
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, SE-581 83, Linköping, Sweden
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, SE-581 83, Linköping, Sweden.
| |
Collapse
|
25
|
Hadarovich A, Anishchenko I, Tuzikov AV, Kundrotas PJ, Vakser IA. Gene ontology improves template selection in comparative protein docking. Proteins 2018; 87:245-253. [PMID: 30520123 DOI: 10.1002/prot.25645] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2018] [Revised: 10/21/2018] [Accepted: 11/29/2018] [Indexed: 02/06/2023]
Abstract
Structural characterization of protein-protein interactions is essential for our ability to study life processes at the molecular level. Computational modeling of protein complexes (protein docking) is important as the source of their structure and as a way to understand the principles of protein interaction. Rapidly evolving comparative docking approaches utilize target/template similarity metrics, which are often based on the protein structure. Although the structural similarity, generally, yields good performance, other characteristics of the interacting proteins (eg, function, biological process, and localization) may improve the prediction quality, especially in the case of weak target/template structural similarity. For the ranking of a pool of models for each target, we tested scoring functions that quantify similarity of Gene Ontology (GO) terms assigned to target and template proteins in three ontology domains-biological process, molecular function, and cellular component (GO-score). The scoring functions were tested in docking of bound, unbound, and modeled proteins. The results indicate that the combined structural and GO-terms functions improve the scoring, especially in the twilight zone of structural similarity, typical for protein models of limited accuracy.
Collapse
Affiliation(s)
- Anna Hadarovich
- Computational Biology Program, The University of Kansas, Lawrence, Kansas.,United Institute of Informatics Problems, National Academy of Sciences, Minsk, Belarus
| | - Ivan Anishchenko
- Computational Biology Program, The University of Kansas, Lawrence, Kansas
| | - Alexander V Tuzikov
- United Institute of Informatics Problems, National Academy of Sciences, Minsk, Belarus
| | - Petras J Kundrotas
- Computational Biology Program, The University of Kansas, Lawrence, Kansas
| | - Ilya A Vakser
- Computational Biology Program, The University of Kansas, Lawrence, Kansas.,Department of Molecular Biosciences, The University of Kansas, Kansas, Lawrence
| |
Collapse
|
26
|
Inhibition of protein interactions: co-crystalized protein-protein interfaces are nearly as good as holo proteins in rigid-body ligand docking. J Comput Aided Mol Des 2018; 32:769-779. [PMID: 30003468 DOI: 10.1007/s10822-018-0124-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2017] [Accepted: 05/22/2018] [Indexed: 12/15/2022]
Abstract
Modulating protein interaction pathways may lead to the cure of many diseases. Known protein-protein inhibitors bind to large pockets on the protein-protein interface. Such large pockets are detected also in the protein-protein complexes without known inhibitors, making such complexes potentially druggable. The inhibitor-binding site is primary defined by the side chains that form the largest pocket in the protein-bound conformation. Low-resolution ligand docking shows that the success rate for the protein-bound conformation is close to the one for the ligand-bound conformation, and significantly higher than for the apo conformation. The conformational change on the protein interface upon binding to the other protein results in a pocket employed by the ligand when it binds to that interface. This proof-of-concept study suggests that rather than using computational pocket-opening procedures, one can opt for an experimentally determined structure of the target co-crystallized protein-protein complex as a starting point for drug design.
Collapse
|
27
|
Kundrotas PJ, Anishchenko I, Badal VD, Das M, Dauzhenka T, Vakser IA. Modeling CAPRI targets 110-120 by template-based and free docking using contact potential and combined scoring function. Proteins 2018; 86 Suppl 1:302-310. [PMID: 28905425 PMCID: PMC5820180 DOI: 10.1002/prot.25380] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Revised: 08/25/2017] [Accepted: 09/10/2017] [Indexed: 01/12/2023]
Abstract
The paper presents analysis of our template-based and free docking predictions in the joint CASP12/CAPRI37 round. A new scoring function for template-based docking was developed, benchmarked on the Dockground resource, and applied to the targets. The results showed that the function successfully discriminates the incorrect docking predictions. In correctly predicted targets, the scoring function was complemented by other considerations, such as consistency of the oligomeric states among templates, similarity of the biological functions, biological interface relevance, etc. The scoring function still does not distinguish well biological from crystal packing interfaces, and needs further development for the docking of bundles of α-helices. In the case of the trimeric targets, sequence-based methods did not find common templates, despite similarity of the structures, suggesting complementary use of structure- and sequence-based alignments in comparative docking. The results showed that if a good docking template is found, an accurate model of the interface can be built even from largely inaccurate models of individual subunits. Free docking however is very sensitive to the quality of the individual models. However, our newly developed contact potential detected approximate locations of the binding sites.
Collapse
Affiliation(s)
- Petras J. Kundrotas
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66045, USA
| | | | - Varsha D. Badal
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66045, USA
| | - Madhurima Das
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66045, USA
| | - Taras Dauzhenka
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66045, USA
| | - Ilya A. Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66045, USA
| |
Collapse
|
28
|
Yang S, Li H, He H, Zhou Y, Zhang Z. Critical assessment and performance improvement of plant–pathogen protein–protein interaction prediction methods. Brief Bioinform 2017; 20:274-287. [DOI: 10.1093/bib/bbx123] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Indexed: 01/15/2023] Open
Affiliation(s)
- Shiping Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University
| | - Hong Li
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University
| | - Huaqin He
- College of Life Sciences, Fujian Agriculture and Forestry University
| | - Yuan Zhou
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University
| |
Collapse
|
29
|
Xue LC, Rodrigues JPGLM, Dobbs D, Honavar V, Bonvin AMJJ. Template-based protein-protein docking exploiting pairwise interfacial residue restraints. Brief Bioinform 2017; 18:458-466. [PMID: 27013645 PMCID: PMC5428999 DOI: 10.1093/bib/bbw027] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 02/03/2016] [Indexed: 01/26/2023] Open
Abstract
Although many advanced and sophisticated ab initio approaches for modeling protein-protein complexes have been proposed in past decades, template-based modeling (TBM) remains the most accurate and widely used approach, given a reliable template is available. However, there are many different ways to exploit template information in the modeling process. Here, we systematically evaluate and benchmark a TBM method that uses conserved interfacial residue pairs as docking distance restraints [referred to as alpha carbon-alpha carbon (CA-CA)-guided docking]. We compare it with two other template-based protein-protein modeling approaches, including a conserved non-pairwise interfacial residue restrained docking approach [referred to as the ambiguous interaction restraint (AIR)-guided docking] and a simple superposition-based modeling approach. Our results show that, for most cases, the CA-CA-guided docking method outperforms both superposition with refinement and the AIR-guided docking method. We emphasize the superiority of the CA-CA-guided docking on cases with medium to large conformational changes, and interactions mediated through loops, tails or disordered regions. Our results also underscore the importance of a proper refinement of superimposition models to reduce steric clashes. In summary, we provide a benchmarked TBM protocol that uses conserved pairwise interface distance as restraints in generating realistic 3D protein-protein interaction models, when reliable templates are available. The described CA-CA-guided docking protocol is based on the HADDOCK platform, which allows users to incorporate additional prior knowledge of the target system to further improve the quality of the resulting models.
Collapse
Affiliation(s)
- Li C Xue
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, CH Utrecht, The Netherlands
| | - João P G L M Rodrigues
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, CH Utrecht, The Netherlands
| | - Drena Dobbs
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, CH Utrecht, The Netherlands
| | - Vasant Honavar
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, CH Utrecht, The Netherlands
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, CH Utrecht, The Netherlands
| |
Collapse
|
30
|
Mirabello C, Wallner B. InterPred: A pipeline to identify and model protein-protein interactions. Proteins 2017; 85:1159-1170. [DOI: 10.1002/prot.25280] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 02/27/2017] [Accepted: 03/01/2017] [Indexed: 12/22/2022]
Affiliation(s)
- Claudio Mirabello
- Division of Bioinformatics, Department of Physics, Chemistry and Biology; Linköping University; Linköping 581 83 Sweden
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology; Linköping University; Linköping 581 83 Sweden
| |
Collapse
|
31
|
Anishchenko I, Kundrotas PJ, Vakser IA. Modeling complexes of modeled proteins. Proteins 2017; 85:470-478. [PMID: 27701777 PMCID: PMC5313347 DOI: 10.1002/prot.25183] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2016] [Revised: 09/22/2016] [Accepted: 10/02/2016] [Indexed: 12/21/2022]
Abstract
Structural characterization of proteins is essential for understanding life processes at the molecular level. However, only a fraction of known proteins have experimentally determined structures. This fraction is even smaller for protein-protein complexes. Thus, structural modeling of protein-protein interactions (docking) primarily has to rely on modeled structures of the individual proteins, which typically are less accurate than the experimentally determined ones. Such "double" modeling is the Grand Challenge of structural reconstruction of the interactome. Yet it remains so far largely untested in a systematic way. We present a comprehensive validation of template-based and free docking on a set of 165 complexes, where each protein model has six levels of structural accuracy, from 1 to 6 Å Cα RMSD. Many template-based docking predictions fall into acceptable quality category, according to the CAPRI criteria, even for highly inaccurate proteins (5-6 Å RMSD), although the number of such models (and, consequently, the docking success rate) drops significantly for models with RMSD > 4 Å. The results show that the existing docking methodologies can be successfully applied to protein models with a broad range of structural accuracy, and the template-based docking is much less sensitive to inaccuracies of protein models than the free docking. Proteins 2017; 85:470-478. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas 66047, USA
| | - Petras J. Kundrotas
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas 66047, USA
| | - Ilya A. Vakser
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas 66047, USA
- Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66047, USA
| |
Collapse
|
32
|
Anishchenko I, Kundrotas PJ, Vakser IA. Structural quality of unrefined models in protein docking. Proteins 2017; 85:39-45. [PMID: 27756103 PMCID: PMC5167671 DOI: 10.1002/prot.25188] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Revised: 09/29/2016] [Accepted: 10/11/2016] [Indexed: 11/11/2022]
Abstract
Structural characterization of protein-protein interactions is essential for understanding life processes at the molecular level. However, only a fraction of protein interactions have experimentally resolved structures. Thus, reliable computational methods for structural modeling of protein interactions (protein docking) are important for generating such structures and understanding the principles of protein recognition. Template-based docking techniques that utilize structural similarity between target protein-protein interaction and cocrystallized protein-protein complexes (templates) are gaining popularity due to generally higher reliability than that of the template-free docking. However, the template-based approach lacks explicit penalties for intermolecular penetration, as opposed to the typical free docking where such penalty is inherent due to the shape complementarity paradigm. Thus, template-based docking models are commonly assumed to require special treatment to remove large structural penetrations. In this study, we compared clashes in the template-based and free docking of the same proteins, with crystallographically determined and modeled structures. The results show that for the less accurate protein models, free docking produces fewer clashes than the template-based approach. However, contrary to the common expectation, in acceptable and better quality docking models of unbound crystallographically determined proteins, the clashes in the template-based docking are comparable to those in the free docking, due to the overall higher quality of the template-based docking predictions. This suggests that the free docking refinement protocols can in principle be applied to the template-based docking predictions as well. Proteins 2016; 85:39-45. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66047, USA
| | - Petras J. Kundrotas
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66047, USA
| | - Ilya A. Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66047, USA
| |
Collapse
|
33
|
Zheng J, Kundrotas PJ, Vakser IA, Liu S. Template-Based Modeling of Protein-RNA Interactions. PLoS Comput Biol 2016; 12:e1005120. [PMID: 27662342 PMCID: PMC5035060 DOI: 10.1371/journal.pcbi.1005120] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Accepted: 08/25/2016] [Indexed: 12/29/2022] Open
Abstract
Protein-RNA complexes formed by specific recognition between RNA and RNA-binding proteins play an important role in biological processes. More than a thousand of such proteins in human are curated and many novel RNA-binding proteins are to be discovered. Due to limitations of experimental approaches, computational techniques are needed for characterization of protein-RNA interactions. Although much progress has been made, adequate methodologies reliably providing atomic resolution structural details are still lacking. Although protein-RNA free docking approaches proved to be useful, in general, the template-based approaches provide higher quality of predictions. Templates are key to building a high quality model. Sequence/structure relationships were studied based on a representative set of binary protein-RNA complexes from PDB. Several approaches were tested for pairwise target/template alignment. The analysis revealed a transition point between random and correct binding modes. The results showed that structural alignment is better than sequence alignment in identifying good templates, suitable for generating protein-RNA complexes close to the native structure, and outperforms free docking, successfully predicting complexes where the free docking fails, including cases of significant conformational change upon binding. A template-based protein-RNA interaction modeling protocol PRIME was developed and benchmarked on a representative set of complexes. Structures of protein-RNA complexes are important for characterization of biological processes. The number of experimentally determined protein-RNA complexes is limited. Thus modeling of these complexes is important. Reliable structural predictions of proteins and their complexes are provided by comparative modeling, which takes advantage of similar complexes with experimentally determined structures. Thus, in the case of protein-RNA complexes, it is important to determine if similar proteins and RNAs bind in a similar way. We show that, similarly to the earlier published results on protein-protein complexes, such correlation of the protein-RNA binding mode and the monomers similarity indeed exists, and is stronger when the similarity is determined by structure rather than sequence alignment. The data shows clear transition from random to similar binding mode with the increase of the structural similarity of the monomers. On the basis of the results we designed and implemented a predictive tool, which should be useful for the biological community interested in modeling of protein-RNA interactions.
Collapse
Affiliation(s)
- Jinfang Zheng
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Petras J. Kundrotas
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America
| | - Ilya A. Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America
- * E-mail: (IAV); (SL)
| | - Shiyong Liu
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, Hubei, China
- * E-mail: (IAV); (SL)
| |
Collapse
|
34
|
Li H, Yang S, Wang C, Zhou Y, Zhang Z. AraPPISite: a database of fine-grained protein-protein interaction site annotations for Arabidopsis thaliana. PLANT MOLECULAR BIOLOGY 2016; 92:105-16. [PMID: 27338257 DOI: 10.1007/s11103-016-0498-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Accepted: 05/26/2016] [Indexed: 05/18/2023]
Abstract
Knowledge about protein interaction sites provides detailed information of protein-protein interactions (PPIs). To date, nearly 20,000 of PPIs from Arabidopsis thaliana have been identified. Nevertheless, the interaction site information has been largely missed by previously published PPI databases. Here, AraPPISite, a database that presents fine-grained interaction details for A. thaliana PPIs is established. First, the experimentally determined 3D structures of 27 A. thaliana PPIs are collected from the Protein Data Bank database and the predicted 3D structures of 3023 A. thaliana PPIs are modeled by using two well-established template-based docking methods. For each experimental/predicted complex structure, AraPPISite not only provides an interactive user interface for browsing interaction sites, but also lists detailed evolutionary and physicochemical properties of these sites. Second, AraPPISite assigns domain-domain interactions or domain-motif interactions to 4286 PPIs whose 3D structures cannot be modeled. In this case, users can easily query protein interaction regions at the sequence level. AraPPISite is a free and user-friendly database, which does not require user registration or any configuration on local machines. We anticipate AraPPISite can serve as a helpful database resource for the users with less experience in structural biology or protein bioinformatics to probe the details of PPIs, and thus accelerate the studies of plant genetics and functional genomics. AraPPISite is available at http://systbio.cau.edu.cn/arappisite/index.html .
Collapse
Affiliation(s)
- Hong Li
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Shiping Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Chuan Wang
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, 94305, USA
| | - Yuan Zhou
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.
| |
Collapse
|
35
|
Im W, Liang J, Olson A, Zhou HX, Vajda S, Vakser IA. Challenges in structural approaches to cell modeling. J Mol Biol 2016; 428:2943-64. [PMID: 27255863 PMCID: PMC4976022 DOI: 10.1016/j.jmb.2016.05.024] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2016] [Revised: 05/19/2016] [Accepted: 05/24/2016] [Indexed: 11/17/2022]
Abstract
Computational modeling is essential for structural characterization of biomolecular mechanisms across the broad spectrum of scales. Adequate understanding of biomolecular mechanisms inherently involves our ability to model them. Structural modeling of individual biomolecules and their interactions has been rapidly progressing. However, in terms of the broader picture, the focus is shifting toward larger systems, up to the level of a cell. Such modeling involves a more dynamic and realistic representation of the interactomes in vivo, in a crowded cellular environment, as well as membranes and membrane proteins, and other cellular components. Structural modeling of a cell complements computational approaches to cellular mechanisms based on differential equations, graph models, and other techniques to model biological networks, imaging data, etc. Structural modeling along with other computational and experimental approaches will provide a fundamental understanding of life at the molecular level and lead to important applications to biology and medicine. A cross section of diverse approaches presented in this review illustrates the developing shift from the structural modeling of individual molecules to that of cell biology. Studies in several related areas are covered: biological networks; automated construction of three-dimensional cell models using experimental data; modeling of protein complexes; prediction of non-specific and transient protein interactions; thermodynamic and kinetic effects of crowding; cellular membrane modeling; and modeling of chromosomes. The review presents an expert opinion on the current state-of-the-art in these various aspects of structural modeling in cellular biology, and the prospects of future developments in this emerging field.
Collapse
Affiliation(s)
- Wonpil Im
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66047, United States.
| | - Jie Liang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607, United States.
| | - Arthur Olson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, United States.
| | - Huan-Xiang Zhou
- Department of Physics and Institute of Molecular Biophysics, Florida State University, Tallahassee, FL 32306, United States.
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, United States.
| | - Ilya A Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS 66047, United States.
| |
Collapse
|
36
|
Dourado DFAR, Flores SC. Modeling and fitting protein-protein complexes to predict change of binding energy. Sci Rep 2016; 6:25406. [PMID: 27173910 PMCID: PMC4865953 DOI: 10.1038/srep25406] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Accepted: 04/18/2016] [Indexed: 01/18/2023] Open
Abstract
It is possible to accurately and economically predict change in protein-protein interaction energy upon mutation (ΔΔG), when a high-resolution structure of the complex is available. This is of growing usefulness for design of high-affinity or otherwise modified binding proteins for therapeutic, diagnostic, industrial, and basic science applications. Recently the field has begun to pursue ΔΔG prediction for homology modeled complexes, but so far this has worked mostly for cases of high sequence identity. If the interacting proteins have been crystallized in free (uncomplexed) form, in a majority of cases it is possible to find a structurally similar complex which can be used as the basis for template-based modeling. We describe how to use MMB to create such models, and then use them to predict ΔΔG, using a dataset consisting of free target structures, co-crystallized template complexes with sequence identify with respect to the targets as low as 44%, and experimental ΔΔG measurements. We obtain similar results by fitting to a low-resolution Cryo-EM density map. Results suggest that other structural constraints may lead to a similar outcome, making the method even more broadly applicable.
Collapse
Affiliation(s)
- Daniel F A R Dourado
- Department of Cell and Molecular Biology, Computational and Systems Biology, Uppsala University, Biomedical Center Box 596, 751 24, Uppsala, Sweden
| | - Samuel Coulbourn Flores
- Department of Cell and Molecular Biology, Computational and Systems Biology, Uppsala University, Biomedical Center Box 596, 751 24, Uppsala, Sweden
| |
Collapse
|
37
|
Snider J, Kotlyar M, Saraon P, Yao Z, Jurisica I, Stagljar I. Fundamentals of protein interaction network mapping. Mol Syst Biol 2015; 11:848. [PMID: 26681426 PMCID: PMC4704491 DOI: 10.15252/msb.20156351] [Citation(s) in RCA: 180] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Studying protein interaction networks of all proteins in an organism (“interactomes”) remains one of the major challenges in modern biomedicine. Such information is crucial to understanding cellular pathways and developing effective therapies for the treatment of human diseases. Over the past two decades, diverse biochemical, genetic, and cell biological methods have been developed to map interactomes. In this review, we highlight basic principles of interactome mapping. Specifically, we discuss the strengths and weaknesses of individual assays, how to select a method appropriate for the problem being studied, and provide general guidelines for carrying out the necessary follow‐up analyses. In addition, we discuss computational methods to predict, map, and visualize interactomes, and provide a summary of some of the most important interactome resources. We hope that this review serves as both a useful overview of the field and a guide to help more scientists actively employ these powerful approaches in their research.
Collapse
Affiliation(s)
- Jamie Snider
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Max Kotlyar
- Princess Margaret Cancer Center, IBM Life Sciences Discovery Centre, University Health Network, Ontario, Canada
| | - Punit Saraon
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Zhong Yao
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Igor Jurisica
- Princess Margaret Cancer Center, IBM Life Sciences Discovery Centre, University Health Network, Ontario, Canada
| | - Igor Stagljar
- Donnelly Centre, Department of Biochemistry, Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
38
|
Maheshwari S, Brylinski M. Predicted binding site information improves model ranking in protein docking using experimental and computer-generated target structures. BMC STRUCTURAL BIOLOGY 2015; 15:23. [PMID: 26597230 PMCID: PMC4657198 DOI: 10.1186/s12900-015-0050-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2015] [Accepted: 10/30/2015] [Indexed: 01/10/2023]
Abstract
Background Protein-protein interactions (PPIs) mediate the vast majority of biological processes, therefore, significant efforts have been directed to investigate PPIs to fully comprehend cellular functions. Predicting complex structures is critical to reveal molecular mechanisms by which proteins operate. Despite recent advances in the development of new methods to model macromolecular assemblies, most current methodologies are designed to work with experimentally determined protein structures. However, because only computer-generated models are available for a large number of proteins in a given genome, computational tools should tolerate structural inaccuracies in order to perform the genome-wide modeling of PPIs. Results To address this problem, we developed eRankPPI, an algorithm for the identification of near-native conformations generated by protein docking using experimental structures as well as protein models. The scoring function implemented in eRankPPI employs multiple features including interface probability estimates calculated by eFindSitePPI and a novel contact-based symmetry score. In comparative benchmarks using representative datasets of homo- and hetero-complexes, we show that eRankPPI consistently outperforms state-of-the-art algorithms improving the success rate by ~10 %. Conclusions eRankPPI was designed to bridge the gap between the volume of sequence data, the evidence of binary interactions, and the atomic details of pharmacologically relevant protein complexes. Tolerating structure imperfections in computer-generated models opens up a possibility to conduct the exhaustive structure-based reconstruction of PPI networks across proteomes. The methods and datasets used in this study are available at www.brylinski.org/erankppi.
Collapse
Affiliation(s)
- Surabhi Maheshwari
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA.
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA. .,Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.
| |
Collapse
|
39
|
Muratcioglu S, Guven-Maiorov E, Keskin Ö, Gursoy A. Advances in template-based protein docking by utilizing interfaces towards completing structural interactome. Curr Opin Struct Biol 2015; 35:87-92. [PMID: 26539658 DOI: 10.1016/j.sbi.2015.10.001] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2015] [Revised: 10/09/2015] [Accepted: 10/13/2015] [Indexed: 11/27/2022]
Abstract
The increase in the number of structurally determined protein complexes strengthens template-based docking (TBD) methods for modelling protein-protein interactions (PPIs). These methods utilize the known structures of protein complexes as templates to predict the quaternary structure of the target proteins. The templates may be partial or complete structures. Interface based (partial) methods have recently gained interest due in part to the observation that the interface regions are reusable. We describe how available template interfaces can be used to obtain the structural models of protein interactions. Despite the agreement that a majority of the protein complexes can be modelled using the available Protein Data Bank (PDB) structures, a handful of studies argue that we need more template proteins to increase the structural coverage of PPIs. We also discuss the performance of the interface TBD methods at large scale, and the significance of capturing multiple conformations for improving accuracy.
Collapse
Affiliation(s)
- Serena Muratcioglu
- Department of Chemical and Biological Engineering, Koc University, 34450 Istanbul, Turkey; Center for Computational Biology and Bioinformatics, Koc University, 34450 Istanbul, Turkey
| | - Emine Guven-Maiorov
- Department of Chemical and Biological Engineering, Koc University, 34450 Istanbul, Turkey; Center for Computational Biology and Bioinformatics, Koc University, 34450 Istanbul, Turkey
| | - Özlem Keskin
- Department of Chemical and Biological Engineering, Koc University, 34450 Istanbul, Turkey; Center for Computational Biology and Bioinformatics, Koc University, 34450 Istanbul, Turkey
| | - Attila Gursoy
- Department of Computer Engineering, Koc University, 34450 Istanbul, Turkey; Center for Computational Biology and Bioinformatics, Koc University, 34450 Istanbul, Turkey.
| |
Collapse
|
40
|
Vreven T, Moal IH, Vangone A, Pierce BG, Kastritis PL, Torchala M, Chaleil R, Jiménez-García B, Bates PA, Fernandez-Recio J, Bonvin AMJJ, Weng Z. Updates to the Integrated Protein-Protein Interaction Benchmarks: Docking Benchmark Version 5 and Affinity Benchmark Version 2. J Mol Biol 2015; 427:3031-41. [PMID: 26231283 PMCID: PMC4677049 DOI: 10.1016/j.jmb.2015.07.016] [Citation(s) in RCA: 248] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Revised: 07/17/2015] [Accepted: 07/17/2015] [Indexed: 01/31/2023]
Abstract
We present an updated and integrated version of our widely used protein-protein docking and binding affinity benchmarks. The benchmarks consist of non-redundant, high-quality structures of protein-protein complexes along with the unbound structures of their components. Fifty-five new complexes were added to the docking benchmark, 35 of which have experimentally measured binding affinities. These updated docking and affinity benchmarks now contain 230 and 179 entries, respectively. In particular, the number of antibody-antigen complexes has increased significantly, by 67% and 74% in the docking and affinity benchmarks, respectively. We tested previously developed docking and affinity prediction algorithms on the new cases. Considering only the top 10 docking predictions per benchmark case, a prediction accuracy of 38% is achieved on all 55 cases and up to 50% for the 32 rigid-body cases only. Predicted affinity scores are found to correlate with experimental binding energies up to r=0.52 overall and r=0.72 for the rigid complexes.
Collapse
Affiliation(s)
- Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Iain H Moal
- Joint BSC-CRG-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Anna Vangone
- Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University, 3584CH Utrecht, The Netherlands
| | - Brian G Pierce
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Panagiotis L Kastritis
- Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University, 3584CH Utrecht, The Netherlands
| | - Mieczyslaw Torchala
- Biomolecular Modelling Laboratory, The Francis Crick Institute, Lincoln's Inn Fields Laboratory, London WC2A 3LY, United Kingdom
| | - Raphael Chaleil
- Biomolecular Modelling Laboratory, The Francis Crick Institute, Lincoln's Inn Fields Laboratory, London WC2A 3LY, United Kingdom
| | - Brian Jiménez-García
- Joint BSC-CRG-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Paul A Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, Lincoln's Inn Fields Laboratory, London WC2A 3LY, United Kingdom.
| | - Juan Fernandez-Recio
- Joint BSC-CRG-IRB Research Program in Computational Biology, Life Sciences Department, Barcelona Supercomputing Center, C/Jordi Girona 29, 08034 Barcelona, Spain.
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science, Utrecht University, 3584CH Utrecht, The Netherlands.
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
41
|
Vakser IA. Protein-protein docking: from interaction to interactome. Biophys J 2015; 107:1785-1793. [PMID: 25418159 DOI: 10.1016/j.bpj.2014.08.033] [Citation(s) in RCA: 184] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2014] [Revised: 08/17/2014] [Accepted: 08/27/2014] [Indexed: 12/29/2022] Open
Abstract
The protein-protein docking problem is one of the focal points of activity in computational biophysics and structural biology. The three-dimensional structure of a protein-protein complex, generally, is more difficult to determine experimentally than the structure of an individual protein. Adequate computational techniques to model protein interactions are important because of the growing number of known protein structures, particularly in the context of structural genomics. Docking offers tools for fundamental studies of protein interactions and provides a structural basis for drug design. Protein-protein docking is the prediction of the structure of the complex, given the structures of the individual proteins. In the heart of the docking methodology is the notion of steric and physicochemical complementarity at the protein-protein interface. Originally, mostly high-resolution, experimentally determined (primarily by x-ray crystallography) protein structures were considered for docking. However, more recently, the focus has been shifting toward lower-resolution modeled structures. Docking approaches have to deal with the conformational changes between unbound and bound structures, as well as the inaccuracies of the interacting modeled structures, often in a high-throughput mode needed for modeling of large networks of protein interactions. The growing number of docking developers is engaged in the community-wide assessments of predictive methodologies. The development of more powerful and adequate docking approaches is facilitated by rapidly expanding information and data resources, growing computational capabilities, and a deeper understanding of the fundamental principles of protein interactions.
Collapse
Affiliation(s)
- Ilya A Vakser
- Center for Bioinformatics and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas.
| |
Collapse
|
42
|
Anishchenko I, Kundrotas PJ, Tuzikov AV, Vakser IA. Structural templates for comparative protein docking. Proteins 2015; 83:1563-70. [PMID: 25488330 DOI: 10.1002/prot.24736] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2014] [Revised: 11/15/2014] [Accepted: 11/26/2014] [Indexed: 11/07/2022]
Abstract
Structural characterization of protein-protein interactions is important for understanding life processes. Because of the inherent limitations of experimental techniques, such characterization requires computational approaches. Along with the traditional protein-protein docking (free search for a match between two proteins), comparative (template-based) modeling of protein-protein complexes has been gaining popularity. Its development puts an emphasis on full and partial structural similarity between the target protein monomers and the protein-protein complexes previously determined by experimental techniques (templates). The template-based docking relies on the quality and diversity of the template set. We present a carefully curated, nonredundant library of templates containing 4950 full structures of binary complexes and 5936 protein-protein interfaces extracted from the full structures at 12 Å distance cut-off. Redundancy in the libraries was removed by clustering the PDB structures based on structural similarity. The value of the clustering threshold was determined from the analysis of the clusters and the docking performance on a benchmark set. High structural quality of the interfaces in the template and validation sets was achieved by automated procedures and manual curation. The library is included in the Dockground resource for molecular recognition studies at http://dockground.bioinformatics.ku.edu.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047.,United Institute of Informatics Problems, National Academy of Sciences, Minsk, 220012, Belarus
| | - Petras J Kundrotas
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047
| | - Alexander V Tuzikov
- United Institute of Informatics Problems, National Academy of Sciences, Minsk, 220012, Belarus
| | - Ilya A Vakser
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66045
| |
Collapse
|
43
|
Xie ZR, Chen J, Zhao Y, Wu Y. Decomposing the space of protein quaternary structures with the interface fragment pair library. BMC Bioinformatics 2015; 16:14. [PMID: 25592649 PMCID: PMC4384354 DOI: 10.1186/s12859-014-0437-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Accepted: 12/18/2014] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND The physical interactions between proteins constitute the basis of protein quaternary structures. They dominate many biological processes in living cells. Deciphering the structural features of interacting proteins is essential to understand their cellular functions. Similar to the space of protein tertiary structures in which discrete patterns are clearly observed on fold or sub-fold motif levels, it has been found that the space of protein quaternary structures is highly degenerate due to the packing of compact secondary structure elements at interfaces. Therefore, it is necessary to further decompose the protein quaternary structural space into a more local representation. RESULTS Here we constructed an interface fragment pair library from the current structure database of protein complexes. After structural-based clustering, we found that more than 90% of these interface fragment pairs can be represented by a limited number of highly abundant motifs. These motifs were further used to guide complex assembly. A large-scale benchmark test shows that the native-like binding is highly likely in the structural ensemble of modeled protein complexes that were built through the library. CONCLUSIONS Our study therefore presents supportive evidences that the space of protein quaternary structures can be represented by the combination of a small set of secondary-structure-based packing at binding interfaces. Finally, after future improvements such as adding sequence profiles, we expect this new library will be useful to predict structures of unknown protein-protein interactions.
Collapse
Affiliation(s)
- Zhong-Ru Xie
- Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA.
| | - Jiawen Chen
- Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA.
| | - Yilin Zhao
- Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA.
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine of Yeshiva University, 1300 Morris Park Avenue, Bronx, NY, 10461, USA.
| |
Collapse
|
44
|
Assessing the applicability of template-based protein docking in the twilight zone. Structure 2014; 22:1356-1362. [PMID: 25156427 DOI: 10.1016/j.str.2014.07.009] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2014] [Revised: 07/24/2014] [Accepted: 07/31/2014] [Indexed: 11/20/2022]
Abstract
The structural modeling of protein interactions in the absence of close homologous templates is a challenging task. Recently, template-based docking methods have emerged to exploit local structural similarities to help ab-initio protocols provide reliable 3D models for protein interactions. In this work, we critically assess the performance of template-based docking in the twilight zone. Our results show that, while it is possible to find templates for nearly all known interactions, the quality of the obtained models is rather limited. We can increase the precision of the models at expenses of coverage, but it drastically reduces the potential applicability of the method, as illustrated by the whole-interactome modeling of nine organisms. Template-based docking is likely to play an important role in the structural characterization of the interaction space, but we still need to improve the repertoire of structural templates onto which we can reliably model protein complexes.
Collapse
|
45
|
Lua RC, Marciano DC, Katsonis P, Adikesavan AK, Wilkins AD, Lichtarge O. Prediction and redesign of protein-protein interactions. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014; 116:194-202. [PMID: 24878423 DOI: 10.1016/j.pbiomolbio.2014.05.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 05/02/2014] [Accepted: 05/17/2014] [Indexed: 12/14/2022]
Abstract
Understanding the molecular basis of protein function remains a central goal of biology, with the hope to elucidate the role of human genes in health and in disease, and to rationally design therapies through targeted molecular perturbations. We review here some of the computational techniques and resources available for characterizing a critical aspect of protein function - those mediated by protein-protein interactions (PPI). We describe several applications and recent successes of the Evolutionary Trace (ET) in identifying molecular events and shapes that underlie protein function and specificity in both eukaryotes and prokaryotes. ET is a part of analytical approaches based on the successes and failures of evolution that enable the rational control of PPI.
Collapse
Affiliation(s)
- Rhonald C Lua
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - David C Marciano
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Anbu K Adikesavan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Angela D Wilkins
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA; Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA.
| |
Collapse
|
46
|
Konc J, Janežič D. ProBiS-ligands: a web server for prediction of ligands by examination of protein binding sites. Nucleic Acids Res 2014; 42:W215-20. [PMID: 24861616 PMCID: PMC4086080 DOI: 10.1093/nar/gku460] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
The ProBiS-ligands web server predicts binding of ligands to a protein structure. Starting with a protein structure or binding site, ProBiS-ligands first identifies template proteins in the Protein Data Bank that share similar binding sites. Based on the superimpositions of the query protein and the similar binding sites found, the server then transposes the ligand structures from those sites to the query protein. Such ligand prediction supports many activities, e.g. drug repurposing. The ProBiS-ligands web server, an extension of the ProBiS web server, is open and free to all users at http://probis.cmm.ki.si/ligands.
Collapse
Affiliation(s)
- Janez Konc
- National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Dušanka Janežič
- University of Primorska, Faculty of Mathematics, Natural Sciences and Information Technologies, Glagoljaška 8, 6000 Koper, Slovenia
| |
Collapse
|
47
|
Template-based structure modeling of protein-protein interactions. Curr Opin Struct Biol 2013; 24:10-23. [PMID: 24721449 DOI: 10.1016/j.sbi.2013.11.005] [Citation(s) in RCA: 116] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2013] [Revised: 10/29/2013] [Accepted: 11/21/2013] [Indexed: 01/21/2023]
Abstract
The structure of protein-protein complexes can be constructed by using the known structure of other protein complexes as a template. The complex structure templates are generally detected either by homology-based sequence alignments or, given the structure of monomer components, by structure-based comparisons. Critical improvements have been made in recent years by utilizing interface recognition and by recombining monomer and complex template libraries. Encouraging progress has also been witnessed in genome-wide applications of template-based modeling, with modeling accuracy comparable to high-throughput experimental data. Nevertheless, bottlenecks exist due to the incompleteness of the protein-protein complex structure library and the lack of methods for distant homologous template identification and full-length complex structure refinement.
Collapse
|
48
|
Lopes A, Sacquin-Mora S, Dimitrova V, Laine E, Ponty Y, Carbone A. Protein-protein interactions in a crowded environment: an analysis via cross-docking simulations and evolutionary information. PLoS Comput Biol 2013; 9:e1003369. [PMID: 24339765 PMCID: PMC3854762 DOI: 10.1371/journal.pcbi.1003369] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2013] [Accepted: 10/15/2013] [Indexed: 12/27/2022] Open
Abstract
Large-scale analyses of protein-protein interactions based on coarse-grain molecular docking simulations and binding site predictions resulting from evolutionary sequence analysis, are possible and realizable on hundreds of proteins with variate structures and interfaces. We demonstrated this on the 168 proteins of the Mintseris Benchmark 2.0. On the one hand, we evaluated the quality of the interaction signal and the contribution of docking information compared to evolutionary information showing that the combination of the two improves partner identification. On the other hand, since protein interactions usually occur in crowded environments with several competing partners, we realized a thorough analysis of the interactions of proteins with true partners but also with non-partners to evaluate whether proteins in the environment, competing with the true partner, affect its identification. We found three populations of proteins: strongly competing, never competing, and interacting with different levels of strength. Populations and levels of strength are numerically characterized and provide a signature for the behavior of a protein in the crowded environment. We showed that partner identification, to some extent, does not depend on the competing partners present in the environment, that certain biochemical classes of proteins are intrinsically easier to analyze than others, and that small proteins are not more promiscuous than large ones. Our approach brings to light that the knowledge of the binding site can be used to reduce the high computational cost of docking simulations with no consequence in the quality of the results, demonstrating the possibility to apply coarse-grain docking to datasets made of thousands of proteins. Comparison with all available large-scale analyses aimed to partner predictions is realized. We release the complete decoys set issued by coarse-grain docking simulations of both true and false interacting partners, and their evolutionary sequence analysis leading to binding site predictions. Download site: http://www.lgm.upmc.fr/CCDMintseris/ Protein-protein interactions (PPI) are at the heart of the molecular processes governing life and constitute an increasingly important target for drug design. Given their importance, it is vital to determine which protein interactions have functional relevance and to characterize the protein competition inherent to crowded environments, as the cytoplasm or the cellular organelles. We show that combining coarse-grain molecular cross-docking simulations and binding site predictions based on evolutionary sequence analysis is a viable route to identify true interacting partners for hundreds of proteins with a variate set of protein structures and interfaces. Also, we realize a large-scale analysis of protein binding promiscuity and provide a numerical characterization of partner competition and level of interaction strength for about 28000 false-partner interactions. Finally, we demonstrate that binding site prediction is useful to discriminate native partners, but also to scale up the approach to thousands of protein interactions. This study is based on the large computational effort made by thousands of internautes helping World Community Grid over a period of 7 months. The complete dataset issued by the computation and the analysis is released to the scientific community.
Collapse
Affiliation(s)
- Anne Lopes
- Université Pierre et Marie Curie, UMR 7238, Equipe de Génomique Analytique, Paris, France
- CNRS, UMR 7238, Laboratoire de Génomique des Microorganismes, Paris, France
| | - Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, CNRS UPR 9080, Institut de Biologie Physico-Chimique, Paris, France
| | - Viktoriya Dimitrova
- Université Pierre et Marie Curie, UMR 7238, Equipe de Génomique Analytique, Paris, France
- CNRS, UMR 7238, Laboratoire de Génomique des Microorganismes, Paris, France
| | - Elodie Laine
- Université Pierre et Marie Curie, UMR 7238, Equipe de Génomique Analytique, Paris, France
- CNRS, UMR 7238, Laboratoire de Génomique des Microorganismes, Paris, France
| | - Yann Ponty
- Université Pierre et Marie Curie, UMR 7238, Equipe de Génomique Analytique, Paris, France
- LIX, CNRS UMR 7161 - INRIA AMIB, École polytechnique, Palaiseau, France
| | - Alessandra Carbone
- Université Pierre et Marie Curie, UMR 7238, Equipe de Génomique Analytique, Paris, France
- CNRS, UMR 7238, Laboratoire de Génomique des Microorganismes, Paris, France
- * E-mail:
| |
Collapse
|
49
|
Mosca R, Pons T, Céol A, Valencia A, Aloy P. Towards a detailed atlas of protein–protein interactions. Curr Opin Struct Biol 2013; 23:929-40. [DOI: 10.1016/j.sbi.2013.07.005] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Revised: 07/04/2013] [Accepted: 07/08/2013] [Indexed: 12/30/2022]
|
50
|
Kundrotas PJ, Vakser IA. Global and local structural similarity in protein-protein complexes: implications for template-based docking. Proteins 2013; 81:2137-42. [PMID: 23946125 DOI: 10.1002/prot.24392] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2013] [Revised: 07/23/2013] [Accepted: 08/02/2013] [Indexed: 02/02/2023]
Abstract
The increasing amount of structural information on protein-protein interactions makes it possible to predict the structure of protein-protein complexes by comparison/alignment of the interacting proteins to the ones in cocrystallized complexes. In the predictions based on structure similarity, the template search is performed by structural alignment of the target interactors with the entire structures or with the interface only of the subunits in cocrystallized complexes. This study investigates the scope of the structural similarity that facilitates the detection of a broad range of templates significantly divergent from the targets. The analysis of the target-template similarity is based on models of protein-protein complexes in a large representative set of heterodimers. The similarity of the biological and crystal packing interfaces, dissimilar interface structural motifs in overall similar structures, interface similarity to the full structure, and local similarity away from the interface were analyzed. The structural similarity at the protein-protein interfaces only was observed in ~25% of target-template pairs with sequence identity <20% and primarily homodimeric templates. For ~50% of the target-template pairs, the similarity at the interface was accompanied by the similarity of the whole structure. However, the structural similarity at the interfaces was still stronger than that of the noninterface parts. The study provides insights into structural and functional diversity of protein-protein complexes, and relative performance of the interface and full structure alignment in docking.
Collapse
|