1
|
Zhang Y, Wang X, Zhang Z, Huang Y, Kihara D. Assessment of Protein-Protein Docking Models Using Deep Learning. Methods Mol Biol 2024; 2780:149-162. [PMID: 38987469 DOI: 10.1007/978-1-0716-3985-6_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Protein-protein interactions are involved in almost all processes in a living cell and determine the biological functions of proteins. To obtain mechanistic understandings of protein-protein interactions, the tertiary structures of protein complexes have been determined by biophysical experimental methods, such as X-ray crystallography and cryogenic electron microscopy. However, as experimental methods are costly in resources, many computational methods have been developed that model protein complex structures. One of the difficulties in computational protein complex modeling (protein docking) is to select the most accurate models among many models that are usually generated by a docking method. This article reviews advances in protein docking model assessment methods, focusing on recent developments that apply deep learning to several network architectures.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Yunhan Huang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
2
|
Kiani YS, Jabeen I. Challenges of Protein-Protein Docking of the Membrane Proteins. Methods Mol Biol 2024; 2780:203-255. [PMID: 38987471 DOI: 10.1007/978-1-0716-3985-6_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Despite the recent advances in the determination of high-resolution membrane protein (MP) structures, the structural and functional characterization of MPs remains extremely challenging, mainly due to the hydrophobic nature, low abundance, poor expression, purification, and crystallization difficulties associated with MPs. Whereby the major challenges/hurdles for MP structure determination are associated with the expression, purification, and crystallization procedures. Although there have been significant advances in the experimental determination of MP structures, only a limited number of MP structures (approximately less than 1% of all) are available in the Protein Data Bank (PDB). Therefore, the structures of a large number of MPs still remain unresolved, which leads to the availability of widely unplumbed structural and functional information related to MPs. As a result, recent developments in the drug discovery realm and the significant biological contemplation have led to the development of several novel, low-cost, and time-efficient computational methods that overcome the limitations of experimental approaches, supplement experiments, and provide alternatives for the characterization of MPs. Whereby the fine tuning and optimizations of these computational approaches remains an ongoing endeavor.Computational methods offer a potential way for the elucidation of structural features and the augmentation of currently available MP information. However, the use of computational modeling can be extremely challenging for MPs mainly due to insufficient knowledge of (or gaps in) atomic structures of MPs. Despite the availability of numerous in silico methods for 3D structure determination the applicability of these methods to MPs remains relatively low since all methods are not well-suited or adequate for MPs. However, sophisticated methods for MP structure predictions are constantly being developed and updated to integrate the modifications required for MPs. Currently, different computational methods for (1) MP structure prediction, (2) stability analysis of MPs through molecular dynamics simulations, (3) modeling of MP complexes through docking, (4) prediction of interactions between MPs, and (5) MP interactions with its soluble partner are extensively used. Towards this end, MP docking is widely used. It is notable that the MP docking methods yet few in number might show greater potential in terms of filling the knowledge gap. In this chapter, MP docking methods and associated challenges have been reviewed to improve the applicability, accuracy, and the ability to model macromolecular complexes.
Collapse
Affiliation(s)
- Yusra Sajid Kiani
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Ishrat Jabeen
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan.
| |
Collapse
|
3
|
Pozzati G, Kundrotas P, Elofsson A. Scoring of protein–protein docking models utilizing predicted interface residues. Proteins 2022; 90:1493-1505. [PMID: 35246997 PMCID: PMC9314140 DOI: 10.1002/prot.26330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 02/23/2022] [Accepted: 02/28/2022] [Indexed: 11/08/2022]
Abstract
Scoring docking solutions is a difficult task, and many methods have been developed for this purpose. In docking, only a handful of the hundreds of thousands of models generated by docking algorithms are acceptable, causing difficulties when developing scoring functions. Today's best scoring functions can significantly increase the number of top‐ranked models but still fail for most targets. Here, we examine the possibility of utilizing predicted interface residues to score docking models generated during the scan stage of a docking algorithm. Many methods have been developed to infer the regions of a protein surface that interact with another protein, but most have not been benchmarked using docking algorithms. This study systematically tests different interface prediction methods for scoring >300.000 low‐resolution rigid‐body template free docking decoys. Overall we find that contact‐based interface prediction by BIPSPI is the best method to score docking solutions, with >12% of first ranked docking models being acceptable. Additional experiments indicated precision as a high‐importance metric when estimating interface prediction quality, focusing on docking constraints production. Finally, we discussed several limitations for adopting interface predictions as constraints in a docking protocol.
Collapse
Affiliation(s)
- Gabriele Pozzati
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
| | - Petras Kundrotas
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
- Center for Bioinformatics and Department of Molecular Biosciences University of Kansas Lawrence Kansas USA
| | - Arne Elofsson
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
| |
Collapse
|
4
|
Roy RS, Quadir F, Soltanikazemi E, Cheng J. OUP accepted manuscript. Bioinformatics 2022; 38:1904-1910. [PMID: 35134816 PMCID: PMC8963319 DOI: 10.1093/bioinformatics/btac063] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 01/17/2022] [Accepted: 01/31/2022] [Indexed: 11/23/2022] Open
Abstract
Motivation Deep learning has revolutionized protein tertiary structure prediction recently. The cutting-edge deep learning methods such as AlphaFold can predict high-accuracy tertiary structures for most individual protein chains. However, the accuracy of predicting quaternary structures of protein complexes consisting of multiple chains is still relatively low due to lack of advanced deep learning methods in the field. Because interchain residue–residue contacts can be used as distance restraints to guide quaternary structure modeling, here we develop a deep dilated convolutional residual network method (DRCon) to predict interchain residue–residue contacts in homodimers from residue–residue co-evolutionary signals derived from multiple sequence alignments of monomers, intrachain residue–residue contacts of monomers extracted from true/predicted tertiary structures or predicted by deep learning, and other sequence and structural features. Results Tested on three homodimer test datasets (Homo_std dataset, DeepHomo dataset and CASP-CAPRI dataset), the precision of DRCon for top L/5 interchain contact predictions (L: length of monomer in a homodimer) is 43.46%, 47.10% and 33.50% respectively at 6 Å contact threshold, which is substantially better than DeepHomo and DNCON2_inter and similar to Glinter. Moreover, our experiments demonstrate that using predicted tertiary structure or intrachain contacts of monomers in the unbound state as input, DRCon still performs well, even though its accuracy is lower than using true tertiary structures in the bound state are used as input. Finally, our case study shows that good interchain contact predictions can be used to build high-accuracy quaternary structure models of homodimers. Availability and implementation The source code of DRCon is available at https://github.com/jianlin-cheng/DRCon. The datasets are available at https://zenodo.org/record/5998532#.YgF70vXMKsB. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Farhan Quadir
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Elham Soltanikazemi
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | | |
Collapse
|
5
|
Verburgt J, Kihara D. Benchmarking of structure refinement methods for protein complex models. Proteins 2022; 90:83-95. [PMID: 34309909 PMCID: PMC8671191 DOI: 10.1002/prot.26188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 06/24/2021] [Accepted: 07/22/2021] [Indexed: 01/03/2023]
Abstract
Protein structure docking is the process in which the quaternary structure of a protein complex is predicted from individual tertiary structures of the protein subunits. Protein docking is typically performed in two main steps. The subunits are first docked while keeping them rigid to form the complex, which is then followed by structure refinement. Structure refinement is crucial for a practical use of computational protein docking models, as it is aimed for correcting conformations of interacting residues and atoms at the interface. Here, we benchmarked the performance of eight existing protein structure refinement methods in refinement of protein complex models. We show that the fraction of native contacts between subunits is by far the most straightforward metric to improve. However, backbone dependent metrics, based on the Root Mean Square Deviation proved more difficult to improve via refinement.
Collapse
Affiliation(s)
- Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
- Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
6
|
Christoffer C, Chen S, Bharadwaj V, Aderinwale T, Kumar V, Hormati M, Kihara D. LZerD webserver for pairwise and multiple protein-protein docking. Nucleic Acids Res 2021; 49:W359-W365. [PMID: 33963854 PMCID: PMC8262708 DOI: 10.1093/nar/gkab336] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 04/13/2021] [Accepted: 04/19/2021] [Indexed: 12/13/2022] Open
Abstract
Protein complexes are involved in many important processes in living cells. To understand the mechanisms of these processes, it is necessary to solve the 3D structures of the protein complexes. When protein complex structures have not yet been determined by experiment, protein-protein docking tools can be used to computationally model the structures of these complexes. Here, we present a webserver which provides access to LZerD and Multi-LZerD protein docking tools. The protocol provided by the server have performed consistently among the top in the CAPRI blind evaluation. LZerD docks pairs of structures, while Multi-LZerD can dock three or more structures simultaneously. LZerD uses a soft protein surface representation with 3D Zernike descriptors and explores the binding pose space using geometric hashing. Multi-LZerD performs multi-chain docking by combining pairwise solutions by LZerD. Both methods output full-atom docked models of the input proteins. Users can also input distance constraints between interacting or non-interacting residues as well as residues that locate at the interface or far from the interface. The webserver is equipped with a user-friendly panel that visualizes the distribution and structures of binding poses of top scoring models. The LZerD webserver is available at https://lzerd.kiharalab.org.
Collapse
Affiliation(s)
- Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Siyang Chen
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Vijay Bharadwaj
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Vidhur Kumar
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Matin Hormati
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA.,Department of Biological Sciences, Purdue University, West Lafayette IN, 47907, USA.,Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN 47907, USA
| |
Collapse
|
7
|
Mahdizadeh SJ, Thomas M, Eriksson LA. Reconstruction of the Fas-Based Death-Inducing Signaling Complex (DISC) Using a Protein-Protein Docking Meta-Approach. J Chem Inf Model 2021; 61:3543-3558. [PMID: 34196179 PMCID: PMC8389534 DOI: 10.1021/acs.jcim.1c00301] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The death-inducing signaling complex (DISC) is a fundamental multiprotein complex, which triggers the extrinsic apoptosis pathway through stimulation by death ligands. DISC consists of different death domain (DD) and death effector domain (DED) containing proteins such as the death receptor Fas (CD95) in complex with FADD, procaspase-8, and cFLIP. Despite many experimental and theoretical studies in this area, there is no global agreement neither on the DISC architecture nor on the mechanism of action of the involved species. In the current work, we have tried to reconstruct the DISC structure by identifying key protein interactions using a new protein-protein docking meta-approach. We combined the benefits of five of the most employed protein-protein docking engines, HADDOCK, ClusPro, HDOCK, GRAMM-X, and ZDOCK, in order to improve the accuracy of the predicted docking complexes. Free energy of binding and hot spot interacting residues were calculated and determined for each protein-protein interaction using molecular mechanics generalized Born surface area and alanine scanning techniques, respectively. In addition, a series of in-cellulo protein-fragment complementation assays were conducted to validate the protein-protein docking procedure. The results show that the DISC formation initiates by dimerization of adjacent FasDD trimers followed by recruitment of FADD through homotypic DD interactions with the oligomerized death receptor. Furthermore, the in-silico outcomes indicate that cFLIP cannot bind directly to FADD; instead, cFLIP recruitment to the DISC is a hierarchical and cooperative process where FADD initially recruits procaspase-8, which in turn recruits and heterodimerizes with cFLIP. Finally, a possible structure of the entire DISC is proposed based on the docking results.
Collapse
Affiliation(s)
- Sayyed Jalil Mahdizadeh
- Department of Chemistry and Molecular Biology, University of Gothenburg, 405 30 Göteborg, Sweden
| | - Melissa Thomas
- Department of Chemistry and Molecular Biology, University of Gothenburg, 405 30 Göteborg, Sweden
| | - Leif A Eriksson
- Department of Chemistry and Molecular Biology, University of Gothenburg, 405 30 Göteborg, Sweden
| |
Collapse
|
8
|
Mishra SK, Cooper CJ, Parks JM, Mitchell JC. Hotspot Coevolution Is a Key Identifier of Near-Native Protein Complexes. J Phys Chem B 2021; 125:6058-6067. [PMID: 34077660 DOI: 10.1021/acs.jpcb.0c11525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein-protein interactions play a key role in mediating numerous biological functions, with more than half the proteins in living organisms existing as either homo- or hetero-oligomeric assemblies. Protein subunits that form oligomers minimize the free energy of the complex, but exhaustive computational search-based docking methods have not comprehensively addressed the challenge of distinguishing a natively bound complex from non-native forms. Current protein docking approaches address this problem by sampling multiple binding modes in proteins and scoring each mode, with the lowest-energy (or highest scoring) binding mode being regarded as a near-native complex. However, high-scoring modes often match poorly with the true bound form, suggesting a need for improvement of the scoring function. In this study, we propose a scoring function, KFC-E, that accounts for both conservation and coevolution of putative binding hotspot residues at protein-protein interfaces. We tested KFC-E on four benchmark sets of unbound examples and two benchmark sets of bound examples, with the results demonstrating a clear improvement over scores that examine conservation and coevolution across the entire interface.
Collapse
Affiliation(s)
- Sambit K Mishra
- Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, Tennessee 37831-6038, United States
| | - Connor J Cooper
- Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, Tennessee 37831-6038, United States
| | - Jerry M Parks
- Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, Tennessee 37831-6038, United States
| | - Julie C Mitchell
- Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, Tennessee 37831-6038, United States
| |
Collapse
|
9
|
Slater O, Miller B, Kontoyianni M. Decoding Protein-protein Interactions: An Overview. Curr Top Med Chem 2021; 20:855-882. [PMID: 32101126 DOI: 10.2174/1568026620666200226105312] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Revised: 11/27/2019] [Accepted: 11/27/2019] [Indexed: 12/24/2022]
Abstract
Drug discovery has focused on the paradigm "one drug, one target" for a long time. However, small molecules can act at multiple macromolecular targets, which serves as the basis for drug repurposing. In an effort to expand the target space, and given advances in X-ray crystallography, protein-protein interactions have become an emerging focus area of drug discovery enterprises. Proteins interact with other biomolecules and it is this intricate network of interactions that determines the behavior of the system and its biological processes. In this review, we briefly discuss networks in disease, followed by computational methods for protein-protein complex prediction. Computational methodologies and techniques employed towards objectives such as protein-protein docking, protein-protein interactions, and interface predictions are described extensively. Docking aims at producing a complex between proteins, while interface predictions identify a subset of residues on one protein that could interact with a partner, and protein-protein interaction sites address whether two proteins interact. In addition, approaches to predict hot spots and binding sites are presented along with a representative example of our internal project on the chemokine CXC receptor 3 B-isoform and predictive modeling with IP10 and PF4.
Collapse
Affiliation(s)
- Olivia Slater
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Bethany Miller
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Maria Kontoyianni
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| |
Collapse
|
10
|
Aderinwale T, Christoffer CW, Sarkar D, Alnabati E, Kihara D. Computational structure modeling for diverse categories of macromolecular interactions. Curr Opin Struct Biol 2020; 64:1-8. [PMID: 32599506 PMCID: PMC7665979 DOI: 10.1016/j.sbi.2020.05.017] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 05/06/2020] [Accepted: 05/21/2020] [Indexed: 01/23/2023]
Abstract
Computational protein-protein docking is one of the most intensively studied topics in structural bioinformatics. The field has made substantial progress through over three decades of development. The development began with methods for rigid-body docking of two proteins, which have now been extended in different directions to cover the various macromolecular interactions observed in a cell. Here, we overview the recent developments of the variations of docking methods, including multiple protein docking, peptide-protein docking, and disordered protein docking methods.
Collapse
Affiliation(s)
- Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Eman Alnabati
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA; Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
11
|
Badaczewska-Dawid AE, Kolinski A, Kmiecik S. Computational reconstruction of atomistic protein structures from coarse-grained models. Comput Struct Biotechnol J 2019; 18:162-176. [PMID: 31969975 PMCID: PMC6961067 DOI: 10.1016/j.csbj.2019.12.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 01/02/2023] Open
Abstract
Three-dimensional protein structures, whether determined experimentally or theoretically, are often too low resolution. In this mini-review, we outline the computational methods for protein structure reconstruction from incomplete coarse-grained to all atomistic models. Typical reconstruction schemes can be divided into four major steps. Usually, the first step is reconstruction of the protein backbone chain starting from the C-alpha trace. This is followed by side-chains rebuilding based on protein backbone geometry. Subsequently, hydrogen atoms can be reconstructed. Finally, the resulting all-atom models may require structure optimization. Many methods are available to perform each of these tasks. We discuss the available tools and their potential applications in integrative modeling pipelines that can transfer coarse-grained information from computational predictions, or experiment, to all atomistic structures.
Collapse
Affiliation(s)
| | | | - Sebastian Kmiecik
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
12
|
Nilofer C, Sukhwal A, Mohanapriya A, Sakharkar MK, Kangueane P. Small protein-protein interfaces rich in electrostatic are often linked to regulatory function. J Biomol Struct Dyn 2019; 38:3260-3279. [PMID: 31495333 DOI: 10.1080/07391102.2019.1657040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Protein-protein interaction (PPI) is critical for several biological functions in living cells through the formation of an interface. Therefore, it is of interest to characterize protein-protein interfaces using an updated non-redundant structural dataset of 2557 homo (identical subunits) and 393 hetero (different subunits) dimer protein complexes determined by X-ray crystallography. We analyzed the interfaces using van der Waals (vdW), hydrogen bonding and electrostatic energies. Results show that on average homo and hetero interfaces are similar. Hence, we further grouped the 2950 interfaces based on percentage vdW to total energies into dominant (≥60%) and sub-dominant (<60%) vdW interfaces. Majority (92%) of interfaces have dominant vdW energy with large interface size (146 ± 87 (homo) and 137 ± 76 (hetero) residues) and interface area (1622 ± 1135 Å2 (homo) and 1579 ± 1060 Å2 (hetero)). However, a proportion (8%) of interfaces have sub-dominant vdW energy with small interface size (85 ± 46 (homo) and 88 ± 36 (hetero) residues) and interface area (823 ± 538 Å2 (homo) and 881 ± 377 Å2 (hetero)). It is found that large interfaces have two-fold more interface area and interface size than small interfaces with increasing hydrogen bonding energy to interface size. However, small interfaces have three-fold more electrostatics energy than large interfaces with increasing electrostatics to interface size. Thus, 8% of complexes having small interfaces with limited interface area and sub-dominant vdW energy are rich in electrostatics. It is interesting to observe that complexes having small interfaces are often associated with regulatory function. Hence, the observed structural features with known molecular function provide insights for the better understanding of PPI.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Christina Nilofer
- Biomedical Informatics (P) Ltd., Pondicherry, India.,School of Biosciences & Technology, VIT University, Vellore, Tamil Nadu, India
| | - Anshul Sukhwal
- National Centre for Biological Sciences (NCBS), Bangalore, India
| | | | | | | |
Collapse
|
13
|
|
14
|
Macalino SJY, Basith S, Clavio NAB, Chang H, Kang S, Choi S. Evolution of In Silico Strategies for Protein-Protein Interaction Drug Discovery. Molecules 2018; 23:E1963. [PMID: 30082644 PMCID: PMC6222862 DOI: 10.3390/molecules23081963] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 08/03/2018] [Accepted: 08/04/2018] [Indexed: 12/14/2022] Open
Abstract
The advent of advanced molecular modeling software, big data analytics, and high-speed processing units has led to the exponential evolution of modern drug discovery and better insights into complex biological processes and disease networks. This has progressively steered current research interests to understanding protein-protein interaction (PPI) systems that are related to a number of relevant diseases, such as cancer, neurological illnesses, metabolic disorders, etc. However, targeting PPIs are challenging due to their "undruggable" binding interfaces. In this review, we focus on the current obstacles that impede PPI drug discovery, and how recent discoveries and advances in in silico approaches can alleviate these barriers to expedite the search for potential leads, as shown in several exemplary studies. We will also discuss about currently available information on PPI compounds and systems, along with their usefulness in molecular modeling. Finally, we conclude by presenting the limits of in silico application in drug discovery and offer a perspective in the field of computer-aided PPI drug discovery.
Collapse
Affiliation(s)
- Stephani Joy Y Macalino
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Shaherin Basith
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Nina Abigail B Clavio
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Hyerim Chang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Soosung Kang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Sun Choi
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| |
Collapse
|
15
|
Peterson LX, Shin WH, Kim H, Kihara D. Improved performance in CAPRI round 37 using LZerD docking and template-based modeling with combined scoring functions. Proteins 2018; 86 Suppl 1:311-320. [PMID: 28845596 PMCID: PMC5820220 DOI: 10.1002/prot.25376] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Revised: 08/09/2017] [Accepted: 08/24/2017] [Indexed: 12/12/2022]
Abstract
We report our group's performance for protein-protein complex structure prediction and scoring in Round 37 of the Critical Assessment of PRediction of Interactions (CAPRI), an objective assessment of protein-protein complex modeling. We demonstrated noticeable improvement in both prediction and scoring compared to previous rounds of CAPRI, with our human predictor group near the top of the rankings and our server scorer group at the top. This is the first time in CAPRI that a server has been the top scorer group. To predict protein-protein complex structures, we used both multi-chain template-based modeling (TBM) and our protein-protein docking program, LZerD. LZerD represents protein surfaces using 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. Because 3DZD are a soft representation of the protein surface, LZerD is tolerant to small conformational changes, making it well suited to docking unbound and TBM structures. The key to our improved performance in CAPRI Round 37 was to combine multi-chain TBM and docking. As opposed to our previous strategy of performing docking for all target complexes, we used TBM when multi-chain templates were available and docking otherwise. We also describe the combination of multiple scoring functions used by our server scorer group, which achieved the top rank for the scorer phase.
Collapse
Affiliation(s)
- Lenna X. Peterson
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Hyungrae Kim
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
16
|
Daberdaku S, Ferrari C. Exploring the potential of 3D Zernike descriptors and SVM for protein-protein interface prediction. BMC Bioinformatics 2018; 19:35. [PMID: 29409446 PMCID: PMC5802066 DOI: 10.1186/s12859-018-2043-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 01/24/2018] [Indexed: 12/22/2022] Open
Abstract
Background The correct determination of protein–protein interaction interfaces is important for understanding disease mechanisms and for rational drug design. To date, several computational methods for the prediction of protein interfaces have been developed, but the interface prediction problem is still not fully understood. Experimental evidence suggests that the location of binding sites is imprinted in the protein structure, but there are major differences among the interfaces of the various protein types: the characterising properties can vary a lot depending on the interaction type and function. The selection of an optimal set of features characterising the protein interface and the development of an effective method to represent and capture the complex protein recognition patterns are of paramount importance for this task. Results In this work we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0 and compared to other state-of-the-art protein interface predictors (SPPIDER, PrISE and NPS-HomPPI). Conclusions The 3D Zernike descriptors are able to capture the similarity among patterns of physico-chemical and biochemical properties mapped on the protein surface arising from the various spatial arrangements of the underlying residues, and their usage can be easily extended to other sets of amino acid properties. The results suggest that the choice of a proper set of features characterising the protein interface is crucial for the interface prediction task, and that optimality strongly depends on the class of proteins whose interface we want to characterise. We postulate that different protein classes should be treated separately and that it is necessary to identify an optimal set of features for each protein class. Electronic supplementary material The online version of this article (10.1186/s12859-018-2043-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sebastian Daberdaku
- Department of Information Engineering, University of Padova, via Gradenigo 6/A, Padova, 35131, Italy.
| | - Carlo Ferrari
- Department of Information Engineering, University of Padova, via Gradenigo 6/A, Padova, 35131, Italy
| |
Collapse
|
17
|
Peterson LX, Togawa Y, Esquivel-Rodriguez J, Terashi G, Christoffer C, Roy A, Shin WH, Kihara D. Modeling the assembly order of multimeric heteroprotein complexes. PLoS Comput Biol 2018; 14:e1005937. [PMID: 29329283 PMCID: PMC5785014 DOI: 10.1371/journal.pcbi.1005937] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Revised: 01/25/2018] [Accepted: 12/19/2017] [Indexed: 12/31/2022] Open
Abstract
Protein-protein interactions are the cornerstone of numerous biological processes. Although an increasing number of protein complex structures have been determined using experimental methods, relatively fewer studies have been performed to determine the assembly order of complexes. In addition to the insights into the molecular mechanisms of biological function provided by the structure of a complex, knowing the assembly order is important for understanding the process of complex formation. Assembly order is also practically useful for constructing subcomplexes as a step toward solving the entire complex experimentally, designing artificial protein complexes, and developing drugs that interrupt a critical step in the complex assembly. There are several experimental methods for determining the assembly order of complexes; however, these techniques are resource-intensive. Here, we present a computational method that predicts the assembly order of protein complexes by building the complex structure. The method, named Path-LzerD, uses a multimeric protein docking algorithm that assembles a protein complex structure from individual subunit structures and predicts assembly order by observing the simulated assembly process of the complex. Benchmarked on a dataset of complexes with experimental evidence of assembly order, Path-LZerD was successful in predicting the assembly pathway for the majority of the cases. Moreover, when compared with a simple approach that infers the assembly path from the buried surface area of subunits in the native complex, Path-LZerD has the strong advantage that it can be used for cases where the complex structure is not known. The path prediction accuracy decreased when starting from unbound monomers, particularly for larger complexes of five or more subunits, for which only a part of the assembly path was correctly identified. As the first method of its kind, Path-LZerD opens a new area of computational protein structure modeling and will be an indispensable approach for studying protein complexes.
Collapse
Affiliation(s)
- Lenna X. Peterson
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Yoichiro Togawa
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Juan Esquivel-Rodriguez
- Department of Computer Science, Purdue University, West Lafayette, Indiana, United States of America
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, Indiana, United States of America
| | - Amitava Roy
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana, United States of America
- Bioinformatics and Computational Biosciences Branch, Rocky Mountain Laboratories, NIAID, National Institutes of Health, Hamilton, Montana, United States of America
| | - Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
- Department of Computer Science, Purdue University, West Lafayette, Indiana, United States of America
- * E-mail:
| |
Collapse
|
18
|
Han X, Wei Q, Kihara D. Protein 3D Structure and Electron Microscopy Map Retrieval Using 3D-SURFER2.0 and EM-SURFER. ACTA ACUST UNITED AC 2017; 60:3.14.1-3.14.15. [PMID: 29220075 DOI: 10.1002/cpbi.37] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
With the rapid growth in the number of solved protein structures stored in the Protein Data Bank (PDB) and the Electron Microscopy Data Bank (EMDB), it is essential to develop tools to perform real-time structure similarity searches against the entire structure database. Since conventional structure alignment methods need to sample different orientations of proteins in the three-dimensional space, they are time consuming and unsuitable for rapid, real-time database searches. To this end, we have developed 3D-SURFER and EM-SURFER, which utilize 3D Zernike descriptors (3DZD) to conduct high-throughput protein structure comparison, visualization, and analysis. Taking an atomic structure or an electron microscopy map of a protein or a protein complex as input, the 3DZD of a query protein is computed and compared with the 3DZD of all other proteins in PDB or EMDB. In addition, local geometrical characteristics of a query protein can be analyzed using VisGrid and LIGSITECSC in 3D-SURFER. This article describes how to use 3D-SURFER and EM-SURFER to carry out protein surface shape similarity searches, local geometric feature analysis, and interpretation of the search results. © 2017 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Xusi Han
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana
| | - Qing Wei
- Department of Computer Science, Purdue University, West Lafayette, Indiana
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana.,Department of Computer Science, Purdue University, West Lafayette, Indiana
| |
Collapse
|
19
|
Peterson LX, Kim H, Esquivel-Rodriguez J, Roy A, Han X, Shin WH, Zhang J, Terashi G, Lee M, Kihara D. Human and server docking prediction for CAPRI round 30-35 using LZerD with combined scoring functions. Proteins 2017; 85:513-527. [PMID: 27654025 PMCID: PMC5313330 DOI: 10.1002/prot.25165] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2016] [Revised: 09/09/2016] [Accepted: 09/15/2016] [Indexed: 12/12/2022]
Abstract
We report the performance of protein-protein docking predictions by our group for recent rounds of the Critical Assessment of Prediction of Interactions (CAPRI), a community-wide assessment of state-of-the-art docking methods. Our prediction procedure uses a protein-protein docking program named LZerD developed in our group. LZerD represents a protein surface with 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. The appropriate soft representation of protein surface with 3DZD makes the method more tolerant to conformational change of proteins upon docking, which adds an advantage for unbound docking. Docking was guided by interface residue prediction performed with BindML and cons-PPISP as well as literature information when available. The generated docking models were ranked by a combination of scoring functions, including PRESCO, which evaluates the native-likeness of residues' spatial environments in structure models. First, we discuss the overall performance of our group in the CAPRI prediction rounds and investigate the reasons for unsuccessful cases. Then, we examine the performance of several knowledge-based scoring functions and their combinations for ranking docking models. It was found that the quality of a pool of docking models generated by LZerD, that is whether or not the pool includes near-native models, can be predicted by the correlation of multiple scores. Although the current analysis used docking models generated by LZerD, findings on scoring functions are expected to be universally applicable to other docking methods. Proteins 2017; 85:513-527. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Lenna X. Peterson
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Hyungrae Kim
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Amitava Roy
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, IN, 47907, USA
- Bioinformatics and Computational Biosciences Branch, Rocky Mountain Laboratories, NIAID, National Institutes of Health, Hamilton, Montana 59840, USA
| | - Xusi Han
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Jian Zhang
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- School of Pharmacy, Kitasato University, Minato-Ku, Tokyo, 108-8641, Japan
| | - Matt Lee
- Lilly Biotechnology Center San Diego, 10300 Campus Point Drive, San Diego, CA, 92121, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
20
|
Wei Q, La D, Kihara D. BindML/BindML+: Detecting Protein-Protein Interaction Interface Propensity from Amino Acid Substitution Patterns. Methods Mol Biol 2017; 1529:279-289. [PMID: 27914057 DOI: 10.1007/978-1-4939-6637-0_14] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Prediction of protein-protein interaction sites in a protein structure provides important information for elucidating the mechanism of protein function and can also be useful in guiding a modeling or design procedures of protein complex structures. Since prediction methods essentially assess the propensity of amino acids that are likely to be part of a protein docking interface, they can help in designing protein-protein interactions. Here, we introduce BindML and BindML+ protein-protein interaction sites prediction methods. BindML predicts protein-protein interaction sites by identifying mutation patterns found in known protein-protein complexes using phylogenetic substitution models. BindML+ is an extension of BindML for distinguishing permanent and transient types of protein-protein interaction sites. We developed an interactive web-server that provides a convenient interface to assist in structural visualization of protein-protein interactions site predictions. The input data for the web-server are a tertiary structure of interest. BindML and BindML+ are available at http://kiharalab.org/bindml/ and http://kiharalab.org/bindml/plus/ .
Collapse
Affiliation(s)
- Qing Wei
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - David La
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
21
|
de Vries SJ, Chauvot de Beauchêne I, Schindler CEM, Zacharias M. Cryo-EM Data Are Superior to Contact and Interface Information in Integrative Modeling. Biophys J 2016; 110:785-97. [PMID: 26846888 DOI: 10.1016/j.bpj.2015.12.038] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Revised: 11/18/2015] [Accepted: 12/14/2015] [Indexed: 12/29/2022] Open
Abstract
Protein-protein interactions carry out a large variety of essential cellular processes. Cryo-electron microscopy (cryo-EM) is a powerful technique for the modeling of protein-protein interactions at a wide range of resolutions, and recent developments have caused a revolution in the field. At low resolution, cryo-EM maps can drive integrative modeling of the interaction, assembling existing structures into the map. Other experimental techniques can provide information on the interface or on the contacts between the monomers in the complex. This inevitably raises the question regarding which type of data is best suited to drive integrative modeling approaches. Systematic comparison of the prediction accuracy and specificity of the different integrative modeling paradigms is unavailable to date. Here, we compare EM-driven, interface-driven, and contact-driven integrative modeling paradigms. Models were generated for the protein docking benchmark using the ATTRACT docking engine and evaluated using the CAPRI two-star criterion. At 20 Å resolution, EM-driven modeling achieved a success rate of 100%, outperforming the other paradigms even with perfect interface and contact information. Therefore, even very low resolution cryo-EM data is superior in predicting heterodimeric and heterotrimeric protein assemblies. Our study demonstrates that a force field is not necessary, cryo-EM data alone is sufficient to accurately guide the monomers into place. The resulting rigid models successfully identify regions of conformational change, opening up perspectives for targeted flexible remodeling.
Collapse
Affiliation(s)
- Sjoerd J de Vries
- Physik-Department T38, Technische Universität München, Garching, Germany.
| | | | - Christina E M Schindler
- Physik-Department T38, Technische Universität München, Garching, Germany; Center for Integrated Protein Science Munich (CIPSM) at the Physics Department, Technische Universität München, Garching, Germany
| | - Martin Zacharias
- Physik-Department T38, Technische Universität München, Garching, Germany; Center for Integrated Protein Science Munich (CIPSM) at the Physics Department, Technische Universität München, Garching, Germany
| |
Collapse
|
22
|
Computing Discrete Fine-Grained Representations of Protein Surfaces. COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS 2016. [DOI: 10.1007/978-3-319-44332-4_14] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
23
|
Xue LC, Dobbs D, Bonvin AMJJ, Honavar V. Computational prediction of protein interfaces: A review of data driven methods. FEBS Lett 2015; 589:3516-26. [PMID: 26460190 PMCID: PMC4655202 DOI: 10.1016/j.febslet.2015.10.003] [Citation(s) in RCA: 101] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Revised: 10/01/2015] [Accepted: 10/02/2015] [Indexed: 01/06/2023]
Abstract
Reliably pinpointing which specific amino acid residues form the interface(s) between a protein and its binding partner(s) is critical for understanding the structural and physicochemical determinants of protein recognition and binding affinity, and has wide applications in modeling and validating protein interactions predicted by high-throughput methods, in engineering proteins, and in prioritizing drug targets. Here, we review the basic concepts, principles and recent advances in computational approaches to the analysis and prediction of protein-protein interfaces. We point out caveats for objectively evaluating interface predictors, and discuss various applications of data-driven interface predictors for improving energy model-driven protein-protein docking. Finally, we stress the importance of exploiting binding partner information in reliably predicting interfaces and highlight recent advances in this emerging direction.
Collapse
Affiliation(s)
- Li C Xue
- Faculty of Science - Chemistry, Bijvoet Center for Biomolecular Research, Utrecht Univ., Utrecht 3584 CH, The Netherlands.
| | - Drena Dobbs
- Department of Genetics, Development & Cell Biology, Iowa State Univ., Ames, IA 50011, USA; Bioinformatics & Computational Biology Program, Iowa State Univ., Ames, IA 50011, USA
| | - Alexandre M J J Bonvin
- Faculty of Science - Chemistry, Bijvoet Center for Biomolecular Research, Utrecht Univ., Utrecht 3584 CH, The Netherlands
| | - Vasant Honavar
- College of Information Sciences & Technology, Pennsylvania State Univ., University Park, PA 16802, USA; Genomics & Bioinformatics Program, Pennsylvania State Univ., University Park, PA 16802, USA; Neuroscience Program, Pennsylvania State Univ., University Park, PA 16802, USA; The Huck Institutes of the Life Sciences, Pennsylvania State Univ., University Park, PA 16802, USA; Center for Big Data Analytics & Discovery Informatics, Pennsylvania State Univ., University Park, PA 16802, USA; Institute for Cyberscience, Pennsylvania State Univ., University Park, PA 16802, USA
| |
Collapse
|
24
|
Capitani G, Duarte JM, Baskaran K, Bliven S, Somody JC. Understanding the fabric of protein crystals: computational classification of biological interfaces and crystal contacts. Bioinformatics 2015; 32:481-9. [PMID: 26508758 PMCID: PMC4743631 DOI: 10.1093/bioinformatics/btv622] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 10/16/2015] [Indexed: 11/20/2022] Open
Abstract
Modern structural biology still draws the vast majority of information from crystallography, a technique where the objects being investigated are embedded in a crystal lattice. Given the complexity and variety of those objects, it becomes fundamental to computationally assess which of the interfaces in the lattice are biologically relevant and which are simply crystal contacts. Since the mid-1990s, several approaches have been applied to obtain high-accuracy classification of crystal contacts and biological protein–protein interfaces. This review provides an overview of the concepts and main approaches to protein interface classification: thermodynamic estimation of interface stability, evolutionary approaches based on conservation of interface residues, and co-occurrence of the interface across different crystal forms. Among the three categories, evolutionary approaches offer the strongest promise for improvement, thanks to the incessant growth in sequence knowledge. Importantly, protein interface classification algorithms can also be used on multimeric structures obtained using other high-resolution techniques or for protein assembly design or validation purposes. A key issue linked to protein interface classification is the identification of the biological assembly of a crystal structure and the analysis of its symmetry. Here, we highlight the most important concepts and problems to be overcome in assembly prediction. Over the next few years, tools and concepts of interface classification will probably become more frequently used and integrated in several areas of structural biology and structural bioinformatics. Among the main challenges for the future are better addressing of weak interfaces and the application of interface classification concepts to prediction problems like protein–protein docking. Supplementary information: Supplementary data are available at Bioinformatics online. Contact:guido.capitani@psi.ch
Collapse
Affiliation(s)
- Guido Capitani
- Laboratory of Biomolecular Research, Paul Scherrer Institute, OFLC/110, 5232 Villigen PSI, Department of Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Jose M Duarte
- Laboratory of Biomolecular Research, Paul Scherrer Institute, OFLC/110, 5232 Villigen PSI, Department of Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Kumaran Baskaran
- Laboratory of Biomolecular Research, Paul Scherrer Institute, OFLC/110, 5232 Villigen PSI
| | - Spencer Bliven
- Laboratory of Biomolecular Research, Paul Scherrer Institute, OFLC/110, 5232 Villigen PSI, Bioinformatics and Systems Biology Program, UC San Diego, La Jolla, CA 92093, National Center for Biotechnology Information, NIH, Bethesda, MD 20894, USA and
| | - Joseph C Somody
- Laboratory of Biomolecular Research, Paul Scherrer Institute, OFLC/110, 5232 Villigen PSI, Department of Computer Science, ETH Zurich, 8092 Zurich, Switzerland
| |
Collapse
|
25
|
Hashmi I, Shehu A. idDock+: Integrating Machine Learning in Probabilistic Search for Protein–Protein Docking. J Comput Biol 2015. [DOI: 10.1089/cmb.2015.0108] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Affiliation(s)
- Irina Hashmi
- Department of Computer Science, George Mason University, Fairfax, Virginia
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, Virginia
- Department of Bioengineering, George Mason University, Fairfax, Virginia
- School of Systems Biology, George Mason University, Fairfax, Virginia
| |
Collapse
|
26
|
Zheng H, Mandal A, Shumilin IA, Chordia MD, Panneerdoss S, Herr JC, Minor W. Sperm Lysozyme-Like Protein 1 (SLLP1), an intra-acrosomal oolemmal-binding sperm protein, reveals filamentous organization in protein crystal form. Andrology 2015; 3:756-71. [PMID: 26198801 PMCID: PMC5040164 DOI: 10.1111/andr.12057] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Revised: 04/29/2015] [Accepted: 04/30/2015] [Indexed: 01/07/2023]
Abstract
Sperm lysozyme-like protein 1 (SLLP1) is one of the lysozyme-like proteins predominantly expressed in mammalian testes that lacks bacteriolytic activity, localizes in the sperm acrosome, and exhibits high affinity for an oolemmal receptor, SAS1B. The crystal structure of mouse SLLP1 (mSLLP1) was determined at 2.15 Å resolution. mSLLP1 monomer adopts a structural fold similar to that of chicken/mouse lysozymes retaining all four canonical disulfide bonds. mSLLP1 is distinct from c-lysozyme by substituting two essential catalytic residues (E35T/D52N), exhibiting different surface charge distribution, and by forming helical filaments approximately 75 Å in diameter with a 25 Å central pore comprised of six monomers per helix turn repeating every 33 Å. Cross-species alignment of all reported SLLP1 sequences revealed a set of invariant surface regions comprising a characteristic fingerprint uniquely identifying SLLP1 from other c-lysozyme family members. The fingerprint surface regions reside around the lips of the putative glycan-binding groove including three polar residues (Y33/E46/H113). A flexible salt bridge (E46-R61) was observed covering the glycan-binding groove. The conservation of these regions may be linked to their involvement in oolemmal protein binding. Interaction between SLLP1 monomer and its oolemmal receptor SAS1B was modeled using protein-protein docking algorithms, utilizing the SLLP1 fingerprint regions along with the SAS1B conserved surface regions. This computational model revealed complementarity between the conserved SLLP1/SAS1B interacting surfaces supporting the experimentally observed SLLP1/SAS1B interaction involved in fertilization.
Collapse
Affiliation(s)
- Heping Zheng
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA
| | - Arabinda Mandal
- Department of Cell Biology, Center for Research in Contraceptive and Reproductive Health, University of Virginia, Charlottesville, VA 22908, USA
| | - Igor A. Shumilin
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA
| | - Mahendra D. Chordia
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA
| | - Subbarayalu Panneerdoss
- Department of Cell Biology, Center for Research in Contraceptive and Reproductive Health, University of Virginia, Charlottesville, VA 22908, USA
| | - John C. Herr
- Department of Cell Biology, Center for Research in Contraceptive and Reproductive Health, University of Virginia, Charlottesville, VA 22908, USA
- Correspondence: Wladek Minor, Ph.D., Department of Molecular Physiology and Biological Physics, University of Virginia, P.O. Box 800736, Charlottesville, Virginia 22908-0736, USA. Ph: +1 434 243-6865; Fax: +1 434 982-1616; , John C. Herr, Ph.D., Department of Cell Biology, University of Virginia, P.O. Box 800732, Charlottesville, Virginia 22908, USA. Ph: +1 434 924-2007; Fax: +1 434 982-3912;
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA
- Correspondence: Wladek Minor, Ph.D., Department of Molecular Physiology and Biological Physics, University of Virginia, P.O. Box 800736, Charlottesville, Virginia 22908-0736, USA. Ph: +1 434 243-6865; Fax: +1 434 982-1616; , John C. Herr, Ph.D., Department of Cell Biology, University of Virginia, P.O. Box 800732, Charlottesville, Virginia 22908, USA. Ph: +1 434 924-2007; Fax: +1 434 982-3912;
| |
Collapse
|
27
|
Esquivel-Rodríguez J, Xiong Y, Han X, Guang S, Christoffer C, Kihara D. Navigating 3D electron microscopy maps with EM-SURFER. BMC Bioinformatics 2015; 16:181. [PMID: 26025554 PMCID: PMC4448178 DOI: 10.1186/s12859-015-0580-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2014] [Accepted: 04/20/2015] [Indexed: 03/18/2023] Open
Abstract
BACKGROUND The Electron Microscopy DataBank (EMDB) is growing rapidly, accumulating biological structural data obtained mainly by electron microscopy and tomography, which are emerging techniques for determining large biomolecular complex and subcellular structures. Together with the Protein Data Bank (PDB), EMDB is becoming a fundamental resource of the tertiary structures of biological macromolecules. To take full advantage of this indispensable resource, the ability to search the database by structural similarity is essential. However, unlike high-resolution structures stored in PDB, methods for comparing low-resolution electron microscopy (EM) density maps in EMDB are not well established. RESULTS We developed a computational method for efficiently searching low-resolution EM maps. The method uses a compact fingerprint representation of EM maps based on the 3D Zernike descriptor, which is derived from a mathematical series expansion for EM maps that are considered as 3D functions. The method is implemented in a web server named EM-SURFER, which allows users to search against the entire EMDB in real-time. EM-SURFER compares the global shapes of EM maps. Examples of search results from different types of query structures are discussed. CONCLUSIONS We developed EM-SURFER, which retrieves structurally relevant matches for query EM maps from EMDB within seconds. The unique capability of EM-SURFER to detect 3D shape similarity of low-resolution EM maps should prove invaluable in structural biology.
Collapse
Affiliation(s)
| | - Yi Xiong
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| | - Xusi Han
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| | - Shuomeng Guang
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA.
- Department of Mathematics, Purdue University, West Lafayette, IN, 47907, USA.
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
28
|
Maheshwari S, Brylinski M. Predicting protein interface residues using easily accessible on-line resources. Brief Bioinform 2015; 16:1025-34. [PMID: 25797794 DOI: 10.1093/bib/bbv009] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Indexed: 01/20/2023] Open
Abstract
It has been more than a decade since the completion of the Human Genome Project that provided us with a complete list of human proteins. The next obvious task is to figure out how various parts interact with each other. On that account, we review 10 methods for protein interface prediction, which are freely available as web servers. In addition, we comparatively evaluate their performance on a common data set comprising different quality target structures. We find that using experimental structures and high-quality homology models, structure-based methods outperform those using only protein sequences, with global template-based approaches providing the best performance. For moderate-quality models, sequence-based methods often perform better than those structure-based techniques that rely on fine atomic details. We note that post-processing protocols implemented in several methods quantitatively improve the results only for experimental structures, suggesting that these procedures should be tuned up for computer-generated models. Finally, we anticipate that advanced meta-prediction protocols are likely to enhance interface residue prediction. Notwithstanding further improvements, easily accessible web servers already provide the scientific community with convenient resources for the identification of protein-protein interaction sites.
Collapse
|
29
|
Wierschin T, Wang K, Welter M, Waack S, Stanke M. Combining features in a graphical model to predict protein binding sites. Proteins 2015; 83:844-52. [PMID: 25663045 DOI: 10.1002/prot.24775] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Revised: 01/16/2015] [Accepted: 01/26/2015] [Indexed: 11/08/2022]
Abstract
Large efforts have been made in classifying residues as binding sites in proteins using machine learning methods. The prediction task can be translated into the computational challenge of assigning each residue the label binding site or non-binding site. Observational data comes from various possibly highly correlated sources. It includes the structure of the protein but not the structure of the complex. The model class of conditional random fields (CRFs) has previously successfully been used for protein binding site prediction. Here, a new CRF-approach is presented that models the dependencies of residues using a general graphical structure defined as a neighborhood graph and thus our model makes fewer independence assumptions on the labels than sequential labeling approaches. A novel node feature "change in free energy" is introduced into the model, which is then denoted by ΔF-CRF. Parameters are trained with an online large-margin algorithm. Using the standard feature class relative accessible surface area alone, the general graph-structure CRF already achieves higher prediction accuracy than the linear chain CRF of Li et al. ΔF-CRF performs significantly better on a large range of false positive rates than the support-vector-machine-based program PresCont of Zellner et al. on a homodimer set containing 128 chains. ΔF-CRF has a broader scope than PresCont since it is not constrained to protein subgroups and requires no multiple sequence alignment. The improvement is attributed to the advantageous combination of the novel node feature with the standard feature and to the adopted parameter training method.
Collapse
Affiliation(s)
- Torsten Wierschin
- Institute of Mathematics and Computer Science, University of Greifswald, 17487, Greifswald, Germany
| | | | | | | | | |
Collapse
|
30
|
Segura J, Marín-López MA, Jones PF, Oliva B, Fernandez-Fuentes N. VORFFIP-driven dock: V-D2OCK, a fast and accurate protein docking strategy. PLoS One 2015; 10:e0118107. [PMID: 25763838 PMCID: PMC4357426 DOI: 10.1371/journal.pone.0118107] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 12/27/2014] [Indexed: 12/24/2022] Open
Abstract
The experimental determination of the structure of protein complexes cannot keep pace with the generation of interactomic data, hence resulting in an ever-expanding gap. As the structural details of protein complexes are central to a full understanding of the function and dynamics of the cell machinery, alternative strategies are needed to circumvent the bottleneck in structure determination. Computational protein docking is a valid and valuable approach to model the structure of protein complexes. In this work, we describe a novel computational strategy to predict the structure of protein complexes based on data-driven docking: VORFFIP-driven dock (V-D2OCK). This new approach makes use of our newly described method to predict functional sites in protein structures, VORFFIP, to define the region to be sampled during docking and structural clustering to reduce the number of models to be examined by users. V-D2OCK has been benchmarked using a validated and diverse set of protein complexes and compared to a state-of-art docking method. The speed and accuracy compared to contemporary tools justifies the potential use of VD2OCK for high-throughput, genome-wide, protein docking. Finally, we have developed a web interface that allows users to browser and visualize V-D2OCK predictions from the convenience of their web-browsers.
Collapse
Affiliation(s)
- Joan Segura
- Leeds Institute of Molecular Medicine, School of Medicine, University of Leeds, Leeds, LS9 7TF, United Kingdom
| | - Manuel Alejandro Marín-López
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Pamela F. Jones
- Leeds Institute of Molecular Medicine, School of Medicine, University of Leeds, Leeds, LS9 7TF, United Kingdom
| | - Baldo Oliva
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Narcis Fernandez-Fuentes
- Leeds Institute of Molecular Medicine, School of Medicine, University of Leeds, Leeds, LS9 7TF, United Kingdom
- * E-mail:
| |
Collapse
|
31
|
Krippahl L, Barahona P. Protein docking with predicted constraints. Algorithms Mol Biol 2015; 10:9. [PMID: 25722738 PMCID: PMC4340843 DOI: 10.1186/s13015-015-0036-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Accepted: 01/29/2015] [Indexed: 11/30/2022] Open
Abstract
This paper presents a constraint-based method for improving protein docking results. Efficient constraint propagation cuts over 95% of the search time for finding the configurations with the largest contact surface, provided a contact is specified between two amino acid residues. This makes it possible to scan a large number of potentially correct constraints, lowering the requirements for useful contact predictions. While other approaches are very dependent on accurate contact predictions, ours requires only that at least one correct contact be retained in a set of, for example, one hundred constraints to test. It is this feature that makes it feasible to use readily available sequence data to predict specific potential contacts. Although such prediction is too inaccurate for most purposes, we demonstrate with a Naïve Bayes Classifier that it is accurate enough to more than double the average number of acceptable models retained during the crucial filtering stage of protein docking when combined with our constrained docking algorithm. All software developed in this work is freely available as part of the Open Chemera Library.
Collapse
|
32
|
Maheshwari S, Brylinski M. Prediction of protein-protein interaction sites from weakly homologous template structures using meta-threading and machine learning. J Mol Recognit 2015; 28:35-48. [DOI: 10.1002/jmr.2410] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2014] [Revised: 06/19/2014] [Accepted: 06/27/2014] [Indexed: 11/11/2022]
Affiliation(s)
- Surabhi Maheshwari
- Department of Biological Sciences; Louisiana State University; Baton Rouge LA 70803 USA
| | - Michal Brylinski
- Department of Biological Sciences; Louisiana State University; Baton Rouge LA 70803 USA
- Center for Computation & Technology; Louisiana State University; Baton Rouge LA 70803 USA
| |
Collapse
|
33
|
Peterson LX, Kang X, Kihara D. Assessment of protein side-chain conformation prediction methods in different residue environments. Proteins 2014; 82:1971-84. [PMID: 24619909 PMCID: PMC5007623 DOI: 10.1002/prot.24552] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Revised: 03/02/2014] [Accepted: 03/07/2014] [Indexed: 11/09/2022]
Abstract
Computational prediction of side-chain conformation is an important component of protein structure prediction. Accurate side-chain prediction is crucial for practical applications of protein structure models that need atomic-detailed resolution such as protein and ligand design. We evaluated the accuracy of eight side-chain prediction methods in reproducing the side-chain conformations of experimentally solved structures deposited to the Protein Data Bank. Prediction accuracy was evaluated for a total of four different structural environments (buried, surface, interface, and membrane-spanning) in three different protein types (monomeric, multimeric, and membrane). Overall, the highest accuracy was observed for buried residues in monomeric and multimeric proteins. Notably, side-chains at protein interfaces and membrane-spanning regions were better predicted than surface residues even though the methods did not all use multimeric and membrane proteins for training. Thus, we conclude that the current methods are as practically useful for modeling protein docking interfaces and membrane-spanning regions as for modeling monomers.
Collapse
Affiliation(s)
- Lenna X. Peterson
- Department of Biological Sciences, Purdue University, West Lafayette IN, 47907, USA
| | - Xuejiao Kang
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette IN, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
34
|
Esmaielbeiki R, Nebel JC. Scoring docking conformations using predicted protein interfaces. BMC Bioinformatics 2014; 15:171. [PMID: 24906633 PMCID: PMC4057934 DOI: 10.1186/1471-2105-15-171] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2012] [Accepted: 05/29/2014] [Indexed: 12/22/2022] Open
Abstract
Background Since proteins function by interacting with other molecules, analysis of protein-protein interactions is essential for comprehending biological processes. Whereas understanding of atomic interactions within a complex is especially useful for drug design, limitations of experimental techniques have restricted their practical use. Despite progress in docking predictions, there is still room for improvement. In this study, we contribute to this topic by proposing T-PioDock, a framework for detection of a native-like docked complex 3D structure. T-PioDock supports the identification of near-native conformations from 3D models that docking software produced by scoring those models using binding interfaces predicted by the interface predictor, Template based Protein Interface Prediction (T-PIP). Results First, exhaustive evaluation of interface predictors demonstrates that T-PIP, whose predictions are customised to target complexity, is a state-of-the-art method. Second, comparative study between T-PioDock and other state-of-the-art scoring methods establishes T-PioDock as the best performing approach. Moreover, there is good correlation between T-PioDock performance and quality of docking models, which suggests that progress in docking will lead to even better results at recognising near-native conformations. Conclusion Accurate identification of near-native conformations remains a challenging task. Although availability of 3D complexes will benefit from template-based methods such as T-PioDock, we have identified specific limitations which need to be addressed. First, docking software are still not able to produce native like models for every target. Second, current interface predictors do not explicitly consider pairwise residue interactions between proteins and their interacting partners which leaves ambiguity when assessing quality of complex conformations.
Collapse
Affiliation(s)
- Reyhaneh Esmaielbeiki
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK.
| | | |
Collapse
|
35
|
Saccà C, Teso S, Diligenti M, Passerini A. Improved multi-level protein-protein interaction prediction with semantic-based regularization. BMC Bioinformatics 2014; 15:103. [PMID: 24725682 PMCID: PMC4004462 DOI: 10.1186/1471-2105-15-103] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2013] [Accepted: 03/03/2014] [Indexed: 11/24/2022] Open
Abstract
Background Protein–protein interactions can be seen as a hierarchical process occurring at three related levels: proteins bind by means of specific domains, which in turn form interfaces through patches of residues. Detailed knowledge about which domains and residues are involved in a given interaction has extensive applications to biology, including better understanding of the binding process and more efficient drug/enzyme design. Alas, most current interaction prediction methods do not identify which parts of a protein actually instantiate an interaction. Furthermore, they also fail to leverage the hierarchical nature of the problem, ignoring otherwise useful information available at the lower levels; when they do, they do not generate predictions that are guaranteed to be consistent between levels. Results Inspired by earlier ideas of Yip et al. (BMC Bioinformatics 10:241, 2009), in the present paper we view the problem as a multi-level learning task, with one task per level (proteins, domains and residues), and propose a machine learning method that collectively infers the binding state of all object pairs. Our method is based on Semantic Based Regularization (SBR), a flexible and theoretically sound machine learning framework that uses First Order Logic constraints to tie the learning tasks together. We introduce a set of biologically motivated rules that enforce consistent predictions between the hierarchy levels. Conclusions We study the empirical performance of our method using a standard validation procedure, and compare its performance against the only other existing multi-level prediction technique. We present results showing that our method substantially outperforms the competitor in several experimental settings, indicating that exploiting the hierarchical nature of the problem can lead to better predictions. In addition, our method is also guaranteed to produce interactions that are consistent with respect to the protein–domain–residue hierarchy.
Collapse
Affiliation(s)
| | | | | | - Andrea Passerini
- Dipartimento di Ingegneria e Scienza dell'Informazione, University of Trento, Trento, Italy.
| |
Collapse
|
36
|
Xue LC, Jordan RA, EL-Manzalawy Y, Dobbs D, Honavar V. DockRank: ranking docked conformations using partner-specific sequence homology-based protein interface prediction. Proteins 2014; 82:250-67. [PMID: 23873600 PMCID: PMC4417613 DOI: 10.1002/prot.24370] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2012] [Revised: 06/27/2013] [Accepted: 07/09/2013] [Indexed: 12/11/2022]
Abstract
Selecting near-native conformations from the immense number of conformations generated by docking programs remains a major challenge in molecular docking. We introduce DockRank, a novel approach to scoring docked conformations based on the degree to which the interface residues of the docked conformation match a set of predicted interface residues. DockRank uses interface residues predicted by partner-specific sequence homology-based protein-protein interface predictor (PS-HomPPI), which predicts the interface residues of a query protein with a specific interaction partner. We compared the performance of DockRank with several state-of-the-art docking scoring functions using Success Rate (the percentage of cases that have at least one near-native conformation among the top m conformations) and Hit Rate (the percentage of near-native conformations that are included among the top m conformations). In cases where it is possible to obtain partner-specific (PS) interface predictions from PS-HomPPI, DockRank consistently outperforms both (i) ZRank and IRAD, two state-of-the-art energy-based scoring functions (improving Success Rate by up to 4-fold); and (ii) Variants of DockRank that use predicted interface residues obtained from several protein interface predictors that do not take into account the binding partner in making interface predictions (improving success rate by up to 39-fold). The latter result underscores the importance of using partner-specific interface residues in scoring docked conformations. We show that DockRank, when used to re-rank the conformations returned by ClusPro, improves upon the original ClusPro rankings in terms of both Success Rate and Hit Rate. DockRank is available as a server at http://einstein.cs.iastate.edu/DockRank/.
Collapse
Affiliation(s)
- Li C. Xue
- Bioinformatics and Computational Biology program, Iowa State University, Ames, Iowa
| | - Rafael A. Jordan
- Department of Computer Science, Iowa State University, Ames, Iowa
- Department of Systems and Computer Engineering, Pontificia Universidad Javeriana, Cali, Colombia
| | - Yasser EL-Manzalawy
- Department of Computer Science, Iowa State University, Ames, Iowa
- Department of Systems and Computer Engineering, Al-Azhar University, Cairo, Egypt
| | - Drena Dobbs
- Bioinformatics and Computational Biology program, Iowa State University, Ames, Iowa
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, Iowa
| | - Vasant Honavar
- Bioinformatics and Computational Biology program, Iowa State University, Ames, Iowa
- Department of Computer Science, Iowa State University, Ames, Iowa
| |
Collapse
|
37
|
Esquivel-Rodriguez J, Filos-Gonzalez V, Li B, Kihara D. Pairwise and multimeric protein-protein docking using the LZerD program suite. Methods Mol Biol 2014; 1137:209-34. [PMID: 24573484 DOI: 10.1007/978-1-4939-0366-5_15] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Physical interactions between proteins are involved in many important cell functions and are key for understanding the mechanisms of biological processes. Protein-protein docking programs provide a means to computationally construct three-dimensional (3D) models of a protein complex structure from its component protein units. A protein docking program takes two or more individual 3D protein structures, which are either experimentally solved or computationally modeled, and outputs a series of probable complex structures.In this chapter we present the LZerD protein docking suite, which includes programs for pairwise docking, LZerD and PI-LZerD, and multiple protein docking, Multi-LZerD, developed by our group. PI-LZerD takes protein docking interface residues as additional input information. The methods use a combination of shape-based protein surface features as well as physics-based scoring terms to generate protein complex models. The programs are provided as stand-alone programs and can be downloaded from http://kiharalab.org/proteindocking.
Collapse
|
38
|
Krawczyk K, Baker T, Shi J, Deane CM. Antibody i-Patch prediction of the antibody binding site improves rigid local antibody-antigen docking. Protein Eng Des Sel 2013; 26:621-9. [PMID: 24006373 DOI: 10.1093/protein/gzt043] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Antibodies are a class of proteins indispensable for the vertebrate immune system. The general architecture of all antibodies is very similar, but they contain a hypervariable region which allows millions of antibody variants to exist, each of which can bind to different molecules. This binding malleability means that antibodies are an increasingly important category of biopharmaceuticals and biomarkers. We present Antibody i-Patch, a method that annotates the most likely antibody residues to be in contact with the antigen. We show that our predictions correlate with energetic importance and thus we argue that they may be useful in guiding mutations in the artificial affinity maturation process. Using our predictions as constraints for a rigid-body docking algorithm, we are able to obtain high-quality results in minutes. Our annotation method and re-scoring system for docking achieve their predictive power by using antibody-specific statistics. Antibody i-Patch is available from http://www.stats.ox.ac.uk/research/proteins/resources.
Collapse
Affiliation(s)
- Konrad Krawczyk
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK
| | | | | | | |
Collapse
|
39
|
Shi Z, Wedd AG, Gras SL. Parallel in vivo DNA assembly by recombination: experimental demonstration and theoretical approaches. PLoS One 2013; 8:e56854. [PMID: 23468883 PMCID: PMC3585241 DOI: 10.1371/journal.pone.0056854] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Accepted: 01/17/2013] [Indexed: 01/10/2023] Open
Abstract
The development of synthetic biology requires rapid batch construction of large gene networks from combinations of smaller units. Despite the availability of computational predictions for well-characterized enzymes, the optimization of most synthetic biology projects requires combinational constructions and tests. A new building-brick-style parallel DNA assembly framework for simple and flexible batch construction is presented here. It is based on robust recombination steps and allows a variety of DNA assembly techniques to be organized for complex constructions (with or without scars). The assembly of five DNA fragments into a host genome was performed as an experimental demonstration.
Collapse
Affiliation(s)
- Zhenyu Shi
- School of Chemistry, University of Melbourne, Parkville, Victoria, Australia.
| | | | | |
Collapse
|
40
|
Low-resolution structural modeling of protein interactome. Curr Opin Struct Biol 2013; 23:198-205. [PMID: 23294579 DOI: 10.1016/j.sbi.2012.12.003] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 12/03/2012] [Indexed: 11/23/2022]
Abstract
Structural characterization of protein-protein interactions across the broad spectrum of scales is key to our understanding of life at the molecular level. Low-resolution approach to protein interactions is needed for modeling large interaction networks, given the significant level of uncertainties in large biomolecular systems and the high-throughput nature of the task. Since only a fraction of protein structures in interactome are determined experimentally, protein docking approaches are increasingly focusing on modeled proteins. Current rapid advancement of template-based modeling of protein-protein complexes is following a long standing trend in structure prediction of individual proteins. Protein-protein templates are already available for almost all interactions of structurally characterized proteins, and about one third of such templates are likely correct.
Collapse
|
41
|
Shih ESC, Hwang MJ. A critical assessment of information-guided protein-protein docking predictions. Mol Cell Proteomics 2012; 12:679-86. [PMID: 23242549 DOI: 10.1074/mcp.m112.020198] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The structures of protein complexes are increasingly predicted via protein-protein docking (PPD) using ambiguous interaction data to help guide the docking. These data often are incomplete and contain errors and therefore could lead to incorrect docking predictions. In this study, we performed a series of PPD simulations to examine the effects of incompletely and incorrectly assigned interface residues on the success rate of PPD predictions. The results for a widely used PPD benchmark dataset obtained using a new interface information-driven PPD (IPPD) method developed in this work showed that the success rate for an acceptable top-ranked model varied, depending on the information content used, from as high as 95% when contact relationships (though not contact distances) were known for all residues to 78% when only the interface/non-interface state of the residues was known. However, the success rates decreased rapidly to ∼40% when the interface/non-interface state of 20% of the residues was assigned incorrectly, and to less than 5% for a 40% incorrect assignment. Comparisons with results obtained by re-ranking a global search and with those reported for other data-guided PPD methods showed that, in general, IPPD performed better than re-ranking when the information used was more complete and more accurate, but worse when it was not, and that when using bioinformatics-predicted information on interface residues, IPPD and other data-guided PPD methods performed poorly, at a level similar to simulations with a 40% incorrect assignment. These results provide guidelines for using information about interface residues to improve PPD predictions and reveal a bottleneck for such improvement imposed by the low accuracy of current bioinformatic interface residue predictions.
Collapse
Affiliation(s)
- Edward S C Shih
- ‡Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei 115, Taiwan
| | | |
Collapse
|
42
|
Zaman A. Docking studies and network analyses reveal capacity of compounds from Kandelia rheedii to strengthen cellular immunity by interacting with host proteins during tuberculosis infection. Bioinformation 2012; 8:1012-20. [PMID: 23275699 PMCID: PMC3524883 DOI: 10.6026/97320630081012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2012] [Accepted: 10/15/2012] [Indexed: 01/09/2023] Open
Abstract
Kandelia rheedii (locally known as Guria or Rasunia), widely found and used in Indian subcontinent, is a well-known herbal cure to tuberculosis. However, neither the mechanism nor the active components of the plant extract responsible for mediating this action has yet been confirmed. Here in this study, molecular interactions of three compounds (emodin, fusaric acid and skyrin) from the plant extract with the host protein targets (casein kinase (CSNK), estrogen receptor (ERBB), dopamine β-hydroxylase (DBH) and glucagon receptor (Gcgr)) has been found. These protein targets are known to be responsible for strengthening cellular immunity against Mycobacteria tuberculosis. The specific interactions of these three compounds with the respective protein targets have been discussed here. The insights from study should further help us designing molecular medicines against tuberculosis.
Collapse
Affiliation(s)
- Aubhishek Zaman
- Molecular Biology Laboratory, Department of Biochemistry and Molecular Biology and Department of Genetic Engineering and Biotechnology, University of Dhaka, Dhaka-1000, Bangladesh
| |
Collapse
|
43
|
Esquivel-Rodríguez J, Kihara D. Fitting multimeric protein complexes into electron microscopy maps using 3D Zernike descriptors. J Phys Chem B 2012; 116:6854-61. [PMID: 22417139 PMCID: PMC3376205 DOI: 10.1021/jp212612t] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A novel computational method for fitting high-resolution structures of multiple proteins into a cryoelectron microscopy map is presented. The method named EMLZerD generates a pool of candidate multiple protein docking conformations of component proteins, which are later compared with a provided electron microscopy (EM) density map to select the ones that fit well into the EM map. The comparison of docking conformations and the EM map is performed using the 3D Zernike descriptor (3DZD), a mathematical series expansion of three-dimensional functions. The 3DZD provides a unified representation of the surface shape of multimeric protein complex models and EM maps, which allows a convenient, fast quantitative comparison of the three-dimensional structural data. Out of 19 multimeric complexes tested, near native complex structures with a root-mean-square deviation of less than 2.5 Å were obtained for 14 cases while medium range resolution structures with correct topology were computed for the additional 5 cases.
Collapse
Affiliation(s)
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
- Markey Center for Structural Biology, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|