1
|
Kousaka S, Ishikawa T. Quantum Chemistry-Based Protein-Protein Docking without Empirical Parameters. J Chem Theory Comput 2024; 20:5164-5175. [PMID: 38845143 DOI: 10.1021/acs.jctc.4c00531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
This study developed a novel protein-protein docking approach based on quantum chemistry. To judge the appropriateness of complex structures, we introduced two criterion values, EV1 and EV2, computed using the fragment molecular orbital method without any empirical parameters. These criterion values enable us to search complex structures in which patterns of the electrostatic potential of the two proteins are optimally aligned at their interface. The performance of our method was validated using 53 complexes in a benchmark set provided for protein-protein docking. When employing bound state structures, docking success rates reached 64% for EV1 and 76% for EV2. On the other hand, when employing unbound state structures, docking success rates reached 13% for EV1 and 17% for EV2.
Collapse
Affiliation(s)
- Sumire Kousaka
- Department of Chemistry, Biotechnology, and Chemical Engineering, Graduate School of Science and Engineering, Kagoshima University, 1-21-40 Korimoto, Kagoshima 890-0065, Japan
| | - Takeshi Ishikawa
- Department of Chemistry, Biotechnology, and Chemical Engineering, Graduate School of Science and Engineering, Kagoshima University, 1-21-40 Korimoto, Kagoshima 890-0065, Japan
| |
Collapse
|
2
|
Graef J, Ehrt C, Reim T, Rarey M. Database-Driven Identification of Structurally Similar Protein-Protein Interfaces. J Chem Inf Model 2024; 64:3332-3349. [PMID: 38470439 DOI: 10.1021/acs.jcim.3c01462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Analyzing the similarity of protein interfaces in protein-protein interactions gives new insights into protein function and assists in discovering new drugs. Usually, tools that assess the similarity focus on the interactions between two protein interfaces, while sometimes we only have one predicted interface. Herein, we present PiMine, a database-driven protein interface similarity search. It compares interface residues of one or two interacting chains by calculating and searching tetrahedral geometric patterns of α-carbon atoms and calculating physicochemical and shape-based similarity. On a dedicated, tailor-made dataset, we show that PiMine outperforms commonly used comparison tools in terms of early enrichment when considering interfaces of sequentially and structurally unrelated proteins. In an application example, we demonstrate its usability for protein interaction partner prediction by comparing predicted interfaces to known protein-protein interfaces.
Collapse
Affiliation(s)
- Joel Graef
- Universität Hamburg, ZBH─Center for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Christiane Ehrt
- Universität Hamburg, ZBH─Center for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Thorben Reim
- Universität Hamburg, ZBH─Center for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH─Center for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| |
Collapse
|
3
|
Zhang C, Zhang C, Shang T, Zhu N, Wu X, Duan H. HighFold: accurately predicting structures of cyclic peptides and complexes with head-to-tail and disulfide bridge constraints. Brief Bioinform 2024; 25:bbae215. [PMID: 38706323 PMCID: PMC11070728 DOI: 10.1093/bib/bbae215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 04/12/2024] [Accepted: 04/18/2024] [Indexed: 05/07/2024] Open
Abstract
In recent years, cyclic peptides have emerged as a promising therapeutic modality due to their diverse biological activities. Understanding the structures of these cyclic peptides and their complexes is crucial for unlocking invaluable insights about protein target-cyclic peptide interaction, which can facilitate the development of novel-related drugs. However, conducting experimental observations is time-consuming and expensive. Computer-aided drug design methods are not practical enough in real-world applications. To tackles this challenge, we introduce HighFold, an AlphaFold-derived model in this study. By integrating specific details about the head-to-tail circle and disulfide bridge structures, the HighFold model can accurately predict the structures of cyclic peptides and their complexes. Our model demonstrates superior predictive performance compared to other existing approaches, representing a significant advancement in structure-activity research. The HighFold model is openly accessible at https://github.com/hongliangduan/HighFold.
Collapse
Affiliation(s)
- Chenhao Zhang
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, China
| | - Chengyun Zhang
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, China
- AI department, Shanghai Highslab Therapeutics. Inc, Shanghai, 201203, China
| | - Tianfeng Shang
- AI department, Shanghai Highslab Therapeutics. Inc, Shanghai, 201203, China
| | - Ning Zhu
- China Pharmaceutical University, Nanjing, Jiangsu, 211198, China
| | - Xinyi Wu
- College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, China
| | - Hongliang Duan
- Faculty of Applied Sciences, Macao Polytechnic University, R. de Luís Gonzaga Gomes, Macao, 999078, China
| |
Collapse
|
4
|
Jastrzębski MK. Computational Methods to Target Protein-Protein Interactions. Methods Mol Biol 2024; 2780:327-343. [PMID: 38987476 DOI: 10.1007/978-1-0716-3985-6_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
The chapter emphasizes the importance of understanding protein-protein interactions in cellular mechanisms and highlights the role of computational modeling in predicting these interactions. It discusses sequence-based approaches such as evolutionary trace (ET), correlated mutation analysis (CMA), and subtractive correlated mutation (SCM) for identifying crucial amino acid residues, considering interface conservation or evolutionary changes. The chapter also explores methods like differential ET, hidden-site class model, and spatial cluster detection (SCD) for interface specificity and spatial clustering. Furthermore, it examines approaches combining structural and sequential methodologies and evaluates modeled predictions through initiatives like critical assessment of prediction of interactions (CAPRI). Additionally, the chapter provides an overview of various software programs used for molecular docking, detailing their search, sampling, refinement and scoring stages, along with innovative techniques and tools like normal mode analysis (NMA) and adaptive Poisson-Boltzmann solver (APBS) for electrostatic calculations. These computational and experimental approaches are crucial for unraveling protein-protein interactions and aid in developing potential therapeutics for various diseases.
Collapse
Affiliation(s)
- Michał K Jastrzębski
- Department of Synthesis and Chemical Technology of Pharmaceutical Substances with Computer Modeling Laboratory, Faculty of Pharmacy, Medical University of Lublin, Lublin, Poland.
| |
Collapse
|
5
|
Robin X, Studer G, Durairaj J, Eberhardt J, Schwede T, Walters WP. Assessment of protein-ligand complexes in CASP15. Proteins 2023; 91:1811-1821. [PMID: 37795762 DOI: 10.1002/prot.26601] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 09/14/2023] [Accepted: 09/19/2023] [Indexed: 10/06/2023]
Abstract
CASP15 introduced a new category, ligand prediction, where participants were provided with a protein or nucleic acid sequence, SMILES line notation, and stoichiometry for ligands and tasked with generating computational models for the three-dimensional structure of the corresponding protein-ligand complex. These models were subsequently compared with experimental structures determined by x-ray crystallography or cryoEM. To assess these predictions, two novel scores were developed. The Binding-Site Superposed, Symmetry-Corrected Pose Root Mean Square Deviation (BiSyRMSD) evaluated the absolute deviations of the models from the experimental structures. At the same time, the Local Distance Difference Test for Protein-Ligand Interactions (lDDT-PLI) assessed the ability of models to reproduce the protein-ligand interactions in the experimental structures. The ligands evaluated in this challenge range from single-atom ions to large flexible organic molecules. More than 1800 submissions were evaluated for their ability to predict 23 different protein-ligand complexes. Overall, the best models could faithfully reproduce the geometries of more than half of the prediction targets. The ligands' size and flexibility were the primary factors influencing the predictions' quality. Small ions and organic molecules with limited flexibility were predicted with high fidelity, while reproducing the binding poses of larger, flexible ligands proved more challenging.
Collapse
Affiliation(s)
- Xavier Robin
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Janani Durairaj
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Jerome Eberhardt
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | |
Collapse
|
6
|
Kilian M, Bischofs IB. Co-evolution at protein-protein interfaces guides inference of stoichiometry of oligomeric protein complexes by de novo structure prediction. Mol Microbiol 2023; 120:763-782. [PMID: 37777474 DOI: 10.1111/mmi.15169] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 09/10/2023] [Accepted: 09/11/2023] [Indexed: 10/02/2023]
Abstract
The quaternary structure with specific stoichiometry is pivotal to the specific function of protein complexes. However, determining the structure of many protein complexes experimentally remains a major bottleneck. Structural bioinformatics approaches, such as the deep learning algorithm Alphafold2-multimer (AF2-multimer), leverage the co-evolution of amino acids and sequence-structure relationships for accurate de novo structure and contact prediction. Pseudo-likelihood maximization direct coupling analysis (plmDCA) has been used to detect co-evolving residue pairs by statistical modeling. Here, we provide evidence that combining both methods can be used for de novo prediction of the quaternary structure and stoichiometry of a protein complex. We achieve this by augmenting the existing AF2-multimer confidence metrics with an interpretable score to identify the complex with an optimal fraction of native contacts of co-evolving residue pairs at intermolecular interfaces. We use this strategy to predict the quaternary structure and non-trivial stoichiometries of Bacillus subtilis spore germination protein complexes with unknown structures. Co-evolution at intermolecular interfaces may therefore synergize with AI-based de novo quaternary structure prediction of structurally uncharacterized bacterial protein complexes.
Collapse
Affiliation(s)
- Max Kilian
- Max-Planck-Institute for Terrestrial Microbiology, Marburg, Germany
- BioQuant Center for Quantitative Analysis of Molecular and Cellular Biosystems, Heidelberg University, Heidelberg, Germany
- Center for Molecular Biology of Heidelberg University (ZMBH), Heidelberg, Germany
| | - Ilka B Bischofs
- Max-Planck-Institute for Terrestrial Microbiology, Marburg, Germany
- BioQuant Center for Quantitative Analysis of Molecular and Cellular Biosystems, Heidelberg University, Heidelberg, Germany
- Center for Molecular Biology of Heidelberg University (ZMBH), Heidelberg, Germany
| |
Collapse
|
7
|
McFee M, Kim PM. GDockScore: a graph-based protein-protein docking scoring function. BIOINFORMATICS ADVANCES 2023; 3:vbad072. [PMID: 37359726 PMCID: PMC10290236 DOI: 10.1093/bioadv/vbad072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 05/30/2023] [Accepted: 06/10/2023] [Indexed: 06/28/2023]
Abstract
Summary Protein complexes play vital roles in a variety of biological processes, such as mediating biochemical reactions, the immune response and cell signalling, with 3D structure specifying function. Computational docking methods provide a means to determine the interface between two complexed polypeptide chains without using time-consuming experimental techniques. The docking process requires the optimal solution to be selected with a scoring function. Here, we propose a novel graph-based deep learning model that utilizes mathematical graph representations of proteins to learn a scoring function (GDockScore). GDockScore was pre-trained on docking outputs generated with the Protein Data Bank biounits and the RosettaDock protocol, and then fine-tuned on HADDOCK decoys generated on the ZDOCK Protein Docking Benchmark. GDockScore performs similarly to the Rosetta scoring function on docking decoys generated using the RosettaDock protocol. Furthermore, state-of-the-art is achieved on the CAPRI score set, a challenging dataset for developing docking scoring functions. Availability and implementation The model implementation is available at https://gitlab.com/mcfeemat/gdockscore. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Matthew McFee
- Department of Molecular Genetics, The University of Toronto, Toronto, ON M5S 1A8, Canada
- Donnelly Centre for Cellular and Biomolecular Research, The University of Toronto, Toronto, ON M5S 3E1, Canada
| | | |
Collapse
|
8
|
Shanker S, Sanner MF. Predicting Protein-Peptide Interactions: Benchmarking Deep Learning Techniques and a Comparison with Focused Docking. J Chem Inf Model 2023; 63:3158-3170. [PMID: 37167566 DOI: 10.1021/acs.jcim.3c00602] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
The accurate prediction of protein structures achieved by deep learning (DL) methods is a significant milestone and has deeply impacted structural biology. Shortly after its release, AlphaFold2 has been evaluated for predicting protein-peptide interactions and shown to significantly outperform RoseTTAfold as well as a conventional blind docking method: PIPER-FlexPepDock. Since then, new AlphaFold2 models, trained specifically to predict multimeric assemblies, have been released and a new ab initio folding model OmegaFold has become available. Here, we assess docking success rates for these new DL folding models and compare their performance with our state-of-the-art, focused peptide-docking software AutoDock CrankPep (ADCP). The evaluation is done using the same dataset and performance metric for all methods. We show that, for a set of 99 nonredundant protein-peptide complexes, the new AlphaFold2 model outperforms other Deep Learning approaches and achieves remarkable docking success rates for peptides. While the docking success rate of ADCP is more modest when considering the top-ranking solution only, it samples correct solutions for around 62% of the complexes. Interestingly, different methods succeed on different complexes, and we describe a consensus docking approach using ADCP and AlphaFold2, which achieves a remarkable 60% for the top-ranking results and 66% for the top 5 results for this set of 99 protein-peptide complexes.
Collapse
Affiliation(s)
- Sudhanshu Shanker
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, California 92037, United States
| | - Michel F Sanner
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, California 92037, United States
| |
Collapse
|
9
|
Esmaeeli R, Bauzá A, Perez A. Structural predictions of protein-DNA binding: MELD-DNA. Nucleic Acids Res 2023; 51:1625-1636. [PMID: 36727436 PMCID: PMC9976882 DOI: 10.1093/nar/gkad013] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 12/27/2022] [Accepted: 01/30/2023] [Indexed: 02/03/2023] Open
Abstract
Structural, regulatory and enzymatic proteins interact with DNA to maintain a healthy and functional genome. Yet, our structural understanding of how proteins interact with DNA is limited. We present MELD-DNA, a novel computational approach to predict the structures of protein-DNA complexes. The method combines molecular dynamics simulations with general knowledge or experimental information through Bayesian inference. The physical model is sensitive to sequence-dependent properties and conformational changes required for binding, while information accelerates sampling of bound conformations. MELD-DNA can: (i) sample multiple binding modes; (ii) identify the preferred binding mode from the ensembles; and (iii) provide qualitative binding preferences between DNA sequences. We first assess performance on a dataset of 15 protein-DNA complexes and compare it with state-of-the-art methodologies. Furthermore, for three selected complexes, we show sequence dependence effects of binding in MELD predictions. We expect that the results presented herein, together with the freely available software, will impact structural biology (by complementing DNA structural databases) and molecular recognition (by bringing new insights into aspects governing protein-DNA interactions).
Collapse
Affiliation(s)
- Reza Esmaeeli
- Department of Chemistry, Quantum theory project, University of Florida, Gainesville, FL 32611, USA
| | - Antonio Bauzá
- Department of Chemistry, Universitat de les Illes Balears, Palma de Mallorca (Baleares), 07122, Spain
| | - Alberto Perez
- Department of Chemistry, Quantum theory project, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
10
|
Kim HY, Kim S, Park WY, Kim D. G-RANK: an equivariant graph neural network for the scoring of protein-protein docking models. BIOINFORMATICS ADVANCES 2023; 3:vbad011. [PMID: 36818727 PMCID: PMC9927558 DOI: 10.1093/bioadv/vbad011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 01/25/2023] [Accepted: 02/01/2023] [Indexed: 02/05/2023]
Abstract
Motivation Protein complex structure prediction is important for many applications in bioengineering. A widely used method for predicting the structure of protein complexes is computational docking. Although many tools for scoring protein-protein docking models have been developed, it is still a challenge to accurately identify near-native models for unknown protein complexes. A recently proposed model called the geometric vector perceptron-graph neural network (GVP-GNN), a subtype of equivariant graph neural networks, has demonstrated success in various 3D molecular structure modeling tasks. Results Herein, we present G-RANK, a GVP-GNN-based method for the scoring of protein-protein docking models. When evaluated on two different test datasets, G-RANK achieved a performance competitive with or better than the state-of-the-art scoring functions. We expect G-RANK to be a useful tool for various applications in biological engineering. Availability and implementation Source code is available at https://github.com/ha01994/grank. Contact kds@kaist.ac.kr. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Ha Young Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, South Korea
| | | | - Woong-Yang Park
- GENINUS Inc., Seoul 05836, South Korea,Samsung Genome Institute, Samsung Medical Center, Seoul 06351, South Korea,Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon 16419, South Korea
| | | |
Collapse
|
11
|
Cunha AES, Loureiro RJS, Simões CJV, Brito RMM. Unveiling New Druggable Pockets in Influenza Non-Structural Protein 1: NS1-Host Interactions as Antiviral Targets for Flu. Int J Mol Sci 2023; 24:ijms24032977. [PMID: 36769298 PMCID: PMC9918223 DOI: 10.3390/ijms24032977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 01/19/2023] [Accepted: 02/01/2023] [Indexed: 02/05/2023] Open
Abstract
Influenza viruses are responsible for significant morbidity and mortality worldwide in winter seasonal outbreaks and in flu pandemics. Influenza viruses have a high rate of evolution, requiring annual vaccine updates and severely diminishing the effectiveness of the available antivirals. Identifying novel viral targets and developing new effective antivirals is an urgent need. One of the most promising new targets for influenza antiviral therapy is non-structural protein 1 (NS1), a highly conserved protein exclusively expressed in virus-infected cells that mediates essential functions in virus replication and pathogenesis. Interaction of NS1 with the host proteins PI3K and TRIM25 is paramount for NS1's role in infection and pathogenesis by promoting viral replication through the inhibition of apoptosis and suppressing interferon production, respectively. We, therefore, conducted an analysis of the druggability of this viral protein by performing molecular dynamics simulations on full-length NS1 coupled with ligand pocket detection. We identified several druggable pockets that are partially conserved throughout most of the simulation time. Moreover, we found out that some of these druggable pockets co-localize with the most stable binding regions of the protein-protein interaction (PPI) sites of NS1 with PI3K and TRIM25, which suggests that these NS1 druggable pockets are promising new targets for antiviral development.
Collapse
Affiliation(s)
- Andreia E. S. Cunha
- Coimbra Chemistry Center—Institute of Molecular Sciences (CQC-IMS), Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
| | - Rui J. S. Loureiro
- Coimbra Chemistry Center—Institute of Molecular Sciences (CQC-IMS), Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
- Correspondence: (R.J.S.L.); (R.M.M.B.)
| | - Carlos J. V. Simões
- Coimbra Chemistry Center—Institute of Molecular Sciences (CQC-IMS), Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
- BSIM Therapeutics, Instituto Pedro Nunes, 3030-199 Coimbra, Portugal
| | - Rui M. M. Brito
- Coimbra Chemistry Center—Institute of Molecular Sciences (CQC-IMS), Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal
- BSIM Therapeutics, Instituto Pedro Nunes, 3030-199 Coimbra, Portugal
- Correspondence: (R.J.S.L.); (R.M.M.B.)
| |
Collapse
|
12
|
Barradas-Bautista D, Almajed A, Oliva R, Kalnis P, Cavallo L. Improving classification of correct and incorrect protein-protein docking models by augmenting the training set. BIOINFORMATICS ADVANCES 2023; 3:vbad012. [PMID: 36789292 PMCID: PMC9923443 DOI: 10.1093/bioadv/vbad012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 01/20/2023] [Accepted: 02/01/2023] [Indexed: 02/04/2023]
Abstract
Motivation Protein-protein interactions drive many relevant biological events, such as infection, replication and recognition. To control or engineer such events, we need to access the molecular details of the interaction provided by experimental 3D structures. However, such experiments take time and are expensive; moreover, the current technology cannot keep up with the high discovery rate of new interactions. Computational modeling, like protein-protein docking, can help to fill this gap by generating docking poses. Protein-protein docking generally consists of two parts, sampling and scoring. The sampling is an exhaustive search of the tridimensional space. The caveat of the sampling is that it generates a large number of incorrect poses, producing a highly unbalanced dataset. This limits the utility of the data to train machine learning classifiers. Results Using weak supervision, we developed a data augmentation method that we named hAIkal. Using hAIkal, we increased the labeled training data to train several algorithms. We trained and obtained different classifiers; the best classifier has 81% accuracy and 0.51 Matthews' correlation coefficient on the test set, surpassing the state-of-the-art scoring functions. Availability and implementation Docking models from Benchmark 5 are available at https://doi.org/10.5281/zenodo.4012018. Processed tabular data are available at https://repository.kaust.edu.sa/handle/10754/666961. Google colab is available at https://colab.research.google.com/drive/1vbVrJcQSf6\_C3jOAmZzgQbTpuJ5zC1RP?usp=sharing. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Ali Almajed
- Computer, Electrical and Mathematical Science and Engineering Division, Kaust Extreme Computing Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Romina Oliva
- Department of Sciences and Technologies, University of Naples “Parthenope”, I-80143 Naples, Italy
| | - Panos Kalnis
- Computer, Electrical and Mathematical Science and Engineering Division, Kaust Extreme Computing Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Luigi Cavallo
- Physical Sciences and Engineering Division, Kaust Catalysis Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|
13
|
Ambrosetti F, Jandova Z, Bonvin AMJJ. Information-Driven Antibody-Antigen Modelling with HADDOCK. Methods Mol Biol 2023; 2552:267-282. [PMID: 36346597 DOI: 10.1007/978-1-0716-2609-2_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In the recent years, therapeutic use of antibodies has seen a huge growth, "due to their inherent proprieties and technological advances in the methods used to study and characterize them. Effective design and engineering of antibodies for therapeutic purposes are heavily dependent on knowledge of the structural principles that regulate antibody-antigen interactions. Several experimental techniques such as X-ray crystallography, cryo-electron microscopy, NMR, or mutagenesis analysis can be applied, but these are usually expensive and time-consuming. Therefore computational approaches like molecular docking may offer a valuable alternative for the characterization of antibody-antigen complexes.Here we describe a protocol for the prediction of the 3D structure of antibody-antigen complexes using the integrative modelling platform HADDOCK. The protocol consists of (1) the identification of the antibody residues belonging to the hypervariable loops which are known to be crucial for the binding and can be used to guide the docking and (2) the detailed steps to perform docking with the HADDOCK 2.4 webserver following different strategies depending on the availability of information about epitope residues.
Collapse
Affiliation(s)
- Francesco Ambrosetti
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, The Netherlands
| | - Zuzana Jandova
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, The Netherlands
| | - Alexandre M J J Bonvin
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, The Netherlands.
| |
Collapse
|
14
|
Drake ZC, Seffernick JT, Lindert S. Protein complex prediction using Rosetta, AlphaFold, and mass spectrometry covalent labeling. Nat Commun 2022; 13:7846. [PMID: 36543826 PMCID: PMC9772387 DOI: 10.1038/s41467-022-35593-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 12/09/2022] [Indexed: 12/24/2022] Open
Abstract
Covalent labeling (CL) in combination with mass spectrometry can be used as an analytical tool to study and determine structural properties of protein-protein complexes. However, data from these experiments is sparse and does not unambiguously elucidate protein structure. Thus, computational algorithms are needed to deduce structure from the CL data. In this work, we present a hybrid method that combines models of protein complex subunits generated with AlphaFold with differential CL data via a CL-guided protein-protein docking in Rosetta. In a benchmark set, the RMSD (root-mean-square deviation) of the best-scoring models was below 3.6 Å for 5/5 complexes with inclusion of CL data, whereas the same quality was only achieved for 1/5 complexes without CL data. This study suggests that our integrated approach can successfully use data obtained from CL experiments to distinguish between nativelike and non-nativelike models.
Collapse
Affiliation(s)
- Zachary C Drake
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH, 43210, US
| | - Justin T Seffernick
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH, 43210, US
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH, 43210, US.
| |
Collapse
|
15
|
Cohen T, Halfon M, Carter L, Sharkey B, Jain T, Sivasubramanian A, Schneidman-Duhovny D. Multi-state modeling of antibody-antigen complexes with SAXS profiles and deep-learning models. Methods Enzymol 2022; 678:237-262. [PMID: 36641210 DOI: 10.1016/bs.mie.2022.11.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Antibodies are an established class of human therapeutics. Epitope characterization is an important part of therapeutic antibody discovery. However, structural characterization of antibody-antigen complexes remains challenging. On the one hand, X-ray crystallography or cryo-electron microscopy provide atomic resolution characterization of the epitope, but the data collection process is typically long and the success rate is low. On the other hand, computational methods for modeling antibody-antigen structures from the individual components frequently suffer from a high false positive rate, rarely resulting in a unique solution. Recent deep learning models for structure prediction are also successful in predicting protein-protein complexes. However, they do not perform well for antibody-antigen complexes. Small Angle X-ray Scattering (SAXS) is a reliable technique for rapid structural characterization of protein samples in solution albeit at low resolution. Here, we present an integrative approach for modeling antigen-antibody complexes using the antibody sequence, antigen structure, and experimentally determined SAXS profiles of the antibody, antigen, and the complex. The method models antibody structures using a novel deep-learning approach, NanoNet. The structures of the antibodies and antigens are represented using multiple 3D conformations to account for compositional and conformational heterogeneity of the protein samples that are used to collect the SAXS data. The complexes are predicted by integrating the SAXS profiles with scoring functions for protein-protein interfaces that are based on statistical potentials and antibody-specific deep-learning models. We validated the method via application to four Fab:EGFR and one Fab:PCSK9 antibody:antigen complexes with experimentally available SAXS datasets. The integrative approach returns accurate predictions (interface RMSD<4Å) in the top five predictions for four out of five complexes (respective interface RMSD values of 1.95, 2.18, 2.66 and 3.87Å), providing support for the utility of such a computational pipeline for epitope characterization during therapeutic antibody discovery.
Collapse
Affiliation(s)
- Tomer Cohen
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Matan Halfon
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Lester Carter
- Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park, CA, United States
| | - Beth Sharkey
- High-Throughput Expression, Adimab LLC, Lebanon, NH, United States
| | - Tushar Jain
- Computational Biology, Adimab LLC, Palo Alto, CA, United States
| | | | - Dina Schneidman-Duhovny
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
16
|
Ranaudo A, Cosentino U, Greco C, Moro G, Bonardi A, Maiocchi A, Moroni E. Evaluation of docking procedures reliability in affitins-partners interactions. Front Chem 2022; 10:1074249. [DOI: 10.3389/fchem.2022.1074249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 11/17/2022] [Indexed: 12/02/2022] Open
Abstract
Affitins constitute a class of small proteins belonging to Sul7d family, which, in microorganisms such as Sulfolobus acidocaldarius, bind DNA preventing its denaturation. Thanks to their stability and small size (60–66 residues in length) they have been considered as ideal candidates for engineering and have been used for more than 10 years now, for different applications. The individuation of a mutant able to recognize a specific target does not imply the knowledge of the binding geometry between the two proteins. However, its identification is of undoubted importance but not always experimentally accessible. For this reason, computational approaches such as protein-protein docking can be helpful for an initial structural characterization of the complex. This method, which produces tens of putative binding geometries ordered according to a binding score, needs to be followed by a further reranking procedure for finding the most plausible one. In the present paper, we use the server ClusPro for generating docking models of affitins with different protein partners whose experimental structures are available in the Protein Data Bank. Then, we apply two protocols for reranking the docking models. The first one investigates their stability by means of Molecular Dynamics simulations; the second one, instead, compares the docking models with the interacting residues predicted by the Matrix of Local Coupling Energies method. Results show that the more efficient way to deal with the reranking problem is to consider the information given by the two protocols together, i.e. employing a consensus approach.
Collapse
|
17
|
Shao N, Fan Y, Chou CW, Yavari S, Williams RV, Amster IJ, Brown SM, Drake IJ, Duin EC, Whitman WB, Liu Y. Expression of divergent methyl/alkyl coenzyme M reductases from uncultured archaea. Commun Biol 2022; 5:1113. [PMID: 36266535 PMCID: PMC9584954 DOI: 10.1038/s42003-022-04057-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 09/30/2022] [Indexed: 11/08/2022] Open
Abstract
Methanogens and anaerobic methane-oxidizing archaea (ANME) are important players in the global carbon cycle. Methyl-coenzyme M reductase (MCR) is a key enzyme in methane metabolism, catalyzing the last step in methanogenesis and the first step in anaerobic methane oxidation. Divergent mcr and mcr-like genes have recently been identified in uncultured archaeal lineages. However, the assembly and biochemistry of MCRs from uncultured archaea remain largely unknown. Here we present an approach to study MCRs from uncultured archaea by heterologous expression in a methanogen, Methanococcus maripaludis. Promoter, operon structure, and temperature were important determinants for MCR production. Both recombinant methanococcal and ANME-2 MCR assembled with the host MCR forming hybrid complexes, whereas tested ANME-1 MCR and ethyl-coenzyme M reductase only formed homogenous complexes. Together with structural modeling, this suggests that ANME-2 and methanogen MCRs are structurally similar and their reaction directions are likely regulated by thermodynamics rather than intrinsic structural differences.
Collapse
Affiliation(s)
- Nana Shao
- Department of Microbiology, University of Georgia, Athens, GA, USA
| | - Yu Fan
- EMTEC IT, ExxonMobil Technical Computing Company, Annandale, NJ, USA
| | - Chau-Wen Chou
- Department of Chemistry, University of Georgia, Athens, GA, USA
| | - Shadi Yavari
- Department of Chemistry and Biochemistry, Auburn University, Auburn, AL, USA
| | | | | | - Stuart M Brown
- Energy Sciences, ExxonMobil Technology & Engineering Company, Annandale, NJ, USA
| | - Ian J Drake
- Biomedical Sciences, ExxonMobil Technology & Engineering Company, Annandale, NJ, USA
| | - Evert C Duin
- Department of Chemistry and Biochemistry, Auburn University, Auburn, AL, USA
| | | | - Yuchen Liu
- Energy Sciences, ExxonMobil Technology & Engineering Company, Annandale, NJ, USA.
| |
Collapse
|
18
|
Dixon T, MacPherson D, Mostofian B, Dauzhenka T, Lotz S, McGee D, Shechter S, Shrestha UR, Wiewiora R, McDargh ZA, Pei F, Pal R, Ribeiro JV, Wilkerson T, Sachdeva V, Gao N, Jain S, Sparks S, Li Y, Vinitsky A, Zhang X, Razavi AM, Kolossváry I, Imbriglio J, Evdokimov A, Bergeron L, Zhou W, Adhikari J, Ruprecht B, Dickson A, Xu H, Sherman W, Izaguirre JA. Predicting the structural basis of targeted protein degradation by integrating molecular dynamics simulations with structural mass spectrometry. Nat Commun 2022; 13:5884. [PMID: 36202813 PMCID: PMC9537307 DOI: 10.1038/s41467-022-33575-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 09/20/2022] [Indexed: 11/09/2022] Open
Abstract
Targeted protein degradation (TPD) is a promising approach in drug discovery for degrading proteins implicated in diseases. A key step in this process is the formation of a ternary complex where a heterobifunctional molecule induces proximity of an E3 ligase to a protein of interest (POI), thus facilitating ubiquitin transfer to the POI. In this work, we characterize 3 steps in the TPD process. (1) We simulate the ternary complex formation of SMARCA2 bromodomain and VHL E3 ligase by combining hydrogen-deuterium exchange mass spectrometry with weighted ensemble molecular dynamics (MD). (2) We characterize the conformational heterogeneity of the ternary complex using Hamiltonian replica exchange simulations and small-angle X-ray scattering. (3) We assess the ubiquitination of the POI in the context of the full Cullin-RING Ligase, confirming experimental ubiquitinomics results. Differences in degradation efficiency can be explained by the proximity of lysine residues on the POI relative to ubiquitin.
Collapse
Affiliation(s)
- Tom Dixon
- Roivant Discovery, New York City, NY, 10036, USA
- Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | | | | | | | - Samuel Lotz
- Roivant Discovery, New York City, NY, 10036, USA
| | - Dwight McGee
- Roivant Discovery, New York City, NY, 10036, USA
| | | | | | | | | | - Fen Pei
- Roivant Discovery, New York City, NY, 10036, USA
| | - Rajat Pal
- Roivant Discovery, New York City, NY, 10036, USA
| | | | | | | | - Ning Gao
- Roivant Discovery, New York City, NY, 10036, USA
| | - Shourya Jain
- Roivant Discovery, New York City, NY, 10036, USA
| | | | - Yunxing Li
- Roivant Discovery, New York City, NY, 10036, USA
| | | | - Xin Zhang
- Roivant Discovery, New York City, NY, 10036, USA
| | | | | | | | | | | | | | | | | | - Alex Dickson
- Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA.
| | - Huafeng Xu
- Roivant Discovery, New York City, NY, 10036, USA.
| | | | | |
Collapse
|
19
|
Zhang W, Roy Burman SS, Chen J, Donovan KA, Cao Y, Shu C, Zhang B, Zeng Z, Gu S, Zhang Y, Li D, Fischer ES, Tokheim C, Shirley Liu X. Machine Learning Modeling of Protein-intrinsic Features Predicts Tractability of Targeted Protein Degradation. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:882-898. [PMID: 36494034 PMCID: PMC10025769 DOI: 10.1016/j.gpb.2022.11.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 10/25/2022] [Accepted: 11/04/2022] [Indexed: 12/12/2022]
Abstract
Targeted protein degradation (TPD) has rapidly emerged as a therapeutic modality to eliminate previously undruggable proteins by repurposing the cell's endogenous protein degradation machinery. However, the susceptibility of proteins for targeting by TPD approaches, termed "degradability", is largely unknown. Here, we developed a machine learning model, model-free analysis of protein degradability (MAPD), to predict degradability from features intrinsic to protein targets. MAPD shows accurate performance in predicting kinases that are degradable by TPD compounds [with an area under the precision-recall curve (AUPRC) of 0.759 and an area under the receiver operating characteristic curve (AUROC) of 0.775] and is likely generalizable to independent non-kinase proteins. We found five features with statistical significance to achieve optimal prediction, with ubiquitination potential being the most predictive. By structural modeling, we found that E2-accessible ubiquitination sites, but not lysine residues in general, are particularly associated with kinase degradability. Finally, we extended MAPD predictions to the entire proteome to find 964 disease-causing proteins (including proteins encoded by 278 cancer genes) that may be tractable to TPD drug development.
Collapse
Affiliation(s)
- Wubing Zhang
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Shourya S Roy Burman
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - Jiaye Chen
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Katherine A Donovan
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - Yang Cao
- Center of Growth, Metabolism, and Aging, Key Laboratory of Bio-resource and Eco-environment, Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610064, China
| | - Chelsea Shu
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Research Scholar Initiative, Graduate School of Arts and Sciences, Harvard University, Cambridge, MA 02138, USA
| | - Boning Zhang
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Zexian Zeng
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Shengqing Gu
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Yi Zhang
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Dian Li
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Eric S Fischer
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA.
| | - Collin Tokheim
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
| | - X Shirley Liu
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
| |
Collapse
|
20
|
Tao H, Zhao X, Zhang K, Lin P, Huang SY. Docking cyclic peptides formed by a disulfide bond through a hierarchical strategy. Bioinformatics 2022; 38:4109-4116. [PMID: 35801933 DOI: 10.1093/bioinformatics/btac486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 05/06/2022] [Accepted: 07/07/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Cyclization is a common strategy to enhance the therapeutic potential of peptides. Many cyclic peptide drugs have been approved for clinical use, in which the disulfide-driven cyclic peptide is one of the most prevalent categories. Molecular docking is a powerful computational method to predict the binding modes of molecules. For protein-cyclic peptide docking, a big challenge is considering the flexibility of peptides with conformers constrained by cyclization. RESULTS Integrating our efficient peptide 3D conformation sampling algorithm MODPEP2.0 and knowledge-based scoring function ITScorePP, we have proposed an extended version of our hierarchical peptide docking algorithm, named HPEPDOCK2.0, to predict the binding modes of the peptide cyclized through a disulfide against a protein. Our HPEPDOCK2.0 approach was extensively evaluated on diverse test sets and compared with the state-of-the-art cyclic peptide docking program AutoDock CrankPep (ADCP). On a benchmark dataset of 18 cyclic peptide-protein complexes, HPEPDOCK2.0 obtained a native contact fraction of above 0.5 for 61% of the cases when the top prediction was considered, compared with 39% for ADCP. On a larger test set of 25 cyclic peptide-protein complexes, HPEPDOCK2.0 yielded a success rate of 44% for the top prediction, compared with 20% for ADCP. In addition, HPEPDOCK2.0 was also validated on two other test sets of 10 and 11 complexes with apo and predicted receptor structures, respectively. HPEPDOCK2.0 is computationally efficient and the average running time for docking a cyclic peptide is about 34 min on a single CPU core, compared with 496 min for ADCP. HPEPDOCK2.0 will facilitate the study of the interaction between cyclic peptides and proteins and the development of therapeutic cyclic peptide drugs. AVAILABILITY AND IMPLEMENTATION http://huanglab.phys.hust.edu.cn/hpepdock/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Huanyu Tao
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Xuejun Zhao
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Keqiong Zhang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Peicong Lin
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| |
Collapse
|
21
|
Aderinwale T, Christoffer C, Kihara D. RL-MLZerD: Multimeric protein docking using reinforcement learning. Front Mol Biosci 2022; 9:969394. [PMID: 36090027 PMCID: PMC9459051 DOI: 10.3389/fmolb.2022.969394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 08/08/2022] [Indexed: 11/24/2022] Open
Abstract
Numerous biological processes in a cell are carried out by protein complexes. To understand the molecular mechanisms of such processes, it is crucial to know the quaternary structures of the complexes. Although the structures of protein complexes have been determined by biophysical experiments at a rapid pace, there are still many important complex structures that are yet to be determined. To supplement experimental structure determination of complexes, many computational protein docking methods have been developed; however, most of these docking methods are designed only for docking with two chains. Here, we introduce a novel method, RL-MLZerD, which builds multiple protein complexes using reinforcement learning (RL). In RL-MLZerD a multi-chain assembly process is considered as a series of episodes of selecting and integrating pre-computed pairwise docking models in a RL framework. RL is effective in correctly selecting plausible pairwise models that fit well with other subunits in a complex. When tested on a benchmark dataset of protein complexes with three to five chains, RL-MLZerD showed better modeling performance than other existing multiple docking methods under different evaluation criteria, except against AlphaFold-Multimer in unbound docking. Also, it emerged that the docking order of multi-chain complexes can be naturally predicted by examining preferred paths of episodes in the RL computation.
Collapse
Affiliation(s)
- Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States
- *Correspondence: Daisuke Kihara,
| |
Collapse
|
22
|
Verburgt J, Zhang Z, Kihara D. Multi-level analysis of intrinsically disordered protein docking methods. Methods 2022; 204:55-63. [PMID: 35609776 PMCID: PMC9701586 DOI: 10.1016/j.ymeth.2022.05.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/17/2022] [Accepted: 05/19/2022] [Indexed: 12/29/2022] Open
Abstract
Intrinsically Disordered Proteins (IDPs) are a class of proteins in which at least some region of the protein does not possess any stable structure in solution in the physiological condition but may adopt an ordered structure upon binding to a globular receptor. These IDP-receptor complexes are thus subject to protein complex modeling in which computational techniques are applied to accurately reproduce the IDP ligand-receptor interactions. This often exists in the form of protein docking, in which the 3D structures of both the subunits are known, but the position of the ligand relative to the receptor is not. Here, we evaluate the performance of three IDP-receptor modeling tools with metrics that characterize the IDP-receptor interface at various resolutions. We show that all three methods are able to properly identify the general binding site, as identified by lower resolution metrics, but begin to struggle with higher resolution metrics that capture biophysical interactions.
Collapse
Affiliation(s)
- Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA,Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA,Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, 47907, USA,Corresponding Author
| |
Collapse
|
23
|
Zhang W, Meng Q, Wang J, Guo F. HDIContact: a novel predictor of residue-residue contacts on hetero-dimer interfaces via sequential information and transfer learning strategy. Brief Bioinform 2022; 23:6599074. [PMID: 35653713 DOI: 10.1093/bib/bbac169] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/07/2022] [Accepted: 04/16/2022] [Indexed: 11/12/2022] Open
Abstract
Proteins maintain the functional order of cell in life by interacting with other proteins. Determination of protein complex structural information gives biological insights for the research of diseases and drugs. Recently, a breakthrough has been made in protein monomer structure prediction. However, due to the limited number of the known protein structure and homologous sequences of complexes, the prediction of residue-residue contacts on hetero-dimer interfaces is still a challenge. In this study, we have developed a deep learning framework for inferring inter-protein residue contacts from sequential information, called HDIContact. We utilized transfer learning strategy to produce Multiple Sequence Alignment (MSA) two-dimensional (2D) embedding based on patterns of concatenated MSA, which could reduce the influence of noise on MSA caused by mismatched sequences or less homology. For MSA 2D embedding, HDIContact took advantage of Bi-directional Long Short-Term Memory (BiLSTM) with two-channel to capture 2D context of residue pairs. Our comprehensive assessment on the Escherichia coli (E. coli) test dataset showed that HDIContact outperformed other state-of-the-art methods, with top precision of 65.96%, the Area Under the Receiver Operating Characteristic curve (AUROC) of 83.08% and the Area Under the Precision Recall curve (AUPR) of 25.02%. In addition, we analyzed the potential of HDIContact for human-virus protein-protein complexes, by achieving top five precision of 80% on O75475-P04584 related to Human Immunodeficiency Virus. All experiments indicated that our method was a valuable technical tool for predicting inter-protein residue contacts, which would be helpful for understanding protein-protein interaction mechanisms.
Collapse
Affiliation(s)
- Wei Zhang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Qiaozhen Meng
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
24
|
Induced fit with replica exchange improves protein complex structure prediction. PLoS Comput Biol 2022; 18:e1010124. [PMID: 35658008 PMCID: PMC9200320 DOI: 10.1371/journal.pcbi.1010124] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 06/15/2022] [Accepted: 04/20/2022] [Indexed: 11/19/2022] Open
Abstract
Despite the progress in prediction of protein complexes over the last decade, recent blind protein complex structure prediction challenges revealed limited success rates (less than 20% models with DockQ score > 0.4) on targets that exhibit significant conformational change upon binding. To overcome limitations in capturing backbone motions, we developed a new, aggressive sampling method that incorporates temperature replica exchange Monte Carlo (T-REMC) and conformational sampling techniques within docking protocols in Rosetta. Our method, ReplicaDock 2.0, mimics induced-fit mechanism of protein binding to sample backbone motions across putative interface residues on-the-fly, thereby recapitulating binding-partner induced conformational changes. Furthermore, ReplicaDock 2.0 clocks in at 150-500 CPU hours per target (protein-size dependent); a runtime that is significantly faster than Molecular Dynamics based approaches. For a benchmark set of 88 proteins with moderate to high flexibility (unbound-to-bound iRMSD over 1.2 Å), ReplicaDock 2.0 successfully docks 61% of moderately flexible complexes and 35% of highly flexible complexes. Additionally, we demonstrate that by biasing backbone sampling particularly towards residues comprising flexible loops or hinge domains, highly flexible targets can be predicted to under 2 Å accuracy. This indicates that additional gains are possible when mobile protein segments are known.
Collapse
|
25
|
Charitou V, van Keulen SC, Bonvin AMJJ. Cyclization and Docking Protocol for Cyclic Peptide-Protein Modeling Using HADDOCK2.4. J Chem Theory Comput 2022; 18:4027-4040. [PMID: 35652781 PMCID: PMC9202357 DOI: 10.1021/acs.jctc.2c00075] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
An emerging class of therapeutic molecules are cyclic peptides with over 40 cyclic peptide drugs currently in clinical use. Their mode of action is, however, not fully understood, impeding rational drug design. Computational techniques could positively impact their design, but modeling them and their interactions remains challenging due to their cyclic nature and their flexibility. This study presents a step-by-step protocol for generating cyclic peptide conformations and docking them to their protein target using HADDOCK2.4. A dataset of 30 cyclic peptide-protein complexes was used to optimize both cyclization and docking protocols. It supports peptides cyclized via an N- and C-terminus peptide bond and/or a disulfide bond. An ensemble of cyclic peptide conformations is then used in HADDOCK to dock them onto their target protein using knowledge of the binding site on the protein side to drive the modeling. The presented protocol predicts at least one acceptable model according to the critical assessment of prediction of interaction criteria for each complex of the dataset when the top 10 HADDOCK-ranked single structures are considered (100% success rate top 10) both in the bound and unbound docking scenarios. Moreover, its performance in both bound and fully unbound docking is similar to the state-of-the-art software in the field, Autodock CrankPep. The presented cyclization and docking protocol should make HADDOCK a valuable tool for rational cyclic peptide-based drug design and high-throughput screening.
Collapse
Affiliation(s)
- Vicky Charitou
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Science for Life, Faculty of Science─Chemistry, Utrecht University, Padualaan 8, Utrecht 3584 CH, The Netherlands
| | - Siri C van Keulen
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Science for Life, Faculty of Science─Chemistry, Utrecht University, Padualaan 8, Utrecht 3584 CH, The Netherlands
| | - Alexandre M J J Bonvin
- Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Science for Life, Faculty of Science─Chemistry, Utrecht University, Padualaan 8, Utrecht 3584 CH, The Netherlands
| |
Collapse
|
26
|
Marzella DF, Parizi FM, van Tilborg D, Renaud N, Sybrandi D, Buzatu R, Rademaker DT, 't Hoen PAC, Xue LC. PANDORA: A Fast, Anchor-Restrained Modelling Protocol for Peptide: MHC Complexes. Front Immunol 2022; 13:878762. [PMID: 35619705 PMCID: PMC9127323 DOI: 10.3389/fimmu.2022.878762] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 04/07/2022] [Indexed: 11/21/2022] Open
Abstract
Deeper understanding of T-cell-mediated adaptive immune responses is important for the design of cancer immunotherapies and antiviral vaccines against pandemic outbreaks. T-cells are activated when they recognize foreign peptides that are presented on the cell surface by Major Histocompatibility Complexes (MHC), forming peptide:MHC (pMHC) complexes. 3D structures of pMHC complexes provide fundamental insight into T-cell recognition mechanism and aids immunotherapy design. High MHC and peptide diversities necessitate efficient computational modelling to enable whole proteome structural analysis. We developed PANDORA, a generic modelling pipeline for pMHC class I and II (pMHC-I and pMHC-II), and present its performance on pMHC-I here. Given a query, PANDORA searches for structural templates in its extensive database and then applies anchor restraints to the modelling process. This restrained energy minimization ensures one of the fastest pMHC modelling pipelines so far. On a set of 835 pMHC-I complexes over 78 MHC types, PANDORA generated models with a median RMSD of 0.70 Å and achieved a 93% success rate in top 10 models. PANDORA performs competitively with three pMHC-I modelling state-of-the-art approaches and outperforms AlphaFold2 in terms of accuracy while being superior to it in speed. PANDORA is a modularized and user-configurable python package with easy installation. We envision PANDORA to fuel deep learning algorithms with large-scale high-quality 3D models to tackle long-standing immunology challenges.
Collapse
Affiliation(s)
- Dario F Marzella
- Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboudumc, Nijmegen, Netherlands
| | - Farzaneh M Parizi
- Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboudumc, Nijmegen, Netherlands
| | - Derek van Tilborg
- Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboudumc, Nijmegen, Netherlands.,Department of Biomedical Engineering, Institute for Complex Molecular Systems, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Nicolas Renaud
- Natural Sciences and Engineering section, Netherlands eScience Center, Amsterdam, Netherlands
| | - Daan Sybrandi
- Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, Netherlands
| | - Rafaella Buzatu
- Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboudumc, Nijmegen, Netherlands
| | - Daniel T Rademaker
- Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboudumc, Nijmegen, Netherlands
| | - Peter A C 't Hoen
- Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboudumc, Nijmegen, Netherlands
| | - Li C Xue
- Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboudumc, Nijmegen, Netherlands
| |
Collapse
|
27
|
Karaca E, Prévost C, Sacquin-Mora S. Modeling the Dynamics of Protein–Protein Interfaces, How and Why? Molecules 2022; 27:molecules27061841. [PMID: 35335203 PMCID: PMC8950966 DOI: 10.3390/molecules27061841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 03/06/2022] [Accepted: 03/08/2022] [Indexed: 12/07/2022] Open
Abstract
Protein–protein assemblies act as a key component in numerous cellular processes. Their accurate modeling at the atomic level remains a challenge for structural biology. To address this challenge, several docking and a handful of deep learning methodologies focus on modeling protein–protein interfaces. Although the outcome of these methods has been assessed using static reference structures, more and more data point to the fact that the interaction stability and specificity is encoded in the dynamics of these interfaces. Therefore, this dynamics information must be taken into account when modeling and assessing protein interactions at the atomistic scale. Expanding on this, our review initially focuses on the recent computational strategies aiming at investigating protein–protein interfaces in a dynamic fashion using enhanced sampling, multi-scale modeling, and experimental data integration. Then, we discuss how interface dynamics report on the function of protein assemblies in globular complexes, in fuzzy complexes containing intrinsically disordered proteins, as well as in active complexes, where chemical reactions take place across the protein–protein interface.
Collapse
Affiliation(s)
- Ezgi Karaca
- Izmir Biomedicine and Genome Center, Izmir 35340, Turkey;
- Izmir International Biomedicine and Genome Institute, Dokuz Eylul University, Izmir 35340, Turkey
| | - Chantal Prévost
- CNRS, Laboratoire de Biochimie Théorique, UPR9080, Université de Paris, 13 rue Pierre et Marie Curie, 75005 Paris, France;
- Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, 75006 Paris, France
| | - Sophie Sacquin-Mora
- CNRS, Laboratoire de Biochimie Théorique, UPR9080, Université de Paris, 13 rue Pierre et Marie Curie, 75005 Paris, France;
- Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, 75006 Paris, France
- Correspondence:
| |
Collapse
|
28
|
Wang J, Miao Y. Protein-Protein Interaction-Gaussian Accelerated Molecular Dynamics (PPI-GaMD): Characterization of Protein Binding Thermodynamics and Kinetics. J Chem Theory Comput 2022; 18:1275-1285. [PMID: 35099970 DOI: 10.1021/acs.jctc.1c00974] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Protein-protein interactions (PPIs) play key roles in many fundamental biological processes such as cellular signaling and immune responses. However, it has proven challenging to simulate repetitive protein association and dissociation in order to calculate binding free energies and kinetics of PPIs due to long biological timescales and complex protein dynamics. To address this challenge, we have developed a new computational approach to all-atom simulations of PPIs based on a robust Gaussian accelerated molecular dynamics (GaMD) technique. The method, termed "PPI-GaMD", selectively boosts interaction potential energy between protein partners to facilitate their slow dissociation. Meanwhile, another boost potential is applied to the remaining potential energy of the entire system to effectively model the protein's flexibility and rebinding. PPI-GaMD has been demonstrated on a model system of the ribonuclease barnase interactions with its inhibitor barstar. Six independent 2 μs PPI-GaMD simulations have captured repetitive barstar dissociation and rebinding events, which enable calculations of the protein binding thermodynamics and kinetics simultaneously. The calculated binding free energies and kinetic rate constants agree well with the experimental data. Furthermore, PPI-GaMD simulations have provided mechanistic insights into barstar binding to barnase, which involves long-range electrostatic interactions and multiple binding pathways, being consistent with previous experimental and computational findings of this model system. In summary, PPI-GaMD provides a highly efficient and easy-to-use approach for binding free energy and kinetics calculations of PPIs.
Collapse
Affiliation(s)
- Jinan Wang
- Center for Computational Biology and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas 66047, United States
| | - Yinglong Miao
- Center for Computational Biology and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas 66047, United States
| |
Collapse
|
29
|
Capturing a Crucial ‘Disorder-to-Order Transition’ at the Heart of the Coronavirus Molecular Pathology—Triggered by Highly Persistent, Interchangeable Salt-Bridges. Vaccines (Basel) 2022; 10:vaccines10020301. [PMID: 35214759 PMCID: PMC8875383 DOI: 10.3390/vaccines10020301] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 01/27/2022] [Accepted: 02/05/2022] [Indexed: 02/05/2023] Open
Abstract
The COVID-19 origin debate has greatly been influenced by genome comparison studies of late, revealing the emergence of the Furin-like cleavage site at the S1/S2 junction of the SARS-CoV-2 Spike (FLCSSpike) containing its 681PRRAR685 motif, absent in other related respiratory viruses. Being the rate-limiting (i.e., the slowest) step, the host Furin cleavage is instrumental in the abrupt increase in transmissibility in COVID-19, compared to earlier onsets of respiratory viral diseases. In such a context, the current paper entraps a ‘disorder-to-order transition’ of the FLCSSpike (concomitant to an entropy arrest) upon binding to Furin. The interaction clearly seems to be optimized for a more efficient proteolytic cleavage in SARS-CoV-2. The study further shows the formation of dynamically interchangeable and persistent networks of salt-bridges at the Spike–Furin interface in SARS-CoV-2 involving the three arginines (R682, R683, R685) of the FLCSSpike with several anionic residues (E230, E236, D259, D264, D306) coming from Furin, strategically distributed around its catalytic triad. Multiplicity and structural degeneracy of plausible salt-bridge network archetypes seem to be the other key characteristic features of the Spike–Furin binding in SARS-CoV-2, allowing the system to breathe—a trademark of protein disorder transitions. Interestingly, with respect to the homologous interaction in SARS-CoV (2002/2003) taken as a baseline, the Spike–Furin binding events, generally, in the coronavirus lineage, seems to have preference for ionic bond formation, even with a lesser number of cationic residues at their potentially polybasic FLCSSpike patches. The interaction energies are suggestive of characteristic metastabilities attributed to Spike–Furin interactions, generally to the coronavirus lineage, which appears to be favorable for proteolytic cleavages targeted at flexible protein loops. The current findings not only offer novel mechanistic insights into the coronavirus molecular pathology and evolution, but also add substantially to the existing theories of proteolytic cleavages.
Collapse
|
30
|
Verburgt J, Kihara D. Benchmarking of structure refinement methods for protein complex models. Proteins 2022; 90:83-95. [PMID: 34309909 PMCID: PMC8671191 DOI: 10.1002/prot.26188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 06/24/2021] [Accepted: 07/22/2021] [Indexed: 01/03/2023]
Abstract
Protein structure docking is the process in which the quaternary structure of a protein complex is predicted from individual tertiary structures of the protein subunits. Protein docking is typically performed in two main steps. The subunits are first docked while keeping them rigid to form the complex, which is then followed by structure refinement. Structure refinement is crucial for a practical use of computational protein docking models, as it is aimed for correcting conformations of interacting residues and atoms at the interface. Here, we benchmarked the performance of eight existing protein structure refinement methods in refinement of protein complex models. We show that the fraction of native contacts between subunits is by far the most straightforward metric to improve. However, backbone dependent metrics, based on the Root Mean Square Deviation proved more difficult to improve via refinement.
Collapse
Affiliation(s)
- Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
- Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
31
|
Barradas-Bautista D, Cao Z, Vangone A, Oliva R, Cavallo L. A random forest classifier for protein-protein docking models. BIOINFORMATICS ADVANCES 2021; 2:vbab042. [PMID: 36699405 PMCID: PMC9710594 DOI: 10.1093/bioadv/vbab042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 11/11/2021] [Accepted: 12/06/2021] [Indexed: 01/28/2023]
Abstract
Herein, we present the results of a machine learning approach we developed to single out correct 3D docking models of protein-protein complexes obtained by popular docking software. To this aim, we generated 3 × 10 4 docking models for each of the 230 complexes in the protein-protein benchmark, version 5, using three different docking programs (HADDOCK, FTDock and ZDOCK), for a cumulative set of ≈ 7 × 10 6 docking models. Three different machine learning approaches (Random Forest, Supported Vector Machine and Perceptron) were used to train classifiers with 158 different scoring functions (features). The Random Forest algorithm outperformed the other two algorithms and was selected for further optimization. Using a features selection algorithm, and optimizing the random forest hyperparameters, allowed us to train and validate a random forest classifier, named COnservation Driven Expert System (CoDES). Testing of CoDES on independent datasets, as well as results of its comparative performance with machine learning methods recently developed in the field for the scoring of docking decoys, confirm its state-of-the-art ability to discriminate correct from incorrect decoys both in terms of global parameters and in terms of decoys ranked at the top positions. Supplementary information Supplementary data are available at Bioinformatics Advances online. Software and data availability statement The docking models are available at https://doi.org/10.5281/zenodo.4012018. The programs underlying this article will be shared on request to the corresponding authors.
Collapse
Affiliation(s)
- Didier Barradas-Bautista
- Kaust Catalysis Center, Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Saudi Arabia,To whom correspondence should be addressed. or or
| | - Zhen Cao
- Kaust Catalysis Center, Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Saudi Arabia
| | - Anna Vangone
- Pharma Research and Early Development, Therapeutic Modalities, Roche Innovation Center Munich Large Molecule Research, 82377 Penzberg, Germany
| | - Romina Oliva
- Department of Sciences and Technologies, University Parthenope of Naples, Centro Direzionale Isola C4, I-80143 Naples, Italy,To whom correspondence should be addressed. or or
| | - Luigi Cavallo
- Kaust Catalysis Center, Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Saudi Arabia,To whom correspondence should be addressed. or or
| |
Collapse
|
32
|
Renaud N, Geng C, Georgievska S, Ambrosetti F, Ridder L, Marzella DF, Réau MF, Bonvin AMJJ, Xue LC. DeepRank: a deep learning framework for data mining 3D protein-protein interfaces. Nat Commun 2021; 12:7068. [PMID: 34862392 PMCID: PMC8642403 DOI: 10.1038/s41467-021-27396-0] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 11/12/2021] [Indexed: 11/08/2022] Open
Abstract
Three-dimensional (3D) structures of protein complexes provide fundamental information to decipher biological processes at the molecular scale. The vast amount of experimentally and computationally resolved protein-protein interfaces (PPIs) offers the possibility of training deep learning models to aid the predictions of their biological relevance. We present here DeepRank, a general, configurable deep learning framework for data mining PPIs using 3D convolutional neural networks (CNNs). DeepRank maps features of PPIs onto 3D grids and trains a user-specified CNN on these 3D grids. DeepRank allows for efficient training of 3D CNNs with data sets containing millions of PPIs and supports both classification and regression. We demonstrate the performance of DeepRank on two distinct challenges: The classification of biological versus crystallographic PPIs, and the ranking of docking models. For both problems DeepRank is competitive with, or outperforms, state-of-the-art methods, demonstrating the versatility of the framework for research in structural biology.
Collapse
Affiliation(s)
- Nicolas Renaud
- Netherlands eScience Center, Science Park 140, 1098 XG, Amsterdam, The Netherlands
| | - Cunliang Geng
- Netherlands eScience Center, Science Park 140, 1098 XG, Amsterdam, The Netherlands
- Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584, Utrecht, CH, The Netherlands
| | - Sonja Georgievska
- Netherlands eScience Center, Science Park 140, 1098 XG, Amsterdam, The Netherlands
| | - Francesco Ambrosetti
- Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584, Utrecht, CH, The Netherlands
| | - Lars Ridder
- Netherlands eScience Center, Science Park 140, 1098 XG, Amsterdam, The Netherlands
| | - Dario F Marzella
- Center for Molecular and Biomolecular Informatics, Radboudumc, Greet Grooteplein 26-28, 6525, Nijmegen, GA, The Netherlands
| | - Manon F Réau
- Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584, Utrecht, CH, The Netherlands
| | - Alexandre M J J Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584, Utrecht, CH, The Netherlands.
| | - Li C Xue
- Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584, Utrecht, CH, The Netherlands.
- Center for Molecular and Biomolecular Informatics, Radboudumc, Greet Grooteplein 26-28, 6525, Nijmegen, GA, The Netherlands.
| |
Collapse
|
33
|
Depetris RS, Lu D, Polonskaya Z, Zhang Z, Luna X, Tankard A, Kolahi P, Drummond M, Williams C, Ebert MCCJC, Patel JP, Poyurovsky MV. Functional antibody characterization via direct structural analysis and information-driven protein-protein docking. Proteins 2021; 90:919-935. [PMID: 34773424 PMCID: PMC9544432 DOI: 10.1002/prot.26280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 08/28/2021] [Accepted: 11/07/2021] [Indexed: 12/02/2022]
Abstract
Detailed description of the mechanism of action of the therapeutic antibodies is essential for the functional characterization and future optimization of potential clinical agents. We recently developed KD035, a fully human antibody targeting vascular endothelial growth factor receptor 2 (VEGFR2). KD035 blocked VEGF‐A, and VEGF‐C‐mediated VEGFR2 activation, as demonstrated by the in vitro binding and competition assays and functional cellular assays. Here, we report a computational model of the complex between the variable fragment of KD035 (KD035(Fv)) and the domains 2 and 3 of the extracellular portion of VEGFR2 (VEGFR2(D2‐3)). Our modeling was guided by a priori experimental information including the X‐ray structures of KD035 and related antibodies, binding assays, target domain mapping and comparison of KD035 affinity for VEGFR2 from different species. The accuracy of the model was assessed by molecular dynamics simulations, and subsequently validated by mutagenesis and binding analysis. Importantly, the steps followed during the generation of this model can set a precedent for future in silico efforts aimed at the accurate description of the antibody–antigen and more broadly protein–protein complexes.
Collapse
Affiliation(s)
| | - Dan Lu
- Kadmon Corporation, LLC, New York, New York, USA
| | | | - Zhikai Zhang
- Kadmon Corporation, LLC, New York, New York, USA
| | - Xenia Luna
- Kadmon Corporation, LLC, New York, New York, USA
| | | | - Pegah Kolahi
- Kadmon Corporation, LLC, New York, New York, USA
| | | | | | | | | | | |
Collapse
|
34
|
Jandova Z, Vargiu AV, Bonvin AMJJ. Native or Non-Native Protein-Protein Docking Models? Molecular Dynamics to the Rescue. J Chem Theory Comput 2021; 17:5944-5954. [PMID: 34342983 PMCID: PMC8444332 DOI: 10.1021/acs.jctc.1c00336] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Indexed: 11/29/2022]
Abstract
Molecular docking excels at creating a plethora of potential models of protein-protein complexes. To correctly distinguish the favorable, native-like models from the remaining ones remains, however, a challenge. We assessed here if a protocol based on molecular dynamics (MD) simulations would allow distinguishing native from non-native models to complement scoring functions used in docking. To this end, the first models for 25 protein-protein complexes were generated using HADDOCK. Next, MD simulations complemented with machine learning were used to discriminate between native and non-native complexes based on a combination of metrics reporting on the stability of the initial models. Native models showed higher stability in almost all measured properties, including the key ones used for scoring in the Critical Assessment of PRedicted Interaction (CAPRI) competition, namely the positional root mean square deviations and fraction of native contacts from the initial docked model. A random forest classifier was trained, reaching a 0.85 accuracy in correctly distinguishing native from non-native complexes. Reasonably modest simulation lengths of the order of 50-100 ns are sufficient to reach this accuracy, which makes this approach applicable in practice.
Collapse
Affiliation(s)
- Zuzana Jandova
- Computational
Structural Biology Group, Bijvoet Centre for Biomolecular Research,
Faculty of Science—Chemistry, Utrecht
University, Padualaan 8, 3584 CH Utrecht, the Netherlands
| | - Attilio Vittorio Vargiu
- Physics
Department, University of Cagliari, Cittadella
Universitaria, S.P. 8 km 0.700, 09042 Monserrato, Italy
| | - Alexandre M. J. J. Bonvin
- Computational
Structural Biology Group, Bijvoet Centre for Biomolecular Research,
Faculty of Science—Chemistry, Utrecht
University, Padualaan 8, 3584 CH Utrecht, the Netherlands
| |
Collapse
|
35
|
Dapkūnas J, Olechnovič K, Venclovas Č. Modeling of protein complexes in CASP14 with emphasis on the interaction interface prediction. Proteins 2021; 89:1834-1843. [PMID: 34176161 PMCID: PMC9292421 DOI: 10.1002/prot.26167] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 06/21/2021] [Accepted: 06/23/2021] [Indexed: 01/08/2023]
Abstract
The goal of CASP experiments is to monitor the progress in the protein structure prediction field. During the 14th CASP edition we aimed to test our capabilities of predicting structures of protein complexes. Our protocol for modeling protein assemblies included both template‐based modeling and free docking. Structural templates were identified using sensitive sequence‐based searches. If sequence‐based searches failed, we performed structure‐based template searches using selected CASP server models. In the absence of reliable templates we applied free docking starting from monomers generated by CASP servers. We evaluated and ranked models of protein complexes using an improved version of our protein structure quality assessment method, VoroMQA, taking into account both interaction interface and global structure scores. If reliable templates could be identified, generally accurate models of protein assemblies were generated with the exception of an antibody‐antigen interaction. The success of free docking mainly depended on the accuracy of initial subunit models and on the scoring of docking solutions. To put our overall results in perspective, we analyzed our performance in the context of other CASP groups. Although the subunits in our assembly models often were not of the top quality, these models had, overall, the best‐predicted intersubunit interfaces according to several accuracy measures. We attribute our relative success primarily to the emphasis on the interaction interface when modeling and scoring.
Collapse
Affiliation(s)
- Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
36
|
Peacock T, Chain B. Information-Driven Docking for TCR-pMHC Complex Prediction. Front Immunol 2021; 12:686127. [PMID: 34177934 PMCID: PMC8219952 DOI: 10.3389/fimmu.2021.686127] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 05/07/2021] [Indexed: 12/16/2022] Open
Abstract
T cell receptor (TCR) recognition of peptides presented by major histocompatibility complex (MHC) molecules is a fundamental process in the adaptive immune system. An understanding of this recognition process at the molecular level is crucial for TCR based therapeutics and vaccine design. The broad nature of TCR diversity and cross-reactivity presents a challenge for traditional structural resolution. Computational modelling of TCR-pMHC complexes offers an efficient alternative. This study compares the ability of four general-purpose docking platforms (ClusPro, LightDock, ZDOCK and HADDOCK) to make use of varying levels of binding interface information for accurate TCR-pMHC modelling. Each platform was tested on an expanded benchmark set of 44 TCR-pMHC docking cases. In general, HADDOCK is shown to be the best performer. Docking strategy guidance is provided to obtain the best models for each platform for future research. The TCR-pMHC docking cases used in this study can be downloaded from https://github.com/innate2adaptive/ExpandedBenchmark.
Collapse
Affiliation(s)
- Thomas Peacock
- Division of Infection and Immunity, University College London, London, United Kingdom.,The UCL Centre for Computation, Mathematics and Physics in the Life Sciences and Experimental Biology (CoMPLEX), Department Computer Science, University College London, London, United Kingdom
| | - Benny Chain
- Division of Infection and Immunity, University College London, London, United Kingdom
| |
Collapse
|
37
|
Prévost C, Sacquin-Mora S. Moving pictures: Reassessing docking experiments with a dynamic view of protein interfaces. Proteins 2021; 89:1315-1323. [PMID: 34038009 DOI: 10.1002/prot.26152] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 03/22/2021] [Accepted: 05/19/2021] [Indexed: 11/06/2022]
Abstract
The modeling of protein assemblies at the atomic level remains a central issue in structural biology, as protein interactions play a key role in numerous cellular processes. This problem is traditionally addressed using docking tools, where the quality of the models is based on their similarity to a single reference experimental structure. However, using a static reference does not take into account the dynamic quality of the protein interface. Here, we used all-atom classical Molecular Dynamics simulations to investigate the stability of the reference interface for three complexes that previously served as targets in the CAPRI competition. For each one of these targets, we also ran MD simulations for ten models that are distributed over the High, Medium and Acceptable accuracy categories. To assess the quality of these models from a dynamic perspective, we set up new criteria which take into account the stability of the reference experimental protein interface. We show that, when the protein interfaces are allowed to evolve along time, the original ranking based on the static CAPRI criteria no longer holds as over 50% of the docking models undergo a category change (which can be either toward a better or a lower accuracy group) when reassessing their quality using dynamic information.
Collapse
Affiliation(s)
- Chantal Prévost
- CNRS, Laboratoire de Biochimie Théorique, UPR9080, Université de Paris, Paris, France.,Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, Paris, France
| | - Sophie Sacquin-Mora
- CNRS, Laboratoire de Biochimie Théorique, UPR9080, Université de Paris, Paris, France.,Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, Paris, France
| |
Collapse
|
38
|
Zhang W, Meng Q, Tang J, Guo F. Exploring effectiveness of ab-initio protein-protein docking methods on a novel antibacterial protein complex dataset. Brief Bioinform 2021; 22:6265196. [PMID: 33959764 DOI: 10.1093/bib/bbab150] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 03/12/2021] [Accepted: 03/27/2021] [Indexed: 12/27/2022] Open
Abstract
Diseases caused by bacterial infections become a critical problem in public heath. Antibiotic, the traditional treatment, gradually loses their effectiveness due to the resistance. Meanwhile, antibacterial proteins attract more attention because of broad spectrum and little harm to host cells. Therefore, exploring new effective antibacterial proteins is urgent and necessary. In this paper, we are committed to evaluating the effectiveness of ab-initio docking methods in antibacterial protein-protein docking. For this purpose, we constructed a three-dimensional (3D) structure dataset of antibacterial protein complex, called APCset, which contained $19$ protein complexes whose receptors or ligands are homologous to antibacterial peptides from Antimicrobial Peptide Database. Then we selected five representative ab-initio protein-protein docking tools including ZDOCK3.0.2, FRODOCK3.0, ATTRACT, PatchDock and Rosetta to identify these complexes' structure, whose performance differences were obtained by analyzing from five aspects, including top/best pose, first hit, success rate, average hit count and running time. Finally, according to different requirements, we assessed and recommended relatively efficient protein-protein docking tools. In terms of computational efficiency and performance, ZDOCK was more suitable as preferred computational tool, with average running time of $6.144$ minutes, average Fnat of best pose of $0.953$ and average rank of best pose of $4.158$. Meanwhile, ZDOCK still yielded better performance on Benchmark 5.0, which proved ZDOCK was effective in performing docking on large-scale dataset. Our survey can offer insights into the research on the treatment of bacterial infections by utilizing the appropriate docking methods.
Collapse
Affiliation(s)
- Wei Zhang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Qiaozhen Meng
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jijun Tang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.,School of Computational Science and Engineering, University of South Carolina, Columbia, U.S.,Key Laboratory of Systems Bioengineering (Ministry of Education), Tianjin University, Tianjin, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
39
|
Quignot C, Granger P, Chacón P, Guerois R, Andreani J. Atomic-level evolutionary information improves protein-protein interface scoring. Bioinformatics 2021; 37:3175-3181. [PMID: 33901284 DOI: 10.1093/bioinformatics/btab254] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 03/20/2021] [Accepted: 04/19/2021] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION The crucial role of protein interactions and the difficulty in characterising them experimentally strongly motivates the development of computational approaches for structural prediction. Even when protein-protein docking samples correct models, current scoring functions struggle to discriminate them from incorrect decoys. The previous incorporation of conservation and coevolution information has shown promise for improving protein-protein scoring. Here, we present a novel strategy to integrate atomic-level evolutionary information into different types of scoring functions to improve their docking discrimination. RESULTS : We applied this general strategy to our residue-level statistical potential from InterEvScore and to two atomic-level scores, SOAP-PP and Rosetta interface score (ISC). Including evolutionary information from as few as ten homologous sequences improves the top 10 success rates of individual atomic-level scores SOAP-PP and Rosetta ISC by respectively 6 and 13.5 percentage points, on a large benchmark of 752 docking cases. The best individual homology-enriched score reaches a top 10 success rate of 34.4%. A consensus approach based on the complementarity between different homology-enriched scores further increases the top 10 success rate to 40%. AVAILABILITY All data used for benchmarking and scoring results, as well as a Singularity container of the pipeline, are available at http://biodev.cea.fr/interevol/interevdata/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chloé Quignot
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Pierre Granger
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Pablo Chacón
- Department of Biological Chemical Physics, Rocasolano Institute of Physical Chemistry C.S.I.C, Madrid, Spain
| | - Raphael Guerois
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| |
Collapse
|
40
|
Liang T, Chen H, Yuan J, Jiang C, Hao Y, Wang Y, Feng Z, Xie XQ. IsAb: a computational protocol for antibody design. Brief Bioinform 2021; 22:6238584. [PMID: 33876197 DOI: 10.1093/bib/bbab143] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 02/24/2021] [Accepted: 03/23/2021] [Indexed: 12/15/2022] Open
Abstract
The design of therapeutic antibodies has attracted a large amount of attention over the years. Antibodies are widely used to treat many diseases due to their high efficiency and low risk of adverse events. However, the experimental methods of antibody design are time-consuming and expensive. Although computational antibody design techniques have had significant advances in the past years, there are still some challenges that need to be solved, such as the flexibility of antigen structure, the lack of antibody structural data and the absence of standard antibody design protocol. In the present work, we elaborated on an in silico antibody design protocol for users to easily perform computer-aided antibody design. First, the Rosetta web server will be applied to generate the 3D structure of query antibodies if there is no structural information available. Then, two-step docking will be used to identify the binding pose of an antibody-antigen complex when the binding information is unknown. ClusPro is the first method to be used to conduct the global docking, and SnugDock is applied for the local docking. Sequentially, based on the predicted binding poses, in silico alanine scanning will be used to predict the potential hotspots (or key residues). Finally, computational affinity maturation protocol will be used to modify the structure of antibodies to theoretically increase their affinity and stability, which will be further validated by the bioassays in the future. As a proof of concept, we redesigned antibody D44.1 and compared it with previously reported data in order to validate IsAb protocol. To further illustrate our proposed protocol, we used cemiplimab antibody, a PD-1 checkpoint inhibitor, as an example to showcase a step-by-step tutorial.
Collapse
Affiliation(s)
- Tianjian Liang
- School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Hui Chen
- School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Jiayi Yuan
- School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Chen Jiang
- School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Yixuan Hao
- School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Yuanqiang Wang
- School of Pharmacy and Bioengineering, Chongqing University of Technology, Pittsburgh, PA 15261, USA
| | - Zhiwei Feng
- School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Xiang-Qun Xie
- Computational Drug Abuse Research and Computational Chemogenomics Screening Center at the University of Pittsburgh, Pittsburgh, PA 15261, USA
| |
Collapse
|
41
|
Robustification of RosettaAntibody and Rosetta SnugDock. PLoS One 2021; 16:e0234282. [PMID: 33764990 PMCID: PMC7993800 DOI: 10.1371/journal.pone.0234282] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Accepted: 01/11/2021] [Indexed: 11/19/2022] Open
Abstract
In recent years, the observed antibody sequence space has grown exponentially due to advances in high-throughput sequencing of immune receptors. The rise in sequences has not been mirrored by a rise in structures, as experimental structure determination techniques have remained low-throughput. Computational modeling, however, has the potential to close the sequence–structure gap. To achieve this goal, computational methods must be robust, fast, easy to use, and accurate. Here we report on the latest advances made in RosettaAntibody and Rosetta SnugDock—methods for antibody structure prediction and antibody–antigen docking. We simplified the user interface, expanded and automated the template database, generalized the kinematics of antibody–antigen docking (which enabled modeling of single-domain antibodies) and incorporated new loop modeling techniques. To evaluate the effects of our updates on modeling accuracy, we developed rigorous tests under a new scientific benchmarking framework within Rosetta. Benchmarking revealed that more structurally similar templates could be identified in the updated database and that SnugDock broadened its applicability without losing accuracy. However, there are further advances to be made, including increasing the accuracy and speed of CDR-H3 loop modeling, before computational approaches can accurately model any antibody.
Collapse
|
42
|
Rosell M, Rodríguez-Lumbreras LA, Fernández-Recio J. Modeling of Protein Complexes and Molecular Assemblies with pyDock. Methods Mol Biol 2021; 2165:175-198. [PMID: 32621225 DOI: 10.1007/978-1-0716-0708-4_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The study of the 3D structural details of protein interactions is essential to understand biomolecular functions at the molecular level. In this context, the limited availability of experimental structures of protein-protein complexes at atomic resolution is propelling the development of computational docking methods that aim to complement the current structural coverage of protein interactions. One of these docking approaches is pyDock, which uses van der Waals, electrostatics, and desolvation energy to score docking poses generated by a variety of sampling methods, typically FTDock or ZDOCK. The method has shown a consistently good prediction performance in community-wide assessment experiments like CAPRI or CASP, and has provided biological insights and insightful interpretation of experiments by modeling many biomolecular interactions of biomedical and biotechnological interest. Here, we describe in detail how to perform structural modeling of protein assemblies with pyDock, and the application of its modules to different biomolecular recognition phenomena, such as modeling of binding mode, interface, and hot-spot prediction, use of restraints based on experimental data, inclusion of low-resolution structural data, binding affinity estimation, or modeling of homo- and hetero-oligomeric assemblies.
Collapse
Affiliation(s)
- Mireia Rosell
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Instituto de Ciencias de la Vid y del Vino (ICVV), Consejo Superior de Investigaciones Científicas (CSIC) - Universidad de La Rioja - Gobierno de La Rioja, Logroño, Spain
| | - Luis Angel Rodríguez-Lumbreras
- Barcelona Supercomputing Center (BSC), Barcelona, Spain.,Instituto de Ciencias de la Vid y del Vino (ICVV), Consejo Superior de Investigaciones Científicas (CSIC) - Universidad de La Rioja - Gobierno de La Rioja, Logroño, Spain
| | - Juan Fernández-Recio
- Barcelona Supercomputing Center (BSC), Barcelona, Spain. .,Instituto de Ciencias de la Vid y del Vino (ICVV), Consejo Superior de Investigaciones Científicas (CSIC) - Universidad de La Rioja - Gobierno de La Rioja, Logroño, Spain. .,Institut de Biologia Molecular de Barcelona (IBMB), Consejo Superior de Investigaciones Científicas (CSIC), Barcelona, Spain.
| |
Collapse
|
43
|
Guest JD, Vreven T, Zhou J, Moal I, Jeliazkov JR, Gray JJ, Weng Z, Pierce BG. An expanded benchmark for antibody-antigen docking and affinity prediction reveals insights into antibody recognition determinants. Structure 2021; 29:606-621.e5. [PMID: 33539768 DOI: 10.1016/j.str.2021.01.005] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 11/15/2020] [Accepted: 01/11/2021] [Indexed: 01/04/2023]
Abstract
Accurate predictive modeling of antibody-antigen complex structures and structure-based antibody design remain major challenges in computational biology, with implications for biotherapeutics, immunity, and vaccines. Through a systematic search for high-resolution structures of antibody-antigen complexes and unbound antibody and antigen structures, in conjunction with identification of experimentally determined binding affinities, we have assembled a non-redundant set of test cases for antibody-antigen docking and affinity prediction. This benchmark more than doubles the number of antibody-antigen complexes and corresponding affinities available in our previous benchmarks, providing an unprecedented view of the determinants of antibody recognition and insights into molecular flexibility. Initial assessments of docking and affinity prediction tools highlight the challenges posed by this diverse set of cases, which includes camelid nanobodies, therapeutic monoclonal antibodies, and broadly neutralizing antibodies targeting viral glycoproteins. This dataset will enable development of advanced predictive modeling and design methods for this therapeutically relevant class of protein-protein interactions.
Collapse
Affiliation(s)
- Johnathan D Guest
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA; Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Jing Zhou
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Iain Moal
- Computational Sciences, GlaxoSmithKline Research and Development, Stevenage SG1 2NY, UK
| | - Jeliazko R Jeliazkov
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Jeffrey J Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA; Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| | - Brian G Pierce
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA; Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA.
| |
Collapse
|
44
|
Slater O, Miller B, Kontoyianni M. Decoding Protein-protein Interactions: An Overview. Curr Top Med Chem 2021; 20:855-882. [PMID: 32101126 DOI: 10.2174/1568026620666200226105312] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Revised: 11/27/2019] [Accepted: 11/27/2019] [Indexed: 12/24/2022]
Abstract
Drug discovery has focused on the paradigm "one drug, one target" for a long time. However, small molecules can act at multiple macromolecular targets, which serves as the basis for drug repurposing. In an effort to expand the target space, and given advances in X-ray crystallography, protein-protein interactions have become an emerging focus area of drug discovery enterprises. Proteins interact with other biomolecules and it is this intricate network of interactions that determines the behavior of the system and its biological processes. In this review, we briefly discuss networks in disease, followed by computational methods for protein-protein complex prediction. Computational methodologies and techniques employed towards objectives such as protein-protein docking, protein-protein interactions, and interface predictions are described extensively. Docking aims at producing a complex between proteins, while interface predictions identify a subset of residues on one protein that could interact with a partner, and protein-protein interaction sites address whether two proteins interact. In addition, approaches to predict hot spots and binding sites are presented along with a representative example of our internal project on the chemokine CXC receptor 3 B-isoform and predictive modeling with IP10 and PF4.
Collapse
Affiliation(s)
- Olivia Slater
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Bethany Miller
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Maria Kontoyianni
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| |
Collapse
|
45
|
Bux K, Hofer TS, Moin ST. Exploring interfacial dynamics in homodimeric S-ribosylhomocysteine lyase (LuxS) from Vibrio cholerae through molecular dynamics simulations. RSC Adv 2021; 11:1700-1714. [PMID: 35424088 PMCID: PMC8693604 DOI: 10.1039/d0ra08809a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 12/22/2020] [Indexed: 11/21/2022] Open
Abstract
To the best of our knowledge, this is the first molecular dynamics simulation study on the dimeric form of the LuxS enzyme from Vibrio cholerae to evaluate its structural and dynamical properties including the dynamics of the interface formed by the two monomeric chains of the enzyme. The dynamics of the interfacial region were investigated in terms of inter-residual contacts and the associated interface area of the enzyme in its ligand-free and ligand–bound states which produced characteristics contrast in the interfacial dynamics. Moreover, the binding patterns of the two inhibitors (RHC and KRI) to the enzyme forming two different enzyme–ligand complexes were analyzed which pointed towards a varying inhibition potential of the inhibitors as also revealed by the free energies of ligand binding. It is shown that KRI is a more potent inhibitor than RHC – a substrate analogue, showing correlation with experimental data. Moreover, the role of a loop in chain B of the enzyme was found to facilitate the binding of RHC similar to that of the substrate, while KRI demonstrates a differing binding pattern. The computation of the free energy of binding for the two ligands was also carried out via thermodynamic integration which ultimately served to correlate the dynamical properties with the inhibition potential of two different ligands against the enzyme. Furthermore, this successful study provides a rational to suggest novel LuxS inhibitors which could become promising candidates to treat the diseases caused by a broad variety of bacterial species. To the best of our knowledge, this is the first molecular dynamics simulation study on the dimeric form of the LuxS enzyme from Vibrio cholerae to evaluate its structural and dynamical properties including the dynamics of the interface formed by the two monomeric chains of the enzyme.![]()
Collapse
Affiliation(s)
- Khair Bux
- H.E.J. Research Institute of Chemistry
- International Center for Chemical and Biological Sciences
- University of Karachi
- Karachi-75270
- Pakistan
| | - Thomas S. Hofer
- Theoretical Chemistry Division
- Institute of General, Inorganic and Theoretical Chemistry
- University of Innsbruck
- A-6020 Innsbruck
- Austria
| | - Syed Tarique Moin
- H.E.J. Research Institute of Chemistry
- International Center for Chemical and Biological Sciences
- University of Karachi
- Karachi-75270
- Pakistan
| |
Collapse
|
46
|
Pestana-Nobles R, Leyva-Rojas JA, Yosa J. Searching Hit Potential Antimicrobials in Natural Compounds Space against Biofilm Formation. Molecules 2020; 25:E5334. [PMID: 33207596 PMCID: PMC7696173 DOI: 10.3390/molecules25225334] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 10/10/2020] [Accepted: 10/20/2020] [Indexed: 01/06/2023] Open
Abstract
Biofilms are communities of microorganisms that can colonize biotic and abiotic surfaces and thus play a significant role in the persistence of bacterial infection and resistance to antimicrobial. About 65% and 80% of microbial and chronic infections are associated with biofilm formation, respectively. The increase in infections by multi-resistant bacteria instigates the need for the discovery of novel natural-based drugs that act as inhibitory molecules. The inhibition of diguanylate cyclases (DGCs), the enzyme implicated in the synthesis of the second messenger, cyclic diguanylate (c-di-GMP), involved in the biofilm formation, represents a potential approach for preventing the biofilm development. It has been extensively studied using PleD protein as a model of DGC for in silico studies as virtual screening and as a model for in vitro studies in biofilms formation. This study aimed to search for natural products capable of inhibiting the Caulobacter crescentus enzyme PleD. For this purpose, 224,205 molecules from the natural products ZINC15 database, have been evaluated through molecular docking and molecular dynamic simulation. Our results suggest trans-Aconitic acid (TAA) as a possible starting point for hit-to-lead methodologies to obtain new inhibitors of the PleD protein and hence blocking the biofilm formation.
Collapse
|
47
|
Barradas-Bautista D, Cao Z, Cavallo L, Oliva R. The CASP13-CAPRI targets as case studies to illustrate a novel scoring pipeline integrating CONSRANK with clustering and interface analyses. BMC Bioinformatics 2020; 21:262. [PMID: 32938371 PMCID: PMC7493188 DOI: 10.1186/s12859-020-03600-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Accepted: 06/10/2020] [Indexed: 08/27/2023] Open
Abstract
Background Properly scoring protein-protein docking models to single out the correct ones is an open challenge, also object of assessment in CAPRI (Critical Assessment of PRedicted Interactions), a community-wide blind docking experiment. We introduced in the field CONSRANK (CONSensus RANKing), the first pure consensus method. Also available as a web server, CONSRANK ranks docking models in an ensemble based on their ability to match the most frequent inter-residue contacts in it. We have been blindly testing CONSRANK in all the latest CAPRI rounds, where we showed it to perform competitively with the state-of-the-art energy and knowledge-based scoring functions. More recently, we developed Clust-CONSRANK, an algorithm introducing a contact-based clustering of the models as a preliminary step of the CONSRANK scoring process. In the latest CASP13-CAPRI joint experiment, we participated as scorers with a novel pipeline, combining both our scoring tools, CONSRANK and Clust-CONSRANK, with our interface analysis tool COCOMAPS. Selection of the 10 models for submission was guided by the strength of the emerging consensus, and their final ranking was assisted by results of the interface analysis. Results As a result of the above approach, we were by far the first scorer in the CASP13-CAPRI top-1 ranking, having high/medium quality models ranked at the top-1 position for the majority of targets (11 out of the total 19). We were also the first scorer in the top-10 ranking, on a par with another group, and the second scorer in the top-5 ranking. Further, we topped the ranking relative to the prediction of binding interfaces, among all the scorers and predictors. Using the CASP13-CAPRI targets as case studies, we illustrate here in detail the approach we adopted. Conclusions Introducing some flexibility in the final model selection and ranking, as well as differentiating the adopted scoring approach depending on the targets were the key assets for our highly successful performance, as compared to previous CAPRI rounds. The approach we propose is entirely based on methods made available to the community and could thus be reproduced by any user.
Collapse
|
48
|
Burman SSR, Nance ML, Jeliazkov JR, Labonte JW, Lubin JH, Biswas N, Gray JJ. Novel sampling strategies and a coarse-grained score function for docking homomers, flexible heteromers, and oligosaccharides using Rosetta in CAPRI rounds 37-45. Proteins 2020; 88:973-985. [PMID: 31742764 PMCID: PMC8589291 DOI: 10.1002/prot.25855] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Revised: 11/04/2019] [Accepted: 11/13/2019] [Indexed: 02/06/2023]
Abstract
Critical Assessment of PRediction of Interactions (CAPRI) rounds 37 through 45 introduced larger complexes, new macromolecules, and multistage assemblies. For these rounds, we used and expanded docking methods in Rosetta to model 23 target complexes. We successfully predicted 14 target complexes and recognized and refined near-native models generated by other groups for two further targets. Notably, for targets T110 and T136, we achieved the closest prediction of any CAPRI participant. We created several innovative approaches during these rounds. Since round 39 (target 122), we have used the new RosettaDock 4.0, which has a revamped coarse-grained energy function and the ability to perform conformer selection during docking with hundreds of pregenerated protein backbones. Ten of the complexes had some degree of symmetry in their interactions, so we tested Rosetta SymDock, realized its shortcomings, and developed the next-generation symmetric docking protocol, SymDock2, which includes docking of multiple backbones and induced-fit refinement. Since the last CAPRI assessment, we also developed methods for modeling and designing carbohydrates in Rosetta, and we used them to successfully model oligosaccharide-protein complexes in round 41. Although the results were broadly encouraging, they also highlighted the pressing need to invest in (a) flexible docking algorithms with the ability to model loop and linker motions and in (b) new sampling and scoring methods for oligosaccharide-protein interactions.
Collapse
Affiliation(s)
- Shourya S. Roy Burman
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland
| | - Morgan L. Nance
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, Maryland
| | | | - Jason W. Labonte
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland
| | - Joseph H. Lubin
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland
| | - Naireeta Biswas
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland
| | - Jeffrey J. Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, Maryland
- Institute for NanoBioTechnology, Johns Hopkins University, Baltimore, Maryland
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland
| |
Collapse
|
49
|
Torchala M, Gerguri T, Chaleil RAG, Gordon P, Russell F, Keshani M, Bates PA. Enhanced sampling of protein conformational states for dynamic cross-docking within the protein-protein docking server SwarmDock. Proteins 2020; 88:962-972. [PMID: 31697436 PMCID: PMC7496321 DOI: 10.1002/prot.25851] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 10/02/2019] [Accepted: 11/03/2019] [Indexed: 12/12/2022]
Abstract
The formation of specific protein-protein interactions is often a key to a protein's function. During complex formation, each protein component will undergo a change in the conformational state, for some these changes are relatively small and reside primarily at the sidechain level; however, others may display notable backbone adjustments. One of the classic problems in the protein-docking field is to be able to a priori predict the extent of such conformational changes. In this work, we investigated three protocols to find the most suitable input structure conformations for cross-docking, including a robust sampling approach in normal mode space. Counterintuitively, knowledge of the theoretically best combination of normal modes for unbound-bound transitions does not always lead to the best results. We used a novel spatial partitioning library, Aether Engine (see Supplementary Materials), to efficiently search the conformational states of 56 receptor/ligand pairs, including a recent CAPRI target, in a systematic manner and selected diverse conformations as input to our automated docking server, SwarmDock, a server that allows moderate conformational adjustments during the docking process. In essence, here we present a dynamic cross-docking protocol, which when benchmarked against the simpler approach of just docking the unbound components shows a 10% uplift in the quality of the top docking pose.
Collapse
Affiliation(s)
- Mieczyslaw Torchala
- Biomolecular Modelling LaboratoryThe Francis Crick InstituteLondonUK
- Hadean Supercomputing LtdLondonUK
| | - Tereza Gerguri
- Biomolecular Modelling LaboratoryThe Francis Crick InstituteLondonUK
| | | | | | | | | | - Paul A. Bates
- Biomolecular Modelling LaboratoryThe Francis Crick InstituteLondonUK
| |
Collapse
|
50
|
Tanemura KA, Pei J, Merz KM. Refinement of pairwise potentials via logistic regression to score protein-protein interactions. Proteins 2020; 88:1559-1568. [PMID: 32729132 DOI: 10.1002/prot.25973] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Revised: 05/17/2020] [Accepted: 06/14/2020] [Indexed: 12/20/2022]
Abstract
Protein-protein interactions (PPIs) are ubiquitous and functionally of great importance in biological systems. Hence, the accurate prediction of PPIs by protein-protein docking and scoring tools is highly desirable in order to characterize their structure and biological function. Ab initio docking protocols are divided into the sampling of docking poses to produce at least one near-native structure, and then to evaluate the vast candidate structures by scoring. Concurrent development in both sampling and scoring is crucial for the deployment of protein-protein docking software. In the present work, we apply a machine learning model on pairwise potentials to refine the task of protein quaternary structure native structure detection among decoys. A decoy set was featurized using the Knowledge and Empirical Combined Scoring Algorithm 2 (KECSA2) pairwise potential. The highly unbalanced decoy set was then balanced using a comparison concept between native and decoy structures. The resultant comparison descriptors were used to train a logistic regression (LR) classifier. The LR model yielded the optimal performance for native detection among decoys compared with conventional scoring functions, while exhibiting lesser performance for the detection of low root mean square deviation decoy structures. Its deployment on an independent benchmark set confirms that the scoring function performs competitively relative to other scoring functions. The scripts used are available at https://github.com/TanemuraKiyoto/PPI-native-detection-via-LR.
Collapse
Affiliation(s)
- Kiyoto A Tanemura
- Department of Chemistry, Michigan State University, East Lansing, Michigan, USA
| | - Jun Pei
- Department of Chemistry, Michigan State University, East Lansing, Michigan, USA
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, East Lansing, Michigan, USA
| |
Collapse
|