1
|
Li P, Liu ZP. MuToN Quantifies Binding Affinity Changes upon Protein Mutations by Geometric Deep Learning. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024:e2402918. [PMID: 38995072 DOI: 10.1002/advs.202402918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 06/04/2024] [Indexed: 07/13/2024]
Abstract
Assessing changes in protein-protein binding affinity due to mutations helps understanding a wide range of crucial biological processes within cells. Despite significant efforts to create accurate computational models, predicting how mutations affect affinity remains challenging due to the complexity of the biological mechanisms involved. In the present work, a geometric deep learning framework called MuToN is introduced for quantifying protein binding affinity change upon residue mutations. The method, designed with geometric attention networks, is mechanism-aware. It captures changes in the protein binding interfaces of mutated complexes and assesses the allosteric effects of amino acids. Experimental results highlight MuToN's superiority compared to existing methods. Additionally, MuToN's flexibility and effectiveness are illustrated by its precise predictions of binding affinity changes between SARS-CoV-2 variants and the ACE2 complex.
Collapse
Affiliation(s)
- Pengpai Li
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong, 250061, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong, 250061, China
| |
Collapse
|
2
|
Zhou Y, Myung Y, Rodrigues CM, Ascher D. DDMut-PPI: predicting effects of mutations on protein-protein interactions using graph-based deep learning. Nucleic Acids Res 2024; 52:W207-W214. [PMID: 38783112 PMCID: PMC11223791 DOI: 10.1093/nar/gkae412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 04/30/2024] [Accepted: 05/02/2024] [Indexed: 05/25/2024] Open
Abstract
Protein-protein interactions (PPIs) play a vital role in cellular functions and are essential for therapeutic development and understanding diseases. However, current predictive tools often struggle to balance efficiency and precision in predicting the effects of mutations on these complex interactions. To address this, we present DDMut-PPI, a deep learning model that efficiently and accurately predicts changes in PPI binding free energy upon single and multiple point mutations. Building on the robust Siamese network architecture with graph-based signatures from our prior work, DDMut, the DDMut-PPI model was enhanced with a graph convolutional network operated on the protein interaction interface. We used residue-specific embeddings from ProtT5 protein language model as node features, and a variety of molecular interactions as edge features. By integrating evolutionary context with spatial information, this framework enables DDMut-PPI to achieve a robust Pearson correlation of up to 0.75 (root mean squared error: 1.33 kcal/mol) in our evaluations, outperforming most existing methods. Importantly, the model demonstrated consistent performance across mutations that increase or decrease binding affinity. DDMut-PPI offers a significant advancement in the field and will serve as a valuable tool for researchers probing the complexities of protein interactions. DDMut-PPI is freely available as a web server and an application programming interface at https://biosig.lab.uq.edu.au/ddmut_ppi.
Collapse
Affiliation(s)
- Yunzhuo Zhou
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| | - YooChan Myung
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| | - Carlos H M Rodrigues
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
| | - David B Ascher
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| |
Collapse
|
3
|
Yu G, Zhao Q, Bi X, Wang J. DDAffinity: predicting the changes in binding affinity of multiple point mutations using protein 3D structure. Bioinformatics 2024; 40:i418-i427. [PMID: 38940145 PMCID: PMC11211828 DOI: 10.1093/bioinformatics/btae232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Mutations are the crucial driving force for biological evolution as they can disrupt protein stability and protein-protein interactions which have notable impacts on protein structure, function, and expression. However, existing computational methods for protein mutation effects prediction are generally limited to single point mutations with global dependencies, and do not systematically take into account the local and global synergistic epistasis inherent in multiple point mutations. RESULTS To this end, we propose a novel spatial and sequential message passing neural network, named DDAffinity, to predict the changes in binding affinity caused by multiple point mutations based on protein 3D structures. Specifically, instead of being on the whole protein, we perform message passing on the k-nearest neighbor residue graphs to extract pocket features of the protein 3D structures. Furthermore, to learn global topological features, a two-step additive Gaussian noising strategy during training is applied to blur out local details of protein geometry. We evaluate DDAffinity on benchmark datasets and external validation datasets. Overall, the predictive performance of DDAffinity is significantly improved compared with state-of-the-art baselines on multiple point mutations, including end-to-end and pre-training based methods. The ablation studies indicate the reasonable design of all components of DDAffinity. In addition, applications in nonredundant blind testing, predicting mutation effects of SARS-CoV-2 RBD variants, and optimizing human antibody against SARS-CoV-2 illustrate the effectiveness of DDAffinity. AVAILABILITY AND IMPLEMENTATION DDAffinity is available at https://github.com/ak422/DDAffinity.
Collapse
Affiliation(s)
- Guanglei Yu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
- Medical Engineering and Technology College, Xinjiang Medical University, Urumqi 830017, China
| | - Qichang Zhao
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| | - Xuehua Bi
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
- Medical Engineering and Technology College, Xinjiang Medical University, Urumqi 830017, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| |
Collapse
|
4
|
Gurusinghe SNS, Wu Y, DeGrado W, Shifman JM. ProBASS - a language model with sequence and structural features for predicting the effect of mutations on binding affinity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.21.600041. [PMID: 38979193 PMCID: PMC11230163 DOI: 10.1101/2024.06.21.600041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Protein-protein interactions (PPIs) govern virtually all cellular processes. Even a single mutation within PPI can significantly influence overall protein functionality and potentially lead to various types of diseases. To date, numerous approaches have emerged for predicting the change in free energy of binding (ΔΔGbind) resulting from mutations, yet the majority of these methods lack precision. In recent years, protein language models (PLMs) have been developed and shown powerful predictive capabilities by leveraging both sequence and structural data from protein-protein complexes. Yet, PLMs have not been optimized specifically for predicting ΔΔGbind. We developed an approach to predict effects of mutations on PPI binding affinity based on two most advanced protein language models ESM2 and ESM-IF1 that incorporate PPI sequence and structural features, respectively. We used the two models to generate embeddings for each PPI mutant and subsequently fine-tuned our model by training on a large dataset of experimental ΔΔGbind values. Our model, ProBASS (Protein Binding Affinity from Structure and Sequence) achieved a correlation with experimental ΔΔGbind values of 0.83 ± 0.05 for single mutations and 0.69 ± 0.04 for double mutations when model training and testing was done on the same PDB. Moreover, ProBASS exhibited very high correlation (0.81 ± 0.02) between prediction and experiment when training and testing was performed on a dataset containing 2325 single mutations in 132 PPIs. ProBASS surpasses the state-of-the-art methods in correlation with experimental data and could be further trained as more experimental data becomes available. Our results demonstrate that the integration of extensive datasets containing ΔΔGbind values across multiple PPIs to refine the pre-trained PLMs represents a successful approach for achieving a precise and broadly applicable model for ΔΔGbind prediction, greatly facilitating future protein engineering and design studies.
Collapse
Affiliation(s)
- Sagara N S Gurusinghe
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Yibing Wu
- Department of Pharmaceutical Chemistry, School of Pharmacy, University of California San Francisco, CA, USA
| | - William DeGrado
- Department of Pharmaceutical Chemistry, School of Pharmacy, University of California San Francisco, CA, USA
| | - Julia M Shifman
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
5
|
Joubbi S, Micheli A, Milazzo P, Maccari G, Ciano G, Cardamone D, Medini D. Antibody design using deep learning: from sequence and structure design to affinity maturation. Brief Bioinform 2024; 25:bbae307. [PMID: 38960409 PMCID: PMC11221890 DOI: 10.1093/bib/bbae307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 05/20/2024] [Accepted: 06/12/2024] [Indexed: 07/05/2024] Open
Abstract
Deep learning has achieved impressive results in various fields such as computer vision and natural language processing, making it a powerful tool in biology. Its applications now encompass cellular image classification, genomic studies and drug discovery. While drug development traditionally focused deep learning applications on small molecules, recent innovations have incorporated it in the discovery and development of biological molecules, particularly antibodies. Researchers have devised novel techniques to streamline antibody development, combining in vitro and in silico methods. In particular, computational power expedites lead candidate generation, scaling and potential antibody development against complex antigens. This survey highlights significant advancements in protein design and optimization, specifically focusing on antibodies. This includes various aspects such as design, folding, antibody-antigen interactions docking and affinity maturation.
Collapse
Affiliation(s)
- Sara Joubbi
- Department of Computer Science, University of Pisa, Largo B. Pontecorvo, 3, 56127, Pisa, Italy
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Alessio Micheli
- Department of Computer Science, University of Pisa, Largo B. Pontecorvo, 3, 56127, Pisa, Italy
| | - Paolo Milazzo
- Department of Computer Science, University of Pisa, Largo B. Pontecorvo, 3, 56127, Pisa, Italy
| | - Giuseppe Maccari
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Giorgio Ciano
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Dario Cardamone
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Duccio Medini
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| |
Collapse
|
6
|
Jing H, Gao Z, Xu S, Shen T, Peng Z, He S, You T, Ye S, Lin W, Sun S. Accurate prediction of antibody function and structure using bio-inspired antibody language model. Brief Bioinform 2024; 25:bbae245. [PMID: 38797969 PMCID: PMC11128484 DOI: 10.1093/bib/bbae245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 04/08/2024] [Accepted: 05/07/2024] [Indexed: 05/29/2024] Open
Abstract
In recent decades, antibodies have emerged as indispensable therapeutics for combating diseases, particularly viral infections. However, their development has been hindered by limited structural information and labor-intensive engineering processes. Fortunately, significant advancements in deep learning methods have facilitated the precise prediction of protein structure and function by leveraging co-evolution information from homologous proteins. Despite these advances, predicting the conformation of antibodies remains challenging due to their unique evolution and the high flexibility of their antigen-binding regions. Here, to address this challenge, we present the Bio-inspired Antibody Language Model (BALM). This model is trained on a vast dataset comprising 336 million 40% nonredundant unlabeled antibody sequences, capturing both unique and conserved properties specific to antibodies. Notably, BALM showcases exceptional performance across four antigen-binding prediction tasks. Moreover, we introduce BALMFold, an end-to-end method derived from BALM, capable of swiftly predicting full atomic antibody structures from individual sequences. Remarkably, BALMFold outperforms those well-established methods like AlphaFold2, IgFold, ESMFold and OmegaFold in the antibody benchmark, demonstrating significant potential to advance innovative engineering and streamline therapeutic antibody development by reducing the need for unnecessary trials. The BALMFold structure prediction server is freely available at https://beamlab-sh.com/models/BALMFold.
Collapse
Affiliation(s)
- Hongtai Jing
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai 200433, China
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200032, China
| | - Zhengtao Gao
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai 200433, China
| | - Sheng Xu
- Shanghai AI Laboratory, Shanghai 200232, China
| | - Tao Shen
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai 200433, China
- Zelixir Biotech, Shanghai 201206, China
| | - Zhangzhi Peng
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai 200433, China
| | - Shwai He
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai 200433, China
| | - Tao You
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai 200433, China
| | - Shuang Ye
- Department of Gynecologic Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Wei Lin
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai 200433, China
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200032, China
- Shanghai AI Laboratory, Shanghai 200232, China
- School of Mathematical Sciences and Shanghai Center for Mathematical Sciences, Fudan University, Shanghai 200433, China
| | - Siqi Sun
- Research Institute of Intelligent Complex Systems, Fudan University, Shanghai 200433, China
- Shanghai AI Laboratory, Shanghai 200232, China
| |
Collapse
|
7
|
Didiasova M, Cesaro S, Feldhoff S, Bettin I, Tiegel N, Füssgen V, Bertoldi M, Tikkanen R. Functional Characterization of a Spectrum of Genetic Variants in a Family with Succinic Semialdehyde Dehydrogenase Deficiency. Int J Mol Sci 2024; 25:5237. [PMID: 38791277 PMCID: PMC11121183 DOI: 10.3390/ijms25105237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 05/08/2024] [Accepted: 05/09/2024] [Indexed: 05/26/2024] Open
Abstract
Succinic semialdehyde dehydrogenase (SSADH) is a mitochondrial enzyme involved in the catabolism of the neurotransmitter γ-amino butyric acid. Pathogenic variants in the gene encoding this enzyme cause SSADH deficiency, a developmental disease that manifests as hypotonia, autism, and epilepsy. SSADH deficiency patients usually have family-specific gene variants. Here, we describe a family exhibiting four different SSADH variants: Val90Ala, Cys93Phe, and His180Tyr/Asn255Asp (a double variant). We provide a structural and functional characterization of these variants and show that Cys93Phe and Asn255Asp are pathogenic variants that affect the stability of the SSADH protein. Due to the impairment of the cofactor NAD+ binding, these variants show a highly reduced enzyme activity. However, Val90Ala and His180Tyr exhibit normal activity and expression. The His180Tyr/Asn255Asp variant exhibits a highly reduced activity as a recombinant species, is inactive, and shows a very low expression in eukaryotic cells. A treatment with substances that support protein folding by either increasing chaperone protein expression or by chemical means did not increase the expression of the pathogenic variants of the SSADH deficiency patient. However, stabilization of the folding of pathogenic SSADH variants by other substances may provide a treatment option for this disease.
Collapse
Affiliation(s)
- Miroslava Didiasova
- Institute of Biochemistry, Medical Faculty, University of Giessen, Friedrichstrasse 24, DE-35390 Giessen, Germany; (M.D.); (S.F.)
| | - Samuele Cesaro
- Department of Neuroscience, Biomedicine and Movement Sciences, University of Verona, Strada Le Grazie, 8, 37134 Verona, Italy; (S.C.); (I.B.); (M.B.)
| | - Simon Feldhoff
- Institute of Biochemistry, Medical Faculty, University of Giessen, Friedrichstrasse 24, DE-35390 Giessen, Germany; (M.D.); (S.F.)
| | - Ilaria Bettin
- Department of Neuroscience, Biomedicine and Movement Sciences, University of Verona, Strada Le Grazie, 8, 37134 Verona, Italy; (S.C.); (I.B.); (M.B.)
| | - Nana Tiegel
- Institute of Biochemistry, Medical Faculty, University of Giessen, Friedrichstrasse 24, DE-35390 Giessen, Germany; (M.D.); (S.F.)
| | - Vera Füssgen
- Institute of Biochemistry, Medical Faculty, University of Giessen, Friedrichstrasse 24, DE-35390 Giessen, Germany; (M.D.); (S.F.)
| | - Mariarita Bertoldi
- Department of Neuroscience, Biomedicine and Movement Sciences, University of Verona, Strada Le Grazie, 8, 37134 Verona, Italy; (S.C.); (I.B.); (M.B.)
| | - Ritva Tikkanen
- Institute of Biochemistry, Medical Faculty, University of Giessen, Friedrichstrasse 24, DE-35390 Giessen, Germany; (M.D.); (S.F.)
| |
Collapse
|
8
|
Nikam R, Jemimah S, Gromiha MM. DeepPPAPredMut: deep ensemble method for predicting the binding affinity change in protein-protein complexes upon mutation. Bioinformatics 2024; 40:btae309. [PMID: 38718170 PMCID: PMC11112046 DOI: 10.1093/bioinformatics/btae309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 04/08/2024] [Accepted: 05/08/2024] [Indexed: 05/24/2024] Open
Abstract
MOTIVATION Protein-protein interactions underpin many cellular processes and their disruption due to mutations can lead to diseases. With the evolution of protein structure prediction methods like AlphaFold2 and the availability of extensive experimental affinity data, there is a pressing need for updated computational tools that can efficiently predict changes in binding affinity caused by mutations in protein-protein complexes. RESULTS We developed a deep ensemble model that leverages protein sequences, predicted structure-based features, and protein functional classes to accurately predict the change in binding affinity due to mutations. The model achieved a correlation of 0.97 and a mean absolute error (MAE) of 0.35 kcal/mol on the training dataset, and maintained robust performance on the test set with a correlation of 0.72 and a MAE of 0.83 kcal/mol. Further validation using Leave-One-Out Complex (LOOC) cross-validation exhibited a correlation of 0.83 and a MAE of 0.51 kcal/mol, indicating consistent performance. AVAILABILITY AND IMPLEMENTATION https://web.iitm.ac.in/bioinfo2/DeepPPAPredMut/index.html.
Collapse
Affiliation(s)
- Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Sherlyn Jemimah
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
- Department of Biomedical Engineering, Khalifa University, P.O. Box: 127788 , Abu Dhabi, United Arab Emirates
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
- Department of Computer Science, Tokyo Tech World Research Hub Initiative (WRHI), Institute of Innovative Research, Tokyo Institute of Technology, 4259 Nagatsutacho, Midori-ku, Yokohama, Kanagawa 226-8501, Japan
| |
Collapse
|
9
|
Gurusinghe SNS, Shifman JM. Cold Spot SCANNER: Colab Notebook for predicting cold spots in protein-protein interfaces. BMC Bioinformatics 2024; 25:172. [PMID: 38689238 PMCID: PMC11061940 DOI: 10.1186/s12859-024-05796-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 04/22/2024] [Indexed: 05/02/2024] Open
Abstract
BACKGROUND Protein-protein interactions (PPIs) are conveyed through binding interfaces or surface patches on proteins that become buried upon binding. Structural and biophysical analysis of many protein-protein interfaces revealed certain unique features of these surfaces that determine the energetics of interactions and play a critical role in protein evolution. One of the significant aspects of binding interfaces is the presence of binding hot spots, where mutations are highly deleterious for binding. Conversely, binding cold spots are positions occupied by suboptimal amino acids and several mutations in such positions could lead to affinity enhancement. While there are many software programs for identification of hot spot positions, there is currently a lack of software for cold spot detection. RESULTS In this paper, we present Cold Spot SCANNER, a Colab Notebook, which scans a PPI binding interface and identifies cold spots resulting from cavities, unfavorable charge-charge, and unfavorable charge-hydrophobic interactions. The software offers a Py3DMOL-based interface that allows users to visualize cold spots in the context of the protein structure and generates a zip file containing the results for easy download. CONCLUSIONS Cold spot identification is of great importance to protein engineering studies and provides a useful insight into protein evolution. Cold Spot SCANNER is open to all users without login requirements and can be accessible at: https://colab. RESEARCH google.com/github/sagagugit/Cold-Spot-Scanner/blob/main/Cold_Spot_Scanner.ipynb .
Collapse
Affiliation(s)
- Sagara N S Gurusinghe
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Julia M Shifman
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
10
|
Raisinghani N, Alshahrani M, Gupta G, Xiao S, Tao P, Verkhivker G. AlphaFold2-Enabled Atomistic Modeling of Structure, Conformational Ensembles, and Binding Energetics of the SARS-CoV-2 Omicron BA.2.86 Spike Protein with ACE2 Host Receptor and Antibodies: Compensatory Functional Effects of Binding Hotspots in Modulating Mechanisms of Receptor Binding and Immune Escape. J Chem Inf Model 2024; 64:1657-1681. [PMID: 38373700 DOI: 10.1021/acs.jcim.3c01857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
The latest wave of SARS-CoV-2 Omicron variants displayed a growth advantage and increased viral fitness through convergent evolution of functional hotspots that work synchronously to balance fitness requirements for productive receptor binding and efficient immune evasion. In this study, we combined AlphaFold2-based structural modeling approaches with atomistic simulations and mutational profiling of binding energetics and stability for prediction and comprehensive analysis of the structure, dynamics, and binding of the SARS-CoV-2 Omicron BA.2.86 spike variant with ACE2 host receptor and distinct classes of antibodies. We adapted several AlphaFold2 approaches to predict both the structure and conformational ensembles of the Omicron BA.2.86 spike protein in the complex with the host receptor. The results showed that the AlphaFold2-predicted structural ensemble of the BA.2.86 spike protein complex with ACE2 can accurately capture the main conformational states of the Omicron variant. Complementary to AlphaFold2 structural predictions, microsecond molecular dynamics simulations reveal the details of the conformational landscape and produced equilibrium ensembles of the BA.2.86 structures that are used to perform mutational scanning of spike residues and characterize structural stability and binding energy hotspots. The ensemble-based mutational profiling of the receptor binding domain residues in the BA.2 and BA.2.86 spike complexes with ACE2 revealed a group of conserved hydrophobic hotspots and critical variant-specific contributions of the BA.2.86 convergent mutational hotspots R403K, F486P, and R493Q. To examine the immune evasion properties of BA.2.86 in atomistic detail, we performed structure-based mutational profiling of the spike protein binding interfaces with distinct classes of antibodies that displayed significantly reduced neutralization against the BA.2.86 variant. The results revealed the molecular basis of compensatory functional effects of the binding hotspots, showing that BA.2.86 lineage may have evolved to outcompete other Omicron subvariants by improving immune evasion while preserving binding affinity with ACE2 via through a compensatory effect of R493Q and F486P convergent mutational hotspots. This study demonstrated that an integrative approach combining AlphaFold2 predictions with complementary atomistic molecular dynamics simulations and robust ensemble-based mutational profiling of spike residues can enable accurate and comprehensive characterization of structure, dynamics, and binding mechanisms of newly emerging Omicron variants.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States of America
| | - Mohammed Alshahrani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States of America
| | - Grace Gupta
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States of America
| | - Sian Xiao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States of America
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States of America
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States of America
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, California 92618, United States of America
| |
Collapse
|
11
|
Jarończyk M. Software for Predicting Binding Free Energy of Protein-Protein Complexes and Their Mutants. Methods Mol Biol 2024; 2780:139-147. [PMID: 38987468 DOI: 10.1007/978-1-0716-3985-6_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Protein-protein binding affinity prediction is important for understanding complex biochemical pathways and to uncover protein interaction networks. Quantitative estimation of the binding affinity changes caused by mutations can provide critical information for protein function annotation and genetic disease diagnoses. The binding free energies of protein-protein complexes can be predicted using several computational tools. This chapter is a summary of software developed for the prediction of binding free energies for protein-protein complexes and their mutants.
Collapse
|
12
|
Pei J, Zhang J, Cong Q. Computational analysis of protein-protein interactions of cancer drivers in renal cell carcinoma. FEBS Open Bio 2024; 14:112-126. [PMID: 37964489 PMCID: PMC10761929 DOI: 10.1002/2211-5463.13732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 10/30/2023] [Accepted: 11/06/2023] [Indexed: 11/16/2023] Open
Abstract
Renal cell carcinoma (RCC) is the most common type of kidney cancer with rising cases in recent years. Extensive research has identified various cancer driver proteins associated with different subtypes of RCC. Most RCC drivers are encoded by tumor suppressor genes and exhibit enrichment in functional categories such as protein degradation, chromatin remodeling, and transcription. To further our understanding of RCC, we utilized powerful deep-learning methods based on AlphaFold to predict protein-protein interactions (PPIs) involving RCC drivers. We predicted high-confidence complexes formed by various RCC drivers, including TCEB1, KMT2C/D and KDM6A of the COMPASS-related complexes, TSC1 of the MTOR pathway, and TRRAP. These predictions provide valuable structural insights into the interaction interfaces, some of which are promising targets for cancer drug design, such as the NRF2-MAFK interface. Cancer somatic missense mutations from large datasets of genome sequencing of RCCs were mapped to the interfaces of predicted and experimental structures of PPIs involving RCC drivers, and their effects on the binding affinity were evaluated. We observed more than 100 cancer somatic mutations affecting the binding affinity of complexes formed by key RCC drivers such as VHL and TCEB1. These findings emphasize the importance of these mutations in RCC pathogenesis and potentially offer new avenues for targeted therapies.
Collapse
Affiliation(s)
- Jimin Pei
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Harold C. Simmons Comprehensive Cancer CenterUniversity of Texas Southwestern Medical CenterDallasTXUSA
| | - Jing Zhang
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Harold C. Simmons Comprehensive Cancer CenterUniversity of Texas Southwestern Medical CenterDallasTXUSA
| | - Qian Cong
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTXUSA
- Harold C. Simmons Comprehensive Cancer CenterUniversity of Texas Southwestern Medical CenterDallasTXUSA
| |
Collapse
|
13
|
Rana MM, Nguyen DD. Geometric Graph Learning to Predict Changes in Binding Free Energy and Protein Thermodynamic Stability upon Mutation. J Phys Chem Lett 2023; 14:10870-10879. [PMID: 38032742 DOI: 10.1021/acs.jpclett.3c02679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Accurate prediction of binding free energy changes upon mutations is vital for optimizing drugs, designing proteins, understanding genetic diseases, and cost-effective virtual screening. While machine learning methods show promise in this domain, achieving accuracy and generalization across diverse data sets remains a challenge. This study introduces Geometric Graph Learning for Protein-Protein Interactions (GGL-PPI), a novel approach integrating geometric graph representation and machine learning to forecast mutation-induced binding free energy changes. GGL-PPI leverages atom-level graph coloring and multiscale weighted colored geometric subgraphs to capture structural features of biomolecules, demonstrating superior performance on three standard data sets, namely, AB-Bind, SKEMPI 1.0, and SKEMPI 2.0 data sets. The model's efficacy extends to predicting protein thermodynamic stability in a blind test set, providing unbiased predictions for both direct and reverse mutations and showcasing notable generalization. GGL-PPI's precision in predicting changes in binding free energy and stability due to mutations enhances our comprehension of protein complexes, offering valuable insights for drug design endeavors.
Collapse
Affiliation(s)
- Md Masud Rana
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Duc Duy Nguyen
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| |
Collapse
|
14
|
Tsishyn M, Pucci F, Rooman M. Quantification of biases in predictions of protein-protein binding affinity changes upon mutations. Brief Bioinform 2023; 25:bbad491. [PMID: 38197311 PMCID: PMC10777193 DOI: 10.1093/bib/bbad491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 10/02/2023] [Accepted: 12/05/2023] [Indexed: 01/11/2024] Open
Abstract
Understanding the impact of mutations on protein-protein binding affinity is a key objective for a wide range of biotechnological applications and for shedding light on disease-causing mutations, which are often located at protein-protein interfaces. Over the past decade, many computational methods using physics-based and/or machine learning approaches have been developed to predict how protein binding affinity changes upon mutations. They all claim to achieve astonishing accuracy on both training and test sets, with performances on standard benchmarks such as SKEMPI 2.0 that seem overly optimistic. Here we benchmarked eight well-known and well-used predictors and identified their biases and dataset dependencies, using not only SKEMPI 2.0 as a test set but also deep mutagenesis data on the severe acute respiratory syndrome coronavirus 2 spike protein in complex with the human angiotensin-converting enzyme 2. We showed that, even though most of the tested methods reach a significant degree of robustness and accuracy, they suffer from limited generalizability properties and struggle to predict unseen mutations. Interestingly, the generalizability problems are more severe for pure machine learning approaches, while physics-based methods are less affected by this issue. Moreover, undesirable prediction biases toward specific mutation properties, the most marked being toward destabilizing mutations, are also observed and should be carefully considered by method developers. We conclude from our analyses that there is room for improvement in the prediction models and suggest ways to check, assess and improve their generalizability and robustness.
Collapse
Affiliation(s)
- Matsvei Tsishyn
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium
| |
Collapse
|
15
|
Yue Y, Li S, Wang L, Liu H, Tong HHY, He S. MpbPPI: a multi-task pre-training-based equivariant approach for the prediction of the effect of amino acid mutations on protein-protein interactions. Brief Bioinform 2023; 24:bbad310. [PMID: 37651610 PMCID: PMC10516393 DOI: 10.1093/bib/bbad310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 07/12/2023] [Accepted: 08/04/2023] [Indexed: 09/02/2023] Open
Abstract
The accurate prediction of the effect of amino acid mutations for protein-protein interactions (PPI $\Delta \Delta G$) is a crucial task in protein engineering, as it provides insight into the relevant biological processes underpinning protein binding and provides a basis for further drug discovery. In this study, we propose MpbPPI, a novel multi-task pre-training-based geometric equivariance-preserving framework to predict PPI $\Delta \Delta G$. Pre-training on a strictly screened pre-training dataset is employed to address the scarcity of protein-protein complex structures annotated with PPI $\Delta \Delta G$ values. MpbPPI employs a multi-task pre-training technique, forcing the framework to learn comprehensive backbone and side chain geometric regulations of protein-protein complexes at different scales. After pre-training, MpbPPI can generate high-quality representations capturing the effective geometric characteristics of labeled protein-protein complexes for downstream $\Delta \Delta G$ predictions. MpbPPI serves as a scalable framework supporting different sources of mutant-type (MT) protein-protein complexes for flexible application. Experimental results on four benchmark datasets demonstrate that MpbPPI is a state-of-the-art framework for PPI $\Delta \Delta G$ predictions. The data and source code are available at https://github.com/arantir123/MpbPPI.
Collapse
Affiliation(s)
- Yang Yue
- School of Computer Science from the University of Birmingham, UK
| | - Shu Li
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Lingling Wang
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Huanxiang Liu
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Henry H Y Tong
- Centre for Artificial Intelligence Driven Drug Discovery at Macao Polytechnic University
| | - Shan He
- School of Computer Science, the University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| |
Collapse
|
16
|
Shirvanizadeh N, Vihinen M. VariBench, new variation benchmark categories and data sets. FRONTIERS IN BIOINFORMATICS 2023; 3:1248732. [PMID: 37795169 PMCID: PMC10546188 DOI: 10.3389/fbinf.2023.1248732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 09/08/2023] [Indexed: 10/06/2023] Open
Affiliation(s)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
17
|
Peka M, Balatsky V. Analysis of RBD-ACE2 interactions in livestock species as a factor in the spread of SARS-CoV-2 among animals. Vet Anim Sci 2023; 21:100303. [PMID: 37521409 PMCID: PMC10372456 DOI: 10.1016/j.vas.2023.100303] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/01/2023] Open
Abstract
The high mutation rate of SARS-CoV-2, which has led to the emergence of a number of virus variants, creates risks of transmission from humans to animal species and the emergence of new animal reservoirs of COVID-19. This study aimed to identify animal species among livestock susceptible to infection and develop an approach that would be possible to use for assessing the hazards caused by new SARS-CoV-2 variants for animals. Bioinformatic analysis was used to evaluate the ability of receptor-binding domains (RBDs) of different SARS-CoV-2 variants to interact with ACE2 receptors of livestock species. The results indicated that the stability of RBD-ACE2 complexes depends on both amino acid residues in the ACE2 sequences of animal species and on mutations in the RBDs of SARS-CoV-2 variants, with the residues in the interface of the RBD-ACE2 complex being the most important. All studied SARS-CoV-2 variants had high affinity for ferret and American mink receptors, while the affinity for horse, donkey, and bird species' receptors significantly increased in the highly mutated Omicron variant. Hazards that future SARS-CoV-2 variants may acquire specificity to new animal species remain high given the mutability of the virus. The continued use and expansion of the bioinformatic approach presented in this study may be relevant for monitoring transmission risks and preventing the emergence of new reservoirs of COVID-19 among animals.
Collapse
Affiliation(s)
- Mykyta Peka
- V. N. Karazin Kharkiv National University, 4 Svobody Sq, Kharkiv, 61022, Ukraine
- Institute of Pig Breeding and Agroindustrial Production, National Academy of Agrarian Sciences of Ukraine, 1 Shvedska Mohyla St, Poltava, 36013, Ukraine
| | - Viktor Balatsky
- V. N. Karazin Kharkiv National University, 4 Svobody Sq, Kharkiv, 61022, Ukraine
- Institute of Pig Breeding and Agroindustrial Production, National Academy of Agrarian Sciences of Ukraine, 1 Shvedska Mohyla St, Poltava, 36013, Ukraine
| |
Collapse
|
18
|
Wang G, Liu X, Wang K, Gao Y, Li G, Baptista-Hon DT, Yang XH, Xue K, Tai WH, Jiang Z, Cheng L, Fok M, Lau JYN, Yang S, Lu L, Zhang P, Zhang K. Deep-learning-enabled protein-protein interaction analysis for prediction of SARS-CoV-2 infectivity and variant evolution. Nat Med 2023; 29:2007-2018. [PMID: 37524952 DOI: 10.1038/s41591-023-02483-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 06/28/2023] [Indexed: 08/02/2023]
Abstract
Host-pathogen interactions and pathogen evolution are underpinned by protein-protein interactions between viral and host proteins. An understanding of how viral variants affect protein-protein binding is important for predicting viral-host interactions, such as the emergence of new pathogenic SARS-CoV-2 variants. Here we propose an artificial intelligence-based framework called UniBind, in which proteins are represented as a graph at the residue and atom levels. UniBind integrates protein three-dimensional structure and binding affinity and is capable of multi-task learning for heterogeneous biological data integration. In systematic tests on benchmark datasets and further experimental validation, UniBind effectively and scalably predicted the effects of SARS-CoV-2 spike protein variants on their binding affinities to the human ACE2 receptor, as well as to SARS-CoV-2 neutralizing monoclonal antibodies. Furthermore, in a cross-species analysis, UniBind could be applied to predict host susceptibility to SARS-CoV-2 variants and to predict future viral variant evolutionary trends. This in silico approach has the potential to serve as an early warning system for problematic emerging SARS-CoV-2 variants, as well as to facilitate research on protein-protein interactions in general.
Collapse
Affiliation(s)
- Guangyu Wang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China.
| | - Xiaohong Liu
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- UCL Cancer Institute, University College London, London, UK
| | - Kai Wang
- Department of Big Data and Biomedical Artificial Intelligence, National Biomedical Imaging Center, College of Future Technology, Peking University and Peking-Tsinghua Center for Life Sciences, Beijing, China
| | - Yuanxu Gao
- Guangzhou National Laboratory, Guangzhou, China
| | - Gen Li
- Guangzhou National Laboratory, Guangzhou, China
- Guangzhou Women and Children's Medical Center, Guangzhou, China
| | - Daniel T Baptista-Hon
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China
| | - Xiaohong Helena Yang
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Kanmin Xue
- Nuffield Laboratory of Ophthalmology, Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Wa Hou Tai
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Zeyu Jiang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Linling Cheng
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China
| | - Manson Fok
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Johnson Yiu-Nam Lau
- Departments of Biology and Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Shengyong Yang
- State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Ligong Lu
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China
| | - Ping Zhang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Kang Zhang
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China.
- Department of Big Data and Biomedical Artificial Intelligence, National Biomedical Imaging Center, College of Future Technology, Peking University and Peking-Tsinghua Center for Life Sciences, Beijing, China.
- Guangzhou National Laboratory, Guangzhou, China.
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China.
| |
Collapse
|
19
|
Pandey P, Panday SK, Rimal P, Ancona N, Alexov E. Predicting the Effect of Single Mutations on Protein Stability and Binding with Respect to Types of Mutations. Int J Mol Sci 2023; 24:12073. [PMID: 37569449 PMCID: PMC10418460 DOI: 10.3390/ijms241512073] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 07/24/2023] [Accepted: 07/26/2023] [Indexed: 08/13/2023] Open
Abstract
The development of methods and algorithms to predict the effect of mutations on protein stability, protein-protein interaction, and protein-DNA/RNA binding is necessitated by the needs of protein engineering and for understanding the molecular mechanism of disease-causing variants. The vast majority of the leading methods require a database of experimentally measured folding and binding free energy changes for training. These databases are collections of experimental data taken from scientific investigations typically aimed at probing the role of particular residues on the above-mentioned thermodynamic characteristics, i.e., the mutations are not introduced at random and do not necessarily represent mutations originating from single nucleotide variants (SNV). Thus, the reported performance of the leading algorithms assessed on these databases or other limited cases may not be applicable for predicting the effect of SNVs seen in the human population. Indeed, we demonstrate that the SNVs and non-SNVs are not equally presented in the corresponding databases, and the distribution of the free energy changes is not the same. It is shown that the Pearson correlation coefficients (PCCs) of folding and binding free energy changes obtained in cases involving SNVs are smaller than for non-SNVs, indicating that caution should be used in applying them to reveal the effect of human SNVs. Furthermore, it is demonstrated that some methods are sensitive to the chemical nature of the mutations, resulting in PCCs that differ by a factor of four across chemically different mutations. All methods are found to underestimate the energy changes by roughly a factor of 2.
Collapse
Affiliation(s)
- Preeti Pandey
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| | - Shailesh Kumar Panday
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| | - Prawin Rimal
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| | - Nicolas Ancona
- Department of Biological Sciences, Clemson University, Clemson, SC 29634, USA;
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA; (P.P.); (S.K.P.); (P.R.)
| |
Collapse
|
20
|
Mohseni Behbahani Y, Laine E, Carbone A. Deep Local Analysis deconstructs protein-protein interfaces and accurately estimates binding affinity changes upon mutation. Bioinformatics 2023; 39:i544-i552. [PMID: 37387162 DOI: 10.1093/bioinformatics/btad231] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The spectacular recent advances in protein and protein complex structure prediction hold promise for reconstructing interactomes at large-scale and residue resolution. Beyond determining the 3D arrangement of interacting partners, modeling approaches should be able to unravel the impact of sequence variations on the strength of the association. RESULTS In this work, we report on Deep Local Analysis, a novel and efficient deep learning framework that relies on a strikingly simple deconstruction of protein interfaces into small locally oriented residue-centered cubes and on 3D convolutions recognizing patterns within cubes. Merely based on the two cubes associated with the wild-type and the mutant residues, DLA accurately estimates the binding affinity change for the associated complexes. It achieves a Pearson correlation coefficient of 0.735 on about 400 mutations on unseen complexes. Its generalization capability on blind datasets of complexes is higher than the state-of-the-art methods. We show that taking into account the evolutionary constraints on residues contributes to predictions. We also discuss the influence of conformational variability on performance. Beyond the predictive power on the effects of mutations, DLA is a general framework for transferring the knowledge gained from the available non-redundant set of complex protein structures to various tasks. For instance, given a single partially masked cube, it recovers the identity and physicochemical class of the central residue. Given an ensemble of cubes representing an interface, it predicts the function of the complex. AVAILABILITY AND IMPLEMENTATION Source code and models are available at http://gitlab.lcqb.upmc.fr/DLA/DLA.git.
Collapse
Affiliation(s)
- Yasser Mohseni Behbahani
- Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Sorbonne Université, CNRS, IBPS, Paris 75005, France
| | - Elodie Laine
- Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Sorbonne Université, CNRS, IBPS, Paris 75005, France
| | - Alessandra Carbone
- Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Sorbonne Université, CNRS, IBPS, Paris 75005, France
| |
Collapse
|
21
|
Pandey P, Ghimire S, Wu B, Alexov E. On the linkage of thermodynamics and pathogenicity. Curr Opin Struct Biol 2023; 80:102572. [PMID: 36965249 PMCID: PMC10239362 DOI: 10.1016/j.sbi.2023.102572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 02/16/2023] [Accepted: 02/21/2023] [Indexed: 03/27/2023]
Abstract
This review outlines the effect of disease-causing mutations on proteins' thermodynamics. Two major thermodynamics quantities, which are essential for structural integrity, the folding and binding free energy changes caused by missense mutations, are considered. It is emphasized that disease effects in case of complex diseases may originate from several mutations over several genes, while monogenic diseases are caused by mutation is a single gene. Nevertheless, in both cases it is shown that pathogenic mutations cause larger perturbations of the above-mentioned thermodynamics quantities as compared with the benign mutations. Recent works demonstrating the effect of pathogenic mutations on the above-mentioned thermodynamics quantities, as well as on structural dynamics and allosteric pathways, are reviewed.
Collapse
Affiliation(s)
- Preeti Pandey
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Sanjeev Ghimire
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Bohua Wu
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
| |
Collapse
|
22
|
Durham J, Zhang J, Humphreys IR, Pei J, Cong Q. Recent advances in predicting and modeling protein-protein interactions. Trends Biochem Sci 2023; 48:527-538. [PMID: 37061423 DOI: 10.1016/j.tibs.2023.03.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 03/03/2023] [Accepted: 03/17/2023] [Indexed: 04/17/2023]
Abstract
Protein-protein interactions (PPIs) drive biological processes, and disruption of PPIs can cause disease. With recent breakthroughs in structure prediction and a deluge of genomic sequence data, computational methods to predict PPIs and model spatial structures of protein complexes are now approaching the accuracy of experimental approaches for permanent interactions and show promise for elucidating transient interactions. As we describe here, the key to this success is rich evolutionary information deciphered from thousands of homologous sequences that coevolve in interacting partners. This covariation signal, revealed by sophisticated statistical and machine learning (ML) algorithms, predicts physiological interactions. Accurate artificial intelligence (AI)-based modeling of protein structures promises to provide accurate 3D models of PPIs at a proteome-wide scale.
Collapse
Affiliation(s)
- Jesse Durham
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Jing Zhang
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Ian R Humphreys
- Department of Biochemistry, University of Washington, Seattle, WA, USA; Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jimin Pei
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Qian Cong
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA; Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
23
|
Peka M, Balatsky V. The impact of mutation sets in receptor-binding domain of SARS-CoV-2 variants on the stability of RBD–ACE2 complex. Future Virol 2023. [PMID: 37064325 PMCID: PMC10089296 DOI: 10.2217/fvl-2022-0152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 02/01/2023] [Indexed: 04/08/2023]
Abstract
Aim: Bioinformatic analysis of mutation sets in receptor-binding domain (RBD) of currently and previously circulating SARS-CoV-2 variants of concern (VOCs) and interest (VOIs) to assess their ability to bind the ACE2 receptor. Methods: In silico sequence and structure-oriented approaches were used to evaluate the impact of single and multiple mutations. Results: Mutations detected in VOCs and VOIs led to the reduction of binding free energy of the RBD–ACE2 complex, forming additional chemical bonds with ACE2, and to an increase of RBD–ACE2 complex stability. Conclusion: Mutation sets characteristic of SARS-CoV-2 variants have complex effects on the ACE2 receptor-binding affinity associated with amino acid interactions at mutation sites, as well as on the acquisition of other viral adaptive advantages.
Collapse
Affiliation(s)
- Mykyta Peka
- V. N. Karazin Kharkiv National University, Kharkiv, 61022, Ukraine
- Institute of Pig Breeding & Agroindustrial Production, National Academy of Agrarian Sciences of Ukraine, Poltava, 36013, Ukraine
| | - Viktor Balatsky
- V. N. Karazin Kharkiv National University, Kharkiv, 61022, Ukraine
- Institute of Pig Breeding & Agroindustrial Production, National Academy of Agrarian Sciences of Ukraine, Poltava, 36013, Ukraine
| |
Collapse
|
24
|
Biochemical and Bioinformatic Studies of Mutations of Residues at the Monomer-Monomer Interface of Human Ornithine Aminotransferase Leading to Gyrate Atrophy of Choroid and Retina. Int J Mol Sci 2023; 24:ijms24043369. [PMID: 36834788 PMCID: PMC9967328 DOI: 10.3390/ijms24043369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 02/03/2023] [Accepted: 02/06/2023] [Indexed: 02/10/2023] Open
Abstract
Deficit of human ornithine aminotransferase (hOAT), a mitochondrial tetrameric pyridoxal-5'-phosphate (PLP) enzyme, leads to gyrate atrophy of the choroid and retina (GA). Although 70 pathogenic mutations have been identified, only few enzymatic phenotypes are known. Here, we report biochemical and bioinformatic analyses of the G51D, G121D, R154L, Y158S, T181M, and P199Q pathogenic variants involving residues located at the monomer-monomer interface. All mutations cause a shift toward a dimeric structure, and changes in tertiary structure, thermal stability, and PLP microenvironment. The impact on these features is less pronounced for the mutations of Gly51 and Gly121 mapping to the N-terminal segment of the enzyme than those of Arg154, Tyr158, Thr181, and Pro199 belonging to the large domain. These data, together with the predicted ΔΔG values of monomer-monomer binding for the variants, suggest that the proper monomer-monomer interactions seem to be correlated with the thermal stability, the PLP binding site and the tetrameric structure of hOAT. The different impact of these mutations on the catalytic activity was also reported and discussed on the basis of the computational information. Together, these results allow the identification of the molecular defects of these variants, thus extending the knowledge of enzymatic phenotypes of GA patients.
Collapse
|
25
|
Zhang J, Pei J, Durham J, Bos T, Cong Q. Computed cancer interactome explains the effects of somatic mutations in cancers. Protein Sci 2022; 31:e4479. [PMID: 36261849 PMCID: PMC9667826 DOI: 10.1002/pro.4479] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 09/28/2022] [Accepted: 10/13/2022] [Indexed: 12/13/2022]
Abstract
Protein-protein interactions (PPIs) are involved in almost all essential cellular processes. Perturbation of PPI networks plays critical roles in tumorigenesis, cancer progression, and metastasis. While numerous high-throughput experiments have produced a vast amount of data for PPIs, these data sets suffer from high false positive rates and exhibit a high degree of discrepancy. Coevolution of amino acid positions between protein pairs has proven to be useful in identifying interacting proteins and providing structural details of the interaction interfaces with the help of deep learning methods like AlphaFold (AF). In this study, we applied AF to investigate the cancer protein-protein interactome. We predicted 1,798 PPIs for cancer driver proteins involved in diverse cellular processes such as transcription regulation, signal transduction, DNA repair, and cell cycle. We modeled the spatial structures for the predicted binary protein complexes, 1,087 of which lacked previous 3D structure information. Our predictions offer novel structural insight into many cancer-related processes such as the MAP kinase cascade and Fanconi anemia pathway. We further investigated the cancer mutation landscape by mapping somatic missense mutations (SMMs) in cancer to the predicted PPI interfaces and performing enrichment and depletion analyses. Interfaces enriched or depleted with SMMs exhibit different preferences for functional categories. Interfaces enriched in mutations tend to function in pathways that are deregulated in cancers and they may help explain the molecular mechanisms of cancers in patients; interfaces lacking mutations appear to be essential for the survival of cancer cells and thus may be future targets for PPI modulating drugs.
Collapse
Affiliation(s)
- Jing Zhang
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Jimin Pei
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Jesse Durham
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Tasia Bos
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| | - Qian Cong
- Eugene McDermott Center for Human Growth and DevelopmentUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Department of BiophysicsUniversity of Texas Southwestern Medical CenterDallasTexasUSA
| |
Collapse
|
26
|
Chen J, Qiu Y, Wang R, Wei GW. Persistent Laplacian projected Omicron BA.4 and BA.5 to become new dominating variants. Comput Biol Med 2022; 151:106262. [PMID: 36379191 PMCID: PMC10754203 DOI: 10.1016/j.compbiomed.2022.106262] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 10/21/2022] [Accepted: 10/30/2022] [Indexed: 11/15/2022]
Abstract
Due to its high transmissibility, Omicron BA.1 ousted the Delta variant to become a dominating variant in late 2021 and was replaced by more transmissible Omicron BA.2 in March 2022. An important question is which new variants will dominate in the future. Topology-based deep learning models have had tremendous success in forecasting emerging variants in the past. However, topology is insensitive to homotopic shape evolution in virus-human protein-protein binding, which is crucial to viral evolution and transmission. This challenge is tackled with persistent Laplacian, which is able to capture both the topological change and homotopic shape evolution of data. Persistent Laplacian-based deep learning models are developed to systematically evaluate variant infectivity. Our comparative analysis of Alpha, Beta, Gamma, Delta, Lambda, Mu, and Omicron BA.1, BA.1.1, BA.2, BA.2.11, BA.2.12.1, BA.3, BA.4, and BA.5 unveils that Omicron BA.2.11, BA.2.12.1, BA.3, BA.4, and BA.5 are more contagious than BA.2. In particular, BA.4 and BA.5 are about 36% more infectious than BA.2 and are projected to become new dominant variants by natural selection. Moreover, the proposed models outperform the state-of-the-art methods on three major benchmark datasets for mutation-induced protein-protein binding free energy changes. Our key projection about BA4 and BA.5's dominance made on May 1, 2022 (see arXiv:2205.00532) became a reality in late June 2022.
Collapse
Affiliation(s)
- Jiahui Chen
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Yuchi Qiu
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
27
|
Liu J, Xia KL, Wu J, Yau SST, Wei GW. Biomolecular Topology: Modelling and Analysis. ACTA MATHEMATICA SINICA, ENGLISH SERIES 2022; 38:1901-1938. [PMID: 36407804 PMCID: PMC9640850 DOI: 10.1007/s10114-022-2326-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 07/12/2022] [Indexed: 05/25/2023]
Abstract
With the great advancement of experimental tools, a tremendous amount of biomolecular data has been generated and accumulated in various databases. The high dimensionality, structural complexity, the nonlinearity, and entanglements of biomolecular data, ranging from DNA knots, RNA secondary structures, protein folding configurations, chromosomes, DNA origami, molecular assembly, to others at the macromolecular level, pose a severe challenge in their analysis and characterization. In the past few decades, mathematical concepts, models, algorithms, and tools from algebraic topology, combinatorial topology, computational topology, and topological data analysis, have demonstrated great power and begun to play an essential role in tackling the biomolecular data challenge. In this work, we introduce biomolecular topology, which concerns the topological problems and models originated from the biomolecular systems. More specifically, the biomolecular topology encompasses topological structures, properties and relations that are emerged from biomolecular structures, dynamics, interactions, and functions. We discuss the various types of biomolecular topology from structures (of proteins, DNAs, and RNAs), protein folding, and protein assembly. A brief discussion of databanks (and databases), theoretical models, and computational algorithms, is presented. Further, we systematically review related topological models, including graphs, simplicial complexes, persistent homology, persistent Laplacians, de Rham-Hodge theory, Yau-Hausdorff distance, and the topology-based machine learning models.
Collapse
Affiliation(s)
- Jian Liu
- School of Mathematical Sciences, Hebei Normal University, Shijiazhuang, 050024 P. R. China
- Yanqi Lake Beijing Institute of Mathematical Sciences and Applications, Beijing, 101408 P. R. China
| | - Ke-Lin Xia
- School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore, 639798 Singapore
| | - Jie Wu
- Yanqi Lake Beijing Institute of Mathematical Sciences and Applications, Beijing, 101408 P. R. China
- Department of Mathematical Sciences, Tsinghua University, Beijing, 100084 P. R. China
| | - Stephen Shing-Toung Yau
- Yanqi Lake Beijing Institute of Mathematical Sciences and Applications, Beijing, 101408 P. R. China
- Department of Mathematical Sciences, Tsinghua University, Beijing, 100084 P. R. China
| | - Guo-Wei Wei
- Department of Mathematics & Department of Biochemistry and Molecular Biology & Department of Electrical and Computer Engineering, Michigan State University, Wells Hall 619 Red Cedar Road, East Lansing, MI 48824-1027 USA
| |
Collapse
|
28
|
Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:biom12091246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein–ligand binding, including allosteric effects, protein–protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
|
29
|
Liu X, Feng H, Wu J, Xia K. Hom-Complex-Based Machine Learning (HCML) for the Prediction of Protein-Protein Binding Affinity Changes upon Mutation. J Chem Inf Model 2022; 62:3961-3969. [PMID: 36040839 DOI: 10.1021/acs.jcim.2c00580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein-protein interactions (PPIs) are involved in almost all biological processes in the cell. Understanding protein-protein interactions holds the key for the understanding of biological functions, diseases and the development of therapeutics. Recently, artificial intelligence (AI) models have demonstrated great power in PPIs. However, a key issue for all AI-based PPI models is efficient molecular representations and featurization. Here, we propose Hom-complex-based PPI representation, and Hom-complex-based machine learning models for the prediction of PPI binding affinity changes upon mutation, for the first time. In our model, various Hom complexes Hom(G1, G) can be generated for the graph representation G of protein-protein complex by using different graphs G1, which reveal G1-related inner connections within the graph representation G of protein-protein complex. Further, for a specific graph G1, a series of nested Hom complexes are generated to give a multiscale characterization of the PPIs. Its persistent homology and persistent Euler characteristic are used as molecular descriptors and further combined with the machine learning model, in particular, gradient boosting tree (GBT). We systematically test our model on the two most-commonly used data sets, that is, SKEMPI and AB-Bind. It has been found that our model outperforms all the existing models as far as we know, which demonstrates the great potential of our model for the analysis of PPIs. Our model can be used for the analysis and design of efficient antibodies for SARS-CoV-2.
Collapse
Affiliation(s)
- Xiang Liu
- Chern Institute of Mathematics and LPMC, Nankai University, Tianjin, China, 300071.,Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371
| | - Huitao Feng
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371.,Mathematical Science Research Center, Chongqing University of Technology, Chongqing, China, 400054
| | - Jie Wu
- Yanqi Lake Beijing Institute of Mathematical Sciences and Applications (BIMSA), Beijing, China,101408
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371
| |
Collapse
|
30
|
Probing the Immune System Dynamics of the COVID-19 Disease for Vaccine Designing and Drug Repurposing Using Bioinformatics Tools. IMMUNO 2022. [DOI: 10.3390/immuno2020022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The pathogenesis of COVID-19 is complicated by immune dysfunction. The impact of immune-based therapy in COVID-19 patients has been well documented, with some notable studies on the use of anti-cytokine medicines. However, the complexity of disease phenotypes, patient heterogeneity and the varying quality of evidence from immunotherapy studies provide problems in clinical decision-making. This review seeks to aid therapeutic decision-making by giving an overview of the immunological responses against COVID-19 disease that may contribute to the severity of the disease. We have extensively discussed theranostic methods for COVID-19 detection. With advancements in technology, bioinformatics has taken studies to a higher level. The paper also discusses the application of bioinformatics and machine learning tools for the diagnosis, vaccine design and drug repurposing against SARS-CoV-2.
Collapse
|
31
|
Gupta S, Azadvari N, Hosseinzadeh P. Design of Protein Segments and Peptides for Binding to Protein Targets. BIODESIGN RESEARCH 2022; 2022:9783197. [PMID: 37850124 PMCID: PMC10521657 DOI: 10.34133/2022/9783197] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 03/16/2022] [Indexed: 10/19/2023] Open
Abstract
Recent years have witnessed a rise in methods for accurate prediction of structure and design of novel functional proteins. Design of functional protein fragments and peptides occupy a small, albeit unique, space within the general field of protein design. While the smaller size of these peptides allows for more exhaustive computational methods, flexibility in their structure and sparsity of data compared to proteins, as well as presence of noncanonical building blocks, add additional challenges to their design. This review summarizes the current advances in the design of protein fragments and peptides for binding to targets and discusses the challenges in the field, with an eye toward future directions.
Collapse
Affiliation(s)
- Suchetana Gupta
- Knight Campus Center for Accelerating Scientific Impact, University of Oregon, Eugene OR 97403, USA
| | - Noora Azadvari
- Knight Campus Center for Accelerating Scientific Impact, University of Oregon, Eugene OR 97403, USA
| | - Parisa Hosseinzadeh
- Knight Campus Center for Accelerating Scientific Impact, University of Oregon, Eugene OR 97403, USA
| |
Collapse
|
32
|
Ghadie MA, Xia Y. Are transient protein-protein interactions more dispensable? PLoS Comput Biol 2022; 18:e1010013. [PMID: 35404956 PMCID: PMC9000134 DOI: 10.1371/journal.pcbi.1010013] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Accepted: 03/11/2022] [Indexed: 12/12/2022] Open
Abstract
Protein-protein interactions (PPIs) are key drivers of cell function and evolution. While it is widely assumed that most permanent PPIs are important for cellular function, it remains unclear whether transient PPIs are equally important. Here, we estimate and compare dispensable content among transient PPIs and permanent PPIs in human. Starting with a human reference interactome mapped by experiments, we construct a human structural interactome by building three-dimensional structural models for PPIs, and then distinguish transient PPIs from permanent PPIs using several structural and biophysical properties. We map common mutations from healthy individuals and disease-causing mutations onto the structural interactome, and perform structure-based calculations of the probabilities for common mutations (assumed to be neutral) and disease mutations (assumed to be mildly deleterious) to disrupt transient PPIs and permanent PPIs. Using Bayes' theorem we estimate that a similarly small fraction (<~20%) of both transient and permanent PPIs are completely dispensable, i.e., effectively neutral upon disruption. Hence, transient and permanent interactions are subject to similarly strong selective constraints in the human interactome.
Collapse
Affiliation(s)
| | - Yu Xia
- Department of Bioengineering, McGill University, Montreal, Canada
| |
Collapse
|
33
|
Flagellin outer domain dimerization modulates motility in pathogenic and soil bacteria from viscous environments. Nat Commun 2022; 13:1422. [PMID: 35301306 PMCID: PMC8931119 DOI: 10.1038/s41467-022-29069-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 02/24/2022] [Indexed: 12/01/2022] Open
Abstract
Flagellar filaments function as the propellers of the bacterial flagellum and their supercoiling is key to motility. The outer domains on the surface of the filament are non-critical for motility in many bacteria and their structures and functions are not conserved. Here, we show the atomic cryo-electron microscopy structures for flagellar filaments from enterohemorrhagic Escherichia coli O157:H7, enteropathogenic E. coli O127:H6, Achromobacter, and Sinorhizobium meliloti, where the outer domains dimerize or tetramerize to form either a sheath or a screw-like surface. These dimers are formed by 180° rotations of half of the outer domains. The outer domain sheath (ODS) plays a role in bacterial motility by stabilizing an intermediate waveform and prolonging the tumbling of E. coli cells. Bacteria with these ODS and screw-like flagellar filaments are commonly found in soil and human intestinal environments of relatively high viscosity suggesting a role for the dimerization in these environments. It has been suggested that the outer domains of bacterial flagellins are not needed for motility. Here, the authors show that flagellar filament outer domains from some bacteria have unique structures which can alter the motility of the bacteria.
Collapse
|
34
|
Deep learning guided optimization of human antibody against SARS-CoV-2 variants with broad neutralization. Proc Natl Acad Sci U S A 2022; 119:e2122954119. [PMID: 35238654 PMCID: PMC8931377 DOI: 10.1073/pnas.2122954119] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
SignificanceSARS-CoV-2 continues to evolve through emerging variants, more frequently observed with higher transmissibility. Despite the wide application of vaccines and antibodies, the selection pressure on the Spike protein may lead to further evolution of variants that include mutations that can evade immune response. To catch up with the virus's evolution, we introduced a deep learning approach to redesign the complementarity-determining regions (CDRs) to target multiple virus variants and obtained an antibody that broadly neutralizes SARS-CoV-2 variants.
Collapse
|
35
|
Wee J, Xia K. Persistent spectral based ensemble learning (PerSpect-EL) for protein-protein binding affinity prediction. Brief Bioinform 2022; 23:6533501. [PMID: 35189639 DOI: 10.1093/bib/bbac024] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 12/30/2021] [Accepted: 01/17/2022] [Indexed: 12/14/2022] Open
Abstract
Protein-protein interactions (PPIs) play a significant role in nearly all cellular and biological activities. Data-driven machine learning models have demonstrated great power in PPIs. However, the design of efficient molecular featurization poses a great challenge for all learning models for PPIs. Here, we propose persistent spectral (PerSpect) based PPI representation and featurization, and PerSpect-based ensemble learning (PerSpect-EL) models for PPI binding affinity prediction, for the first time. In our model, a sequence of Hodge (or combinatorial) Laplacian (HL) matrices at various different scales are generated from a specially designed filtration process. PerSpect attributes, which are statistical and combinatorial properties of spectrum information from these HL matrices, are used as features for PPI characterization. Each PerSpect attribute is input into a 1D convolutional neural network (CNN), and these CNN networks are stacked together in our PerSpect-based ensemble learning models. We systematically test our model on the two most commonly used datasets, i.e. SKEMPI and AB-Bind. It has been found that our model can achieve state-of-the-art results and outperform all existing models to the best of our knowledge.
Collapse
Affiliation(s)
- JunJie Wee
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| |
Collapse
|
36
|
Xiong D, Lee D, Li L, Zhao Q, Yu H. Implications of disease-related mutations at protein-protein interfaces. Curr Opin Struct Biol 2022; 72:219-225. [PMID: 34959033 PMCID: PMC8863207 DOI: 10.1016/j.sbi.2021.11.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 11/01/2021] [Accepted: 11/18/2021] [Indexed: 02/03/2023]
Abstract
Protein-protein interfaces have been attracting great attention owing to their critical roles in protein-protein interactions and the fact that human disease-related mutations are generally enriched in them. Recently, substantial research progress has been made in this field, which has significantly promoted the understanding and treatment of various human diseases. For example, many studies have discovered the properties of disease-related mutations. Besides, as more large-scale experimental data become available, various computational approaches have been proposed to advance our understanding of disease mutations from the data. Here, we overview recent advances in characteristics of disease-related mutations at protein-protein interfaces, mutation effects on protein interactions, and investigation of mutations on specific diseases.
Collapse
Affiliation(s)
- Dapeng Xiong
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Dongjin Lee
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Le Li
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Qiuye Zhao
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA,Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
37
|
Statistical potentials from the Gaussian scaling behaviour of chain fragments buried within protein globules. PLoS One 2022; 17:e0254969. [PMID: 35085247 PMCID: PMC8794220 DOI: 10.1371/journal.pone.0254969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Accepted: 10/28/2021] [Indexed: 11/19/2022] Open
Abstract
Knowledge-based approaches use the statistics collected from protein data-bank structures to estimate effective interaction potentials between amino acid pairs. Empirical relations are typically employed that are based on the crucial choice of a reference state associated to the null interaction case. Despite their significant effectiveness, the physical interpretation of knowledge-based potentials has been repeatedly questioned, with no consensus on the choice of the reference state. Here we use the fact that the Flory theorem, originally derived for chains in a dense polymer melt, holds also for chain fragments within the core of globular proteins, if the average over buried fragments collected from different non-redundant native structures is considered. After verifying that the ensuing Gaussian statistics, a hallmark of effectively non-interacting polymer chains, holds for a wide range of fragment lengths, although with significant deviations at short spatial scales, we use it to define a ‘bona fide’ reference state. Notably, despite the latter does depend on fragment length, deviations from it do not. This allows to estimate an effective interaction potential which is not biased by the presence of correlations due to the connectivity of the protein chain. We show how different sequence-independent effective statistical potentials can be derived using this approach by coarse-graining the protein representation at varying levels. The possibility of defining sequence-dependent potentials is explored.
Collapse
|
38
|
Baek KT, Kepp KP. Data set and fitting dependencies when estimating protein mutant stability: Toward simple, balanced, and interpretable models. J Comput Chem 2022; 43:504-518. [PMID: 35040492 DOI: 10.1002/jcc.26810] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 12/13/2021] [Accepted: 01/03/2022] [Indexed: 12/27/2022]
Abstract
Accurate prediction of protein stability changes upon mutation (ΔΔG) is increasingly important to evolution studies, protein engineering, and screening of disease-causing gene variants but is challenged by biases in training data. We investigated 45 linear regression models trained on data sets that account systematically for destabilization bias and mutation-type bias BM . The models were externally validated on three test data sets probing different pathologies and for internal consistency (symmetry and neutrality). Model structure and performance substantially depended on training data and even fitting method. We developed two final models: SimBa-IB for typical natural mutations and SimBa-SYM for situations where stabilizing and destabilizing mutations occur to a similar extent. SimBa-SYM, despite is simplicity, is essentially non-biased (vs. the Ssym data set) while still performing well for all data sets (R ~ 0.46-0.54, MAE = 1.16-1.24 kcal/mol). The simple models provide advantage in terms of interpretability, use and future improvement, and are freely available on GitHub.
Collapse
Affiliation(s)
| | - Kasper P Kepp
- DTU Chemistry, Technical University of Denmark, Lyngby, Denmark
| |
Collapse
|
39
|
Dhusia K, Madrid C, Su Z, Wu Y. EXCESP: A Structure-Based Online Database for Extracellular Interactome of Cell Surface Proteins in Humans. J Proteome Res 2022; 21:349-359. [PMID: 34978816 DOI: 10.1021/acs.jproteome.1c00612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The interactions between ectodomains of cell surface proteins are vital players in many important cellular processes, such as regulating immune responses, coordinating cell differentiation, and shaping neural plasticity. However, while the construction of a large-scale protein interactome has been greatly facilitated by the development of high-throughput experimental techniques, little progress has been made to support the discovery of extracellular interactome for cell surface proteins. Harnessed by the recent advances in computational modeling of protein-protein interactions, here we present a structure-based online database for the extracellular interactome of cell surface proteins in humans, called EXCESP. The database contains both experimentally determined and computationally predicted interactions among all type-I transmembrane proteins in humans. All structural models for these interactions and their binding affinities were further computationally modeled. Moreover, information such as expression levels of each protein in different cell types and its relation to various signaling pathways from other online resources has also been integrated into the database. In summary, the database serves as a valuable addition to the existing online resources for the study of cell surface proteins. It can contribute to the understanding of the functions of cell surface proteins in the era of systems biology.
Collapse
Affiliation(s)
- Kalyani Dhusia
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Carlos Madrid
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States.,Laboratory for Macromolecular Analysis and Proteomics, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461, United States
| |
Collapse
|
40
|
Sun T, Chen Y, Wen Y, Zhu Z, Li M. PremPLI: a machine learning model for predicting the effects of missense mutations on protein-ligand interactions. Commun Biol 2021; 4:1311. [PMID: 34799678 PMCID: PMC8604987 DOI: 10.1038/s42003-021-02826-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 10/26/2021] [Indexed: 02/07/2023] Open
Abstract
Resistance to small-molecule drugs is the main cause of the failure of therapeutic drugs in clinical practice. Missense mutations altering the binding of ligands to proteins are one of the critical mechanisms that result in genetic disease and drug resistance. Computational methods have made a lot of progress for predicting binding affinity changes and identifying resistance mutations, but their prediction accuracy and speed are still not satisfied and need to be further improved. To address these issues, we introduce a structure-based machine learning method for quantitatively estimating the effects of single mutations on ligand binding affinity changes (named as PremPLI). A comprehensive comparison of the predictive performance of PremPLI with other available methods on two benchmark datasets confirms that our approach performs robustly and presents similar or even higher predictive accuracy than the approaches relying on first-principle statistical mechanics and mixed physics- and knowledge-based potentials while requires much less computational resources. PremPLI can be used for guiding the design of ligand-binding proteins, identifying and understanding disease driver mutations, and finding potential resistance mutations for different drugs. PremPLI is freely available at https://lilab.jysw.suda.edu.cn/research/PremPLI/ and allows to do large-scale mutational scanning. Sun et al. present PremPLI, a machine learning approach and web tool to predict how missense mutations in a drug’s target protein will affect the drug’s binding affinity. PremPLI can be applied to identify drug resistance mechanisms in cancer cells and microorganisms and develop drugs to combat drug resistance.
Collapse
Affiliation(s)
- Tingting Sun
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
| | - Yuting Chen
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
| | - Yuhao Wen
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
| | - Zefeng Zhu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China
| | - Minghui Li
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, 215123, Suzhou, China.
| |
Collapse
|
41
|
Engineered Fully Human Single-Chain Monoclonal Antibodies to PIM2 Kinase. Molecules 2021; 26:molecules26216436. [PMID: 34770845 PMCID: PMC8588357 DOI: 10.3390/molecules26216436] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 10/24/2021] [Indexed: 11/17/2022] Open
Abstract
Proviral integration site of Moloney virus-2 (PIM2) is overexpressed in multiple human cancer cells and high level is related to poor prognosis; thus, PIM2 kinase is a rational target of anti-cancer therapeutics. Several chemical inhibitors targeting PIMs/PIM2 or their downstream signaling molecules have been developed for treatment of different cancers. However, their off-target toxicity is common in clinical trials, so they could not be advanced to official approval for clinical application. Here, we produced human single-chain antibody fragments (HuscFvs) to PIM2 by using phage display library, which was constructed in a way that a portion of phages in the library carried HuscFvs against human own proteins on their surface with the respective antibody genes in the phage genome. Bacterial derived-recombinant PIM2 (rPIM2) was used as an antigenic bait to fish out the rPIM2-bound phages from the library. Three E. coli clones transfected with the HuscFv genes derived from the rPIM2-bound phages expressed HuscFvs that bound also to native PIM2 from cancer cells. The HuscFvs presumptively interact with the PIM2 at the ATP binding pocket and kinase active loop. They were as effective as small chemical drug inhibitor (AZD1208, which is an ATP competitive inhibitor of all PIM isoforms for ex vivo use) in inhibiting PIM kinase activity. The HuscFvs should be engineered into a cell-penetrating format and tested further towards clinical application as a novel and safe pan-anti-cancer therapeutics.
Collapse
|
42
|
Chen C, Boorla VS, Banerjee D, Chowdhury R, Cavener VS, Nissly RH, Gontu A, Boyle NR, Vandegrift K, Nair MS, Kuchipudi SV, Maranas CD. Computational prediction of the effect of amino acid changes on the binding affinity between SARS-CoV-2 spike RBD and human ACE2. Proc Natl Acad Sci U S A 2021; 118:e2106480118. [PMID: 34588290 PMCID: PMC8594574 DOI: 10.1073/pnas.2106480118] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/23/2021] [Indexed: 01/22/2023] Open
Abstract
The association of the receptor binding domain (RBD) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike protein with human angiotensin-converting enzyme 2 (hACE2) represents the first required step for cellular entry. SARS-CoV-2 has continued to evolve with the emergence of several novel variants, and amino acid changes in the RBD have been implicated with increased fitness and potential for immune evasion. Reliably predicting the effect of amino acid changes on the ability of the RBD to interact more strongly with the hACE2 can help assess the implications for public health and the potential for spillover and adaptation into other animals. Here, we introduce a two-step framework that first relies on 48 independent 4-ns molecular dynamics (MD) trajectories of RBD-hACE2 variants to collect binding energy terms decomposed into Coulombic, covalent, van der Waals, lipophilic, generalized Born solvation, hydrogen bonding, π-π packing, and self-contact correction terms. The second step implements a neural network to classify and quantitatively predict binding affinity changes using the decomposed energy terms as descriptors. The computational base achieves a validation accuracy of 82.8% for classifying single-amino acid substitution variants of the RBD as worsening or improving binding affinity for hACE2 and a correlation coefficient of 0.73 between predicted and experimentally calculated changes in binding affinities. Both metrics are calculated using a fivefold cross-validation test. Our method thus sets up a framework for screening binding affinity changes caused by unknown single- and multiple-amino acid changes offering a valuable tool to predict host adaptation of SARS-CoV-2 variants toward tighter hACE2 binding.
Collapse
Affiliation(s)
- Chen Chen
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802
| | - Veda Sheersh Boorla
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802
| | - Deepro Banerjee
- The Bioinformatics and Genomics Program, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802
| | - Ratul Chowdhury
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802
| | - Victoria S Cavener
- Department of Veterinary and Biomedical Sciences, The Pennsylvania State University, University Park, PA 16802
| | - Ruth H Nissly
- Animal Diagnostic Laboratory, Department of Veterinary and Biomedical Sciences, The Pennsylvania State University, University Park, PA 16802
| | - Abhinay Gontu
- Department of Veterinary and Biomedical Sciences, The Pennsylvania State University, University Park, PA 16802
| | - Nina R Boyle
- Department of Veterinary and Biomedical Sciences, The Pennsylvania State University, University Park, PA 16802
| | - Kurt Vandegrift
- Center for Infectious Disease Dynamics, The Pennsylvania State University, University Park, PA 16802
| | - Meera Surendran Nair
- Department of Veterinary and Biomedical Sciences, The Pennsylvania State University, University Park, PA 16802
- Animal Diagnostic Laboratory, Department of Veterinary and Biomedical Sciences, The Pennsylvania State University, University Park, PA 16802
| | - Suresh V Kuchipudi
- Department of Veterinary and Biomedical Sciences, The Pennsylvania State University, University Park, PA 16802;
- Animal Diagnostic Laboratory, Department of Veterinary and Biomedical Sciences, The Pennsylvania State University, University Park, PA 16802
- Center for Infectious Disease Dynamics, The Pennsylvania State University, University Park, PA 16802
| | - Costas D Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802;
| |
Collapse
|
43
|
Verkhivker GM, Agajanian S, Oztas DY, Gupta G. Atomistic Simulations and In Silico Mutational Profiling of Protein Stability and Binding in the SARS-CoV-2 Spike Protein Complexes with Nanobodies: Molecular Determinants of Mutational Escape Mechanisms. ACS OMEGA 2021; 6:26354-26371. [PMID: 34660995 PMCID: PMC8515575 DOI: 10.1021/acsomega.1c03558] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 09/10/2021] [Indexed: 05/11/2023]
Abstract
Structure-functional studies have recently revealed a spectrum of diverse high-affinity nanobodies with efficient neutralizing capacity against SARS-CoV-2 virus and resilience against mutational escape. In this study, we combine atomistic simulations with the ensemble-based mutational profiling of binding for the SARS-CoV-2 S-RBD complexes with a wide range of nanobodies to identify dynamic and binding affinity fingerprints and characterize the energetic determinants of nanobody-escaping mutations. Using an in silico mutational profiling approach for probing the protein stability and binding, we examine dynamics and energetics of the SARS-CoV-2 complexes with single nanobodies Nb6 and Nb20, VHH E, a pair combination VHH E + U, a biparatopic nanobody VHH VE, and a combination of the CC12.3 antibody and VHH V/W nanobodies. This study characterizes the binding energy hotspots in the SARS-CoV-2 protein and complexes with nanobodies providing a quantitative analysis of the effects of circulating variants and escaping mutations on binding that is consistent with a broad range of biochemical experiments. The results suggest that mutational escape may be controlled through structurally adaptable binding hotspots in the receptor-accessible binding epitope that are dynamically coupled to the stability centers in the distant binding epitope targeted by VHH U/V/W nanobodies. This study offers a plausible mechanism in which through cooperative dynamic changes, nanobody combinations and biparatopic nanobodies can elicit the increased binding affinity response and yield resilience to common escape mutants.
Collapse
Affiliation(s)
- Gennady M. Verkhivker
- Keck
Center for Science and Engineering, Schmid College of Science and
Technology, Chapman University, One University Drive, Orange, California 92866, United States
- Department
of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, California 92618, United States
| | - Steve Agajanian
- Keck
Center for Science and Engineering, Schmid College of Science and
Technology, Chapman University, One University Drive, Orange, California 92866, United States
| | - Deniz Yasar Oztas
- Keck
Center for Science and Engineering, Schmid College of Science and
Technology, Chapman University, One University Drive, Orange, California 92866, United States
| | - Grace Gupta
- Keck
Center for Science and Engineering, Schmid College of Science and
Technology, Chapman University, One University Drive, Orange, California 92866, United States
| |
Collapse
|
44
|
Liu X, Luo Y, Li P, Song S, Peng J. Deep geometric representations for modeling effects of mutations on protein-protein binding affinity. PLoS Comput Biol 2021; 17:e1009284. [PMID: 34347784 PMCID: PMC8366979 DOI: 10.1371/journal.pcbi.1009284] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 08/16/2021] [Accepted: 07/17/2021] [Indexed: 11/19/2022] Open
Abstract
Modeling the impact of amino acid mutations on protein-protein interaction plays a crucial role in protein engineering and drug design. In this study, we develop GeoPPI, a novel structure-based deep-learning framework to predict the change of binding affinity upon mutations. Based on the three-dimensional structure of a protein, GeoPPI first learns a geometric representation that encodes topology features of the protein structure via a self-supervised learning scheme. These representations are then used as features for training gradient-boosting trees to predict the changes of protein-protein binding affinity upon mutations. We find that GeoPPI is able to learn meaningful features that characterize interactions between atoms in protein structures. In addition, through extensive experiments, we show that GeoPPI achieves new state-of-the-art performance in predicting the binding affinity changes upon both single- and multi-point mutations on six benchmark datasets. Moreover, we show that GeoPPI can accurately estimate the difference of binding affinities between a few recently identified SARS-CoV-2 antibodies and the receptor-binding domain (RBD) of the S protein. These results demonstrate the potential of GeoPPI as a powerful and useful computational tool in protein design and engineering. Our code and datasets are available at: https://github.com/Liuxg16/GeoPPI. Estimating the binding affinities of protein-protein interactions (PPIs) is crucial to understand protein function and design new functional proteins. Since the experimental measurement in wet-labs is labor-intensive and time-consuming, fast and accurate in silico approaches have received much attention. Although considerable efforts have been made in this direction, predicting the effects of mutations on the protein-protein binding affinity is still a challenging research problem. In this work, we introduce GeoPPI, a novel computational approach that uses deep geometric representations of protein complexes to predict the effects of mutations on the binding affinity. The geometric representations are first learned via a self-supervised learning scheme and then integrated with gradient-boosting trees to accomplish the prediction. We find that the learned representations encode meaningful patterns underlying the interactions between atoms in protein structures. Also, extensive tests on major benchmark datasets show that GeoPPI has made an important improvement over the existing methods in predicting the effects of mutations on the binding affinity.
Collapse
Affiliation(s)
- Xianggen Liu
- Laboratory for Brain and Intelligence and Department of Biomedical Engineering, Tsinghua University, Beijing, China
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, China
- Beijing Innovation Center for Future Chip, Tsinghua University, Beijing, China
| | - Yunan Luo
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Pengyong Li
- Laboratory for Brain and Intelligence and Department of Biomedical Engineering, Tsinghua University, Beijing, China
- Beijing Innovation Center for Future Chip, Tsinghua University, Beijing, China
| | - Sen Song
- Laboratory for Brain and Intelligence and Department of Biomedical Engineering, Tsinghua University, Beijing, China
- Beijing Innovation Center for Future Chip, Tsinghua University, Beijing, China
- * E-mail: (JP); (SS)
| | - Jian Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- * E-mail: (JP); (SS)
| |
Collapse
|
45
|
Rangel-Chávez CP, Galán-Vásquez E, Pescador-Tapia A, Delaye L, Martínez-Antonio A. RNA polymerases in strict endosymbiont bacteria with extreme genome reduction show distinct erosions that might result in limited and differential promoter recognition. PLoS One 2021; 16:e0239350. [PMID: 34324516 PMCID: PMC8321222 DOI: 10.1371/journal.pone.0239350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 06/22/2021] [Indexed: 11/26/2022] Open
Abstract
Strict endosymbiont bacteria present high degree genome reduction, retain smaller proteins, and in some instances, lack complete functional domains compared to free-living counterparts. Until now, the mechanisms underlying these genetic reductions are not well understood. In this study, the conservation of RNA polymerases, the essential machinery for gene expression, is analyzed in endosymbiont bacteria with extreme genome reductions. We analyzed the RNA polymerase subunits to identify and define domains, subdomains, and specific amino acids involved in precise biological functions known in Escherichia coli. We also perform phylogenetic analysis and three-dimensional models over four lineages of endosymbiotic proteobacteria with the smallest genomes known to date: Candidatus Hodgkinia cicadicola, Candidatus Tremblaya phenacola, Candidatus Tremblaya Princeps, Candidatus Nasuia deltocephalinicola, and Candidatus Carsonella ruddii. We found that some Hodgkinia strains do not encode for the RNA polymerase α subunit. The rest encode genes for α, β, β', and σ subunits to form the RNA polymerase. However, 16% shorter, on average, respect their orthologous in E. coli. In the α subunit, the amino-terminal domain is the most conserved. Regarding the β and β' subunits, both the catalytic core and the assembly domains are the most conserved. However, they showed compensatory amino acid substitutions to adapt to changes in the σ subunit. Precisely, the most erosive diversity occurs within the σ subunit. We identified broad amino acid substitution even in those recognizing and binding to the -10-box promoter element. In an overall conceptual image, the RNA polymerase from Candidatus Nasuia conserved the highest similarity with Escherichia coli RNA polymerase and their σ70. It might be recognizing the two main promoter elements (-10 and -35) and the two promoter accessory elements (-10 extended and UP-element). In Candidatus Carsonella, the RNA polymerase could recognize all the promoter elements except the -10-box extended. In Candidatus Tremblaya and Hodgkinia, due to the α carboxyl-terminal domain absence, they might not recognize the UP-promoter element. We also identified the lack of the β flap-tip helix domain in most Hodgkinia's that suggests the inability to bind the -35-box promoter element.
Collapse
Affiliation(s)
- Cynthia Paola Rangel-Chávez
- Biological Engineering Laboratory, Genetic Engineering Department, Center for Research and Advanced Studies of the National Polytechnic Institute, Irapuato Gto, México
| | - Edgardo Galán-Vásquez
- Departamento de Ingeniería de Sistemas Computacionales y Automatización, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Coyoacán, Ciudad de México, CDMX, México
| | - Azucena Pescador-Tapia
- Biological Engineering Laboratory, Genetic Engineering Department, Center for Research and Advanced Studies of the National Polytechnic Institute, Irapuato Gto, México
| | - Luis Delaye
- Evolutionary Genomics Laboratory, Genetic Engineering Department, Center for Research and Advanced Studies of the National Polytechnic Institute, Irapuato Gto, México
| | - Agustino Martínez-Antonio
- Biological Engineering Laboratory, Genetic Engineering Department, Center for Research and Advanced Studies of the National Polytechnic Institute, Irapuato Gto, México
| |
Collapse
|
46
|
Li G, Pahari S, Murthy AK, Liang S, Fragoza R, Yu H, Alexov E. SAAMBE-SEQ: a sequence-based method for predicting mutation effect on protein-protein binding affinity. Bioinformatics 2021; 37:992-999. [PMID: 32866236 PMCID: PMC8128451 DOI: 10.1093/bioinformatics/btaa761] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Revised: 08/17/2020] [Accepted: 08/24/2020] [Indexed: 01/04/2023] Open
Abstract
MOTIVATION Vast majority of human genetic disorders are associated with mutations that affect protein-protein interactions by altering wild-type binding affinity. Therefore, it is extremely important to assess the effect of mutations on protein-protein binding free energy to assist the development of therapeutic solutions. Currently, the most popular approaches use structural information to deliver the predictions, which precludes them to be applicable on genome-scale investigations. Indeed, with the progress of genomic sequencing, researchers are frequently dealing with assessing effect of mutations for which there is no structure available. RESULTS Here, we report a Gradient Boosting Decision Tree machine learning algorithm, the SAAMBE-SEQ, which is completely sequence-based and does not require structural information at all. SAAMBE-SEQ utilizes 80 features representing evolutionary information, sequence-based features and change of physical properties upon mutation at the mutation site. The approach is shown to achieve Pearson correlation coefficient (PCC) of 0.83 in 5-fold cross validation in a benchmarking test against experimentally determined binding free energy change (ΔΔG). Further, a blind test (no-STRUC) is compiled collecting experimental ΔΔG upon mutation for protein complexes for which structure is not available and used to benchmark SAAMBE-SEQ resulting in PCC in the range of 0.37-0.46. The accuracy of SAAMBE-SEQ method is found to be either better or comparable to most advanced structure-based methods. SAAMBE-SEQ is very fast, available as webserver and stand-alone code, and indeed utilizes only sequence information, and thus it is applicable for genome-scale investigations to study the effect of mutations on protein-protein interactions. AVAILABILITY AND IMPLEMENTATION SAAMBE-SEQ is available at http://compbio.clemson.edu/saambe_webserver/indexSEQ.php#started. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gen Li
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Swagata Pahari
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | | | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
| | - Robert Fragoza
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
| | - Emil Alexov
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| |
Collapse
|
47
|
Sequeiros-Borja CE, Surpeta B, Brezovsky J. Recent advances in user-friendly computational tools to engineer protein function. Brief Bioinform 2021; 22:bbaa150. [PMID: 32743637 PMCID: PMC8138880 DOI: 10.1093/bib/bbaa150] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 06/03/2020] [Accepted: 06/16/2020] [Indexed: 12/14/2022] Open
Abstract
Progress in technology and algorithms throughout the past decade has transformed the field of protein design and engineering. Computational approaches have become well-engrained in the processes of tailoring proteins for various biotechnological applications. Many tools and methods are developed and upgraded each year to satisfy the increasing demands and challenges of protein engineering. To help protein engineers and bioinformaticians navigate this emerging wave of dedicated software, we have critically evaluated recent additions to the toolbox regarding their application for semi-rational and rational protein engineering. These newly developed tools identify and prioritize hotspots and analyze the effects of mutations for a variety of properties, comprising ligand binding, protein-protein and protein-nucleic acid interactions, and electrostatic potential. We also discuss notable progress to target elusive protein dynamics and associated properties like ligand-transport processes and allosteric communication. Finally, we discuss several challenges these tools face and provide our perspectives on the further development of readily applicable methods to guide protein engineering efforts.
Collapse
Affiliation(s)
- Carlos Eduardo Sequeiros-Borja
- Laboratory of Biomolecular Interactions and Transport, Department of Gene Expression, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University and the International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Bartłomiej Surpeta
- Laboratory of Biomolecular Interactions and Transport, Department of Gene Expression, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University and the International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Jan Brezovsky
- Laboratory of Biomolecular Interactions and Transport, Department of Gene Expression, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University and the International Institute of Molecular and Cell Biology in Warsaw
| |
Collapse
|
48
|
Wang B, Su Z, Wu Y. Computational Assessment of Protein-Protein Binding Affinity by Reverse Engineering the Energetics in Protein Complexes. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:1012-1022. [PMID: 33838354 PMCID: PMC9403033 DOI: 10.1016/j.gpb.2021.03.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Revised: 03/07/2019] [Accepted: 05/17/2019] [Indexed: 11/29/2022]
Abstract
The cellular functions of proteins are maintained by forming diverse complexes. The stability of these complexes is quantified by the measurement of binding affinity, and mutations that alter the binding affinity can cause various diseases such as cancer and diabetes. As a result, accurate estimation of the binding stability and the effects of mutations on changes of binding affinity is a crucial step to understanding the biological functions of proteins and their dysfunctional consequences. It has been hypothesized that the stability of a protein complex is dependent not only on the residues at its binding interface by pairwise interactions but also on all other remaining residues that do not appear at the binding interface. Here, we computationally reconstruct the binding affinity by decomposing it into the contributions of interfacial residues and other non-interfacial residues in a protein complex. We further assume that the contributions of both interfacial and non-interfacial residues to the binding affinity depend on their local structural environments such as solvent-accessible surfaces and secondary structural types. The weights of all corresponding parameters are optimized by Monte-Carlo simulations. After cross-validation against a large-scale dataset, we show that the model not only shows a strong correlation between the absolute values of the experimental and calculated binding affinities, but can also be an effective approach to predict the relative changes of binding affinity from mutations. Moreover, we have found that the optimized weights of many parameters can capture the first-principle chemical and physical features of molecular recognition, therefore reversely engineering the energetics of protein complexes. These results suggest that our method can serve as a useful addition to current computational approaches for predicting binding affinity and understanding the molecular mechanism of protein–protein interactions.
Collapse
Affiliation(s)
- Bo Wang
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Zhaoqian Su
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
| | - Yinghao Wu
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA.
| |
Collapse
|
49
|
Wu FX, Yang JF, Mei LC, Wang F, Hao GF, Yang GF. PIIMS Server: A Web Server for Mutation Hotspot Scanning at the Protein-Protein Interface. J Chem Inf Model 2021; 61:14-20. [PMID: 33400510 DOI: 10.1021/acs.jcim.0c00966] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Protein-protein interactions (PPIs) play vital roles in regulating biological processes, such as cellular and signaling pathways. Hotspots are certain residues located at protein-protein interfaces that contribute more in protein-protein binding than other residues. Research on the mutational effects of hotspots is important for understanding basic aspects of protein association. Hence, various computational tools have been developed to explore the impact of mutation hotspots, which will allow a better understanding of the forces that drive PPIs. However, tools that may provide comprehensive substitutions at hotspots are still rare. Hence, there is a strong need for a new free web server to explore mutational effects of hotspots. Herein we introduce a web server named PIIMS that integrates molecular dynamics simulation and one-step free energy perturbation. It contains two main computational functions: (1) computational alanine scanning analysis to identify hotspots and (2) full mutation scanning analysis to evaluate the effects of hotspot mutations. We rigidly validated its ability to predict binding free energy changes by using large and diverse datasets including 1,341 mutations from 50 PPIs with the correlation coefficient R = 0.75. The difference from the existing tools is that PIIMS can perform further evaluation of hotspot residues with regard to their different mutations. The PIIMS web server (accessible at http://chemyang.ccnu.edu.cn/ccb/server/PIIMS/index.php) is free and open to all users without login requirements.
Collapse
Affiliation(s)
- Feng-Xu Wu
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P. R. China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, P. R. China
| | - Jing-Fang Yang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P. R. China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, P. R. China
| | - Long-Can Mei
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P. R. China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, P. R. China
| | - Fan Wang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P. R. China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, P. R. China
| | - Ge-Fei Hao
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P. R. China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, P. R. China.,State Key Laboratory Breeding Base of Green Pesticide and Agricultural Bioengineering, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Research and Development Center for Fine Chemicals, Guizhou University, Guiyang 550025, P. R. China
| | - Guang-Fu Yang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P. R. China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, P. R. China.,Collaborative Innovation Center of Chemical Science and Engineering, Tianjin 300072, P. R. China
| |
Collapse
|
50
|
Gonzalez TR, Martin KP, Barnes JE, Patel JS, Ytreberg FM. Assessment of software methods for estimating protein-protein relative binding affinities. PLoS One 2020; 15:e0240573. [PMID: 33347442 PMCID: PMC7751979 DOI: 10.1371/journal.pone.0240573] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Accepted: 12/07/2020] [Indexed: 11/19/2022] Open
Abstract
A growing number of computational tools have been developed to accurately and rapidly predict the impact of amino acid mutations on protein-protein relative binding affinities. Such tools have many applications, for example, designing new drugs and studying evolutionary mechanisms. In the search for accuracy, many of these methods employ expensive yet rigorous molecular dynamics simulations. By contrast, non-rigorous methods use less exhaustive statistical mechanics, allowing for more efficient calculations. However, it is unclear if such methods retain enough accuracy to replace rigorous methods in binding affinity calculations. This trade-off between accuracy and computational expense makes it difficult to determine the best method for a particular system or study. Here, eight non-rigorous computational methods were assessed using eight antibody-antigen and eight non-antibody-antigen complexes for their ability to accurately predict relative binding affinities (ΔΔG) for 654 single mutations. In addition to assessing accuracy, we analyzed the CPU cost and performance for each method using a variety of physico-chemical structural features. This allowed us to posit scenarios in which each method may be best utilized. Most methods performed worse when applied to antibody-antigen complexes compared to non-antibody-antigen complexes. Rosetta-based JayZ and EasyE methods classified mutations as destabilizing (ΔΔG < -0.5 kcal/mol) with high (83-98%) accuracy and a relatively low computational cost for non-antibody-antigen complexes. Some of the most accurate results for antibody-antigen systems came from combining molecular dynamics with FoldX with a correlation coefficient (r) of 0.46, but this was also the most computationally expensive method. Overall, our results suggest these methods can be used to quickly and accurately predict stabilizing versus destabilizing mutations but are less accurate at predicting actual binding affinities. This study highlights the need for continued development of reliable, accessible, and reproducible methods for predicting binding affinities in antibody-antigen proteins and provides a recipe for using current methods.
Collapse
Affiliation(s)
- Tawny R. Gonzalez
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, Idaho, United States of America
| | - Kyle P. Martin
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, Idaho, United States of America
- Department of Physics, University of Idaho, Moscow, Idaho, United States of America
| | - Jonathan E. Barnes
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, Idaho, United States of America
- Department of Physics, University of Idaho, Moscow, Idaho, United States of America
| | - Jagdish Suresh Patel
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, Idaho, United States of America
- Department of Biological Sciences, University of Idaho, Moscow, Idaho, United States of America
| | - F. Marty Ytreberg
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, Idaho, United States of America
- Department of Physics, University of Idaho, Moscow, Idaho, United States of America
| |
Collapse
|