1
|
Hribar-Lee B, Lukšič M. Biophysical Principles Emerging from Experiments on Protein-Protein Association and Aggregation. Annu Rev Biophys 2024; 53:1-18. [PMID: 37906740 DOI: 10.1146/annurev-biophys-030722-111729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Protein-protein association and aggregation are fundamental processes that play critical roles in various biological phenomena, from cellular signaling to disease progression. Understanding the underlying biophysical principles governing these processes is crucial for elucidating their mechanisms and developing strategies for therapeutic intervention. In this review, we provide an overview of recent experimental studies focused on protein-protein association and aggregation. We explore the key biophysical factors that influence these processes, including protein structure, conformational dynamics, and intermolecular interactions. We discuss the effects of environmental conditions such as temperature, pH and related buffer-specific effects, and ionic strength and related ion-specific effects on protein aggregation. The effects of polymer crowders and sugars are also addressed. We list the techniques used to study aggregation. We analyze emerging trends and challenges in the field, including the development of computational models and the integration of multidisciplinary approaches for a comprehensive understanding of protein-protein association and aggregation.
Collapse
Affiliation(s)
- Barbara Hribar-Lee
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia;
| | - Miha Lukšič
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia;
| |
Collapse
|
2
|
Kamal H, Zafar MM, Parvaiz A, Razzaq A, Elhindi KM, Ercisli S, Qiao F, Jiang X. Gossypium hirsutum calmodulin-like protein (CML 11) interaction with geminivirus encoded protein using bioinformatics and molecular techniques. Int J Biol Macromol 2024; 269:132095. [PMID: 38710255 DOI: 10.1016/j.ijbiomac.2024.132095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 03/24/2024] [Accepted: 05/03/2024] [Indexed: 05/08/2024]
Abstract
Plant viruses are the most abundant destructive agents that exist in every ecosystem, causing severe diseases in multiple crops worldwide. Currently, a major gap is present in computational biology determining plant viruses interaction with its host. We lay out a strategy to extract virus-host protein interactions using various protein binding and interface methods for Geminiviridae, a second largest virus family. Using this approach, transcriptional activator protein (TrAP/C2) encoded by Cotton leaf curl Kokhran virus (CLCuKoV) and Cotton leaf curl Multan virus (CLCuMV) showed strong binding affinity with calmodulin-like (CML) protein of Gossypium hirsutum (Gh-CML11). Higher negative value for the change in Gibbs free energy between TrAP and Gh-CML11 indicated strong binding affinity. Consensus from gene ontology database and in-silico nuclear localization signal (NLS) tools identified subcellular localization of TrAP in the nucleus associated with Gh-CML11 for virus infection. Data based on interaction prediction and docking methods present evidences that full length and truncated C2 strongly binds with Gh-CML11. This computational data was further validated with molecular results collected from yeast two-hybrid, bimolecular fluorescence complementation system and pull down assay. In this work, we also show the outcomes of full length and truncated TrAP on plant machinery. This is a first extensive report to delineate a role of CML protein from cotton with begomoviruses encoded transcription activator protein.
Collapse
Affiliation(s)
- Hira Kamal
- Department of Plant Pathology, Washington State University, Pullman, WA, USA
| | - Muhammad Mubashar Zafar
- Sanya Institute of Breeding and Multiplication/School of Tropical Agriculture and Forestry, Hainan University, Sanya, China
| | - Aqsa Parvaiz
- Department of Biochemistry and Biotechnology, The Women University Multan, Multan. Pakistan
| | - Abdul Razzaq
- Institute of Molecular Biology and Biotechnology, The University of Lahore, Lahore, Pakistan..
| | - Khalid M Elhindi
- Plant Production Department, College of Food & Agriculture Sciences, King Saud University, P.O. Box 2460, Riyadh 11451, Saudi Arabia
| | - Sezai Ercisli
- Department of Horticulture, Faculty of Agriculture, Ataturk University, 25240 Erzurum, Turkey
| | - Fei Qiao
- Sanya Institute of Breeding and Multiplication/School of Tropical Agriculture and Forestry, Hainan University, Sanya, China
| | - Xuefei Jiang
- Sanya Institute of Breeding and Multiplication/School of Tropical Agriculture and Forestry, Hainan University, Sanya, China..
| |
Collapse
|
3
|
Parvathy J, Yazhini A, Srinivasan N, Sowdhamini R. Interfacial residues in protein-protein complexes are in the eyes of the beholder. Proteins 2024; 92:509-528. [PMID: 37982321 DOI: 10.1002/prot.26628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Revised: 10/14/2023] [Accepted: 10/17/2023] [Indexed: 11/21/2023]
Abstract
Interactions between proteins are vital in almost all biological processes. The characterization of protein-protein interactions helps us understand the mechanistic basis of biological processes, thereby enabling the manipulation of proteins for biotechnological and clinical purposes. The interface residues of a protein-protein complex are assumed to have the following two properties: (a) they always interact with a residue of a partner protein, which forms the basis for distance-based interface residue identification methods, and (b) they are solvent-exposed in the isolated form of the protein and become buried in the complex form, which forms the basis for Accessible Surface Area (ASA)-based methods. The study interrogates this popular assumption by recognizing interface residues in protein-protein complexes through these two methods. The results show that a few residues are identified uniquely by each method, and the extent of conservation, propensities, and their contribution to the stability of protein-protein interaction varies substantially between these residues. The case study analyses showed that interface residues, unique to distance, participate in crucial interactions that hold the proteins together, whereas the interface residues unique to the ASA method have a potential role in the recognition, dynamics, and specificity of the complex and can also be a hotspot. Overall, the study recommends applying both distance and ASA methods so that some interface residues missed by either method but crucial to the stability, recognition, dynamics, and function of protein-protein complexes are identified in a complementary manner.
Collapse
Affiliation(s)
- Jayadevan Parvathy
- Interdisciplinary Mathematical Sciences Initiative (IMI), Indian Institute of Science, Bangalore, India
- Molecular Biophysics Unit (MBU), Indian Institute of Science, Bangalore, India
| | | | | | - Ramanathan Sowdhamini
- Molecular Biophysics Unit (MBU), Indian Institute of Science, Bangalore, India
- National Center for Biological Sciences (NCBS), Bangalore, India
| |
Collapse
|
4
|
Kiani YS, Jabeen I. Challenges of Protein-Protein Docking of the Membrane Proteins. Methods Mol Biol 2024; 2780:203-255. [PMID: 38987471 DOI: 10.1007/978-1-0716-3985-6_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Despite the recent advances in the determination of high-resolution membrane protein (MP) structures, the structural and functional characterization of MPs remains extremely challenging, mainly due to the hydrophobic nature, low abundance, poor expression, purification, and crystallization difficulties associated with MPs. Whereby the major challenges/hurdles for MP structure determination are associated with the expression, purification, and crystallization procedures. Although there have been significant advances in the experimental determination of MP structures, only a limited number of MP structures (approximately less than 1% of all) are available in the Protein Data Bank (PDB). Therefore, the structures of a large number of MPs still remain unresolved, which leads to the availability of widely unplumbed structural and functional information related to MPs. As a result, recent developments in the drug discovery realm and the significant biological contemplation have led to the development of several novel, low-cost, and time-efficient computational methods that overcome the limitations of experimental approaches, supplement experiments, and provide alternatives for the characterization of MPs. Whereby the fine tuning and optimizations of these computational approaches remains an ongoing endeavor.Computational methods offer a potential way for the elucidation of structural features and the augmentation of currently available MP information. However, the use of computational modeling can be extremely challenging for MPs mainly due to insufficient knowledge of (or gaps in) atomic structures of MPs. Despite the availability of numerous in silico methods for 3D structure determination the applicability of these methods to MPs remains relatively low since all methods are not well-suited or adequate for MPs. However, sophisticated methods for MP structure predictions are constantly being developed and updated to integrate the modifications required for MPs. Currently, different computational methods for (1) MP structure prediction, (2) stability analysis of MPs through molecular dynamics simulations, (3) modeling of MP complexes through docking, (4) prediction of interactions between MPs, and (5) MP interactions with its soluble partner are extensively used. Towards this end, MP docking is widely used. It is notable that the MP docking methods yet few in number might show greater potential in terms of filling the knowledge gap. In this chapter, MP docking methods and associated challenges have been reviewed to improve the applicability, accuracy, and the ability to model macromolecular complexes.
Collapse
Affiliation(s)
- Yusra Sajid Kiani
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Ishrat Jabeen
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan.
| |
Collapse
|
5
|
Jain A, Begum T, Ahmad S. Analysis and Prediction of Pathogen Nucleic Acid Specificity for Toll-like Receptors in Vertebrates. J Mol Biol 2023; 435:168208. [PMID: 37479078 DOI: 10.1016/j.jmb.2023.168208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/20/2023] [Accepted: 07/13/2023] [Indexed: 07/23/2023]
Abstract
Identification of key sequence, expression and function related features of nucleic acid-sensing host proteins is of fundamental importance to understand the dynamics of pathogen-specific host responses. To meet this objective, we considered toll-like receptors (TLRs), a representative class of membrane-bound sensor proteins, from 17 vertebrate species covering mammals, birds, reptiles, amphibians, and fishes in this comparative study. We identified the molecular signatures of host TLRs that are responsible for sensing pathogen nucleic acids or other pathogen-associated molecular patterns (PAMPs), and potentially play important roles in host defence mechanism. Interestingly, our findings reveal that such host-specific features are directly related to the strand (single or double) specificity of nucleic acid from pathogens. However, during host-pathogen interactions, such features were unable to explain the pathogenic PAMP (i.e., DNA, RNA or other) selectivity, suggesting a more complex mechanism. Using these features, we developed a number of machine learning models, of which Random Forest achieved a high performance (94.57% accuracy) to predict strand specificity of TLRs from protein-derived features. We applied the trained model to propose strand specificity of some previously uncharacterized distinct fish-specific novel TLRs (TLR18, TLR23, TLR24, TLR25, TLR27).
Collapse
Affiliation(s)
- Anuja Jain
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India. https://twitter.com/@Anuja334
| | - Tina Begum
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| | - Shandar Ahmad
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.
| |
Collapse
|
6
|
Walder M, Edelstein E, Carroll M, Lazarev S, Fajardo JE, Fiser A, Viswanathan R. Integrated structure-based protein interface prediction. BMC Bioinformatics 2022; 23:301. [PMID: 35879651 PMCID: PMC9316365 DOI: 10.1186/s12859-022-04852-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 07/18/2022] [Indexed: 11/29/2022] Open
Abstract
Background Identifying protein interfaces can inform how proteins interact with their binding partners, uncover the regulatory mechanisms that control biological functions and guide the development of novel therapeutic agents. A variety of computational approaches have been developed for predicting a protein’s interfacial residues from its known sequence and structure. Methods using the known three-dimensional structures of proteins can be template-based or template-free. Template-based methods have limited success in predicting interfaces when homologues with known complex structures are not available to use as templates. The prediction performance of template-free methods that only rely only upon proteins’ intrinsic properties is limited by the amount of biologically relevant features that can be included in an interface prediction model. Results We describe the development of an integrated method for protein interface prediction (ISPIP) to explore the hypothesis that the efficacy of a computational prediction method of protein binding sites can be enhanced by using a combination of methods that rely on orthogonal structure-based properties of a query protein, combining and balancing both template-free and template-based features. ISPIP is a method that integrates these approaches through simple linear or logistic regression models and more complex decision tree models. On a diverse test set of 156 query proteins, ISPIP outperforms each of its individual classifiers in identifying protein binding interfaces. Conclusions The integrated method captures the best performance of individual classifiers and delivers an improved interface prediction. The method is robust and performs well even when one of the individual classifiers performs poorly on a particular query protein. This work demonstrates that integrating orthogonal methods that depend on different structural properties of proteins performs better at interface prediction than any individual classifier alone. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04852-2.
Collapse
Affiliation(s)
- M Walder
- Department of Chemistry, Yeshiva College, Yeshiva University, New York, NY, 10033, USA
| | - E Edelstein
- Department of Chemistry, Yeshiva College, Yeshiva University, New York, NY, 10033, USA
| | - M Carroll
- Department of Chemistry, Yeshiva College, Yeshiva University, New York, NY, 10033, USA
| | - S Lazarev
- Department of Chemistry, Yeshiva College, Yeshiva University, New York, NY, 10033, USA
| | - J E Fajardo
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - A Fiser
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - R Viswanathan
- Department of Chemistry, Yeshiva College, Yeshiva University, New York, NY, 10033, USA.
| |
Collapse
|
7
|
Zhang W, Meng Q, Wang J, Guo F. HDIContact: a novel predictor of residue-residue contacts on hetero-dimer interfaces via sequential information and transfer learning strategy. Brief Bioinform 2022; 23:6599074. [PMID: 35653713 DOI: 10.1093/bib/bbac169] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/07/2022] [Accepted: 04/16/2022] [Indexed: 11/12/2022] Open
Abstract
Proteins maintain the functional order of cell in life by interacting with other proteins. Determination of protein complex structural information gives biological insights for the research of diseases and drugs. Recently, a breakthrough has been made in protein monomer structure prediction. However, due to the limited number of the known protein structure and homologous sequences of complexes, the prediction of residue-residue contacts on hetero-dimer interfaces is still a challenge. In this study, we have developed a deep learning framework for inferring inter-protein residue contacts from sequential information, called HDIContact. We utilized transfer learning strategy to produce Multiple Sequence Alignment (MSA) two-dimensional (2D) embedding based on patterns of concatenated MSA, which could reduce the influence of noise on MSA caused by mismatched sequences or less homology. For MSA 2D embedding, HDIContact took advantage of Bi-directional Long Short-Term Memory (BiLSTM) with two-channel to capture 2D context of residue pairs. Our comprehensive assessment on the Escherichia coli (E. coli) test dataset showed that HDIContact outperformed other state-of-the-art methods, with top precision of 65.96%, the Area Under the Receiver Operating Characteristic curve (AUROC) of 83.08% and the Area Under the Precision Recall curve (AUPR) of 25.02%. In addition, we analyzed the potential of HDIContact for human-virus protein-protein complexes, by achieving top five precision of 80% on O75475-P04584 related to Human Immunodeficiency Virus. All experiments indicated that our method was a valuable technical tool for predicting inter-protein residue contacts, which would be helpful for understanding protein-protein interaction mechanisms.
Collapse
Affiliation(s)
- Wei Zhang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Qiaozhen Meng
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
8
|
Quadrini M, Daberdaku S, Ferrari C. Hierarchical representation for PPI sites prediction. BMC Bioinformatics 2022; 23:96. [PMID: 35307006 PMCID: PMC8934516 DOI: 10.1186/s12859-022-04624-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 02/23/2022] [Indexed: 01/06/2023] Open
Abstract
Abstract
Background
Protein–protein interactions have pivotal roles in life processes, and aberrant interactions are associated with various disorders. Interaction site identification is key for understanding disease mechanisms and design new drugs. Effective and efficient computational methods for the PPI prediction are of great value due to the overall cost of experimental methods. Promising results have been obtained using machine learning methods and deep learning techniques, but their effectiveness depends on protein representation and feature selection.
Results
We define a new abstraction of the protein structure, called hierarchical representations, considering and quantifying spatial and sequential neighboring among amino acids. We also investigate the effect of molecular abstractions using the Graph Convolutional Networks technique to classify amino acids as interface and no-interface ones. Our study takes into account three abstractions, hierarchical representations, contact map, and the residue sequence, and considers the eight functional classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0. The performance of our method, evaluated using standard metrics, is compared to the ones obtained with some state-of-the-art protein interface predictors. The analysis of the performance values shows that our method outperforms the considered competitors when the considered molecules are structurally similar.
Conclusions
The hierarchical representation can capture the structural properties that promote the interactions and can be used to represent proteins with unknown structures by codifying only their sequential neighboring. Analyzing the results, we conclude that classes should be arranged according to their architectures rather than functions.
Collapse
|
9
|
BIPSPI+: Mining Type-Specific Datasets of Protein Complexes to Improve Protein Binding Site Prediction. J Mol Biol 2022; 434:167556. [DOI: 10.1016/j.jmb.2022.167556] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Revised: 03/12/2022] [Accepted: 03/16/2022] [Indexed: 11/20/2022]
|
10
|
Mahbub S, Bayzid MS. EGRET: edge aggregated graph attention networks and transfer learning improve protein-protein interaction site prediction. Brief Bioinform 2022; 23:6518045. [PMID: 35106547 DOI: 10.1093/bib/bbab578] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 11/25/2021] [Accepted: 12/16/2021] [Indexed: 12/18/2022] Open
Abstract
MOTIVATION Protein-protein interactions (PPIs) are central to most biological processes. However, reliable identification of PPI sites using conventional experimental methods is slow and expensive. Therefore, great efforts are being put into computational methods to identify PPI sites. RESULTS We present Edge Aggregated GRaph Attention NETwork (EGRET), a highly accurate deep learning-based method for PPI site prediction, where we have used an edge aggregated graph attention network to effectively leverage the structural information. We, for the first time, have used transfer learning in PPI site prediction. Our proposed edge aggregated network, together with transfer learning, has achieved notable improvement over the best alternate methods. Furthermore, we systematically investigated EGRET's network behavior to provide insights about the causes of its decisions. AVAILABILITY EGRET is freely available as an open source project at https://github.com/Sazan-Mahbub/EGRET. CONTACT shams_bayzid@cse.buet.ac.bd.
Collapse
Affiliation(s)
- Sazan Mahbub
- Department of Computer Science University of Maryland, College Park, Maryland 20742, USA
| | - Md Shamsuzzoha Bayzid
- Department of Computer Science and Engineering Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| |
Collapse
|
11
|
Tahir S, Bourquard T, Musnier A, Jullian Y, Corde Y, Omahdi Z, Mathias L, Reiter E, Crépieux P, Bruneau G, Poupon A. Accurate determination of epitope for antibodies with unknown 3D structures. MAbs 2021; 13:1961349. [PMID: 34432559 PMCID: PMC8405158 DOI: 10.1080/19420862.2021.1961349] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
MAbTope is a docking-based method for the determination of epitopes. It has been used to successfully determine the epitopes of antibodies with known 3D structures. However, during the antibody discovery process, this structural information is rarely available. Although we already have evidence that homology models of antibodies could be used instead of their 3D structure, the choice of the template, the methodology for homology modeling and the resulting performance still have to be clarified. Here, we show that MAbTope has the same performance when working with homology models of the antibodies as compared to crystallographic structures. Moreover, we show that even low-quality models can be used. We applied MAbTope to determine the epitope of dupilumab, an anti- interleukin 4 receptor alpha subunit therapeutic antibody of unknown 3D structure, that we validated experimentally. Finally, we show how the MAbTope-determined epitopes for a series of antibodies targeting the same protein can be used to predict competitions, and demonstrate the accuracy with an experimentally validated example. 3D: three-dimensionalRMSD: root mean square deviationCDR: complementary-determining regionCPU: central processing unitsVH: heavy chain variable regionVL: light chain variable regionscFv: single-chain variable fragmentsVHH: single-chain antibody variable regionIL4Rα: Interleukin 4 receptor alpha chainSPR: surface plasmon resonancePDB: protein data bankHEK293: Human embryonic kidney 293 cellsEDTA: Ethylenediaminetetraacetic acidFBS: Fetal bovine serumANOVA: Analysis of varianceEGFR: Epidermal growth factor receptorPE: PhycoerythrinAPC: AllophycocyaninFSC: forward scatterSSC: side scatterWT: wild type Keywords: MAbTope, Epitope Mapping, Molecular docking, Antibody modeling, Antibody-antigen docking
Collapse
Affiliation(s)
- Shifa Tahir
- PRC, INRAE, CNRS, Université De Tours, Nouzilly, France
| | - Thomas Bourquard
- PRC, INRAE, CNRS, Université De Tours, Nouzilly, France.,MAbSilico SAS, 1 Impasse Du Palais
| | - Astrid Musnier
- PRC, INRAE, CNRS, Université De Tours, Nouzilly, France.,MAbSilico SAS, 1 Impasse Du Palais
| | - Yann Jullian
- MAbSilico SAS, 1 Impasse Du Palais.,CaSciModOT, UFR De Sciences Et Techniques, Université De Tours
| | | | | | | | - Eric Reiter
- PRC, INRAE, CNRS, Université De Tours, Nouzilly, France.,France Inria, Inria Saclay-Île-de-France, Palaiseau, France.,Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
| | - Pascale Crépieux
- PRC, INRAE, CNRS, Université De Tours, Nouzilly, France.,France Inria, Inria Saclay-Île-de-France, Palaiseau, France.,Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
| | | | - Anne Poupon
- PRC, INRAE, CNRS, Université De Tours, Nouzilly, France.,MAbSilico SAS, 1 Impasse Du Palais.,France Inria, Inria Saclay-Île-de-France, Palaiseau, France.,Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
| |
Collapse
|
12
|
Chen KH, Hu YJ. Residue-Residue Interaction Prediction via Stacked Meta-Learning. Int J Mol Sci 2021; 22:ijms22126393. [PMID: 34203772 PMCID: PMC8232778 DOI: 10.3390/ijms22126393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/06/2021] [Accepted: 06/13/2021] [Indexed: 11/16/2022] Open
Abstract
Protein-protein interactions (PPIs) are the basis of most biological functions determined by residue-residue interactions (RRIs). Predicting residue pairs responsible for the interaction is crucial for understanding the cause of a disease and drug design. Computational approaches that considered inexpensive and faster solutions for RRI prediction have been widely used to predict protein interfaces for further analysis. This study presents RRI-Meta, an ensemble meta-learning-based method for RRI prediction. Its hierarchical learning structure comprises four base classifiers and one meta-classifier to integrate predictive strengths from different classifiers. It considers multiple feature types, including sequence-, structure-, and neighbor-based features, for characterizing other properties of a residue interaction environment to better distinguish between noninteracting and interacting residues. We conducted the same experiments using the same data as previously reported in the literature to demonstrate RRI-Meta's performance. Experimental results show that RRI-Meta is superior to several current prediction tools. Additionally, to analyze the factors that affect the performance of RRI-Meta, we conducted a comparative case study using different protein complexes.
Collapse
Affiliation(s)
- Kuan-Hsi Chen
- College of Computer Science, National Yang Ming Chiao Tung University, Hsinchu 300093, Taiwan;
| | - Yuh-Jyh Hu
- Institute of Biomedical Engineering, National Yang Ming Chiao Tung University, Hsinchu 300093, Taiwan
- Correspondence: ; Tel.: +886-3-571-2121
| |
Collapse
|
13
|
Pattern Discovery and Disentanglement for Aligned Pattern Cluster Analysis and Protein Binding Complexes Detection. Bioinformatics 2021. [DOI: 10.36255/exonpublications.bioinformatics.2021.ch10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] Open
|
14
|
Simončič M, Lukšič M. Mechanistic differences in the effects of sucrose and sucralose on the phase stability of lysozyme solutions. J Mol Liq 2021; 326. [PMID: 35082450 DOI: 10.1016/j.molliq.2020.115245] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The effect of two disaccharide analogues, sucrose and sucralose, on the phase stability of aqueous lysozyme solutions has been addressed from a mechanistic viewpoint by a combination of experiment and molecular dynamics (MD) simulations. The influence of the added low molecular weight salts (NaBr, NaI and NaNO3) was considered as well. The cloud-point temperature measurements revealed a larger stabilizing effect of sucralose. Upon increasing sugar concentration, the protein solutions became more stable and differences in the effect of sucralose and sucrose amplified. It was confirmed that the addition of either of the two sugars imposed no secondary structure changes of the lysozyme. Enthalpies of lysozyme-sugar mixing were exothermic and a larger effect was recorded for sucralose. MD simulations indicated that acidic, basic and polar amino acid residues play predominant roles in the sugar-protein interactions, mainly through hydrogen bonding. Such sugar mediated protein-protein interactions are thought to be responsible for the biopreserative nature of sugars. Our observations hint at mechanistic differences in sugar-lysozyme interactions: while sucrose does not interact directly with the protein's surface for the most part (in line with the preferential hydration hypothesis), sucralose forms hydrogen bonds with acidic, basic and polar amino acid residues at the lysozyme's surface (in line with the water replacement hypothesis).
Collapse
Affiliation(s)
- Matjaž Simončič
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Večna pot 113, SI-1000 Ljubljana, Slovenia
| | - Miha Lukšič
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Večna pot 113, SI-1000 Ljubljana, Slovenia
| |
Collapse
|
15
|
Akbar R, Robert PA, Pavlović M, Jeliazkov JR, Snapkov I, Slabodkin A, Weber CR, Scheffer L, Miho E, Haff IH, Haug DTT, Lund-Johansen F, Safonova Y, Sandve GK, Greiff V. A compact vocabulary of paratope-epitope interactions enables predictability of antibody-antigen binding. Cell Rep 2021; 34:108856. [PMID: 33730590 DOI: 10.1016/j.celrep.2021.108856] [Citation(s) in RCA: 75] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 11/29/2020] [Accepted: 02/22/2021] [Indexed: 12/16/2022] Open
Abstract
Antibody-antigen binding relies on the specific interaction of amino acids at the paratope-epitope interface. The predictability of antibody-antigen binding is a prerequisite for de novo antibody and (neo-)epitope design. A fundamental premise for the predictability of antibody-antigen binding is the existence of paratope-epitope interaction motifs that are universally shared among antibody-antigen structures. In a dataset of non-redundant antibody-antigen structures, we identify structural interaction motifs, which together compose a commonly shared structure-based vocabulary of paratope-epitope interactions. We show that this vocabulary enables the machine learnability of antibody-antigen binding on the paratope-epitope level using generative machine learning. The vocabulary (1) is compact, less than 104 motifs; (2) distinct from non-immune protein-protein interactions; and (3) mediates specific oligo- and polyreactive interactions between paratope-epitope pairs. Our work leverages combined structure- and sequence-based learning to demonstrate that machine-learning-driven predictive paratope and epitope engineering is feasible.
Collapse
Affiliation(s)
- Rahmad Akbar
- Department of Immunology, University of Oslo, Oslo, Norway.
| | | | - Milena Pavlović
- Department of Informatics, University of Oslo, Oslo, Norway; Centre for Bioinformatics, University of Oslo, Norway; K.G. Jebsen Centre for Coeliac Disease Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | | | - Igor Snapkov
- Department of Immunology, University of Oslo, Oslo, Norway
| | | | - Cédric R Weber
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Lonneke Scheffer
- Department of Informatics, University of Oslo, Oslo, Norway; Centre for Bioinformatics, University of Oslo, Norway
| | - Enkelejda Miho
- Institute of Medical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
| | | | | | | | - Yana Safonova
- Computer Science and Engineering Department, University of California, San Diego, La Jolla, CA, USA
| | - Geir K Sandve
- Department of Informatics, University of Oslo, Oslo, Norway; Centre for Bioinformatics, University of Oslo, Norway; K.G. Jebsen Centre for Coeliac Disease Research, Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Victor Greiff
- Department of Immunology, University of Oslo, Oslo, Norway.
| |
Collapse
|
16
|
McCafferty CL, Marcotte EM, Taylor DW. Simplified geometric representations of protein structures identify complementary interaction interfaces. Proteins 2021; 89:348-360. [PMID: 33140424 PMCID: PMC7855953 DOI: 10.1002/prot.26020] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 09/22/2020] [Accepted: 10/25/2020] [Indexed: 12/12/2022]
Abstract
Protein-protein interactions are critical to protein function, but three-dimensional (3D) arrangements of interacting proteins have proven hard to predict, even given the identities and 3D structures of the interacting partners. Specifically, identifying the relevant pairwise interaction surfaces remains difficult, often relying on shape complementarity with molecular docking while accounting for molecular motions to optimize rigid 3D translations and rotations. However, such approaches can be computationally expensive, and faster, less accurate approximations may prove useful for large-scale prediction and assembly of 3D structures of multi-protein complexes. We asked if a reduced representation of protein geometry retains enough information about molecular properties to predict pairwise protein interaction interfaces that are tolerant of limited structural rearrangements. Here, we describe a reduced representation of 3D protein accessible surfaces on which molecular properties such as charge, hydrophobicity, and evolutionary rate can be easily mapped, implemented in the MorphProt package. Pairs of surfaces are compared to rapidly assess partner-specific potential surface complementarity. On two available benchmarks of 185 overall known protein complexes, we observe predictions comparable to other structure-based tools at correctly identifying protein interaction surfaces. Furthermore, we examined the effect of molecular motion through normal mode simulation on a benchmark receptor-ligand pair and observed no marked loss of predictive accuracy for distortions of up to 6 Å Cα-RMSD. Thus, a shape reduction of protein surfaces retains considerable information about surface complementarity, offers enhanced speed of comparison relative to more complex geometric representations, and exhibits tolerance to conformational changes.
Collapse
Affiliation(s)
- Caitlyn L. McCafferty
- Department of Molecular BiosciencesUniversity of Texas at AustinAustinTexasUSA
- Center for Systems and Synthetic BiologyUniversity of Texas at AustinAustinTexasUSA
- Institute for Cellular and Molecular BiologyUniversity of Texas at AustinAustinTexasUSA
| | - Edward M. Marcotte
- Department of Molecular BiosciencesUniversity of Texas at AustinAustinTexasUSA
- Center for Systems and Synthetic BiologyUniversity of Texas at AustinAustinTexasUSA
- Institute for Cellular and Molecular BiologyUniversity of Texas at AustinAustinTexasUSA
| | - David W. Taylor
- Department of Molecular BiosciencesUniversity of Texas at AustinAustinTexasUSA
- Center for Systems and Synthetic BiologyUniversity of Texas at AustinAustinTexasUSA
- Institute for Cellular and Molecular BiologyUniversity of Texas at AustinAustinTexasUSA
- LIVESTRONG Cancer InstitutesDell Medical SchoolAustinTexasUSA
| |
Collapse
|
17
|
Slater O, Miller B, Kontoyianni M. Decoding Protein-protein Interactions: An Overview. Curr Top Med Chem 2021; 20:855-882. [PMID: 32101126 DOI: 10.2174/1568026620666200226105312] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Revised: 11/27/2019] [Accepted: 11/27/2019] [Indexed: 12/24/2022]
Abstract
Drug discovery has focused on the paradigm "one drug, one target" for a long time. However, small molecules can act at multiple macromolecular targets, which serves as the basis for drug repurposing. In an effort to expand the target space, and given advances in X-ray crystallography, protein-protein interactions have become an emerging focus area of drug discovery enterprises. Proteins interact with other biomolecules and it is this intricate network of interactions that determines the behavior of the system and its biological processes. In this review, we briefly discuss networks in disease, followed by computational methods for protein-protein complex prediction. Computational methodologies and techniques employed towards objectives such as protein-protein docking, protein-protein interactions, and interface predictions are described extensively. Docking aims at producing a complex between proteins, while interface predictions identify a subset of residues on one protein that could interact with a partner, and protein-protein interaction sites address whether two proteins interact. In addition, approaches to predict hot spots and binding sites are presented along with a representative example of our internal project on the chemokine CXC receptor 3 B-isoform and predictive modeling with IP10 and PF4.
Collapse
Affiliation(s)
- Olivia Slater
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Bethany Miller
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| | - Maria Kontoyianni
- Department of Pharmaceutical Sciences, Southern Illinois University, Edwardsville, IL 62026, United States
| |
Collapse
|
18
|
Abbasi WA, Yaseen A, Hassan FU, Andleeb S, Minhas FUAA. ISLAND: in-silico proteins binding affinity prediction using sequence information. BioData Min 2020; 13:20. [PMID: 33292419 PMCID: PMC7688004 DOI: 10.1186/s13040-020-00231-w] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 11/15/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Determining binding affinity in protein-protein interactions is important in the discovery and design of novel therapeutics and mutagenesis studies. Determination of binding affinity of proteins in the formation of protein complexes requires sophisticated, expensive and time-consuming experimentation which can be replaced with computational methods. Most computational prediction techniques require protein structures that limit their applicability to protein complexes with known structures. In this work, we explore sequence-based protein binding affinity prediction using machine learning. METHOD We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the protein binding affinity. RESULTS We present our findings that the true generalization performance of even the state-of-the-art sequence-only predictor is far from satisfactory and that the development of machine learning methods for binding affinity prediction with improved generalization performance is still an open problem. We have also proposed a sequence-based novel protein binding affinity predictor called ISLAND which gives better accuracy than existing methods over the same validation set as well as on external independent test dataset. A cloud-based webserver implementation of ISLAND and its python code are available at https://sites.google.com/view/wajidarshad/software . CONCLUSION This paper highlights the fact that the true generalization performance of even the state-of-the-art sequence-only predictor of binding affinity is far from satisfactory and that the development of effective and practical methods in this domain is still an open problem.
Collapse
Affiliation(s)
- Wajid Arshad Abbasi
- Computational Biology and Data Analysis Laboratory, Department of Computer Science and Information Technology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, Pakistan. .,Biomedical Informatics Research Laboratory, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan.
| | - Adiba Yaseen
- Biomedical Informatics Research Laboratory, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Fahad Ul Hassan
- Biomedical Informatics Research Laboratory, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Saiqa Andleeb
- Biotechnology Laboratory, Department of Zoology, King Abdullah Campus, University of Azad Jammu & Kashmir, Muzaffarabad, Pakistan
| | | |
Collapse
|
19
|
Zhang J, Kurgan L. SCRIBER: accurate and partner type-specific prediction of protein-binding residues from proteins sequences. Bioinformatics 2020; 35:i343-i353. [PMID: 31510679 PMCID: PMC6612887 DOI: 10.1093/bioinformatics/btz324] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Motivation Accurate predictions of protein-binding residues (PBRs) enhances understanding of molecular-level rules governing protein–protein interactions, helps protein–protein docking and facilitates annotation of protein functions. Recent studies show that current sequence-based predictors of PBRs severely cross-predict residues that interact with other types of protein partners (e.g. RNA and DNA) as PBRs. Moreover, these methods are relatively slow, prohibiting genome-scale use. Results We propose a novel, accurate and fast sequence-based predictor of PBRs that minimizes the cross-predictions. Our SCRIBER (SeleCtive pRoteIn-Binding rEsidue pRedictor) method takes advantage of three innovations: comprehensive dataset that covers multiple types of binding residues, novel types of inputs that are relevant to the prediction of PBRs, and an architecture that is tailored to reduce the cross-predictions. The dataset includes complete protein chains and offers improved coverage of binding annotations that are transferred from multiple protein–protein complexes. We utilize innovative two-layer architecture where the first layer generates a prediction of protein-binding, RNA-binding, DNA-binding and small ligand-binding residues. The second layer re-predicts PBRs by reducing overlap between PBRs and the other types of binding residues produced in the first layer. Empirical tests on an independent test dataset reveal that SCRIBER significantly outperforms current predictors and that all three innovations contribute to its high predictive performance. SCRIBER reduces cross-predictions by between 41% and 69% and our conservative estimates show that it is at least 3 times faster. We provide putative PBRs produced by SCRIBER for the entire human proteome and use these results to hypothesize that about 14% of currently known human protein domains bind proteins. Availability and implementation SCRIBER webserver is available at http://biomine.cs.vcu.edu/servers/SCRIBER/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China.,Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
20
|
Lyu Y, Huang H, Gong X. A Novel Index of Contact Frequency from Noise Protein-Protein Interaction Data Help for Accurate Interface Residue Pair Prediction. Interdiscip Sci 2020; 12:204-216. [PMID: 32185690 DOI: 10.1007/s12539-020-00364-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 01/23/2020] [Accepted: 02/24/2020] [Indexed: 11/24/2022]
Abstract
Protein-protein interactions are important for most biological processes and have been studied for decades. However, the detailed formation mechanism of protein-protein interaction interface is still ambiguous, which makes it difficult to accurately predict the protein-protein interaction interface residue pairs. Here, we extract the interface residue-residue contacts from the decoys in the ZDOCK protein-protein complex decoy set with RMSD mostly larger than 3 Å. To accurately compute the interface residue-residue contacts, we define a new constant called interface residue pairs frequency, which counts the atom contact numbers between two interface residues. We normalize interface residue pairs frequency to pick out the top residue-residue pairs from all the possible pairs preferential to be on correct protein-protein interaction interface. When tested on 37 protein dimers from the decoy set where most decoys are incorrect, our method successfully predicts 30 protein dimers with a success rate of up to 81.1%. Higher accuracy than some other state-of-the-art methods confirmed the performance of our method.
Collapse
Affiliation(s)
- Yanfen Lyu
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, School of Math, Renmin University of China, Beijing, 100872, China
| | - He Huang
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, School of Math, Renmin University of China, Beijing, 100872, China
| | - Xinqi Gong
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, School of Math, Renmin University of China, Beijing, 100872, China.
| |
Collapse
|
21
|
Xie Z, Deng X, Shu K. Prediction of Protein-Protein Interaction Sites Using Convolutional Neural Network and Improved Data Sets. Int J Mol Sci 2020; 21:E467. [PMID: 31940793 PMCID: PMC7013409 DOI: 10.3390/ijms21020467] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Revised: 12/23/2019] [Accepted: 01/08/2020] [Indexed: 12/20/2022] Open
Abstract
Protein-protein interaction (PPI) sites play a key role in the formation of protein complexes, which is the basis of a variety of biological processes. Experimental methods to solve PPI sites are expensive and time-consuming, which has led to the development of different kinds of prediction algorithms. We propose a convolutional neural network for PPI site prediction and use residue binding propensity to improve the positive samples. Our method obtains a remarkable result of the area under the curve (AUC) = 0.912 on the improved data set. In addition, it yields much better results on samples with high binding propensity than on randomly selected samples. This suggests that there are considerable false-positive PPI sites in the positive samples defined by the distance between residue atoms.
Collapse
Affiliation(s)
- Zengyan Xie
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China;
| | | | - Kunxian Shu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China;
| |
Collapse
|
22
|
Barreto CAV, Baptista SJ, Preto AJ, Matos-Filipe P, Mourão J, Melo R, Moreira I. Prediction and targeting of GPCR oligomer interfaces. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 169:105-149. [PMID: 31952684 DOI: 10.1016/bs.pmbts.2019.11.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
GPCR oligomerization has emerged as a hot topic in the GPCR field in the last years. Receptors that are part of these oligomers can influence each other's function, although it is not yet entirely understood how these interactions work. The existence of such a highly complex network of interactions between GPCRs generates the possibility of alternative targets for new therapeutic approaches. However, challenges still exist in the characterization of these complexes, especially at the interface level. Different experimental approaches, such as FRET or BRET, are usually combined to study GPCR oligomer interactions. Computational methods have been applied as a useful tool for retrieving information from GPCR sequences and the few X-ray-resolved oligomeric structures that are accessible, as well as for predicting new and trustworthy GPCR oligomeric interfaces. Machine-learning (ML) approaches have recently helped with some hindrances of other methods. By joining and evaluating multiple structure-, sequence- and co-evolution-based features on the same algorithm, it is possible to dilute the issues of particular structures and residues that arise from the experimental methodology into all-encompassing algorithms capable of accurately predict GPCR-GPCR interfaces. All these methods used as a single or a combined approach provide useful information about GPCR oligomerization and its role in GPCR function and dynamics. Altogether, we present experimental, computational and machine-learning methods used to study oligomers interfaces, as well as strategies that have been used to target these dynamic complexes.
Collapse
Affiliation(s)
- Carlos A V Barreto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Salete J Baptista
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, CTN, LRS, Portugal
| | - António José Preto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Pedro Matos-Filipe
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Joana Mourão
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Institute for Interdisciplinary Research, University of Coimbra, Coimbra, Portugal
| | - Rita Melo
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, CTN, LRS, Portugal
| | - Irina Moreira
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Science and Technology Faculty, University of Coimbra, Coimbra, Portugal.
| |
Collapse
|
23
|
Kamal H, Minhas FUAA, Tripathi D, Abbasi WA, Hamza M, Mustafa R, Khan MZ, Mansoor S, Pappu HR, Amin I. βC1, pathogenicity determinant encoded by Cotton leaf curl Multan betasatellite, interacts with calmodulin-like protein 11 (Gh-CML11) in Gossypium hirsutum. PLoS One 2019; 14:e0225876. [PMID: 31794580 PMCID: PMC6890265 DOI: 10.1371/journal.pone.0225876] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Accepted: 11/14/2019] [Indexed: 01/14/2023] Open
Abstract
Begomoviruses interfere with host plant machinery to evade host defense mechanism by interacting with plant proteins. In the old world, this group of viruses are usually associated with betasatellite that induces severe disease symptoms by encoding a protein, βC1, which is a pathogenicity determinant. Here, we show that βC1 encoded by Cotton leaf curl Multan betasatellite (CLCuMB) requires Gossypium hirsutum calmodulin-like protein 11 (Gh-CML11) to infect cotton. First, we used the in silico approach to predict the interaction of CLCuMB-βC1 with Gh-CML11. A number of sequence- and structure-based in-silico interaction prediction techniques suggested a strong putative binding of CLCuMB-βC1 with Gh-CML11 in a Ca+2-dependent manner. In-silico interaction prediction was then confirmed by three different experimental approaches: The Gh-CML11 interaction was confirmed using CLCuMB-βC1 in a yeast two hybrid system and pull down assay. These results were further validated using bimolecular fluorescence complementation system showing the interaction in cytoplasmic veins of Nicotiana benthamiana. Bioinformatics and molecular studies suggested that CLCuMB-βC1 induces the overexpression of Gh-CML11 protein and ultimately provides calcium as a nutrient source for virus movement and transmission. This is the first comprehensive study on the interaction between CLCuMB-βC1 and Gh-CML11 proteins which provided insights into our understating of the role of βC1 in cotton leaf curl disease.
Collapse
Affiliation(s)
- Hira Kamal
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
- Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
- Department of Plant Pathology, Washington State University, Pullman, WA, United States of America
| | | | - Diwaker Tripathi
- Department of Biology, University of Washington, Seattle, WA, United States of America
| | - Wajid Arshad Abbasi
- Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, Islamabad, Pakistan
| | - Muhammad Hamza
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Roma Mustafa
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Muhammad Zuhaib Khan
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Shahid Mansoor
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Hanu R. Pappu
- Department of Plant Pathology, Washington State University, Pullman, WA, United States of America
| | - Imran Amin
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| |
Collapse
|
24
|
Liu J, Gong X. Attention mechanism enhanced LSTM with residual architecture and its application for protein-protein interaction residue pairs prediction. BMC Bioinformatics 2019; 20:609. [PMID: 31775612 PMCID: PMC6882172 DOI: 10.1186/s12859-019-3199-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2018] [Accepted: 11/06/2019] [Indexed: 11/25/2022] Open
Abstract
Background Recurrent neural network(RNN) is a good way to process sequential data, but the capability of RNN to compute long sequence data is inefficient. As a variant of RNN, long short term memory(LSTM) solved the problem in some extent. Here we improved LSTM for big data application in protein-protein interaction interface residue pairs prediction based on the following two reasons. On the one hand, there are some deficiencies in LSTM, such as shallow layers, gradient explosion or vanishing, etc. With a dramatic data increasing, the imbalance between algorithm innovation and big data processing has been more serious and urgent. On the other hand, protein-protein interaction interface residue pairs prediction is an important problem in biology, but the low prediction accuracy compels us to propose new computational methods. Results In order to surmount aforementioned problems of LSTM, we adopt the residual architecture and add attention mechanism to LSTM. In detail, we redefine the block, and add a connection from front to back in every two layers and attention mechanism to strengthen the capability of mining information. Then we use it to predict protein-protein interaction interface residue pairs, and acquire a quite good accuracy over 72%. What’s more, we compare our method with random experiments, PPiPP, standard LSTM, and some other machine learning methods. Our method shows better performance than the methods mentioned above. Conclusion We present an attention mechanism enhanced LSTM with residual architecture, and make deeper network without gradient vanishing or explosion to a certain extent. Then we apply it to a significant problem– protein-protein interaction interface residue pairs prediction and obtain a better accuracy than other methods. Our method provides a new approach for protein-protein interaction computation, which will be helpful for related biomedical researches.
Collapse
Affiliation(s)
- Jiale Liu
- Mathematics Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, No. 59 Zhongguancun Street,Haidian District, Beijing, China
| | - Xinqi Gong
- Mathematics Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, No. 59 Zhongguancun Street,Haidian District, Beijing, China. .,Center for Mathematical Sciences and Applications,Harvard University, Boston, MA02138, USA.
| |
Collapse
|
25
|
Sanchez-Garcia R, Sorzano COS, Carazo JM, Segura J. BIPSPI: a method for the prediction of partner-specific protein-protein interfaces. Bioinformatics 2019; 35:470-477. [PMID: 30020406 PMCID: PMC6361243 DOI: 10.1093/bioinformatics/bty647] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 07/17/2018] [Indexed: 11/15/2022] Open
Abstract
Motivation Protein-Protein Interactions (PPI) are essentials for most cellular processes and thus, unveiling how proteins interact is a crucial question that can be better understood by identifying which residues are responsible for the interaction. Computational approaches are orders of magnitude cheaper and faster than experimental ones, leading to proliferation of multiple methods aimed to predict which residues belong to the interface of an interaction. Results We present BIPSPI, a new machine learning-based method for the prediction of partner-specific PPI sites. Contrary to most binding site prediction methods, the proposed approach takes into account a pair of interacting proteins rather than a single one in order to predict partner-specific binding sites. BIPSPI has been trained employing sequence-based and structural features from both protein partners of each complex compiled in the Protein-Protein Docking Benchmark version 5.0 and in an additional set independently compiled. Also, a version trained only on sequences has been developed. The performance of our approach has been assessed by a leave-one-out cross-validation over different benchmarks, outperforming state-of-the-art methods. Availability and implementation BIPSPI web server is freely available at http://bipspi.cnb.csic.es. BIPSPI code is available at https://github.com/bioinsilico/BIPSPI. Docker image is available at https://hub.docker.com/r/bioinsilico/bipspi/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ruben Sanchez-Garcia
- GN7 of the Spanish National Institute for Bioinformatics (INB), Biocomputing Unit, National Center of Biotechnology (CSIC), Instruct Image Processing Center, Madrid, Spain
| | - C O S Sorzano
- GN7 of the Spanish National Institute for Bioinformatics (INB), Biocomputing Unit, National Center of Biotechnology (CSIC), Instruct Image Processing Center, Madrid, Spain
| | - J M Carazo
- GN7 of the Spanish National Institute for Bioinformatics (INB), Biocomputing Unit, National Center of Biotechnology (CSIC), Instruct Image Processing Center, Madrid, Spain
| | - Joan Segura
- GN7 of the Spanish National Institute for Bioinformatics (INB), Biocomputing Unit, National Center of Biotechnology (CSIC), Instruct Image Processing Center, Madrid, Spain
| |
Collapse
|
26
|
Zhao Z, Gong X. Protein-Protein Interaction Interface Residue Pair Prediction Based on Deep Learning Architecture. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1753-1759. [PMID: 28541224 DOI: 10.1109/tcbb.2017.2706682] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
MOTIVATION Proteins usually fulfill their biological functions by interacting with other proteins. Although some methods have been developed to predict the binding sites of a monomer protein, these are not sufficient for prediction of the interaction between two monomer proteins. The correct prediction of interface residue pairs from two monomer proteins is still an open question and has great significance for practical experimental applications in the life sciences. We hope to build a method for the prediction of interface residue pairs that is suitable for those applications. RESULTS Here, we developed a novel deep network architecture called the multi-layered Long-Short Term Memory networks (LSTMs) approach for the prediction of protein interface residue pairs. First, we created three new descriptions and used other six worked characterizations to describe an amino acid, then we employed these features to discriminate between interface residue pairs and non-interface residue pairs. Second, we used two thresholds to select residue pairs that are more likely to be interface residue pairs. Furthermore, this step increases the proportion of interface residue pairs and reduces the influence of imbalanced data. Third, we built deep network architectures based on Long-Short Term Memory networks algorithm to organize and refine the prediction of interface residue pairs by employing features mentioned above. We trained the deep networks on dimers in the unbound state in the international Protein-protein Docking Benchmark version 3.0. The updated data sets in the versions 4.0 and 5.0 were used as the validation set and test set respectively. For our best model, the accuracy rate was over 62 percent when we chose the top 0.2 percent pairs of every dimer in the test set as predictions, which will be very helpful for the understanding of protein-protein interaction mechanisms and for guidance in biological experiments.
Collapse
|
27
|
Ahmad S, Prathipati P, Tripathi LP, Chen YA, Arya A, Murakami Y, Mizuguchi K. Integrating sequence and gene expression information predicts genome-wide DNA-binding proteins and suggests a cooperative mechanism. Nucleic Acids Res 2019; 46:54-70. [PMID: 29186632 PMCID: PMC5758906 DOI: 10.1093/nar/gkx1166] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2016] [Accepted: 11/15/2017] [Indexed: 12/29/2022] Open
Abstract
DNA-binding proteins (DBPs) perform diverse biological functions ranging from transcription to pathogen sensing. Machine learning methods can not only identify DBPs de novo but also provide insights into their DNA-recognition dynamics. However, it remains unclear whether available methods that can accurately predict DNA-binding sites in known DBPs can also identify novel DBPs. Moreover, sequence information is blind to the cellular- and disease-specific contexts of DBP activities, whereas the under-utilized knowledge from public gene expression data offers great promise. To address these issues, we have developed novel methods for predicting DBPs by integrating sequence and gene expression-derived features and applied them to explore human, mouse and Arabidopsis proteomes. While our sequence-based models outperformed the gene expression-based ones, some proteins with weaker DBP-like sequence features were correctly predicted by gene expression-based features, suggesting that these proteins acquire a tangible DBP functionality in a conducive gene expression environment. Analysis of motif enrichment among the co-expressed genes of top 100 candidates DBPs from hitherto unannotated genes provides further avenues to explore their functional associations.
Collapse
Affiliation(s)
- Shandar Ahmad
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India.,Laboratory of Bioinformatics, National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito-asagi, Ibaraki, Osaka 5670085, Japan
| | - Philip Prathipati
- Laboratory of Bioinformatics, National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito-asagi, Ibaraki, Osaka 5670085, Japan
| | - Lokesh P Tripathi
- Laboratory of Bioinformatics, National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito-asagi, Ibaraki, Osaka 5670085, Japan
| | - Yi-An Chen
- Laboratory of Bioinformatics, National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito-asagi, Ibaraki, Osaka 5670085, Japan
| | - Ajay Arya
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi 110067, India
| | - Yoichi Murakami
- Laboratory of Bioinformatics, National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito-asagi, Ibaraki, Osaka 5670085, Japan
| | - Kenji Mizuguchi
- Laboratory of Bioinformatics, National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito-asagi, Ibaraki, Osaka 5670085, Japan
| |
Collapse
|
28
|
Kamal H, Minhas FUAA, Farooq M, Tripathi D, Hamza M, Mustafa R, Khan MZ, Mansoor S, Pappu HR, Amin I. In silico Prediction and Validations of Domains Involved in Gossypium hirsutum SnRK1 Protein Interaction With Cotton Leaf Curl Multan Betasatellite Encoded βC1. FRONTIERS IN PLANT SCIENCE 2019; 10:656. [PMID: 31191577 PMCID: PMC6546731 DOI: 10.3389/fpls.2019.00656] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2018] [Accepted: 05/01/2019] [Indexed: 05/19/2023]
Abstract
Cotton leaf curl disease (CLCuD) caused by viruses of genus Begomovirus is a major constraint to cotton (Gossypium hirsutum) production in many cotton-growing regions of the world. Symptoms of the disease are caused by Cotton leaf curl Multan betasatellite (CLCuMB) that encodes a pathogenicity determinant protein, βC1. Here, we report the identification of interacting regions in βC1 protein by using computational approaches including sequence recognition, and binding site and interface prediction methods. We show the domain-level interactions based on the structural analysis of G. hirsutum SnRK1 protein and its domains with CLCuMB-βC1. To verify and validate the in silico predictions, three different experimental approaches, yeast two hybrid, bimolecular fluorescence complementation and pull down assay were used. Our results showed that ubiquitin-associated domain (UBA) and autoinhibitory sequence (AIS) domains of G. hirsutum-encoded SnRK1 are involved in CLCuMB-βC1 interaction. This is the first comprehensive investigation that combined in silico interaction prediction followed by experimental validation of interaction between CLCuMB-βC1 and a host protein. We demonstrated that data from computational biology could provide binding site information between CLCuD-associated viruses/satellites and new hosts that lack known binding site information for protein-protein interaction studies. Implications of these findings are discussed.
Collapse
Affiliation(s)
- Hira Kamal
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
- Pakistan Institute of Engineering and Applied Sciences, Islamabad, Pakistan
- Department of Plant Pathology, Washington State University, Pullman, WA, United States
| | | | - Muhammad Farooq
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Diwaker Tripathi
- Department of Biology, University of Washington, Seattle, WA, United States
| | - Muhammad Hamza
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Roma Mustafa
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Muhammad Zuhaib Khan
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Shahid Mansoor
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| | - Hanu R. Pappu
- Department of Plant Pathology, Washington State University, Pullman, WA, United States
| | - Imran Amin
- National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan
| |
Collapse
|
29
|
Jung Y, El-Manzalawy Y, Dobbs D, Honavar VG. Partner-specific prediction of RNA-binding residues in proteins: A critical assessment. Proteins 2018; 87:198-211. [PMID: 30536635 PMCID: PMC6389706 DOI: 10.1002/prot.25639] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 10/10/2018] [Accepted: 11/29/2018] [Indexed: 01/06/2023]
Abstract
RNA-protein interactions play essential roles in regulating gene expression. While some RNA-protein interactions are "specific", that is, the RNA-binding proteins preferentially bind to particular RNA sequence or structural motifs, others are "non-RNA specific." Deciphering the protein-RNA recognition code is essential for comprehending the functional implications of these interactions and for developing new therapies for many diseases. Because of the high cost of experimental determination of protein-RNA interfaces, there is a need for computational methods to identify RNA-binding residues in proteins. While most of the existing computational methods for predicting RNA-binding residues in RNA-binding proteins are oblivious to the characteristics of the partner RNA, there is growing interest in methods for partner-specific prediction of RNA binding sites in proteins. In this work, we assess the performance of two recently published partner-specific protein-RNA interface prediction tools, PS-PRIP, and PRIdictor, along with our own new tools. Specifically, we introduce a novel metric, RNA-specificity metric (RSM), for quantifying the RNA-specificity of the RNA binding residues predicted by such tools. Our results show that the RNA-binding residues predicted by previously published methods are oblivious to the characteristics of the putative RNA binding partner. Moreover, when evaluated using partner-agnostic metrics, RNA partner-specific methods are outperformed by the state-of-the-art partner-agnostic methods. We conjecture that either (a) the protein-RNA complexes in PDB are not representative of the protein-RNA interactions in nature, or (b) the current methods for partner-specific prediction of RNA-binding residues in proteins fail to account for the differences in RNA partner-specific versus partner-agnostic protein-RNA interactions, or both.
Collapse
Affiliation(s)
- Yong Jung
- Bioinformatics and Genomics Graduate Program, Pennsylvania State University, University Park, Pennsylvania.,Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, Pennsylvania.,The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania
| | - Yasser El-Manzalawy
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, Pennsylvania.,Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, Pennsylvania.,College of Information Sciences and Technology, Pennsylvania State University, Pennsylvania
| | - Drena Dobbs
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa.,Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, Iowa
| | - Vasant G Honavar
- Bioinformatics and Genomics Graduate Program, Pennsylvania State University, University Park, Pennsylvania.,Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, Pennsylvania.,Institute for Cyberscience, Pennsylvania State University, University Park, Pennsylvania.,Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, Pennsylvania.,The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania.,College of Information Sciences and Technology, Pennsylvania State University, Pennsylvania
| |
Collapse
|
30
|
Abbasi WA, Asif A, Ben-Hur A, Minhas FUAA. Learning protein binding affinity using privileged information. BMC Bioinformatics 2018; 19:425. [PMID: 30442086 PMCID: PMC6238365 DOI: 10.1186/s12859-018-2448-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 10/25/2018] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Determining protein-protein interactions and their binding affinity are important in understanding cellular biological processes, discovery and design of novel therapeutics, protein engineering, and mutagenesis studies. Due to the time and effort required in wet lab experiments, computational prediction of binding affinity from sequence or structure is an important area of research. Structure-based methods, though more accurate than sequence-based techniques, are limited in their applicability due to limited availability of protein structure data. RESULTS In this study, we propose a novel machine learning method for predicting binding affinity that uses protein 3D structure as privileged information at training time while expecting only protein sequence information during testing. Using the method, which is based on the framework of learning using privileged information (LUPI), we have achieved improved performance over corresponding sequence-based binding affinity prediction methods that do not have access to privileged information during training. Our experiments show that with the proposed framework which uses structure only during training, it is possible to achieve classification performance comparable to that which is obtained using structure-based features. Evaluation on an independent test set shows improved performance over the PPA-Pred2 method as well. CONCLUSIONS The proposed method outperforms several baseline learners and a state-of-the-art binding affinity predictor not only in cross-validation, but also on an additional validation dataset, demonstrating the utility of the LUPI framework for problems that would benefit from classification using structure-based features. The implementation of LUPI developed for this work is expected to be useful in other areas of bioinformatics as well.
Collapse
Affiliation(s)
- Wajid Arshad Abbasi
- Biomedical Informatics Research Laboratory (BIRL), Department of Computer and Information Sciences (DCIS), Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, ISL, 45650, Pakistan
- Information Technology Center (ITC), University of Azad Jammu & Kashmir, Muzaffarabad, Azad Kashmir, 13100, Pakistan
- Department of Computer Science, Colorado State University (CSU), Fort Collins, CO, 80523, USA
| | - Amina Asif
- Biomedical Informatics Research Laboratory (BIRL), Department of Computer and Information Sciences (DCIS), Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, ISL, 45650, Pakistan
| | - Asa Ben-Hur
- Department of Computer Science, Colorado State University (CSU), Fort Collins, CO, 80523, USA.
| | - Fayyaz Ul Amir Afsar Minhas
- Biomedical Informatics Research Laboratory (BIRL), Department of Computer and Information Sciences (DCIS), Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, ISL, 45650, Pakistan.
| |
Collapse
|
31
|
Bourquard T, Musnier A, Puard V, Tahir S, Ayoub MA, Jullian Y, Boulo T, Gallay N, Watier H, Bruneau G, Reiter E, Crépieux P, Poupon A. MAbTope: A Method for Improved Epitope Mapping. THE JOURNAL OF IMMUNOLOGY 2018; 201:3096-3105. [PMID: 30322966 DOI: 10.4049/jimmunol.1701722] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 09/13/2018] [Indexed: 11/19/2022]
Abstract
Abs are very efficient drugs, ∼70 of them are already approved for medical use, over 500 are in clinical development, and many more are in preclinical development. One important step in the characterization and protection of a therapeutic Ab is the determination of its cognate epitope. The gold standard is the three-dimensional structure of the Ab/Ag complex by crystallography or nuclear magnetic resonance spectroscopy. However, it remains a tedious task, and its outcome is uncertain. We have developed MAbTope, a docking-based prediction method of the epitope associated with straightforward experimental validation procedures. We show that MAbTope predicts the correct epitope for each of 129 tested examples of Ab/Ag complexes of known structure. We further validated this method through the successful determination, and experimental validation (using human embryonic kidney cells 293), of the epitopes recognized by two therapeutic Abs targeting TNF-α: certolizumab and golimumab.
Collapse
Affiliation(s)
- Thomas Bourquard
- Unité de Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Université François Rabelais-Tours, CNRS, 37380 Nouzilly, France.,Department of Human and Molecular Genetics, Baylor College of Medicine, Houston, TX 77030
| | - Astrid Musnier
- Unité de Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Université François Rabelais-Tours, CNRS, 37380 Nouzilly, France.,MAbSilico Société par Actions Simplifiée, Domaine de l'Orfrasière, 37380 Nouzilly, France
| | - Vincent Puard
- MAbSilico Société par Actions Simplifiée, Domaine de l'Orfrasière, 37380 Nouzilly, France
| | - Shifa Tahir
- Unité de Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Université François Rabelais-Tours, CNRS, 37380 Nouzilly, France
| | - Mohammed Akli Ayoub
- Unité de Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Université François Rabelais-Tours, CNRS, 37380 Nouzilly, France.,Biology Department, College of Science, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Yann Jullian
- Calcul Scientifique et Modélisation Orléans Tours, l'Unité de Formation et de Recherche Sciences et Techniques, Université François-Rabelais, 37041 Tours, France; and
| | - Thomas Boulo
- Unité de Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Université François Rabelais-Tours, CNRS, 37380 Nouzilly, France
| | - Nathalie Gallay
- Unité de Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Université François Rabelais-Tours, CNRS, 37380 Nouzilly, France.,Centre Hospitalier Régional Universitaire de Tours, Université François-Rabelais de Tours, CNRS, UMR 7292, 37041 Tours, France
| | - Hervé Watier
- Centre Hospitalier Régional Universitaire de Tours, Université François-Rabelais de Tours, CNRS, UMR 7292, 37041 Tours, France
| | - Gilles Bruneau
- Unité de Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Université François Rabelais-Tours, CNRS, 37380 Nouzilly, France
| | - Eric Reiter
- Unité de Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Université François Rabelais-Tours, CNRS, 37380 Nouzilly, France
| | - Pascale Crépieux
- Unité de Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Université François Rabelais-Tours, CNRS, 37380 Nouzilly, France
| | - Anne Poupon
- Unité de Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Université François Rabelais-Tours, CNRS, 37380 Nouzilly, France;
| |
Collapse
|
32
|
Wong AKC, Sze-To HY, Johanning GL. Pattern to Knowledge: Deep Knowledge-Directed Machine Learning for Residue-Residue Interaction Prediction. Sci Rep 2018; 8:14841. [PMID: 30287904 PMCID: PMC6172270 DOI: 10.1038/s41598-018-32834-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 09/17/2018] [Indexed: 11/21/2022] Open
Abstract
Residue-residue close contact (R2R-C) data procured from three-dimensional protein-protein interaction (PPI) experiments is currently used for predicting residue-residue interaction (R2R-I) in PPI. However, due to complex physiochemical environments, R2R-I incidences, facilitated by multiple factors, are usually entangled in the source environment and masked in the acquired data. Here we present a novel method, P2K (Pattern to Knowledge), to disentangle R2R-I patterns and render much succinct discriminative information expressed in different specific R2R-I statistical/functional spaces. Since such knowledge is not visible in the data acquired, we refer to it as deep knowledge. Leveraging the deep knowledge discovered to construct machine learning models for sequence-based R2R-I prediction, without trial-and-error combination of the features over external knowledge of sequences, our R2R-I predictor was validated for its effectiveness under stringent leave-one-complex-out-alone cross-validation in a benchmark dataset, and was surprisingly demonstrated to perform better than an existing sequence-based R2R-I predictor by 28% (p: 1.9E-08). P2K is accessible via our web server on https://p2k.uwaterloo.ca .
Collapse
Affiliation(s)
- Andrew K C Wong
- Department of Systems Design Engineering, University of Waterloo, 200 University Avenue West, Waterloo, N2L 3G1, Ontario, Canada.
| | - Ho Yin Sze-To
- Department of Systems Design Engineering, University of Waterloo, 200 University Avenue West, Waterloo, N2L 3G1, Ontario, Canada
| | - Gary L Johanning
- Biosciences Division, SRI International, 333 Ravenswood Ave, Menlo Park, CA, USA
| |
Collapse
|
33
|
Macalino SJY, Basith S, Clavio NAB, Chang H, Kang S, Choi S. Evolution of In Silico Strategies for Protein-Protein Interaction Drug Discovery. Molecules 2018; 23:E1963. [PMID: 30082644 PMCID: PMC6222862 DOI: 10.3390/molecules23081963] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 08/03/2018] [Accepted: 08/04/2018] [Indexed: 12/14/2022] Open
Abstract
The advent of advanced molecular modeling software, big data analytics, and high-speed processing units has led to the exponential evolution of modern drug discovery and better insights into complex biological processes and disease networks. This has progressively steered current research interests to understanding protein-protein interaction (PPI) systems that are related to a number of relevant diseases, such as cancer, neurological illnesses, metabolic disorders, etc. However, targeting PPIs are challenging due to their "undruggable" binding interfaces. In this review, we focus on the current obstacles that impede PPI drug discovery, and how recent discoveries and advances in in silico approaches can alleviate these barriers to expedite the search for potential leads, as shown in several exemplary studies. We will also discuss about currently available information on PPI compounds and systems, along with their usefulness in molecular modeling. Finally, we conclude by presenting the limits of in silico application in drug discovery and offer a perspective in the field of computer-aided PPI drug discovery.
Collapse
Affiliation(s)
- Stephani Joy Y Macalino
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Shaherin Basith
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Nina Abigail B Clavio
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Hyerim Chang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Soosung Kang
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| | - Sun Choi
- College of Pharmacy and Graduate School of Pharmaceutical Sciences, Ewha Womans University, Seoul 03760, Korea.
| |
Collapse
|
34
|
Daberdaku S, Ferrari C. Exploring the potential of 3D Zernike descriptors and SVM for protein-protein interface prediction. BMC Bioinformatics 2018; 19:35. [PMID: 29409446 PMCID: PMC5802066 DOI: 10.1186/s12859-018-2043-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 01/24/2018] [Indexed: 12/22/2022] Open
Abstract
Background The correct determination of protein–protein interaction interfaces is important for understanding disease mechanisms and for rational drug design. To date, several computational methods for the prediction of protein interfaces have been developed, but the interface prediction problem is still not fully understood. Experimental evidence suggests that the location of binding sites is imprinted in the protein structure, but there are major differences among the interfaces of the various protein types: the characterising properties can vary a lot depending on the interaction type and function. The selection of an optimal set of features characterising the protein interface and the development of an effective method to represent and capture the complex protein recognition patterns are of paramount importance for this task. Results In this work we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0 and compared to other state-of-the-art protein interface predictors (SPPIDER, PrISE and NPS-HomPPI). Conclusions The 3D Zernike descriptors are able to capture the similarity among patterns of physico-chemical and biochemical properties mapped on the protein surface arising from the various spatial arrangements of the underlying residues, and their usage can be easily extended to other sets of amino acid properties. The results suggest that the choice of a proper set of features characterising the protein interface is crucial for the interface prediction task, and that optimality strongly depends on the class of proteins whose interface we want to characterise. We postulate that different protein classes should be treated separately and that it is necessary to identify an optimal set of features for each protein class. Electronic supplementary material The online version of this article (10.1186/s12859-018-2043-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sebastian Daberdaku
- Department of Information Engineering, University of Padova, via Gradenigo 6/A, Padova, 35131, Italy.
| | - Carlo Ferrari
- Department of Information Engineering, University of Padova, via Gradenigo 6/A, Padova, 35131, Italy
| |
Collapse
|
35
|
Yang Y, Gong X. A new probability method to understand protein-protein interface formation mechanism at amino acid level. J Theor Biol 2018; 436:18-25. [DOI: 10.1016/j.jtbi.2017.09.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Revised: 09/21/2017] [Accepted: 09/27/2017] [Indexed: 10/18/2022]
|
36
|
Different protein-protein interface patterns predicted by different machine learning methods. Sci Rep 2017; 7:16023. [PMID: 29167570 PMCID: PMC5700192 DOI: 10.1038/s41598-017-16397-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 11/13/2017] [Indexed: 12/02/2022] Open
Abstract
Different types of protein-protein interactions make different protein-protein interface patterns. Different machine learning methods are suitable to deal with different types of data. Then, is it the same situation that different interface patterns are preferred for prediction by different machine learning methods? Here, four different machine learning methods were employed to predict protein-protein interface residue pairs on different interface patterns. The performances of the methods for different types of proteins are different, which suggest that different machine learning methods tend to predict different protein-protein interface patterns. We made use of ANOVA and variable selection to prove our result. Our proposed methods taking advantages of different single methods also got a good prediction result compared to single methods. In addition to the prediction of protein-protein interactions, this idea can be extended to other research areas such as protein structure prediction and design.
Collapse
|
37
|
Membrane proteins structures: A review on computational modeling tools. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2017; 1859:2021-2039. [DOI: 10.1016/j.bbamem.2017.07.008] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 07/04/2017] [Accepted: 07/13/2017] [Indexed: 01/02/2023]
|
38
|
Murakami Y, Tripathi LP, Prathipati P, Mizuguchi K. Network analysis and in silico prediction of protein-protein interactions with applications in drug discovery. Curr Opin Struct Biol 2017; 44:134-142. [PMID: 28364585 DOI: 10.1016/j.sbi.2017.02.005] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Revised: 02/05/2017] [Accepted: 02/23/2017] [Indexed: 11/29/2022]
Abstract
Protein-protein interactions (PPIs) are vital to maintaining cellular homeostasis. Several PPI dysregulations have been implicated in the etiology of various diseases and hence PPIs have emerged as promising targets for drug discovery. Surface residues and hotspot residues at the interface of PPIs form the core regions, which play a key role in modulating cellular processes such as signal transduction and are used as starting points for drug design. In this review, we briefly discuss how PPI networks (PPINs) inferred from experimentally characterized PPI data have been utilized for knowledge discovery and how in silico approaches to PPI characterization can contribute to PPIN-based biological research. Next, we describe the principles of in silico PPI prediction and survey the existing PPI and PPI site prediction servers that are useful for drug discovery. Finally, we discuss the potential of in silico PPI prediction in drug discovery.
Collapse
Affiliation(s)
- Yoichi Murakami
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| | - Lokesh P Tripathi
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| | - Philip Prathipati
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan
| | - Kenji Mizuguchi
- National Institutes of Biomedical Innovation, Health and Nutrition, 7-6-8 Saito Asagi, Ibaraki, Osaka 567-0085, Japan.
| |
Collapse
|
39
|
Zhang J, Kurgan L. Review and comparative assessment of sequence-based predictors of protein-binding residues. Brief Bioinform 2017; 19:821-837. [DOI: 10.1093/bib/bbx022] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Indexed: 12/31/2022] Open
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
40
|
Garcia-Garcia J, Valls-Comamala V, Guney E, Andreu D, Muñoz FJ, Fernandez-Fuentes N, Oliva B. iFrag: A Protein–Protein Interface Prediction Server Based on Sequence Fragments. J Mol Biol 2017; 429:382-389. [DOI: 10.1016/j.jmb.2016.11.034] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2016] [Revised: 11/27/2016] [Accepted: 11/30/2016] [Indexed: 01/08/2023]
|
41
|
Integrating computational methods and experimental data for understanding the recognition mechanism and binding affinity of protein-protein complexes. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2017; 128:33-38. [PMID: 28069340 DOI: 10.1016/j.pbiomolbio.2017.01.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Revised: 01/04/2017] [Accepted: 01/05/2017] [Indexed: 01/09/2023]
Abstract
Protein-protein interactions perform several functions inside the cell. Understanding the recognition mechanism and binding affinity of protein-protein complexes is a challenging problem in experimental and computational biology. In this review, we focus on two aspects (i) understanding the recognition mechanism and (ii) predicting the binding affinity. The first part deals with computational techniques for identifying the binding site residues and the contribution of important interactions for understanding the recognition mechanism of protein-protein complexes in comparison with experimental observations. The second part is devoted to the methods developed for discriminating high and low affinity complexes, and predicting the binding affinity of protein-protein complexes using three-dimensional structural information and just from the amino acid sequence. The overall view enhances our understanding of the integration of experimental data and computational methods, recognition mechanism of protein-protein complexes and the binding affinity.
Collapse
|
42
|
Important amino acid residues involved in folding and binding of protein–protein complexes. Int J Biol Macromol 2017; 94:438-444. [DOI: 10.1016/j.ijbiomac.2016.10.045] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2016] [Revised: 10/07/2016] [Accepted: 10/15/2016] [Indexed: 01/12/2023]
|
43
|
Computational Approaches for Predicting Binding Partners, Interface Residues, and Binding Affinity of Protein-Protein Complexes. Methods Mol Biol 2017; 1484:237-253. [PMID: 27787830 DOI: 10.1007/978-1-4939-6406-2_16] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Studying protein-protein interactions leads to a better understanding of the underlying principles of several biological pathways. Cost and labor-intensive experimental techniques suggest the need for computational methods to complement them. Several such state-of-the-art methods have been reported for analyzing diverse aspects such as predicting binding partners, interface residues, and binding affinity for protein-protein complexes with reliable performance. However, there are specific drawbacks for different methods that indicate the need for their improvement. This review highlights various available computational algorithms for analyzing diverse aspects of protein-protein interactions and endorses the necessity for developing new robust methods for gaining deep insights about protein-protein interactions.
Collapse
|
44
|
Laine E, Carbone A. Protein social behavior makes a stronger signal for partner identification than surface geometry. Proteins 2016; 85:137-154. [PMID: 27802579 PMCID: PMC5242317 DOI: 10.1002/prot.25206] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Revised: 10/10/2016] [Accepted: 10/20/2016] [Indexed: 01/26/2023]
Abstract
Cells are interactive living systems where proteins movements, interactions and regulation are substantially free from centralized management. How protein physico‐chemical and geometrical properties determine who interact with whom remains far from fully understood. We show that characterizing how a protein behaves with many potential interactors in a complete cross‐docking study leads to a sharp identification of its cellular/true/native partner(s). We define a sociability index, or S‐index, reflecting whether a protein likes or not to pair with other proteins. Formally, we propose a suitable normalization function that accounts for protein sociability and we combine it with a simple interface‐based (ranking) score to discriminate partners from non‐interactors. We show that sociability is an important factor and that the normalization permits to reach a much higher discriminative power than shape complementarity docking scores. The social effect is also observed with more sophisticated docking algorithms. Docking conformations are evaluated using experimental binding sites. These latter approximate in the best possible way binding sites predictions, which have reached high accuracy in recent years. This makes our analysis helpful for a global understanding of partner identification and for suggesting discriminating strategies. These results contradict previous findings claiming the partner identification problem being solvable solely with geometrical docking. Proteins 2016; 85:137–154. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Elodie Laine
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, Paris, 75005, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, Paris, 75005, France.,Institut Universitaire de France, Paris, 75005, France
| |
Collapse
|
45
|
Esmaielbeiki R, Krawczyk K, Knapp B, Nebel JC, Deane CM. Progress and challenges in predicting protein interfaces. Brief Bioinform 2016; 17:117-31. [PMID: 25971595 PMCID: PMC4719070 DOI: 10.1093/bib/bbv027] [Citation(s) in RCA: 100] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 03/18/2015] [Indexed: 12/31/2022] Open
Abstract
The majority of biological processes are mediated via protein-protein interactions. Determination of residues participating in such interactions improves our understanding of molecular mechanisms and facilitates the development of therapeutics. Experimental approaches to identifying interacting residues, such as mutagenesis, are costly and time-consuming and thus, computational methods for this purpose could streamline conventional pipelines. Here we review the field of computational protein interface prediction. We make a distinction between methods which address proteins in general and those targeted at antibodies, owing to the radically different binding mechanism of antibodies. We organize the multitude of currently available methods hierarchically based on required input and prediction principles to provide an overview of the field.
Collapse
|
46
|
Srinivasulu YS, Wang JR, Hsu KT, Tsai MJ, Charoenkwan P, Huang WL, Huang HL, Ho SY. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes. BMC Bioinformatics 2015; 16 Suppl 18:S14. [PMID: 26681483 PMCID: PMC4682391 DOI: 10.1186/1471-2105-16-s18-s14] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes.
Collapse
|
47
|
Xue LC, Dobbs D, Bonvin AMJJ, Honavar V. Computational prediction of protein interfaces: A review of data driven methods. FEBS Lett 2015; 589:3516-26. [PMID: 26460190 PMCID: PMC4655202 DOI: 10.1016/j.febslet.2015.10.003] [Citation(s) in RCA: 101] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Revised: 10/01/2015] [Accepted: 10/02/2015] [Indexed: 01/06/2023]
Abstract
Reliably pinpointing which specific amino acid residues form the interface(s) between a protein and its binding partner(s) is critical for understanding the structural and physicochemical determinants of protein recognition and binding affinity, and has wide applications in modeling and validating protein interactions predicted by high-throughput methods, in engineering proteins, and in prioritizing drug targets. Here, we review the basic concepts, principles and recent advances in computational approaches to the analysis and prediction of protein-protein interfaces. We point out caveats for objectively evaluating interface predictors, and discuss various applications of data-driven interface predictors for improving energy model-driven protein-protein docking. Finally, we stress the importance of exploiting binding partner information in reliably predicting interfaces and highlight recent advances in this emerging direction.
Collapse
Affiliation(s)
- Li C Xue
- Faculty of Science - Chemistry, Bijvoet Center for Biomolecular Research, Utrecht Univ., Utrecht 3584 CH, The Netherlands.
| | - Drena Dobbs
- Department of Genetics, Development & Cell Biology, Iowa State Univ., Ames, IA 50011, USA; Bioinformatics & Computational Biology Program, Iowa State Univ., Ames, IA 50011, USA
| | - Alexandre M J J Bonvin
- Faculty of Science - Chemistry, Bijvoet Center for Biomolecular Research, Utrecht Univ., Utrecht 3584 CH, The Netherlands
| | - Vasant Honavar
- College of Information Sciences & Technology, Pennsylvania State Univ., University Park, PA 16802, USA; Genomics & Bioinformatics Program, Pennsylvania State Univ., University Park, PA 16802, USA; Neuroscience Program, Pennsylvania State Univ., University Park, PA 16802, USA; The Huck Institutes of the Life Sciences, Pennsylvania State Univ., University Park, PA 16802, USA; Center for Big Data Analytics & Discovery Informatics, Pennsylvania State Univ., University Park, PA 16802, USA; Institute for Cyberscience, Pennsylvania State Univ., University Park, PA 16802, USA
| |
Collapse
|
48
|
Sriwastava BK, Basu S, Maulik U. Predicting Protein-Protein Interaction Sites with a Novel Membership Based Fuzzy SVM Classifier. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1394-1404. [PMID: 26684462 DOI: 10.1109/tcbb.2015.2401018] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Predicting residues that participate in protein-protein interactions (PPI) helps to identify, which amino acids are located at the interface. In this paper, we show that the performance of the classical support vector machine (SVM) algorithm can further be improved with the use of a custom-designed fuzzy membership function, for the partner-specific PPI interface prediction problem. We evaluated the performances of both classical SVM and fuzzy SVM (F-SVM) on the PPI databases of three different model proteomes of Homo sapiens, Escherichia coli and Saccharomyces Cerevisiae and calculated the statistical significance of the developed F-SVM over classical SVM algorithm. We also compared our performance with the available state-of-the-art fuzzy methods in this domain and observed significant performance improvements. To predict interaction sites in protein complexes, local composition of amino acids together with their physico-chemical characteristics are used, where the F-SVM based prediction method exploits the membership function for each pair of sequence fragments. The average F-SVM performance (area under ROC curve) on the test samples in 10-fold cross validation experiment are measured as 77.07, 78.39, and 74.91 percent for the aforementioned organisms respectively. Performances on independent test sets are obtained as 72.09, 73.24 and 82.74 percent respectively. The software is available for free download from http://code.google.com/p/cmater-bioinfo.
Collapse
|
49
|
Tuvshinjargal N, Lee W, Park B, Han K. Predicting protein-binding RNA nucleotides with consideration of binding partners. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2015; 120:3-15. [PMID: 25907142 DOI: 10.1016/j.cmpb.2015.03.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Revised: 03/30/2015] [Accepted: 03/30/2015] [Indexed: 06/04/2023]
Abstract
In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in most performance measures. To the best of our knowledge, this is the first sequence-based prediction of protein-binding nucleotides in RNA which considers the binding partner of RNA. The new model will provide valuable information for designing biochemical experiments to find putative protein-binding sites in RNA with unknown structure.
Collapse
Affiliation(s)
| | - Wook Lee
- Department of Computer Science and Engineering, Inha University, Incheon, South Korea
| | - Byungkyu Park
- Department of Computer Science and Engineering, Inha University, Incheon, South Korea
| | - Kyungsook Han
- Department of Computer Science and Engineering, Inha University, Incheon, South Korea.
| |
Collapse
|
50
|
Chen YA, Murakami Y, Ahmad S, Yoshimaru T, Katagiri T, Mizuguchi K. Brefeldin A-inhibited guanine nucleotide-exchange protein 3 (BIG3) is predicted to interact with its partner through an ARM-type α-helical structure. BMC Res Notes 2014; 7:435. [PMID: 24997568 PMCID: PMC4096751 DOI: 10.1186/1756-0500-7-435] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Accepted: 06/30/2014] [Indexed: 12/21/2022] Open
Abstract
Background Brefeldin A-inhibited guanine nucleotide-exchange protein 3 (BIG3) has been identified recently as a novel regulator of estrogen signalling in breast cancer cells. Despite being a potential target for new breast cancer treatment, its amino acid sequence suggests no association with any well-characterized protein family and provides little clues as to its molecular function. In this paper, we predicted the structure, function and interactions of BIG3 using a range of bioinformatic tools. Results Homology search results showed that BIG3 had distinct features from its paralogues, BIG1 and BIG2, with a unique region between the two shared domains, Sec7 and DUF1981. Although BIG3 contains Sec7 domain, the lack of the conserved motif and the critical glutamate residue suggested no potential guaninyl-exchange factor (GEF) activity. Fold recognition tools predicted BIG3 to adopt an α-helical repeat structure similar to that of the armadillo (ARM) family. Using state-of-the-art methods, we predicted interaction sites between BIG3 and its partner PHB2. Conclusions The combined results of the structure and interaction prediction led to a novel hypothesis that one of the predicted helices of BIG3 might play an important role in binding to PHB2 and thereby preventing its translocation to the nucleus. This hypothesis has been subsequently verified experimentally.
Collapse
Affiliation(s)
| | | | | | | | | | - Kenji Mizuguchi
- National Institute of Biomedical Innovation, 7-6-8 Saito-asagi, Ibaraki city, Osaka 567-0085, Japan.
| |
Collapse
|