1
|
Nandigrami P, Fiser A. Assessing the functional impact of protein binding site definition. Protein Sci 2024; 33:e5026. [PMID: 38757384 PMCID: PMC11099757 DOI: 10.1002/pro.5026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 05/01/2024] [Accepted: 05/03/2024] [Indexed: 05/18/2024]
Abstract
Many biomedical applications, such as classification of binding specificities or bioengineering, depend on the accurate definition of protein binding interfaces. Depending on the choice of method used, substantially different sets of residues can be classified as belonging to the interface of a protein. A typical approach used to verify these definitions is to mutate residues and measure the impact of these changes on binding. Besides the lack of exhaustive data, this approach also suffers from the fundamental problem that a mutation introduces an unknown amount of alteration into an interface, which potentially alters the binding characteristics of the interface. In this study we explore the impact of alternative binding site definitions on the ability of a protein to recognize its cognate ligand using a pharmacophore approach, which does not affect the interface. The study also shows that methods for protein binding interface predictions should perform above approximately F-score = 0.7 accuracy level to capture the biological function of a protein.
Collapse
Affiliation(s)
- Prithviraj Nandigrami
- Departments of Systems and Computational Biology, and BiochemistryAlbert Einstein College of MedicineBronxNew YorkUSA
| | - Andras Fiser
- Departments of Systems and Computational Biology, and BiochemistryAlbert Einstein College of MedicineBronxNew YorkUSA
| |
Collapse
|
2
|
Nikam R, Yugandhar K, Gromiha MM. DeepBSRPred: deep learning-based binding site residue prediction for proteins. Amino Acids 2023; 55:1305-1316. [PMID: 36574037 DOI: 10.1007/s00726-022-03228-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 12/15/2022] [Indexed: 12/28/2022]
Abstract
MOTIVATION Proteins-protein interactions (PPIs) are important to govern several cellular activities. Amino acid residues, which are located at the interface are known as the binding sites and the information about binding sites helps to understand the binding affinities and functions of protein-protein complexes. RESULTS We have developed a deep neural network-based method, DeepBSRPred, for predicting the binding sites using protein sequence information and predicted structures from AlphaFold2. Specific sequence and structure-based features include position-specific scoring matrix (PSSM), solvent accessible surface area, conservation score and amino acid properties, and residue depth, respectively. Our method predicted the binding sites with an average F1 score of 0.73 in a dataset of 1236 proteins. Further, we compared the performance with other existing methods in the literature using four benchmark datasets and our method outperformed those methods. AVAILABILITY AND IMPLEMENTATION The DeepBSRPred web server can be found at https://web.iitm.ac.in/bioinfo2/deepbsrpred/index.html , along with all datasets used in this study. The trained models, the DeepBSRPred standalone source code, and the feature computation pipeline are freely available at https://web.iitm.ac.in/bioinfo2/deepbsrpred/download.html .
Collapse
Affiliation(s)
- Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
| | - Kumar Yugandhar
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India
- Department of Computational Biology, Cornell University, New York, NY, USA
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, 600036, India.
- Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan.
| |
Collapse
|
3
|
Xavier JAM, Fuentes I, Nuez-Martínez M, Viñas C, Teixidor F. Single stop analysis of a protein surface using molecular probe electrochemistry. J Mater Chem B 2023; 11:8422-8432. [PMID: 37563960 DOI: 10.1039/d3tb00816a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
Visualization of a protein in its native form and environment without any interference has always been a challenging task. Contrary to the assumption that protein surfaces are smooth, they are in fact highly irregular with undulating surfaces. Hence, in this study, we have tackled this ambiguous nature of the 'surface' of a protein by considering the 'effective' protein surface (EPS) with respect to its interaction with the geometrically well-defined and structurally inert anionic molecule [3,3'-Co(1,2-C2B9H11)2]-, abbreviated as [o-COSAN]-, whose stability, propensity for amine residues, and self-assembling abilities are well reported. This study demonstrates the intricacies of protein surfaces exploiting simple electrochemical measurements using a 'small molecule' redox-active probe. This technique offers the advantage of not utilizing any harsh experimental conditions that could alter the native structure of the protein and hence the protein integrity is retained. Identification of the amino acid residues which are most involved in the interactions with [3,3'-Co(1,2-C2B9H11)2]- and how a protein's environment affects these interactions can help in gaining insights into how to modify proteins to optimize their interactions particularly in the fields of drug design and biotechnology. In this research, we have demonstrated that [3,3'-Co(1,2-C2B9H11)2]- anionic small molecules are excellent candidates for studying and visualizing protein surfaces in their natural environment and allow proteins to be classified according to the surface composition, which imparts their properties. [3,3'-Co(1,2-C2B9H11)2]- 'viewed' each protein surface differently and hence has the potential to act as a simple and easy to handle cantilever for measuring and picturing protein surfaces.
Collapse
Affiliation(s)
- Jewel Ann Maria Xavier
- Institut de Ciencia de Materials de Barcelona (ICMAB-CSIC), Campus de la UAB, Bellaterra, Spain.
| | - Isabel Fuentes
- Institut de Ciencia de Materials de Barcelona (ICMAB-CSIC), Campus de la UAB, Bellaterra, Spain.
| | - Miquel Nuez-Martínez
- Institut de Ciencia de Materials de Barcelona (ICMAB-CSIC), Campus de la UAB, Bellaterra, Spain.
| | - Clara Viñas
- Institut de Ciencia de Materials de Barcelona (ICMAB-CSIC), Campus de la UAB, Bellaterra, Spain.
| | - Francesc Teixidor
- Institut de Ciencia de Materials de Barcelona (ICMAB-CSIC), Campus de la UAB, Bellaterra, Spain.
| |
Collapse
|
4
|
Mohseni Behbahani Y, Laine E, Carbone A. Deep Local Analysis deconstructs protein-protein interfaces and accurately estimates binding affinity changes upon mutation. Bioinformatics 2023; 39:i544-i552. [PMID: 37387162 DOI: 10.1093/bioinformatics/btad231] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The spectacular recent advances in protein and protein complex structure prediction hold promise for reconstructing interactomes at large-scale and residue resolution. Beyond determining the 3D arrangement of interacting partners, modeling approaches should be able to unravel the impact of sequence variations on the strength of the association. RESULTS In this work, we report on Deep Local Analysis, a novel and efficient deep learning framework that relies on a strikingly simple deconstruction of protein interfaces into small locally oriented residue-centered cubes and on 3D convolutions recognizing patterns within cubes. Merely based on the two cubes associated with the wild-type and the mutant residues, DLA accurately estimates the binding affinity change for the associated complexes. It achieves a Pearson correlation coefficient of 0.735 on about 400 mutations on unseen complexes. Its generalization capability on blind datasets of complexes is higher than the state-of-the-art methods. We show that taking into account the evolutionary constraints on residues contributes to predictions. We also discuss the influence of conformational variability on performance. Beyond the predictive power on the effects of mutations, DLA is a general framework for transferring the knowledge gained from the available non-redundant set of complex protein structures to various tasks. For instance, given a single partially masked cube, it recovers the identity and physicochemical class of the central residue. Given an ensemble of cubes representing an interface, it predicts the function of the complex. AVAILABILITY AND IMPLEMENTATION Source code and models are available at http://gitlab.lcqb.upmc.fr/DLA/DLA.git.
Collapse
Affiliation(s)
- Yasser Mohseni Behbahani
- Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Sorbonne Université, CNRS, IBPS, Paris 75005, France
| | - Elodie Laine
- Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Sorbonne Université, CNRS, IBPS, Paris 75005, France
| | - Alessandra Carbone
- Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Sorbonne Université, CNRS, IBPS, Paris 75005, France
| |
Collapse
|
5
|
Huang Y, Wuchty S, Zhou Y, Zhang Z. SGPPI: structure-aware prediction of protein-protein interactions in rigorous conditions with graph convolutional network. Brief Bioinform 2023; 24:6995378. [PMID: 36682013 DOI: 10.1093/bib/bbad020] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 11/17/2022] [Accepted: 01/05/2023] [Indexed: 01/23/2023] Open
Abstract
While deep learning (DL)-based models have emerged as powerful approaches to predict protein-protein interactions (PPIs), the reliance on explicit similarity measures (e.g. sequence similarity and network neighborhood) to known interacting proteins makes these methods ineffective in dealing with novel proteins. The advent of AlphaFold2 presents a significant opportunity and also a challenge to predict PPIs in a straightforward way based on monomer structures while controlling bias from protein sequences. In this work, we established Structure and Graph-based Predictions of Protein Interactions (SGPPI), a structure-based DL framework for predicting PPIs, using the graph convolutional network. In particular, SGPPI focused on protein patches on the protein-protein binding interfaces and extracted the structural, geometric and evolutionary features from the residue contact map to predict PPIs. We demonstrated that our model outperforms traditional machine learning methods and state-of-the-art DL-based methods using non-representation-bias benchmark datasets. Moreover, our model trained on human dataset can be reliably transferred to predict yeast PPIs, indicating that SGPPI can capture converging structural features of protein interactions across various species. The implementation of SGPPI is available at https://github.com/emerson106/SGPPI.
Collapse
Affiliation(s)
- Yan Huang
- State Key Laboratory of Livestock and Poultry Biotechnology Breeding, College of Biological Sciences, China Agricultural University, Beijing 100193, China
- Department of Biomedical Informatics, Ministry of Education Key Laboratory of Molecular Cardiovascular Sciences, Center for Non-Coding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, Coral Gables, FL 33146, USA
- Department of Biology, University of Miami, Coral Gables, FL 33146, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL 33136, USA
- Institute of Data Science and Computing, University of Miami, Coral Gables, FL 33146, USA
| | - Yuan Zhou
- Department of Biomedical Informatics, Ministry of Education Key Laboratory of Molecular Cardiovascular Sciences, Center for Non-Coding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing 100191, China
| | - Ziding Zhang
- State Key Laboratory of Livestock and Poultry Biotechnology Breeding, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
6
|
Mohseni Behbahani Y, Crouzet S, Laine E, Carbone A. Deep Local Analysis evaluates protein docking conformations with locally oriented cubes. Bioinformatics 2022; 38:4505-4512. [PMID: 35962985 PMCID: PMC9525006 DOI: 10.1093/bioinformatics/btac551] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 07/04/2022] [Accepted: 08/08/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION With the recent advances in protein 3D structure prediction, protein interactions are becoming more central than ever before. Here, we address the problem of determining how proteins interact with one another. More specifically, we investigate the possibility of discriminating near-native protein complex conformations from incorrect ones by exploiting local environments around interfacial residues. RESULTS Deep Local Analysis (DLA)-Ranker is a deep learning framework applying 3D convolutions to a set of locally oriented cubes representing the protein interface. It explicitly considers the local geometry of the interfacial residues along with their neighboring atoms and the regions of the interface with different solvent accessibility. We assessed its performance on three docking benchmarks made of half a million acceptable and incorrect conformations. We show that DLA-Ranker successfully identifies near-native conformations from ensembles generated by molecular docking. It surpasses or competes with other deep learning-based scoring functions. We also showcase its usefulness to discover alternative interfaces. AVAILABILITY AND IMPLEMENTATION http://gitlab.lcqb.upmc.fr/dla-ranker/DLA-Ranker.git. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yasser Mohseni Behbahani
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris 75005, France
| | - Simon Crouzet
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris 75005, France
| | | | | |
Collapse
|
7
|
Schweke H, Mucchielli MH, Chevrollier N, Gosset S, Lopes A. SURFMAP: A Software for Mapping in Two Dimensions Protein Surface Features. J Chem Inf Model 2022; 62:1595-1601. [DOI: 10.1021/acs.jcim.1c01269] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Hugo Schweke
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), Gif-sur-Yvette 91198, France
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Marie-Hélène Mucchielli
- Université Paris-Saclay, CNRS, INRAE, Université Evry, Institute of Plant Sciences Paris-Saclay (IPS2), Gif-sur-Yvette 91190, France
- Université de Paris, Institute of Plant Sciences Paris-Saclay (IPS2), Gif-sur-Yvette 91190, France
| | | | - Simon Gosset
- Université Paris-Saclay, CNRS, INRAE, Université Evry, Institute of Plant Sciences Paris-Saclay (IPS2), Gif-sur-Yvette 91190, France
- Université de Paris, Institute of Plant Sciences Paris-Saclay (IPS2), Gif-sur-Yvette 91190, France
| | - Anne Lopes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), Gif-sur-Yvette 91198, France
| |
Collapse
|
8
|
From complete cross-docking to partners identification and binding sites predictions. PLoS Comput Biol 2022; 18:e1009825. [PMID: 35089918 PMCID: PMC8827487 DOI: 10.1371/journal.pcbi.1009825] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Revised: 02/09/2022] [Accepted: 01/11/2022] [Indexed: 11/19/2022] Open
Abstract
Proteins ensure their biological functions by interacting with each other. Hence, characterising protein interactions is fundamental for our understanding of the cellular machinery, and for improving medicine and bioengineering. Over the past years, a large body of experimental data has been accumulated on who interacts with whom and in what manner. However, these data are highly heterogeneous and sometimes contradictory, noisy, and biased. Ab initio methods provide a means to a "blind" protein-protein interaction network reconstruction. Here, we report on a molecular cross-docking-based approach for the identification of protein partners. The docking algorithm uses a coarse-grained representation of the protein structures and treats them as rigid bodies. We applied the approach to a few hundred of proteins, in the unbound conformations, and we systematically investigated the influence of several key ingredients, such as the size and quality of the interfaces, and the scoring function. We achieved some significant improvement compared to previous works, and a very high discriminative power on some specific functional classes. We provide a readout of the contributions of shape and physico-chemical complementarity, interface matching, and specificity, in the predictions. In addition, we assessed the ability of the approach to account for protein surface multiple usages, and we compared it with a sequence-based deep learning method. This work may contribute to guiding the exploitation of the large amounts of protein structural models now available toward the discovery of unexpected partners and their complex structure characterisation.
Collapse
|
9
|
Xie Z, Xu J. Deep graph learning of inter-protein contacts. Bioinformatics 2021; 38:947-953. [PMID: 34755837 PMCID: PMC8796373 DOI: 10.1093/bioinformatics/btab761] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 10/06/2021] [Accepted: 11/04/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Inter-protein (interfacial) contact prediction is very useful for in silico structural characterization of protein-protein interactions. Although deep learning has been applied to this problem, its accuracy is not as good as intra-protein contact prediction. RESULTS We propose a new deep learning method GLINTER (Graph Learning of INTER-protein contacts) for interfacial contact prediction of dimers, leveraging a rotational invariant representation of protein tertiary structures and a pretrained language model of multiple sequence alignments. Tested on the 13th and 14th CASP-CAPRI datasets, the average top L/10 precision achieved by GLINTER is 54% on the homodimers and 52% on all the dimers, much higher than 30% obtained by the latest deep learning method DeepHomo on the homodimers and 15% obtained by BIPSPI on all the dimers. Our experiments show that GLINTER-predicted contacts help improve selection of docking decoys. AVAILABILITY AND IMPLEMENTATION The software is available at https://github.com/zw2x/glinter. The datasets are available at https://github.com/zw2x/glinter/data. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ziwei Xie
- Toyota Technological Institute at Chicago, Chicago, IL 60637, USA
| | - Jinbo Xu
- To whom correspondence should be addressed.
| |
Collapse
|
10
|
Li Y, Golding GB, Ilie L. DELPHI: accurate deep ensemble model for protein interaction sites prediction. Bioinformatics 2021; 37:896-904. [PMID: 32840562 DOI: 10.1093/bioinformatics/btaa750] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 08/14/2020] [Accepted: 08/19/2020] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION Proteins usually perform their functions by interacting with other proteins, which is why accurately predicting protein-protein interaction (PPI) binding sites is a fundamental problem. Experimental methods are slow and expensive. Therefore, great efforts are being made towards increasing the performance of computational methods. RESULTS We propose DEep Learning Prediction of Highly probable protein Interaction sites (DELPHI), a new sequence-based deep learning suite for PPI-binding sites prediction. DELPHI has an ensemble structure which combines a CNN and a RNN component with fine tuning technique. Three novel features, HSP, position information and ProtVec are used in addition to nine existing ones. We comprehensively compare DELPHI to nine state-of-the-art programmes on five datasets, and DELPHI outperforms the competing methods in all metrics even though its training dataset shares the least similarities with the testing datasets. In the most important metrics, AUPRC and MCC, it surpasses the second best programmes by as much as 18.5% and 27.7%, respectively. We also demonstrated that the improvement is essentially due to using the ensemble model and, especially, the three new features. Using DELPHI it is shown that there is a strong correlation with protein-binding residues (PBRs) and sites with strong evolutionary conservation. In addition, DELPHI's predicted PBR sites closely match known data from Pfam. DELPHI is available as open-sourced standalone software and web server. AVAILABILITY AND IMPLEMENTATION The DELPHI web server can be found at delphi.csd.uwo.ca/, with all datasets and results in this study. The trained models, the DELPHI standalone source code, and the feature computation pipeline are freely available at github.com/lucian-ilie/DELPHI. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yiwei Li
- Department of Computer Science, The University of Western Ontario London, ON N6A 5B7, Canada
| | - G Brian Golding
- Department of Biology, McMaster University, Hamilton, ON L8S 4K1, Canada
| | - Lucian Ilie
- Department of Computer Science, The University of Western Ontario London, ON N6A 5B7, Canada
| |
Collapse
|
11
|
Beytur S. Marker residue types at the structural regions of transmembrane alpha-helical and beta-barrel interfaces. Proteins 2021; 89:1145-1157. [PMID: 33890696 DOI: 10.1002/prot.26087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 04/13/2021] [Accepted: 04/16/2021] [Indexed: 11/11/2022]
Abstract
Membrane proteins play a variety of biological functions to the survival of organisms and functionalities of these proteins are often due to their homo- or hetero-complexation. Encoded by ~30% of the genome in most organisms, they represent the target of over half of nowadays drugs. Spanning the entirety of the cell membrane, transmembrane proteins are the most common type of membrane proteins and can be classified by secondary structures: alpha-helical and beta-barrel structures. Protein-protein interaction (PPI) have been widely studied for globular proteins and many computational tools are available for predicting PPI sites and construct models of complexes. Here, the structural regions of a non-redundant set of 232 alpha-helical and 37 beta-barrel transmembrane complexes and their interfaces are analyzed. Using the residue composition, frequency and propensity, this study brings the light on the marker residue types located at the structural regions of alpha-helical and beta-barrel transmembrane homomeric protein complexes and of their interfaces. This study also shows the necessity to relate the frequency to the composition into a ratio for immediately figuring out residue types presenting high frequencies at the interface and/or at one of its structural regions despite being a minor contributor compared to other residue types to that location's residue composition.
Collapse
Affiliation(s)
- Sercan Beytur
- Faculty of Engineering and Natural Sciences, Department of Bioinformatics and Genetics, Kadir Has University, Istanbul, Turkey
| |
Collapse
|
12
|
Chang RL, Stanley JA, Robinson MC, Sher JW, Li Z, Chan YA, Omdahl AR, Wattiez R, Godzik A, Matallana-Surget S. Protein structure, amino acid composition and sequence determine proteome vulnerability to oxidation-induced damage. EMBO J 2020; 39:e104523. [PMID: 33073387 PMCID: PMC7705453 DOI: 10.15252/embj.2020104523] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 09/16/2020] [Accepted: 09/22/2020] [Indexed: 02/05/2023] Open
Abstract
Oxidative stress alters cell viability, from microorganism irradiation sensitivity to human aging and neurodegeneration. Deleterious effects of protein carbonylation by reactive oxygen species (ROS) make understanding molecular properties determining ROS susceptibility essential. The radiation‐resistant bacterium Deinococcus radiodurans accumulates less carbonylation than sensitive organisms, making it a key model for deciphering properties governing oxidative stress resistance. We integrated shotgun redox proteomics, structural systems biology, and machine learning to resolve properties determining protein damage by γ‐irradiation in Escherichia coli and D. radiodurans at multiple scales. Local accessibility, charge, and lysine enrichment accurately predict ROS susceptibility. Lysine, methionine, and cysteine usage also contribute to ROS resistance of the D. radiodurans proteome. Our model predicts proteome maintenance machinery, and proteins protecting against ROS are more resistant in D. radiodurans. Our findings substantiate that protein‐intrinsic protection impacts oxidative stress resistance, identifying causal molecular properties.
Collapse
Affiliation(s)
- Roger L Chang
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Julian A Stanley
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA
| | - Matthew C Robinson
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA
| | - Joel W Sher
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA
| | - Zhanwen Li
- Division of Biomedical Sciences, University of California Riverside School of Medicine, Riverside, CA, USA
| | - Yujia A Chan
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Ashton R Omdahl
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA
| | - Ruddy Wattiez
- Department of Proteomics and Microbiology, Research Institute for Biosciences, University of Mons, Mons, Belgium
| | - Adam Godzik
- Division of Biomedical Sciences, University of California Riverside School of Medicine, Riverside, CA, USA
| | - Sabine Matallana-Surget
- Division of Biological and Environmental Sciences, Faculty of Natural Sciences, University of Stirling, Stirling, UK
| |
Collapse
|
13
|
Ait-Hamlat A, Zea DJ, Labeeuw A, Polit L, Richard H, Laine E. Transcripts' Evolutionary History and Structural Dynamics Give Mechanistic Insights into the Functional Diversity of the JNK Family. J Mol Biol 2020; 432:2121-2140. [PMID: 32067951 DOI: 10.1016/j.jmb.2020.01.032] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 01/03/2020] [Accepted: 01/28/2020] [Indexed: 12/14/2022]
Abstract
Alternative splicing and alternative initiation/termination transcription sites have the potential to greatly expand the proteome in eukaryotes by producing several transcript isoforms from the same gene. Although these mechanisms are well described at the genomic level, little is known about their contribution to protein evolution and their impact at the protein structure level. Here, we address both issues by reconstructing the evolutionary history of transcripts and by modeling the tertiary structures of the corresponding protein isoforms. We reconstruct phylogenetic forests relating 60 protein-coding transcripts from the c-Jun N-terminal kinase (JNK) family observed in seven species. We identify two alternative splicing events of ancient origin and show that they induce subtle changes in the protein's structural dynamics. We highlight a previously uncharacterized transcript whose predicted structure seems stable in solution. We further demonstrate that orphan transcripts, for which no phylogeny could be reconstructed, display peculiar sequence and structural properties. Our approach is implemented in PhyloSofS (Phylogenies of Splicing Isoforms Structures), a fully automated computational tool freely available at https://github.com/PhyloSofS-Team/PhyloSofS.
Collapse
Affiliation(s)
- Adel Ait-Hamlat
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France
| | - Diego Javier Zea
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France
| | - Antoine Labeeuw
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France
| | - Lélia Polit
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France
| | - Hugues Richard
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France.
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, 75005, France.
| |
Collapse
|
14
|
Multiple protein-DNA interfaces unravelled by evolutionary information, physico-chemical and geometrical properties. PLoS Comput Biol 2020; 16:e1007624. [PMID: 32012150 PMCID: PMC7018136 DOI: 10.1371/journal.pcbi.1007624] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 02/13/2020] [Accepted: 12/20/2019] [Indexed: 02/06/2023] Open
Abstract
Interactions between proteins and nucleic acids are at the heart of many essential biological processes. Despite increasing structural information about how these interactions may take place, our understanding of the usage made of protein surfaces by nucleic acids is still very limited. This is in part due to the inherent complexity associated to protein surface deformability and evolution. In this work, we present a method that contributes to decipher such complexity by predicting protein-DNA interfaces and characterizing their properties. It relies on three biologically and physically meaningful descriptors, namely evolutionary conservation, physico-chemical properties and surface geometry. We carefully assessed its performance on several hundreds of protein structures and compared it to several machine-learning state-of-the-art methods. Our approach achieves a higher sensitivity compared to the other methods, with a similar precision. Importantly, we show that it is able to unravel ‘hidden’ binding sites by applying it to unbound protein structures and to proteins binding to DNA via multiple sites and in different conformations. It is also applicable to the detection of RNA-binding sites, without significant loss of performance. This confirms that DNA and RNA-binding sites share similar properties. Our method is implemented as a fully automated tool, JETDNA2, freely accessible at: http://www.lcqb.upmc.fr/JET2DNA. We also provide a new dataset of 187 protein-DNA complex structures, along with a subset of 82 associated unbound structures. The set represents the largest body of high-resolution crystallographic structures of protein-DNA complexes, use biological protein assemblies as DNA-binding units, and covers all major types of protein-DNA interactions. It is available at: http://www.lcqb.upmc.fr/PDNAbenchmarks. Protein-DNA interactions are essential to living organisms and their impairment is associated to many diseases. For these reasons, they have become increasingly important therapeutic targets. Experimental structure determination has revealed different binding motifs and modes, associated to different functions. Yet, the available structural data gives us only a glimpse of the multiplicity and complexity of protein surface usage by DNA. In this work, we use a three-layer model to describe and predict DNA-binding sites at protein surfaces. Given a protein, we consider the way its residues are conserved through evolution, their physico-chemical properties and geometrical shapes to decrypt its surface. We are able to detect a large portion of interacting residues with good precision, even when they are ‘hidden’ by conformational changes. We highlight cases where one protein binds DNA via distinct regions to perform different functions. We are able to uncover the alternative binding sites and relate their properties with their specific roles. Our work can help guiding mutagenesis experiments and the development of new drugs specifically targeting one site while limiting possible side effects.
Collapse
|
15
|
Abstract
The conformation of water around proteins is of paramount importance, as it determines protein interactions. Although the average water properties around the surface of proteins have been provided experimentally and computationally, protein surfaces are highly heterogeneous. Therefore, it is crucial to determine the correlations of water to the local distributions of polar and nonpolar protein surface domains to understand functions such as aggregation, mutations, and delivery. By using atomistic simulations, we investigate the orientation and dynamics of water molecules next to 4 types of protein surface domains: negatively charged, positively charged, and charge-neutral polar and nonpolar amino acids. The negatively charged amino acids orient around 98% of the neighboring water dipoles toward the protein surface, and such correlation persists up to around 16 Å from the protein surface. The positively charged amino acids orient around 94% of the nearest water dipoles against the protein surface, and the correlation persists up to around 12 Å. The charge-neutral polar and nonpolar amino acids are also orienting the water neighbors in a quantitatively weaker manner. A similar trend was observed in the residence time of the nearest water neighbors. These findings hold true for 3 technically important enzymes (PETase, cytochrome P450, and organophosphorus hydrolase). Our results demonstrate that the water-amino acid degree of correlation follows the same trend as the amino acid contribution in proteins solubility, namely, the negatively charged amino acids are the most beneficial for protein solubility, then the positively charged amino acids, and finally the charge-neutral amino acids.
Collapse
|
16
|
Laine E, Karami Y, Carbone A. GEMME: a simple and fast global epistatic model predicting mutational effects. Mol Biol Evol 2019; 36:2604-2619. [PMID: 31406981 PMCID: PMC6805226 DOI: 10.1093/molbev/msz179] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 06/03/2019] [Accepted: 08/02/2019] [Indexed: 12/15/2022] Open
Abstract
The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering, and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modeling intersite dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present Global Epistatic Model for predicting Mutational Effects (GEMME) (www.lcqb.upmc.fr/GEMME), an original and fast method that predicts mutational outcomes by explicitly modeling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of much conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at www.lcqb.upmc.fr/GEMME/.
Collapse
Affiliation(s)
- Elodie Laine
- Sorbonne Université, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
| | - Yasaman Karami
- Sorbonne Université, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France.,Sorbonne Université, UPMC-Univ P6, Institut du Calcul et de la Simulation
| | - Alessandra Carbone
- Sorbonne Université, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France.,Institut Universitaire de France
| |
Collapse
|
17
|
Reille S, Garnier M, Robert X, Gouet P, Martin J, Launay G. Identification and visualization of protein binding regions with the ArDock server. Nucleic Acids Res 2019; 46:W417-W422. [PMID: 29905873 PMCID: PMC6031020 DOI: 10.1093/nar/gky472] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 05/28/2018] [Indexed: 12/21/2022] Open
Abstract
ArDock (ardock.ibcp.fr) is a structural bioinformatics web server for the prediction and the visualization of potential interaction regions at protein surfaces. ArDock ranks the surface residues of a protein according to their tendency to form interfaces in a set of predefined docking experiments between the query protein and a set of arbitrary protein probes. The ArDock methodology is derived from large scale cross-docking studies where it was observed that randomly chosen proteins tend to dock in a non-random way at protein surfaces. The method predicts interaction site of the protein, or alternate interfaces in the case of proteins with multiple interaction modes. The server takes a protein structure as input and computes a score for each surface residue. Its output focuses on the interactive visualization of results and on interoperability with other services.
Collapse
Affiliation(s)
- Sébastien Reille
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Mélanie Garnier
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Xavier Robert
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Patrice Gouet
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Juliette Martin
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
| | - Guillaume Launay
- Molecular Microbiology and Structural Biochemistry, Unité Mixte de Recherche, Université Claude Bernard Lyon 1, Centre National de la Recherche Scientifique, 69367 Lyon Cedex 07, France
- To whom correspondence should be addressed. Tel: +33 437 652 936; Fax: +33 472 722 601;
| |
Collapse
|
18
|
e Silva KSF, Lima RM, Baeza LC, Lima PDS, Cordeiro TDM, Charneau S, da Silva RA, Soares CMDA, Pereira M. Interactome of Glyceraldehyde-3-Phosphate Dehydrogenase Points to the Existence of Metabolons in Paracoccidioides lutzii. Front Microbiol 2019; 10:1537. [PMID: 31338083 PMCID: PMC6629890 DOI: 10.3389/fmicb.2019.01537] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Accepted: 06/20/2019] [Indexed: 11/13/2022] Open
Abstract
Paracoccidioides is a dimorphic fungus, the causative agent of paracoccidioidomycosis. The disease is endemic within Latin America and prevalent in Brazil. The treatment is based on azoles, sulfonamides and amphotericin B. The seeking for new treatment approaches is a real necessity for neglected infections. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is an essential glycolytic enzyme, well known for its multitude of functions within cells, therefore categorized as a moonlight protein. To our knowledge, this is the first approach performed on the Paracoccidioides genus regarding the description of PPIs having GAPDH as a target. Here, we show an overview of experimental GAPDH interactome in different phases of Paracoccidioides lutzii and an in silico analysis of 18 proteins partners. GAPDH interacted with 207 proteins in P. lutzii. Several proteins bound to GAPDH in mycelium, transition and yeast phases are common to important pathways such as glycolysis and TCA. We performed a co-immunoprecipitation assay to validate the complex formed by GAPDH with triose phosphate isomerase, enolase, isocitrate lyase and 2-methylcitrate synthase. We found GAPDH participating in complexes with proteins of specific pathways, indicating the existence of a glycolytic and a TCA metabolon in P. lutzii. GAPDH interacted with several proteins that undergoes regulation by nitrosylation. In addition, we modeled the GAPDH 3-D structure, performed molecular dynamics and molecular docking in order to identify the interacting interface between GAPDH and the interacting proteins. Despite the large number of interacting proteins, GAPDH has only four main regions of contact with interacting proteins, reflecting its ancestrality and conservation over evolution.
Collapse
Affiliation(s)
| | - Raisa Melo Lima
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| | - Lilian Cristiane Baeza
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| | - Patrícia de Sousa Lima
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| | - Thuany de Moura Cordeiro
- Laboratório de Bioquímica e Química de Proteínas, Departamento de Biologia Celular, Universidade de Brasília, Brasília, Brazil
| | - Sébastien Charneau
- Laboratório de Bioquímica e Química de Proteínas, Departamento de Biologia Celular, Universidade de Brasília, Brasília, Brazil
| | - Roosevelt Alves da Silva
- Núcleo Colaborativo de Biossistemas, Instituto de Ciências Exatas, Universidade Federal de Jataí, Goiás, Brazil
| | | | - Maristela Pereira
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| |
Collapse
|
19
|
Dequeker C, Laine E, Carbone A. Decrypting protein surfaces by combining evolution, geometry, and molecular docking. Proteins 2019; 87:952-965. [PMID: 31199528 PMCID: PMC6852240 DOI: 10.1002/prot.25757] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 05/09/2019] [Accepted: 06/07/2019] [Indexed: 01/30/2023]
Abstract
The growing body of experimental and computational data describing how proteins interact with each other has emphasized the multiplicity of protein interactions and the complexity underlying protein surface usage and deformability. In this work, we propose new concepts and methods toward deciphering such complexity. We introduce the notion of interacting region to account for the multiple usage of a protein's surface residues by several partners and for the variability of protein interfaces coming from molecular flexibility. We predict interacting patches by crossing evolutionary, physicochemical and geometrical properties of the protein surface with information coming from complete cross-docking (CC-D) simulations. We show that our predictions match well interacting regions and that the different sources of information are complementary. We further propose an indicator of whether a protein has a few or many partners. Our prediction strategies are implemented in the dynJET2 algorithm and assessed on a new dataset of 262 protein on which we performed CC-D. The code and the data are available at: http://www.lcqb.upmc.fr/dynJET2/.
Collapse
Affiliation(s)
- Chloé Dequeker
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, France
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, France
| | - Alessandra Carbone
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, France.,Institut Universitaire de France (IUF), Paris, France
| |
Collapse
|
20
|
Nadalin F, Carbone A. Protein-protein interaction specificity is captured by contact preferences and interface composition. Bioinformatics 2018; 34:459-468. [PMID: 29028884 PMCID: PMC5860360 DOI: 10.1093/bioinformatics/btx584] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Accepted: 09/18/2017] [Indexed: 12/24/2022] Open
Abstract
Motivation Large-scale computational docking will be increasingly used in future years to discriminate protein–protein interactions at the residue resolution. Complete cross-docking experiments make in silico reconstruction of protein–protein interaction networks a feasible goal. They ask for efficient and accurate screening of the millions structural conformations issued by the calculations. Results We propose CIPS (Combined Interface Propensity for decoy Scoring), a new pair potential combining interface composition with residue–residue contact preference. CIPS outperforms several other methods on screening docking solutions obtained either with all-atom or with coarse-grain rigid docking. Further testing on 28 CAPRI targets corroborates CIPS predictive power over existing methods. By combining CIPS with atomic potentials, discrimination of correct conformations in all-atom structures reaches optimal accuracy. The drastic reduction of candidate solutions produced by thousands of proteins docked against each other makes large-scale docking accessible to analysis. Availability and implementation CIPS source code is freely available at http://www.lcqb.upmc.fr/CIPS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Francesca Nadalin
- Sorbonne Universités, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative-UMR 7238, 75005 Paris, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC-Univ P6, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative-UMR 7238, 75005 Paris, France.,Institut Universitaire de France, 75005 Paris, France
| |
Collapse
|
21
|
Raucci R, Laine E, Carbone A. Local Interaction Signal Analysis Predicts Protein-Protein Binding Affinity. Structure 2018; 26:905-915.e4. [PMID: 29779789 DOI: 10.1016/j.str.2018.04.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 02/06/2018] [Accepted: 04/10/2018] [Indexed: 12/27/2022]
Abstract
Several models estimating the strength of the interaction between proteins in a complex have been proposed. By exploring the geometry of contact distribution at protein-protein interfaces, we provide an improved model of binding energy. Local interaction signal analysis (LISA) is a radial function based on terms describing favorable and non-favorable contacts obtained by density functional theory, the support-core-rim interface residue distribution, non-interacting charged residues and secondary structures contribution. The three-dimensional organization of the contacts and their contribution on localized hot-sites over the entire interaction surface were numerically evaluated. LISA achieves a correlation of 0.81 (and a root-mean-square error of 2.35 ± 0.38 kcal/mol) when tested on 125 complexes for which experimental measurements were realized. LISA's performance is stable for subsets defined by functional composition and extent of conformational changes upon complex formation. A large-scale comparison with 17 other functions demonstrated the power of the geometrical model in the understanding of complex binding.
Collapse
Affiliation(s)
- Raffaele Raucci
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France; Sorbonne Université, Institut des Sciences du Calcul et des Données (ISCD), 75005 Paris, France
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France
| | - Alessandra Carbone
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France; Institut Universitaire de France, 75005 Paris, France.
| |
Collapse
|
22
|
Lagarde N, Carbone A, Sacquin-Mora S. Hidden partners: Using cross-docking calculations to predict binding sites for proteins with multiple interactions. Proteins 2018; 86:723-737. [DOI: 10.1002/prot.25506] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 03/23/2018] [Accepted: 04/07/2018] [Indexed: 02/06/2023]
Affiliation(s)
- Nathalie Lagarde
- Laboratoire de Biochimie Théorique, CNRS UPR9080, Institut de Biologie Physico-Chimique, University Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie; Paris 75005 France
| | - Alessandra Carbone
- Laboratoire de Biologie Computationnelle et Quantitative, CNRS UMR7238, UPMC Univ-Paris 6, Sorbonne Université, 4 place Jussieu; Paris 75005 France
- Institut Universitaire de France; Paris 75005 France
| | - Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, CNRS UPR9080, Institut de Biologie Physico-Chimique, University Paris Diderot, Sorbonne Paris Cité, 13 rue Pierre et Marie Curie; Paris 75005 France
| |
Collapse
|
23
|
Abdollahi N, Albani A, Anthony E, Baud A, Cardon M, Clerc R, Czernecki D, Conte R, David L, Delaune A, Djerroud S, Fourgoux P, Guiglielmoni N, Laurentie J, Lehmann N, Lochard C, Montagne R, Myrodia V, Opuu V, Parey E, Polit L, Privé S, Quignot C, Ruiz-Cuevas M, Sissoko M, Sompairac N, Vallerix A, Verrecchia V, Delarue M, Guérois R, Ponty Y, Sacquin-Mora S, Carbone A, Froidevaux C, Le Crom S, Lespinet O, Weigt M, Abboud S, Bernardes J, Bouvier G, Dequeker C, Ferré A, Fuchs P, Lelandais G, Poulain P, Richard H, Schweke H, Laine E, Lopes A. Meet-U: Educating through research immersion. PLoS Comput Biol 2018; 14:e1005992. [PMID: 29543809 PMCID: PMC5854232 DOI: 10.1371/journal.pcbi.1005992] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present a new educational initiative called Meet-U that aims to train students for collaborative work in computational biology and to bridge the gap between education and research. Meet-U mimics the setup of collaborative research projects and takes advantage of the most popular tools for collaborative work and of cloud computing. Students are grouped in teams of 4–5 people and have to realize a project from A to Z that answers a challenging question in biology. Meet-U promotes "coopetition," as the students collaborate within and across the teams and are also in competition with each other to develop the best final product. Meet-U fosters interactions between different actors of education and research through the organization of a meeting day, open to everyone, where the students present their work to a jury of researchers and jury members give research seminars. This very unique combination of education and research is strongly motivating for the students and provides a formidable opportunity for a scientific community to unite and increase its visibility. We report on our experience with Meet-U in two French universities with master’s students in bioinformatics and modeling, with protein–protein docking as the subject of the course. Meet-U is easy to implement and can be straightforwardly transferred to other fields and/or universities. All the information and data are available at www.meet-u.org.
Collapse
Affiliation(s)
- Nika Abdollahi
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Alexandre Albani
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Eric Anthony
- Department of Biology and of Computer Science, Univ. Paris-Sud, Université Paris-Saclay (UPSay), Orsay, France
| | - Agnes Baud
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Mélissa Cardon
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Robert Clerc
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Dariusz Czernecki
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Romain Conte
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Laurent David
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Agathe Delaune
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Samia Djerroud
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Pauline Fourgoux
- Department of Biology and of Computer Science, Univ. Paris-Sud, Université Paris-Saclay (UPSay), Orsay, France
| | - Nadège Guiglielmoni
- Department of Biology and of Computer Science, Univ. Paris-Sud, Université Paris-Saclay (UPSay), Orsay, France
| | - Jeanne Laurentie
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Nathalie Lehmann
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Camille Lochard
- Department of Biology and of Computer Science, Univ. Paris-Sud, Université Paris-Saclay (UPSay), Orsay, France
| | - Rémi Montagne
- Department of Biology and of Computer Science, Univ. Paris-Sud, Université Paris-Saclay (UPSay), Orsay, France
| | - Vasiliki Myrodia
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Vaitea Opuu
- Department of Biology and of Computer Science, Univ. Paris-Sud, Université Paris-Saclay (UPSay), Orsay, France
| | - Elise Parey
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Lélia Polit
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Sylvain Privé
- Department of Biology and of Computer Science, Univ. Paris-Sud, Université Paris-Saclay (UPSay), Orsay, France
| | - Chloé Quignot
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Maria Ruiz-Cuevas
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Mariam Sissoko
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Nicolas Sompairac
- Departments of Computer Science and of Life Sciences, Sorbonne Université (SU) / UPMC, Paris, France
| | - Audrey Vallerix
- Department of Biology and of Computer Science, Univ. Paris-Sud, Université Paris-Saclay (UPSay), Orsay, France
| | - Violaine Verrecchia
- Department of Biology and of Computer Science, Univ. Paris-Sud, Université Paris-Saclay (UPSay), Orsay, France
| | - Marc Delarue
- Unit of Structural Dynamics of Macromolecules, CNRS, Institut Pasteur, Paris, France
| | - Raphael Guérois
- Institute for Integrative Biology of the Cell (I2BC), IBITECS, CEA, CNRS, Univ. Paris-Sud, UPSay, Gif-sur-Yvette cedex, France
| | - Yann Ponty
- AMIBio team, Laboratoire d’informatique de l’École polytechnique (LIX, UMR 7161) / Inria Saclay, UPSay, Palaiseau, France
| | - Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, UPR 9080 CNRS Institut de Biologie Physico-Chimique, Paris, France
| | - Alessandra Carbone
- Sorbonne Université / UPMC, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), UMR 7238, Paris, France
- Institut Universitaire de France
| | | | - Stéphane Le Crom
- Sorbonne Université / UPMC, Univ. Antilles, Univ. Nice Sophia Antipolis, CNRS, Evolution Paris Seine - Institut de Biologie Paris Seine (EPS - IBPS), Paris, France
| | - Olivier Lespinet
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, UPSay, Gif-sur-Yvette cedex, France
| | - Martin Weigt
- Sorbonne Université / UPMC, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), UMR 7238, Paris, France
| | - Samer Abboud
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, UPSay, Gif-sur-Yvette cedex, France
| | - Juliana Bernardes
- Sorbonne Université / UPMC, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), UMR 7238, Paris, France
| | - Guillaume Bouvier
- Department of Structural Biology and CheImistry (CNRS UMR3528) - Center of Bioinformatics, Biostatistics and Integrative Biology (CNRS USR3756) - Structural Bioinformatics Unit, Institut Pasteur, Paris, France
| | - Chloé Dequeker
- Sorbonne Université / UPMC, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), UMR 7238, Paris, France
| | - Arnaud Ferré
- MaIAGE, INRA, UPSay, Jouy-en-Josas, France and LIMSI, CNRS, UPSay, Orsay, France
| | - Patrick Fuchs
- Sorbonne Université / UPMC, Ecole Normale Supérieure - PLS Research University, Département de Chimie, CNRS, Laboratoire des Biomolécules, UMR 7203 - Univ. Paris Diderot, Sorbonne Paris Cité, Paris, France
| | - Gaëlle Lelandais
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, UPSay, Gif-sur-Yvette cedex, France
| | - Pierre Poulain
- Mitochondria, Metals and Oxidative Stress Group, Institut Jacques Monod, UMR 7592, Univ. Paris Diderot, CNRS, Sorbonne Paris Cité, Paris, France
| | - Hugues Richard
- Sorbonne Université / UPMC, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), UMR 7238, Paris, France
| | - Hugo Schweke
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, UPSay, Gif-sur-Yvette cedex, France
| | - Elodie Laine
- Sorbonne Université / UPMC, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), UMR 7238, Paris, France
- * E-mail: (EL); (AL)
| | - Anne Lopes
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, UPSay, Gif-sur-Yvette cedex, France
- * E-mail: (EL); (AL)
| |
Collapse
|
24
|
Zhang J, Ma Z, Kurgan L. Comprehensive review and empirical analysis of hallmarks of DNA-, RNA- and protein-binding residues in protein chains. Brief Bioinform 2017; 20:1250-1268. [DOI: 10.1093/bib/bbx168] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2017] [Revised: 11/15/2017] [Indexed: 11/13/2022] Open
Abstract
Abstract
Proteins interact with a variety of molecules including proteins and nucleic acids. We review a comprehensive collection of over 50 studies that analyze and/or predict these interactions. While majority of these studies address either solely protein–DNA or protein–RNA binding, only a few have a wider scope that covers both protein–protein and protein–nucleic acid binding. Our analysis reveals that binding residues are typically characterized with three hallmarks: relative solvent accessibility (RSA), evolutionary conservation and propensity of amino acids (AAs) for binding. Motivated by drawbacks of the prior studies, we perform a large-scale analysis to quantify and contrast the three hallmarks for residues that bind DNA-, RNA-, protein- and (for the first time) multi-ligand-binding residues that interact with DNA and proteins, and with RNA and proteins. Results generated on a well-annotated data set of over 23 000 proteins show that conservation of binding residues is higher for nucleic acid- than protein-binding residues. Multi-ligand-binding residues are more conserved and have higher RSA than single-ligand-binding residues. We empirically show that each hallmark discriminates between binding and nonbinding residues, even predicted RSA, and that combining them improves discriminatory power for each of the five types of interactions. Linear scoring functions that combine these hallmarks offer good predictive performance of residue-level propensity for binding and provide intuitive interpretation of predictions. Better understanding of these residue-level interactions will facilitate development of methods that accurately predict binding in the exponentially growing databases of protein sequences.
Collapse
|
25
|
Ripoche H, Laine E, Ceres N, Carbone A. JET2 Viewer: a database of predicted multiple, possibly overlapping, protein-protein interaction sites for PDB structures. Nucleic Acids Res 2017; 45:D236-D242. [PMID: 27899675 PMCID: PMC5210541 DOI: 10.1093/nar/gkw1053] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2016] [Revised: 10/18/2016] [Accepted: 10/20/2016] [Indexed: 11/13/2022] Open
Abstract
The database JET2 Viewer, openly accessible at http://www.jet2viewer.upmc.fr/, reports putative protein binding sites for all three-dimensional (3D) structures available in the Protein Data Bank (PDB). This knowledge base was generated by applying the computational method JET2 at large-scale on more than 20 000 chains. JET2 strategy yields very precise predictions of interacting surfaces and unravels their evolutionary process and complexity. JET2 Viewer provides an online intelligent display, including interactive 3D visualization of the binding sites mapped onto PDB structures and suitable files recording JET2 analyses. Predictions were evaluated on more than 15 000 experimentally characterized protein interfaces. This is, to our knowledge, the largest evaluation of a protein binding site prediction method. The overall performance of JET2 on all interfaces are: Sen = 52.52, PPV = 51.24, Spe = 80.05, Acc = 75.89. The data can be used to foster new strategies for protein-protein interactions modulation and interaction surface redesign.
Collapse
Affiliation(s)
- Hugues Ripoche
- Sorbonne Universités, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
| | - Elodie Laine
- Sorbonne Universités, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
| | - Nicoletta Ceres
- CNRS UMR 5086/University Lyon I, Institut de Biologie et Chimie des Proteines, 69367 Lyon, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France .,Institut Universitaire de France, 75005 Paris, France
| |
Collapse
|
26
|
Laine E, Carbone A. Protein social behavior makes a stronger signal for partner identification than surface geometry. Proteins 2016; 85:137-154. [PMID: 27802579 PMCID: PMC5242317 DOI: 10.1002/prot.25206] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Revised: 10/10/2016] [Accepted: 10/20/2016] [Indexed: 01/26/2023]
Abstract
Cells are interactive living systems where proteins movements, interactions and regulation are substantially free from centralized management. How protein physico‐chemical and geometrical properties determine who interact with whom remains far from fully understood. We show that characterizing how a protein behaves with many potential interactors in a complete cross‐docking study leads to a sharp identification of its cellular/true/native partner(s). We define a sociability index, or S‐index, reflecting whether a protein likes or not to pair with other proteins. Formally, we propose a suitable normalization function that accounts for protein sociability and we combine it with a simple interface‐based (ranking) score to discriminate partners from non‐interactors. We show that sociability is an important factor and that the normalization permits to reach a much higher discriminative power than shape complementarity docking scores. The social effect is also observed with more sophisticated docking algorithms. Docking conformations are evaluated using experimental binding sites. These latter approximate in the best possible way binding sites predictions, which have reached high accuracy in recent years. This makes our analysis helpful for a global understanding of partner identification and for suggesting discriminating strategies. These results contradict previous findings claiming the partner identification problem being solvable solely with geometrical docking. Proteins 2016; 85:137–154. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Elodie Laine
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, Paris, 75005, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, Paris, 75005, France.,Institut Universitaire de France, Paris, 75005, France
| |
Collapse
|
27
|
Vamparys L, Laurent B, Carbone A, Sacquin-Mora S. Great interactions: How binding incorrect partners can teach us about protein recognition and function. Proteins 2016; 84:1408-21. [PMID: 27287388 PMCID: PMC5516155 DOI: 10.1002/prot.25086] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Revised: 06/01/2016] [Accepted: 06/02/2016] [Indexed: 12/29/2022]
Abstract
Protein–protein interactions play a key part in most biological processes and understanding their mechanism is a fundamental problem leading to numerous practical applications. The prediction of protein binding sites in particular is of paramount importance since proteins now represent a major class of therapeutic targets. Amongst others methods, docking simulations between two proteins known to interact can be a useful tool for the prediction of likely binding patches on a protein surface. From the analysis of the protein interfaces generated by a massive cross‐docking experiment using the 168 proteins of the Docking Benchmark 2.0, where all possible protein pairs, and not only experimental ones, have been docked together, we show that it is also possible to predict a protein's binding residues without having any prior knowledge regarding its potential interaction partners. Evaluating the performance of cross‐docking predictions using the area under the specificity‐sensitivity ROC curve (AUC) leads to an AUC value of 0.77 for the complete benchmark (compared to the 0.5 AUC value obtained for random predictions). Furthermore, a new clustering analysis performed on the binding patches that are scattered on the protein surface show that their distribution and growth will depend on the protein's functional group. Finally, in several cases, the binding‐site predictions resulting from the cross‐docking simulations will lead to the identification of an alternate interface, which corresponds to the interaction with a biomolecular partner that is not included in the original benchmark. Proteins 2016; 84:1408–1421. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Lydie Vamparys
- Laboratoire De Biochimie Théorique, CNRS UPR 9080, Institut De Biologie Physico-Chimique, 13 Rue Pierre Et Marie Curie, Paris, 75005, France
| | - Benoist Laurent
- Laboratoire De Biochimie Théorique, CNRS UPR 9080, Institut De Biologie Physico-Chimique, 13 Rue Pierre Et Marie Curie, Paris, 75005, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC Univ-Paris 6, CNRS UMR7238, Laboratoire De Biologie Computationnelle Et Quantitative, 15 Rue De L'Ecole De Médecine, Paris, 75006, France.,Institut Universitaire De France, Paris, 75005, France
| | - Sophie Sacquin-Mora
- Laboratoire De Biochimie Théorique, CNRS UPR 9080, Institut De Biologie Physico-Chimique, 13 Rue Pierre Et Marie Curie, Paris, 75005, France.
| |
Collapse
|
28
|
Champeimont R, Laine E, Hu SW, Penin F, Carbone A. Coevolution analysis of Hepatitis C virus genome to identify the structural and functional dependency network of viral proteins. Sci Rep 2016; 6:26401. [PMID: 27198619 PMCID: PMC4873791 DOI: 10.1038/srep26401] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 05/03/2016] [Indexed: 12/20/2022] Open
Abstract
A novel computational approach of coevolution analysis allowed us to reconstruct the protein-protein interaction network of the Hepatitis C Virus (HCV) at the residue resolution. For the first time, coevolution analysis of an entire viral genome was realized, based on a limited set of protein sequences with high sequence identity within genotypes. The identified coevolving residues constitute highly relevant predictions of protein-protein interactions for further experimental identification of HCV protein complexes. The method can be used to analyse other viral genomes and to predict the associated protein interaction networks.
Collapse
Affiliation(s)
- Raphaël Champeimont
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 15 rue de l’Ecole de Médecine, 75006 Paris, France
| | - Elodie Laine
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 15 rue de l’Ecole de Médecine, 75006 Paris, France
| | - Shuang-Wei Hu
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 15 rue de l’Ecole de Médecine, 75006 Paris, France
| | - Francois Penin
- CNRS, UMR5086, Bases Moléculaires et Structurales des Systèmes Infectieux, Institut de Biologie et Chimie des Protéines, 7 Passage du Vercors, Cedex 07, F-69367 Lyon, France
- LABEX Ecofect, Université de Lyon, Lyon, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 15 rue de l’Ecole de Médecine, 75006 Paris, France
- Institut Universitaire de France, 75005, Paris, France
| |
Collapse
|