1
|
Bret H, Gao J, Zea DJ, Andreani J, Guerois R. From interaction networks to interfaces, scanning intrinsically disordered regions using AlphaFold2. Nat Commun 2024; 15:597. [PMID: 38238291 PMCID: PMC10796318 DOI: 10.1038/s41467-023-44288-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 12/07/2023] [Indexed: 01/22/2024] Open
Abstract
The revolution brought about by AlphaFold2 opens promising perspectives to unravel the complexity of protein-protein interaction networks. The analysis of interaction networks obtained from proteomics experiments does not systematically provide the delimitations of the interaction regions. This is of particular concern in the case of interactions mediated by intrinsically disordered regions, in which the interaction site is generally small. Using a dataset of protein-peptide complexes involving intrinsically disordered regions that are non-redundant with the structures used in AlphaFold2 training, we show that when using the full sequences of the proteins, AlphaFold2-Multimer only achieves 40% success rate in identifying the correct site and structure of the interface. By delineating the interaction region into fragments of decreasing size and combining different strategies for integrating evolutionary information, we manage to raise this success rate up to 90%. We obtain similar success rates using a much larger dataset of protein complexes taken from the ELM database. Beyond the correct identification of the interaction site, our study also explores specificity issues. We show the advantages and limitations of using the AlphaFold2 confidence score to discriminate between alternative binding partners, a task that can be particularly challenging in the case of small interaction motifs.
Collapse
Affiliation(s)
- Hélène Bret
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Jinmei Gao
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Diego Javier Zea
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| | - Raphaël Guerois
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| |
Collapse
|
2
|
Karamanos TK. Chasing long-range evolutionary couplings in the AlphaFold era. Biopolymers 2023; 114:e23530. [PMID: 36752285 PMCID: PMC10909459 DOI: 10.1002/bip.23530] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 02/09/2023]
Abstract
Coevolution between protein residues is normally interpreted as direct contact. However, the evolutionary record of a protein sequence contains rich information that may include long-range functional couplings, couplings that report on homo-oligomeric states or even conformational changes. Due to the complexity of the sequence space and the lack of structural information on various members of a protein family, it has been difficult to effectively mine the additional information encoded in a multiple sequence alignment (MSA). Here, taking advantage of the recent release of the AlphaFold (AF) database we attempt to identify coevolutionary couplings that cannot be explained simply by spatial proximity. We propose a simple computational method that performs direct coupling analysis on a MSA and searches for couplings that are not satisfied in any of the AF models of members of the identified protein family. Application of this method on 2012 protein families suggests that ~12% of the total identified coevolving residue pairs are spatially distant and more likely to be disordered than their contacting counterparts. We expect that this analysis will help improve the quality of coevolutionary distance restraints used for structure determination and will be useful in identifying potentially functional/allosteric cross-talk between distant residues.
Collapse
|
3
|
Roca-Martinez J, Lazar T, Gavalda-Garcia J, Bickel D, Pancsa R, Dixit B, Tzavella K, Ramasamy P, Sanchez-Fornaris M, Grau I, Vranken WF. Challenges in describing the conformation and dynamics of proteins with ambiguous behavior. Front Mol Biosci 2022; 9:959956. [PMID: 35992270 PMCID: PMC9382080 DOI: 10.3389/fmolb.2022.959956] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 06/27/2022] [Indexed: 11/13/2022] Open
Abstract
Traditionally, our understanding of how proteins operate and how evolution shapes them is based on two main data sources: the overall protein fold and the protein amino acid sequence. However, a significant part of the proteome shows highly dynamic and/or structurally ambiguous behavior, which cannot be correctly represented by the traditional fixed set of static coordinates. Representing such protein behaviors remains challenging and necessarily involves a complex interpretation of conformational states, including probabilistic descriptions. Relating protein dynamics and multiple conformations to their function as well as their physiological context (e.g., post-translational modifications and subcellular localization), therefore, remains elusive for much of the proteome, with studies to investigate the effect of protein dynamics relying heavily on computational models. We here investigate the possibility of delineating three classes of protein conformational behavior: order, disorder, and ambiguity. These definitions are explored based on three different datasets, using interpretable machine learning from a set of features, from AlphaFold2 to sequence-based predictions, to understand the overlap and differences between these datasets. This forms the basis for a discussion on the current limitations in describing the behavior of dynamic and ambiguous proteins.
Collapse
Affiliation(s)
- Joel Roca-Martinez
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Tamas Lazar
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- VIB-VUB Center for Structural Biology, Brussels, Belgium
| | - Jose Gavalda-Garcia
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - David Bickel
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Rita Pancsa
- Research Centre for Natural Sciences, Institute of Enzymology, Budapest, Hungary
| | - Bhawna Dixit
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- IBiTech-Biommeda, Universiteit Gent, Gent, Belgium
| | - Konstantina Tzavella
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Pathmanaban Ramasamy
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- VIB-UGent Center for Medical Biotechnology, Universiteit Gent, Gent, Belgium
| | - Maite Sanchez-Fornaris
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- Department of Computer Sciences, University of Camagüey, Camagüey, Cuba
| | - Isel Grau
- Information Systems, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Wim F. Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| |
Collapse
|
4
|
Sigüeiro R, Bianchetti L, Peluso-Iltis C, Chalhoub S, Dejaegere A, Osz J, Rochel N. Advances in Vitamin D Receptor Function and Evolution Based on the 3D Structure of the Lamprey Ligand-Binding Domain. J Med Chem 2022; 65:5821-5829. [PMID: 35302785 DOI: 10.1021/acs.jmedchem.2c00171] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
1α,25-dihydroxyvitamin D3 (1,25D3) regulates many physiological processes in vertebrates by binding to the vitamin D receptor (VDR). Phylogenetic analysis indicates that jawless fishes are the most basal vertebrates exhibiting a VDR gene. To elucidate the mechanism driving VDR activation during evolution, we determined the crystal structure of the VDR ligand-binding domain (LBD) complex from the basal vertebratePetromyzon marinus, sea lamprey (lVDR). Comparison of three-dimensional crystal structures of the lVDR-1,25D3 complex with higher vertebrate VDR-1,25D3 structures suggests that 1,25D3 binds to lVDR similarly to human VDR, but with unique features for lVDR around linker regions between H11 and H12 and between H9 and H10. These structural differences may contribute to the marked species differences in transcriptional responses. Furthermore, residue co-evolution analysis of VDR across vertebrates identifies amino acid positions in H9 and the large insertion domain VDR LBD specific as correlated.
Collapse
Affiliation(s)
- Rita Sigüeiro
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), 67400 Illkirch, France.,Institut National de La Santé et de La Recherche Médicale (INSERM), U1258, 67400 Illkirch, France.,Centre National de Recherche Scientifique (CNRS), UMR7104, 67400 Illkirch, France.,Université de Strasbourg, 67400 Illkirch, France
| | - Laurent Bianchetti
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), 67400 Illkirch, France.,Institut National de La Santé et de La Recherche Médicale (INSERM), U1258, 67400 Illkirch, France.,Centre National de Recherche Scientifique (CNRS), UMR7104, 67400 Illkirch, France.,Université de Strasbourg, 67400 Illkirch, France
| | - Carole Peluso-Iltis
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), 67400 Illkirch, France.,Institut National de La Santé et de La Recherche Médicale (INSERM), U1258, 67400 Illkirch, France.,Centre National de Recherche Scientifique (CNRS), UMR7104, 67400 Illkirch, France.,Université de Strasbourg, 67400 Illkirch, France
| | - Sandra Chalhoub
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), 67400 Illkirch, France.,Institut National de La Santé et de La Recherche Médicale (INSERM), U1258, 67400 Illkirch, France.,Centre National de Recherche Scientifique (CNRS), UMR7104, 67400 Illkirch, France.,Université de Strasbourg, 67400 Illkirch, France
| | - Annick Dejaegere
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), 67400 Illkirch, France.,Institut National de La Santé et de La Recherche Médicale (INSERM), U1258, 67400 Illkirch, France.,Centre National de Recherche Scientifique (CNRS), UMR7104, 67400 Illkirch, France.,Université de Strasbourg, 67400 Illkirch, France
| | - Judit Osz
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), 67400 Illkirch, France.,Institut National de La Santé et de La Recherche Médicale (INSERM), U1258, 67400 Illkirch, France.,Centre National de Recherche Scientifique (CNRS), UMR7104, 67400 Illkirch, France.,Université de Strasbourg, 67400 Illkirch, France
| | - Natacha Rochel
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), 67400 Illkirch, France.,Institut National de La Santé et de La Recherche Médicale (INSERM), U1258, 67400 Illkirch, France.,Centre National de Recherche Scientifique (CNRS), UMR7104, 67400 Illkirch, France.,Université de Strasbourg, 67400 Illkirch, France
| |
Collapse
|