1
|
Förster D, Idier J, Liberti L, Mucherino A, Lin JH, Malliavin TE. Low-resolution description of the conformational space for intrinsically disordered proteins. Sci Rep 2022; 12:19057. [PMID: 36352011 PMCID: PMC9646904 DOI: 10.1038/s41598-022-21648-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 09/29/2022] [Indexed: 11/11/2022] Open
Abstract
Intrinsically disordered proteins (IDP) are at the center of numerous biological processes, and attract consequently extreme interest in structural biology. Numerous approaches have been developed for generating sets of IDP conformations verifying a given set of experimental measurements. We propose here to perform a systematic enumeration of protein conformations, carried out using the TAiBP approach based on distance geometry. This enumeration was performed on two proteins, Sic1 and pSic1, corresponding to unphosphorylated and phosphorylated states of an IDP. The relative populations of the obtained conformations were then obtained by fitting SAXS curves as well as Ramachandran probability maps, the original finite mixture approach RamaMix being developed for this second task. The similarity between profiles of local gyration radii provides to a certain extent a converged view of the Sic1 and pSic1 conformational space. Profiles and populations are thus proposed for describing IDP conformations. Different variations of the resulting gyration radius between phosphorylated and unphosphorylated states are observed, depending on the set of enumerated conformations as well as on the methods used for obtaining the populations.
Collapse
Affiliation(s)
- Daniel Förster
- grid.112485.b0000 0001 0217 6921UMR7374 Interfaces, Confinement, Matériaux et Nanostructures, Université d’Orléans, Orléans, France
| | - Jérôme Idier
- grid.503212.70000 0000 9563 6044UMR6004 Laboratoire des Sciences du Numérique de Nantes, Nantes, France
| | - Leo Liberti
- grid.508893.fLIX UMR 7161 CNRS École Polytechnique, Institut Polytechnique de Paris, 91128 Palaiseau, France
| | - Antonio Mucherino
- grid.420225.30000 0001 2298 7270IRISA, University of Rennes 1, Rennes, France
| | - Jung-Hsin Lin
- grid.509455.8Biomedical Translation Research Center, Academia Sinica, Taipei, Taiwan
| | - Thérèse E. Malliavin
- grid.428999.70000 0001 2353 6535Institut Pasteur, Université Paris Cité, CNRS UMR3528, Unité de Bioinformatique Structurale, F-75015 Paris, France ,grid.29172.3f0000 0001 2194 6418Université de Lorraine, CNRS UMR7019, LPCT, F-54000 Nancy, France
| |
Collapse
|
2
|
Malliavin TE. Tandem domain structure determination based on a systematic enumeration of conformations. Sci Rep 2021; 11:16925. [PMID: 34413388 PMCID: PMC8376923 DOI: 10.1038/s41598-021-96370-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2021] [Accepted: 08/04/2021] [Indexed: 12/03/2022] Open
Abstract
Protein structure determination is undergoing a change of perspective due to the larger importance taken in biology by the disordered regions of biomolecules. In such cases, the convergence criterion is more difficult to set up and the size of the conformational space is a obstacle to exhaustive exploration. A pipeline is proposed here to exhaustively sample protein conformations using backbone angle limits obtained by nuclear magnetic resonance (NMR), and then to determine the populations of conformations. The pipeline is applied to a tandem domain of the protein whirlin. An original approach, derived from a reformulation of the Distance Geometry Problem is used to enumerate the conformations of the linker connecting the two domains. Specifically designed procedure then permit to assemble the domains to the linker conformations and to optimize the tandem domain conformations with respect to two sets of NMR measurements: residual dipolar couplings and paramagnetic resonance enhancements. The relative populations of optimized conformations are finally determined by fitting small angle X-ray scattering (SAXS) data. The most populated conformation of the tandem domain is a semi-closed one, fully closed and more extended conformations being in minority, in agreement with previous observations. The SAXS and NMR data show different influences on the determination of populations.
Collapse
Affiliation(s)
- Thérèse E Malliavin
- Unité de Bioinformatique Structurale, Institut Pasteur, UMR 3528, CNRS, Paris, France.
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, USR 3756, CNRS, Paris, France.
| |
Collapse
|
3
|
Malliavin TE, Mucherino A, Lavor C, Liberti L. Systematic Exploration of Protein Conformational Space Using a Distance Geometry Approach. J Chem Inf Model 2019; 59:4486-4503. [PMID: 31442036 DOI: 10.1021/acs.jcim.9b00215] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The optimization approaches classically used during the determination of protein structure encounter various difficulties, especially when the size of the conformational space is large. Indeed, in such a case, algorithmic convergence criteria are more difficult to set up. Moreover, the size of the search space makes it difficult to achieve a complete exploration. The interval branch-and-prune (iBP) approach, based on the reformulation of the distance geometry problem (DGP) provides a theoretical frame for the generation of protein conformations, by systematically sampling the conformational space. When an appropriate subset of interatomic distances is known exactly, this worst-case exponential-time algorithm is provably complete and fixed-parameter tractable. These guarantees, however, immediately disappear as distance measurement errors are introduced. Here we propose an improvement of this approach: threading-augmented interval branch-and-prune (TAiBP), where the combinatorial explosion of the original iBP approach arising from its exponential complexity is alleviated by partitioning the input instances into consecutive peptide fragments and by using self-organizing maps (SOMs) to obtain clusters of similar solutions. A validation of the TAiBP approach is presented here on a set of proteins of various sizes and structures. The calculation inputs are a uniform covalent geometry extracted from force field covalent terms, the backbone dihedral angles with error intervals, and a few long-range distances. For most of the proteins smaller than 50 residues and interval widths of 20°, the TAiBP approach yielded solutions with RMSD values smaller than 3 Å with respect to the initial protein conformation. The efficiency of the TAiBP approach for proteins larger than 50 residues will require the use of nonuniform covalent geometry and may have benefits from the recent development of residue-specific force-fields.
Collapse
Affiliation(s)
- Thérèse E Malliavin
- Unité de Bioinformatique Structurale, UMR 3528, CNRS, and Departement de Bioinformatique, Biostatistique et Biologie Intégrative, USR 3756, CNRS , Institut Pasteur , 75015 Paris , France
| | | | - Carlile Lavor
- Applied Math Department , IMECC-University of Campinas , Campinas , SP 13083-970 , Brazil
| | - Leo Liberti
- LIX CNRS, Ecole Polytechnique , Institut Polytechnique de Paris , Route de Saclay , 91128 Palaiseau , France
| |
Collapse
|
4
|
Xu C, Bouvier G, Bardiaux B, Nilges M, Malliavin T, Lisser A. Ordering Protein Contact Matrices. Comput Struct Biotechnol J 2018; 16:140-156. [PMID: 29632657 PMCID: PMC5889711 DOI: 10.1016/j.csbj.2018.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 02/28/2018] [Accepted: 03/01/2018] [Indexed: 11/29/2022] Open
Abstract
Numerous biophysical approaches provide information about residues spatial proximity in proteins. However, correct assignment of the protein fold from this proximity information is not straightforward if the spatially close protein residues are not assigned to residues in the primary sequence. Here, we propose an algorithm to assign such residue numbers by ordering the columns and lines of the raw protein contact matrix directly obtained from proximity information between unassigned amino acids. The ordering problem is formatted as the search of a trail within a graph connecting protein residues through the nonzero contact values. The algorithm performs in two steps: (i) finding the longest trail of the graph using an original dynamic programming algorithm, (ii) clustering the individual ordered matrices using a self-organizing map (SOM) approach. The combination of the dynamic programming and self-organizing map approaches constitutes a quite innovative point of the present work. The algorithm was validated on a set of about 900 proteins, representative of the sizes and proportions of secondary structures observed in the Protein Data Bank. The algorithm was revealed to be efficient for noise levels up to 40%, obtaining average gaps of about 20% at maximum between ordered and initial matrices. The proposed approach paves the ways toward a method of fold prediction from noisy proximity information, as TM scores larger than 0.5 have been obtained for ten randomly chosen proteins, in the case of a noise level of 10%. The methods has been also validated on two experimental cases, on which it performed satisfactorily.
Collapse
Affiliation(s)
- Chuan Xu
- Laboratoire de Recherche en Informatique, Université Paris-Sud and CNRS UMR8623, France
| | - Guillaume Bouvier
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR3528, France
- Centre de Bioinformatique, Biostatistique et Biologie Intégrative, Institut Pasteur and CNRS USR3756, France
| | - Benjamin Bardiaux
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR3528, France
- Centre de Bioinformatique, Biostatistique et Biologie Intégrative, Institut Pasteur and CNRS USR3756, France
| | - Michael Nilges
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR3528, France
- Centre de Bioinformatique, Biostatistique et Biologie Intégrative, Institut Pasteur and CNRS USR3756, France
| | - Thérèse Malliavin
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR3528, France
- Centre de Bioinformatique, Biostatistique et Biologie Intégrative, Institut Pasteur and CNRS USR3756, France
| | - Abdel Lisser
- Laboratoire de Recherche en Informatique, Université Paris-Sud and CNRS UMR8623, France
| |
Collapse
|
5
|
Harigua-Souiai E, Abdelkrim YZ, Bassoumi-Jamoussi I, Zakraoui O, Bouvier G, Essafi-Benkhadir K, Banroques J, Desdouits N, Munier-Lehmann H, Barhoumi M, Tanner NK, Nilges M, Blondel A, Guizani I. Identification of novel leishmanicidal molecules by virtual and biochemical screenings targeting Leishmania eukaryotic translation initiation factor 4A. PLoS Negl Trop Dis 2018; 12:e0006160. [PMID: 29346371 PMCID: PMC5790279 DOI: 10.1371/journal.pntd.0006160] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2016] [Revised: 01/30/2018] [Accepted: 12/11/2017] [Indexed: 01/25/2023] Open
Abstract
Leishmaniases are neglected parasitic diseases in spite of the major burden they inflict on public health. The identification of novel drugs and targets constitutes a research priority. For that purpose we used Leishmania infantum initiation factor 4A (LieIF), an essential translation initiation factor that belongs to the DEAD-box proteins family, as a potential drug target. We modeled its structure and identified two potential binding sites. A virtual screening of a diverse chemical library was performed for both sites. The results were analyzed with an in-house version of the Self-Organizing Maps algorithm combined with multiple filters, which led to the selection of 305 molecules. Effects of these molecules on the ATPase activity of LieIF permitted the identification of a promising hit (208) having a half maximal inhibitory concentration (IC50) of 150 ± 15 μM for 1 μM of protein. Ten chemical analogues of compound 208 were identified and two additional inhibitors were selected (20 and 48). These compounds inhibited the mammalian eIF4I with IC50 values within the same range. All three hits affected the viability of the extra-cellular form of L. infantum parasites with IC50 values at low micromolar concentrations. These molecules showed non-significant toxicity toward THP-1 macrophages. Furthermore, their anti-leishmanial activity was validated with experimental assays on L. infantum intramacrophage amastigotes showing IC50 values lower than 4.2 μM. Selected compounds exhibited selectivity indexes between 19 to 38, which reflects their potential as promising anti-Leishmania molecules.
Collapse
Affiliation(s)
- Emna Harigua-Souiai
- Laboratory of Molecular Epidemiology and Experimental Pathology – LR11IPT04, Institut Pasteur de Tunis, Université de Tunis el Manar, Tunis, Tunisia
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, Paris, France
| | - Yosser Zina Abdelkrim
- Laboratory of Molecular Epidemiology and Experimental Pathology – LR11IPT04, Institut Pasteur de Tunis, Université de Tunis el Manar, Tunis, Tunisia
- Laboratory of Microbial Gene Expression (EGM), CNRS UMR8261/Université Paris Diderot P7, Sorbonne Paris Cité & PSL, Institut de Biologie Physico-Chimique, Paris, France
- Faculté des Sciences de Bizerte, Université de Carthage, Tunis, Tunisia
| | - Imen Bassoumi-Jamoussi
- Laboratory of Molecular Epidemiology and Experimental Pathology – LR11IPT04, Institut Pasteur de Tunis, Université de Tunis el Manar, Tunis, Tunisia
| | - Ons Zakraoui
- Laboratory of Molecular Epidemiology and Experimental Pathology – LR11IPT04, Institut Pasteur de Tunis, Université de Tunis el Manar, Tunis, Tunisia
| | - Guillaume Bouvier
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, Paris, France
| | - Khadija Essafi-Benkhadir
- Laboratory of Molecular Epidemiology and Experimental Pathology – LR11IPT04, Institut Pasteur de Tunis, Université de Tunis el Manar, Tunis, Tunisia
| | - Josette Banroques
- Laboratory of Microbial Gene Expression (EGM), CNRS UMR8261/Université Paris Diderot P7, Sorbonne Paris Cité & PSL, Institut de Biologie Physico-Chimique, Paris, France
| | - Nathan Desdouits
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, Paris, France
| | - Hélène Munier-Lehmann
- Institut Pasteur, Unité de Chimie et Biocatalyse, Département de Biologie Structurale et Chimie, Paris, France
- Unité Mixte de Recherche 3523, Centre National de la Recherche Scientifique, Paris, France
| | - Mourad Barhoumi
- Laboratory of Molecular Epidemiology and Experimental Pathology – LR11IPT04, Institut Pasteur de Tunis, Université de Tunis el Manar, Tunis, Tunisia
| | - N. Kyle Tanner
- Laboratory of Microbial Gene Expression (EGM), CNRS UMR8261/Université Paris Diderot P7, Sorbonne Paris Cité & PSL, Institut de Biologie Physico-Chimique, Paris, France
| | - Michael Nilges
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, Paris, France
| | - Arnaud Blondel
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, Paris, France
| | - Ikram Guizani
- Laboratory of Molecular Epidemiology and Experimental Pathology – LR11IPT04, Institut Pasteur de Tunis, Université de Tunis el Manar, Tunis, Tunisia
| |
Collapse
|
6
|
Duclert-Savatier N, Bouvier G, Nilges M, Malliavin TE. Building Graphs To Describe Dynamics, Kinetics, and Energetics in the d-ALa:d-Lac Ligase VanA. J Chem Inf Model 2016; 56:1762-75. [PMID: 27579990 PMCID: PMC5039762 DOI: 10.1021/acs.jcim.6b00211] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
![]()
The d-Ala:d-Lac ligase, VanA, plays a critical
role in the resistance of vancomycin. Indeed, it is involved in the
synthesis of a peptidoglycan precursor, to which vancomycin cannot
bind. The reaction catalyzed by VanA requires the opening of the so-called
“ω-loop”, so that the substrates can enter the
active site. Here, the conformational landscape of VanA is explored
by an enhanced sampling approach: the temperature-accelerated molecular
dynamics (TAMD). Analysis of the molecular dynamics (MD) and TAMD
trajectories recorded on VanA permits a graphical description of the
structural and kinetics aspects of the conformational space of VanA,
where the internal mobility and various opening modes of the ω-loop
play a major role. The other important feature is the correlation
of the ω-loop motion with the movements of the opposite domain,
defined as containing the residues A149–Q208. Conformational
and kinetic clusters have been determined and a path describing the
ω-loop opening was extracted from these clusters. The determination
of this opening path, as well as the relative importance of hydrogen
bonds along the path, permit one to propose some key residue interactions
for the kinetics of the ω-loop opening.
Collapse
Affiliation(s)
- Nathalie Duclert-Savatier
- Département de Biologie Structurale et Chimie, Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528 , 25, rue du Dr Roux, 75015 Paris, France
| | - Guillaume Bouvier
- Département de Biologie Structurale et Chimie, Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528 , 25, rue du Dr Roux, 75015 Paris, France
| | - Michael Nilges
- Département de Biologie Structurale et Chimie, Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528 , 25, rue du Dr Roux, 75015 Paris, France
| | - Thérèse E Malliavin
- Département de Biologie Structurale et Chimie, Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528 , 25, rue du Dr Roux, 75015 Paris, France
| |
Collapse
|
7
|
Cortés-Ciriano I, van Westen GJP, Bouvier G, Nilges M, Overington JP, Bender A, Malliavin TE. Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel. Bioinformatics 2015; 32:85-95. [PMID: 26351271 PMCID: PMC4681992 DOI: 10.1093/bioinformatics/btv529] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 08/26/2015] [Indexed: 01/28/2023] Open
Abstract
MOTIVATION Recent large-scale omics initiatives have catalogued the somatic alterations of cancer cell line panels along with their pharmacological response to hundreds of compounds. In this study, we have explored these data to advance computational approaches that enable more effective and targeted use of current and future anticancer therapeutics. RESULTS We modelled the 50% growth inhibition bioassay end-point (GI50) of 17,142 compounds screened against 59 cancer cell lines from the NCI60 panel (941,831 data-points, matrix 93.08% complete) by integrating the chemical and biological (cell line) information. We determine that the protein, gene transcript and miRNA abundance provide the highest predictive signal when modelling the GI50 endpoint, which significantly outperformed the DNA copy-number variation or exome sequencing data (Tukey's Honestly Significant Difference, P <0.05). We demonstrate that, within the limits of the data, our approach exhibits the ability to both interpolate and extrapolate compound bioactivities to new cell lines and tissues and, although to a lesser extent, to dissimilar compounds. Moreover, our approach outperforms previous models generated on the GDSC dataset. Finally, we determine that in the cases investigated in more detail, the predicted drug-pathway associations and growth inhibition patterns are mostly consistent with the experimental data, which also suggests the possibility of identifying genomic markers of drug sensitivity for novel compounds on novel cell lines. CONTACT terez@pasteur.fr; ab454@ac.cam.uk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Isidro Cortés-Ciriano
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR 3825, Structural Biology and Chemistry Department, 75 724 Paris, France
| | - Gerard J P van Westen
- Medicinal Chemistry, Leiden Academic Centre for Drug Research, Einsteinweg 55, 2333CC, Leiden
| | - Guillaume Bouvier
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR 3825, Structural Biology and Chemistry Department, 75 724 Paris, France
| | - Michael Nilges
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR 3825, Structural Biology and Chemistry Department, 75 724 Paris, France
| | - John P Overington
- European Molecular Biology Laboratory European Bioinformatics Institute, Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK and
| | - Andreas Bender
- Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, CB2 1EW Cambridge, UK
| | - Thérèse E Malliavin
- Unité de Bioinformatique Structurale, Institut Pasteur and CNRS UMR 3825, Structural Biology and Chemistry Department, 75 724 Paris, France
| |
Collapse
|
8
|
Cortes-Ciriano I, Bouvier G, Nilges M, Maragliano L, Malliavin TE. Temperature Accelerated Molecular Dynamics with Soft-Ratcheting Criterion Orients Enhanced Sampling by Low-Resolution Information. J Chem Theory Comput 2015; 11:3446-54. [PMID: 26575778 DOI: 10.1021/acs.jctc.5b00153] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Many proteins exhibit an equilibrium between multiple conformations, some of them being characterized only by low-resolution information. Visiting all conformations is a demanding task for computational techniques performing enhanced but unfocused exploration of collective variable (CV) space. Otherwise, pulling a structure toward a target condition biases the exploration in a way difficult to assess. To address this problem, we introduce here the soft-ratcheting temperature-accelerated molecular dynamics (sr-TAMD), where the exploration of CV space by TAMD is coupled to a soft-ratcheting algorithm that filters the evolving CV values according to a predefined criterion. Any low resolution or even qualitative information can be used to orient the exploration. We validate this technique by exploring the conformational space of the inactive state of the catalytic domain of the adenyl cyclase AC from Bordetella pertussis. The domain AC gets activated by association with calmodulin (CaM), and the available crystal structure shows that in the complex the protein has an elongated shape. High-resolution data are not available for the inactive, CaM-free protein state, but hydrodynamic measurements have shown that the inactive AC displays a more globular conformation. Here, using as CVs several geometric centers, we use sr-TAMD to enhance CV space sampling while filtering for CV values that correspond to centers moving close to each other, and we thus rapidly visit regions of conformational space that correspond to globular structures. The set of conformations sampled using sr-TAMD provides the most extensive description of the inactive state of AC up to now, consistent with available experimental information.
Collapse
Affiliation(s)
- Isidro Cortes-Ciriano
- Unité de Bioinformatique Structurale, CNRS UMR 3528, Structural Biology and Chemistry Department, Institut Pasteur , 25-28, rue Dr. Roux, 75 724 Paris, France
| | - Guillaume Bouvier
- Unité de Bioinformatique Structurale, CNRS UMR 3528, Structural Biology and Chemistry Department, Institut Pasteur , 25-28, rue Dr. Roux, 75 724 Paris, France
| | - Michael Nilges
- Unité de Bioinformatique Structurale, CNRS UMR 3528, Structural Biology and Chemistry Department, Institut Pasteur , 25-28, rue Dr. Roux, 75 724 Paris, France
| | - Luca Maragliano
- Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia , Genoa, Italy
| | - Thérèse E Malliavin
- Unité de Bioinformatique Structurale, CNRS UMR 3528, Structural Biology and Chemistry Department, Institut Pasteur , 25-28, rue Dr. Roux, 75 724 Paris, France
| |
Collapse
|
9
|
Harigua-Souiai E, Cortes-Ciriano I, Desdouits N, Malliavin TE, Guizani I, Nilges M, Blondel A, Bouvier G. Identification of binding sites and favorable ligand binding moieties by virtual screening and self-organizing map analysis. BMC Bioinformatics 2015; 16:93. [PMID: 25888251 PMCID: PMC4381396 DOI: 10.1186/s12859-015-0518-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Accepted: 02/24/2015] [Indexed: 11/24/2022] Open
Abstract
Background Identifying druggable cavities on a protein surface is a crucial step in structure based drug design. The cavities have to present suitable size and shape, as well as appropriate chemical complementarity with ligands. Results We present a novel cavity prediction method that analyzes results of virtual screening of specific ligands or fragment libraries by means of Self-Organizing Maps. We demonstrate the method with two thoroughly studied proteins where it successfully identified their active sites (AS) and relevant secondary binding sites (BS). Moreover, known active ligands mapped the AS better than inactive ones. Interestingly, docking a naive fragment library brought even more insight. We then systematically applied the method to the 102 targets from the DUD-E database, where it showed a 90% identification rate of the AS among the first three consensual clusters of the SOM, and in 82% of the cases as the first one. Further analysis by chemical decomposition of the fragments improved BS prediction. Chemical substructures that are representative of the active ligands preferentially mapped in the AS. Conclusion The new approach provides valuable information both on relevant BSs and on chemical features promoting bioactivity. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0518-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Emna Harigua-Souiai
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, 25, rue du Dr Roux, Paris, 75015, France. .,Laboratory of Molecular Epidemiology and Experimental Pathology - LR11IPT04, Institut Pasteur de Tunis, Université Tunis el Manar - Tunisia, 13, Place Pasteur, Tunis, 1002, Tunisia. .,University of Carthage, Faculty of sciences of Bizerte - Tunisia, Jarzouna, 7021, Tunisia.
| | - Isidro Cortes-Ciriano
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, 25, rue du Dr Roux, Paris, 75015, France.
| | - Nathan Desdouits
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, 25, rue du Dr Roux, Paris, 75015, France.
| | - Thérèse E Malliavin
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, 25, rue du Dr Roux, Paris, 75015, France.
| | - Ikram Guizani
- Laboratory of Molecular Epidemiology and Experimental Pathology - LR11IPT04, Institut Pasteur de Tunis, Université Tunis el Manar - Tunisia, 13, Place Pasteur, Tunis, 1002, Tunisia.
| | - Michael Nilges
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, 25, rue du Dr Roux, Paris, 75015, France.
| | - Arnaud Blondel
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, 25, rue du Dr Roux, Paris, 75015, France.
| | - Guillaume Bouvier
- Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3528, Département de Biologie Structurale et Chimie, 25, rue du Dr Roux, Paris, 75015, France.
| |
Collapse
|
10
|
Cassioli A, Bardiaux B, Bouvier G, Mucherino A, Alves R, Liberti L, Nilges M, Lavor C, Malliavin TE. An algorithm to enumerate all possible protein conformations verifying a set of distance constraints. BMC Bioinformatics 2015; 16:23. [PMID: 25627244 PMCID: PMC4384350 DOI: 10.1186/s12859-015-0451-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 01/05/2015] [Indexed: 11/15/2023] Open
Abstract
BACKGROUND The determination of protein structures satisfying distance constraints is an important problem in structural biology. Whereas the most common method currently employed is simulated annealing, there have been other methods previously proposed in the literature. Most of them, however, are designed to find one solution only. RESULTS In order to explore exhaustively the feasible conformational space, we propose here an interval Branch-and-Prune algorithm (iBP) to solve the Distance Geometry Problem (DGP) associated to protein structure determination. This algorithm is based on a discretization of the problem obtained by recursively constructing a search space having the structure of a tree, and by verifying whether the generated atomic positions are feasible or not by making use of pruning devices. The pruning devices used here are directly related to features of protein conformations. CONCLUSIONS We described the new algorithm iBP to generate protein conformations satisfying distance constraints, that would potentially allows a systematic exploration of the conformational space. The algorithm iBP has been applied on three α-helical peptides.
Collapse
Affiliation(s)
| | - Benjamin Bardiaux
- Institut Pasteur, Structural Bioinformatics Unit, 25, rue du Dr Roux, Paris, 75015, France. .,CNRS UMR3528, 25, rue du Dr Roux, Paris, 75015, France.
| | - Guillaume Bouvier
- Institut Pasteur, Structural Bioinformatics Unit, 25, rue du Dr Roux, Paris, 75015, France. .,CNRS UMR3528, 25, rue du Dr Roux, Paris, 75015, France.
| | | | - Rafael Alves
- LIX, Ecole Polytechnique, Palaiseau, 91128, France.
| | - Leo Liberti
- LIX, Ecole Polytechnique, Palaiseau, 91128, France. .,IBM TJ Watson Research Center, NY Yorktown Heights, 10598, USA.
| | - Michael Nilges
- Institut Pasteur, Structural Bioinformatics Unit, 25, rue du Dr Roux, Paris, 75015, France. .,CNRS UMR3528, 25, rue du Dr Roux, Paris, 75015, France.
| | - Carlile Lavor
- University of Campinas (IMECC-UNICAMP), Campinas-SP, 13083-859, Brasil.
| | - Thérèse E Malliavin
- Institut Pasteur, Structural Bioinformatics Unit, 25, rue du Dr Roux, Paris, 75015, France. .,CNRS UMR3528, 25, rue du Dr Roux, Paris, 75015, France.
| |
Collapse
|
11
|
Bouvier G, Desdouits N, Ferber M, Blondel A, Nilges M. An automatic tool to analyze and cluster macromolecular conformations based on self-organizing maps. ACTA ACUST UNITED AC 2014; 31:1490-2. [PMID: 25543048 DOI: 10.1093/bioinformatics/btu849] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2014] [Accepted: 12/21/2014] [Indexed: 11/12/2022]
Abstract
MOTIVATION Sampling the conformational space of biological macromolecules generates large sets of data with considerable complexity. Data-mining techniques, such as clustering, can extract meaningful information. Among them, the self-organizing maps (SOMs) algorithm has shown great promise; in particular since its computation time rises only linearly with the size of the data set. Whereas SOMs are generally used with few neurons, we investigate here their behavior with large numbers of neurons. RESULTS We present here a python library implementing the full SOM analysis workflow. Large SOMs can readily be applied on heavy data sets. Coupled with visualization tools they have very interesting properties. Descriptors for each conformation of a trajectory are calculated and mapped onto a 3D landscape, the U-matrix, reporting the distance between neighboring neurons. To delineate clusters, we developed the flooding algorithm, which hierarchically identifies local basins of the U-matrix from the global minimum to the maximum. AVAILABILITY AND IMPLEMENTATION The python implementation of the SOM library is freely available on github: https://github.com/bougui505/SOM. CONTACT michael.nilges@pasteur.fr or guillaume.bouvier@pasteur.fr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Guillaume Bouvier
- Institut Pasteur, Unité de Bioinformatique Structurale; CNRS UMR 3528; Département de Biologie Structurale et Chimie; F-75015, Paris, France
| | - Nathan Desdouits
- Institut Pasteur, Unité de Bioinformatique Structurale; CNRS UMR 3528; Département de Biologie Structurale et Chimie; F-75015, Paris, France
| | - Mathias Ferber
- Institut Pasteur, Unité de Bioinformatique Structurale; CNRS UMR 3528; Département de Biologie Structurale et Chimie; F-75015, Paris, France
| | - Arnaud Blondel
- Institut Pasteur, Unité de Bioinformatique Structurale; CNRS UMR 3528; Département de Biologie Structurale et Chimie; F-75015, Paris, France
| | - Michael Nilges
- Institut Pasteur, Unité de Bioinformatique Structurale; CNRS UMR 3528; Département de Biologie Structurale et Chimie; F-75015, Paris, France
| |
Collapse
|
12
|
Distinct docking and stabilization steps of the Pseudopilus conformational transition path suggest rotational assembly of type IV pilus-like fibers. Structure 2014; 22:685-96. [PMID: 24685147 DOI: 10.1016/j.str.2014.03.001] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2013] [Revised: 02/28/2014] [Accepted: 03/04/2014] [Indexed: 01/07/2023]
Abstract
The closely related bacterial type II secretion (T2S) and type IV pilus (T4P) systems are sophisticated machines that assemble dynamic fibers promoting protein transport, motility, or adhesion. Despite their essential role in virulence, the molecular mechanisms underlying helical fiber assembly remain unknown. Here, we use electron microscopy and flexible modeling to study conformational changes of PulG pili assembled by the Klebsiella oxytoca T2SS. Neural network analysis of 3,900 pilus models suggested a transition path toward low-energy conformations driven by progressive increase in fiber helical twist. Detailed predictions of interprotomer contacts along this path were tested by site-directed mutagenesis, pilus assembly, and protein secretion analyses. We demonstrate that electrostatic interactions between adjacent protomers (P-P+1) in the membrane drive pseudopilin docking, while P-P+3 and P-P+4 contacts determine downstream fiber stabilization steps. These results support a model of a spool-like assembly mechanism for fibers of the T2SS-T4P superfamily.
Collapse
|