1
|
Sirugue L, Langenfeld F, Lagarde N, Montes M. PLO3S: Protein LOcal Surficial Similarity Screening. Comput Struct Biotechnol J 2024; 26:1-10. [PMID: 38189058 PMCID: PMC10770625 DOI: 10.1016/j.csbj.2023.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 12/01/2023] [Accepted: 12/03/2023] [Indexed: 01/09/2024] Open
Abstract
The study of protein molecular surfaces enables to better understand and predict protein interactions. Different methods have been developed in computer vision to compare surfaces that can be applied to protein molecular surfaces. The present work proposes a method using the Wave Kernel Signature: Protein LOcal Surficial Similarity Screening (PLO3S). The descriptor of the PLO3S method is a local surface shape descriptor projected on a unit sphere mapped onto a 2D plane and called Surface Wave Interpolated Maps (SWIM). PLO3S allows to rapidly compare protein surface shapes through local comparisons to filter large protein surfaces datasets in protein structures virtual screening protocols.
Collapse
Affiliation(s)
- Léa Sirugue
- Laboratoire GBCM, EA7528, Conservatoire National des Arts et Métiers, Hesam Université, 2, rue Conté, Paris, 75003, France
| | - Florent Langenfeld
- Laboratoire GBCM, EA7528, Conservatoire National des Arts et Métiers, Hesam Université, 2, rue Conté, Paris, 75003, France
| | - Nathalie Lagarde
- Laboratoire GBCM, EA7528, Conservatoire National des Arts et Métiers, Hesam Université, 2, rue Conté, Paris, 75003, France
| | - Matthieu Montes
- Laboratoire GBCM, EA7528, Conservatoire National des Arts et Métiers, Hesam Université, 2, rue Conté, Paris, 75003, France
| |
Collapse
|
2
|
Shin WH, Kihara D. PL-PatchSurfer3: Improved Structure-Based Virtual Screening for Structure Variation Using 3D Zernike Descriptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.22.581511. [PMID: 38464318 PMCID: PMC10925112 DOI: 10.1101/2024.02.22.581511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Structure-based virtual screening (SBVS) is a widely used method in silico drug discovery. It necessitates a receptor structure or binding site to predict the binding pose and fitness of a ligand. Therefore, the performance of the SBVS is affected by the protein conformation. The most frequently used method in SBVS is the protein-ligand docking program, which utilizes atomic distance-based scoring functions. Hence, they are highly prone to sensitivity towards variation in receptor structure, and it is reported that the conformational change significantly drops the performance of the docking program. To address the problem, we have introduced a novel program of SBVS, named PL-PatchSurfer. This program makes use of molecular surface patches and the Zernike descriptor. The surfaces of the pocket and ligand are segmented into several patches by the program. These patches are then mapped with physico-chemical properties such as shape and electrostatic potential before being converted into the Zernike descriptor, which is rotationally invariant. A complementarity between the protein and the ligand is assessed by comparing the descriptors and geometric distribution of the patches in the molecules. A benchmarking study showed that PL-PatchSurfer2 was able to screen active molecules regardless of the receptor structure change with fast speed. However, the program could not achieve high performance for the targets that the hydrogen bonding feature is important such as nuclear hormone receptors. In this paper, we present the newer version of PL-PatchSurfer, PL-PatchSurfer3, which incorporates two new features: a change in the definition of hydrogen bond complementarity and consideration of visibility that contains curvature information of a patch. Our evaluation demonstrates that the new program outperforms its predecessor and other SBVS methods while retaining its characteristic tolerance to receptor structure changes. Interested individuals can access the program at kiharalab.org/plps3.
Collapse
Affiliation(s)
- Woong-Hee Shin
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
| | - Daisuke Kihara
- Department of Biological Science, Purdue University, West Lafayette, IN, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
- Center for Cancer Research, Purdue University, West Lafayette, IN, USA
| |
Collapse
|
3
|
McCafferty CL, Klumpe S, Amaro RE, Kukulski W, Collinson L, Engel BD. Integrating cellular electron microscopy with multimodal data to explore biology across space and time. Cell 2024; 187:563-584. [PMID: 38306982 DOI: 10.1016/j.cell.2024.01.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 01/03/2024] [Accepted: 01/03/2024] [Indexed: 02/04/2024]
Abstract
Biology spans a continuum of length and time scales. Individual experimental methods only glimpse discrete pieces of this spectrum but can be combined to construct a more holistic view. In this Review, we detail the latest advancements in volume electron microscopy (vEM) and cryo-electron tomography (cryo-ET), which together can visualize biological complexity across scales from the organization of cells in large tissues to the molecular details inside native cellular environments. In addition, we discuss emerging methodologies for integrating three-dimensional electron microscopy (3DEM) imaging with multimodal data, including fluorescence microscopy, mass spectrometry, single-particle analysis, and AI-based structure prediction. This multifaceted approach fills gaps in the biological continuum, providing functional context, spatial organization, molecular identity, and native interactions. We conclude with a perspective on incorporating diverse data into computational simulations that further bridge and extend length scales while integrating the dimension of time.
Collapse
Affiliation(s)
| | - Sven Klumpe
- Research Group CryoEM Technology, Max-Planck-Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany.
| | - Rommie E Amaro
- Department of Molecular Biology, University of California, San Diego, La Jolla, CA 92093, USA.
| | - Wanda Kukulski
- Institute of Biochemistry and Molecular Medicine, University of Bern, Bühlstrasse 28, 3012 Bern, Switzerland.
| | - Lucy Collinson
- Electron Microscopy Science Technology Platform, Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK.
| | - Benjamin D Engel
- Biozentrum, University of Basel, Spitalstrasse 41, 4056 Basel, Switzerland.
| |
Collapse
|
4
|
Niemann M, Matern BM, Spierings E. Repeated local ellipsoid protrusion supplements HLA surface characterization. HLA 2024; 103:e15260. [PMID: 37853578 DOI: 10.1111/tan.15260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 08/11/2023] [Accepted: 10/05/2023] [Indexed: 10/20/2023]
Abstract
Allorecognition of donor HLA is a major risk factor for long-term kidney graft survival. Although several molecular matching algorithms have been proposed that compare physiochemical and structural features of the donors' and recipients' HLA proteins in order to predict their compatibility, the exact underlying mechanisms are still not fully understood. We hypothesized that the ElliPro approach of single ellipsoid fitting and protrusion ranking lacks sensitivity for the characteristic shape of HLA molecules and developed a prediction pipeline named Snowball that is fitting smaller ellipsoids iteratively to substructures. Aggregated protrusion ranks of locally fitted ellipsoids were calculated for 712 publicly available HLA structures and 78 predicted structures using AlphaFold 2. Amino-acid sequence and protrusion ranks were used to train deep neural network predictors to infer protrusion ranks for all known HLA sequences. Snowball protrusion ranks appear to be more sensitive than ElliPro scores in fine parts of the HLA such as the helix structures forming the HLA binding groove in particular when the ellipsoids are fitted to substructures considering atoms within a 15 Å radius. A cloud-based web service was implemented based on amino-acid matching considering both protein- and position-specific surface area and protrusion ranks extending the previously presented Snowflake prediction pipeline.
Collapse
Affiliation(s)
| | - Benedict M Matern
- Research and Development, PIRCHE AG, Berlin, Germany
- Center for Translational Immunology, University Medical Center, Utrecht, Netherlands
| | - Eric Spierings
- Center for Translational Immunology, University Medical Center, Utrecht, Netherlands
- Central Diagnostic Laboratory, University Medical Center, Utrecht, Netherlands
| |
Collapse
|
5
|
Banach M. Structural Outlier Detection and Zernike-Canterakis Moments for Molecular Surface Meshes-Fast Implementation in Python. Molecules 2023; 29:52. [PMID: 38202635 PMCID: PMC10779519 DOI: 10.3390/molecules29010052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/06/2023] [Accepted: 12/12/2023] [Indexed: 01/12/2024] Open
Abstract
Object retrieval systems measure the degree of similarity of the shape of 3D models. They search for the elements of the 3D model databases that resemble the query model. In structural bioinformatics, the query model is a protein tertiary/quaternary structure and the objective is to find similarly shaped molecules in the Protein Data Bank. With the ever-growing size of the PDB, a direct atomic coordinate comparison with all its members is impractical. To overcome this problem, the shape of the molecules can be encoded by fixed-length feature vectors. The distance of a protein to the entire PDB can be measured in this low-dimensional domain in linear time. The state-of-the-art approaches utilize Zernike-Canterakis moments for the shape encoding and supply the retrieval process with geometric data of the input structures. The BioZernike descriptors are a standard utility of the PDB since 2020. However, when trying to calculate the ZC moments locally, the issue of the deficiency of libraries readily available for use in custom programs (i.e., without relying on external binaries) is encountered, in particular programs written in Python. Here, a fast and well-documented Python implementation of the Pozo-Koehl algorithm is presented. In contrast to the more popular algorithm by Novotni and Klein, which is based on the voxelized volume, the PK algorithm produces ZC moments directly from the triangular surface meshes of 3D models. In particular, it can accept the molecular surfaces of proteins as its input. In the presented PK-Zernike library, owing to Numba's just-in-time compilation, a mesh with 50,000 facets is processed by a single thread in a second at the moment order 20. Since this is the first time the PK algorithm is used in structural bioinformatics, it is employed in a novel, simple, but efficient protein structure retrieval pipeline. The elimination of the outlying chain fragments via a fast PCA-based subroutine improves the discrimination ability, allowing for this pipeline to achieve an 0.961 area under the ROC curve in the BioZernike validation suite (0.997 for the assemblies). The correlation between the results of the proposed approach and of the 3D Surfer program attains values up to 0.99.
Collapse
Affiliation(s)
- Mateusz Banach
- Department of Bioinformatics and Telemedicine, Faculty of Medicine, Jagiellonian University Medical College, Medyczna 7, 30-688 Kraków, Poland
| |
Collapse
|
6
|
Emonts J, Buyel J. An overview of descriptors to capture protein properties - Tools and perspectives in the context of QSAR modeling. Comput Struct Biotechnol J 2023; 21:3234-3247. [PMID: 38213891 PMCID: PMC10781719 DOI: 10.1016/j.csbj.2023.05.022] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/23/2023] [Accepted: 05/23/2023] [Indexed: 01/13/2024] Open
Abstract
Proteins are important ingredients in food and feed, they are the active components of many pharmaceutical products, and they are necessary, in the form of enzymes, for the success of many technical processes. However, production can be challenging, especially when using heterologous host cells such as bacteria to express and assemble recombinant mammalian proteins. The manufacturability of proteins can be hindered by low solubility, a tendency to aggregate, or inefficient purification. Tools such as in silico protein engineering and models that predict separation criteria can overcome these issues but usually require the complex shape and surface properties of proteins to be represented by a small number of quantitative numeric values known as descriptors, as similarly used to capture the features of small molecules. Here, we review the current status of protein descriptors, especially for application in quantitative structure activity relationship (QSAR) models. First, we describe the complexity of proteins and the properties that descriptors must accommodate. Then we introduce descriptors of shape and surface properties that quantify the global and local features of proteins. Finally, we highlight the current limitations of protein descriptors and propose strategies for the derivation of novel protein descriptors that are more informative.
Collapse
Affiliation(s)
- J. Emonts
- Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Germany
| | - J.F. Buyel
- University of Natural Resources and Life Sciences, Vienna (BOKU), Department of Biotechnology (DBT), Institute of Bioprocess Science and Engineering (IBSE), Muthgasse 18, 1190 Vienna, Austria
- Institute for Molecular Biotechnology, Worringerweg 1, RWTH Aachen University, 52074 Aachen, Germany
| |
Collapse
|
7
|
Bruley A, Bitard-Feildel T, Callebaut I, Duprat E. A sequence-based foldability score combined with AlphaFold2 predictions to disentangle the protein order/disorder continuum. Proteins 2023; 91:466-484. [PMID: 36306150 DOI: 10.1002/prot.26441] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 10/14/2022] [Accepted: 10/18/2022] [Indexed: 11/11/2022]
Abstract
Order and disorder govern protein functions, but there is a great diversity in disorder, from regions that are-and stay-fully disordered to conditional order. This diversity is still difficult to decipher even though it is encoded in the amino acid sequences. Here, we developed an analytic Python package, named pyHCA, to estimate the foldability of a protein segment from the only information of its amino acid sequence and based on a measure of its density in regular secondary structures associated with hydrophobic clusters, as defined by the hydrophobic cluster analysis (HCA) approach. The tool was designed by optimizing the separation between foldable segments from databases of disorder (DisProt) and order (SCOPe [soluble domains] and OPM [transmembrane domains]). It allows to specify the ratio between order, embodied by regular secondary structures (either participating in the hydrophobic core of well-folded 3D structures or conditionally formed in intrinsically disordered regions) and disorder. We illustrated the relevance of pyHCA with several examples and applied it to the sequences of the proteomes of 21 species ranging from prokaryotes and archaea to unicellular and multicellular eukaryotes, for which structure models are provided in the AlphaFold protein structure database. Cases of low-confidence scores related to disorder were distinguished from those of sequences that we identified as foldable but are still excluded from accurate modeling by AlphaFold2 due to a lack of sequence homologs or to compositional biases. Overall, our approach is complementary to AlphaFold2, providing guides to map structural innovations through evolutionary processes, at proteome and gene scales.
Collapse
Affiliation(s)
- Apolline Bruley
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Tristan Bitard-Feildel
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Elodie Duprat
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| |
Collapse
|
8
|
Shin WH, Kumazawa K, Imai K, Hirokawa T, Kihara D. Quantitative comparison of protein-protein interaction interface using physicochemical feature-based descriptors of surface patches. Front Mol Biosci 2023; 10:1110567. [PMID: 36814641 PMCID: PMC9939524 DOI: 10.3389/fmolb.2023.1110567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 01/24/2023] [Indexed: 02/09/2023] Open
Abstract
Driving mechanisms of many biological functions in a cell include physical interactions of proteins. As protein-protein interactions (PPIs) are also important in disease development, protein-protein interactions are highlighted in the pharmaceutical industry as possible therapeutic targets in recent years. To understand the variety of protein-protein interactions in a proteome, it is essential to establish a method that can identify similarity and dissimilarity between protein-protein interactions for inferring the binding of similar molecules, including drugs and other proteins. In this study, we developed a novel method, protein-protein interaction-Surfer, which compares and quantifies similarity of local surface regions of protein-protein interactions. protein-protein interaction-Surfer represents a protein-protein interaction surface with overlapping surface patches, each of which is described with a three-dimensional Zernike descriptor (3DZD), a compact mathematical representation of 3D function. 3DZD captures both the 3D shape and physicochemical properties of the protein surface. The performance of protein-protein interaction-Surfer was benchmarked on datasets of protein-protein interactions, where we were able to show that protein-protein interaction-Surfer finds similar potential drug binding regions that do not share sequence and structure similarity. protein-protein interaction-Surfer is available at https://kiharalab.org/ppi-surfer.
Collapse
Affiliation(s)
- Woong-Hee Shin
- Department of Chemistry Education, Sunchon National University, Suncheon, South Korea,Department of Advanced Components and Materials Engineering, Sunchon National University, Suncheon, South Korea
| | - Keiko Kumazawa
- Pharmaceutical Discovery Research Laboratories, Teijin Pharma Limited, Tokyo, Japan
| | - Kenichiro Imai
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
| | - Takatsugu Hirokawa
- Division of Biomedical Science, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan,Transborder Medical Research Center, University of Tsukuba, Tsukuba, Japan
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States,Department of Computer Science, Purdue University, West Lafayette, IN, United States,Center for Cancer Research, Purdue University, West Lafayette, IN, United States,*Correspondence: Daisuke Kihara,
| |
Collapse
|
9
|
De Lauro A, Di Rienzo L, Miotto M, Olimpieri PP, Milanetti E, Ruocco G. Shape Complementarity Optimization of Antibody–Antigen Interfaces: The Application to SARS-CoV-2 Spike Protein. Front Mol Biosci 2022; 9:874296. [PMID: 35669567 PMCID: PMC9163568 DOI: 10.3389/fmolb.2022.874296] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 04/07/2022] [Indexed: 11/17/2022] Open
Abstract
Many factors influence biomolecule binding, and its assessment constitutes an elusive challenge in computational structural biology. In this aspect, the evaluation of shape complementarity at molecular interfaces is one of the main factors to be considered. We focus on the particular case of antibody–antigen complexes to quantify the complementarities occurring at molecular interfaces. We relied on a method we recently developed, which employs the 2D Zernike descriptors, to characterize the investigated regions with an ordered set of numbers summarizing the local shape properties. Collecting a structural dataset of antibody–antigen complexes, we applied this method and we statistically distinguished, in terms of shape complementarity, pairs of the interacting regions from the non-interacting ones. Thus, we set up a novel computational strategy based on in silico mutagenesis of antibody-binding site residues. We developed a Monte Carlo procedure to increase the shape complementarity between the antibody paratope and a given epitope on a target protein surface. We applied our protocol against several molecular targets in SARS-CoV-2 spike protein, known to be indispensable for viral cell invasion. We, therefore, optimized the shape of template antibodies for the interaction with such regions. As the last step of our procedure, we performed an independent molecular docking validation of the results of our Monte Carlo simulations.
Collapse
Affiliation(s)
| | - Lorenzo Di Rienzo
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- *Correspondence: Lorenzo Di Rienzo,
| | - Mattia Miotto
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
| | | | - Edoardo Milanetti
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| |
Collapse
|
10
|
Abstract
Proteins have dynamic structures that undergo chain motions on time scales spanning from picoseconds to seconds. Resolving the resultant conformational heterogeneity is essential for gaining accurate insight into fundamental mechanistic aspects of the protein folding reaction. The use of high-resolution structural probes, sensitive to population distributions, has begun to enable the resolution of site-specific conformational heterogeneity at different stages of the folding reaction. Different states populated during protein folding, including the unfolded state, collapsed intermediate states, and even the native state, are found to possess significant conformational heterogeneity. Heterogeneity in protein folding and unfolding reactions originates from the reduced cooperativity of various kinds of physicochemical interactions between various structural elements of a protein, and between a protein and solvent. Heterogeneity may arise because of functional or evolutionary constraints. Conformational substates within the unfolded state and the collapsed intermediates that exchange at rates slower than the subsequent folding steps give rise to heterogeneity on the protein folding pathways. Multiple folding pathways are likely to represent distinct sequences of structure formation. Insight into the nature of the energy barriers separating different conformational states populated during (un)folding can also be obtained by resolving heterogeneity.
Collapse
Affiliation(s)
- Sandhya Bhatia
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560065, India.,Indian Institute of Science Education and Research, Pune 411008, India
| | - Jayant B Udgaonkar
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bengaluru 560065, India.,Indian Institute of Science Education and Research, Pune 411008, India
| |
Collapse
|
11
|
Karaca E, Prévost C, Sacquin-Mora S. Modeling the Dynamics of Protein–Protein Interfaces, How and Why? Molecules 2022; 27:molecules27061841. [PMID: 35335203 PMCID: PMC8950966 DOI: 10.3390/molecules27061841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 03/06/2022] [Accepted: 03/08/2022] [Indexed: 12/07/2022] Open
Abstract
Protein–protein assemblies act as a key component in numerous cellular processes. Their accurate modeling at the atomic level remains a challenge for structural biology. To address this challenge, several docking and a handful of deep learning methodologies focus on modeling protein–protein interfaces. Although the outcome of these methods has been assessed using static reference structures, more and more data point to the fact that the interaction stability and specificity is encoded in the dynamics of these interfaces. Therefore, this dynamics information must be taken into account when modeling and assessing protein interactions at the atomistic scale. Expanding on this, our review initially focuses on the recent computational strategies aiming at investigating protein–protein interfaces in a dynamic fashion using enhanced sampling, multi-scale modeling, and experimental data integration. Then, we discuss how interface dynamics report on the function of protein assemblies in globular complexes, in fuzzy complexes containing intrinsically disordered proteins, as well as in active complexes, where chemical reactions take place across the protein–protein interface.
Collapse
Affiliation(s)
- Ezgi Karaca
- Izmir Biomedicine and Genome Center, Izmir 35340, Turkey;
- Izmir International Biomedicine and Genome Institute, Dokuz Eylul University, Izmir 35340, Turkey
| | - Chantal Prévost
- CNRS, Laboratoire de Biochimie Théorique, UPR9080, Université de Paris, 13 rue Pierre et Marie Curie, 75005 Paris, France;
- Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, 75006 Paris, France
| | - Sophie Sacquin-Mora
- CNRS, Laboratoire de Biochimie Théorique, UPR9080, Université de Paris, 13 rue Pierre et Marie Curie, 75005 Paris, France;
- Institut de Biologie Physico-Chimique, Fondation Edmond de Rothschild, PSL Research University, 75006 Paris, France
- Correspondence:
| |
Collapse
|
12
|
Machat M, Langenfeld F, Craciun D, Sirugue L, Labib T, Lagarde N, Maria M, Montes M. Comparative evaluation of shape retrieval methods on macromolecular surfaces: an application of computer vision methods in structural bioinformatics. Bioinformatics 2021; 37:4375-4382. [PMID: 34247232 PMCID: PMC8652110 DOI: 10.1093/bioinformatics/btab511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 05/18/2021] [Accepted: 07/08/2021] [Indexed: 11/24/2022] Open
Abstract
MOTIVATION The investigation of the structure of biological systems at the molecular level gives insights about their functions and dynamics. Shape and surface of biomolecules are fundamental to molecular recognition events. Characterizing their geometry can lead to more adequate predictions of their interactions. In the present work, we assess the performance of reference shape retrieval methods from the computer vision community on protein shapes. RESULTS Shape retrieval methods are efficient in identifying orthologous proteins and tracking large conformational changes. This work illustrates the interest for the protein surface shape as a higher-level representation of the protein structure that (i) abstracts the underlying protein sequence, structure or fold, (ii) allows the use of shape retrieval methods to screen large databases of protein structures to identify surficial homologs and possible interacting partners and (iii) opens an extension of the protein structure-function paradigm toward a protein structure-surface(s)-function paradigm. AVAILABILITYAND IMPLEMENTATION All data are available online at http://datasetmachat.drugdesign.fr. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mohamed Machat
- Laboratoire GBCM, EA 7528, Conservatoire National des Arts et Métiers, Hesam Université, Paris 75003, France
| | - Florent Langenfeld
- Laboratoire GBCM, EA 7528, Conservatoire National des Arts et Métiers, Hesam Université, Paris 75003, France
| | - Daniela Craciun
- Laboratoire GBCM, EA 7528, Conservatoire National des Arts et Métiers, Hesam Université, Paris 75003, France
| | - Léa Sirugue
- Laboratoire GBCM, EA 7528, Conservatoire National des Arts et Métiers, Hesam Université, Paris 75003, France
| | - Taoufik Labib
- Laboratoire GBCM, EA 7528, Conservatoire National des Arts et Métiers, Hesam Université, Paris 75003, France
| | - Nathalie Lagarde
- Laboratoire GBCM, EA 7528, Conservatoire National des Arts et Métiers, Hesam Université, Paris 75003, France
| | - Maxime Maria
- Laboratoire GBCM, EA 7528, Conservatoire National des Arts et Métiers, Hesam Université, Paris 75003, France
- Laboratoire XLIM, UMR CNRS 7252, Université de Limoges, Limoges 87000, France
| | - Matthieu Montes
- Laboratoire GBCM, EA 7528, Conservatoire National des Arts et Métiers, Hesam Université, Paris 75003, France
| |
Collapse
|
13
|
Ljung F, André I. ZEAL: protein structure alignment based on shape similarity. Bioinformatics 2021; 37:2874-2881. [PMID: 33772587 DOI: 10.1093/bioinformatics/btab205] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 02/02/2021] [Accepted: 03/25/2021] [Indexed: 02/02/2023] Open
Abstract
MOTIVATION Most protein-structure superimposition tools consider only Cartesian coordinates. Yet, much of biology happens on the surface of proteins, which is why proteins with shared ancestry and similar function often have comparable surface shapes. Superposition of proteins based on surface shape can enable comparison of highly divergent proteins, identify convergent evolution and enable detailed comparison of surface features and binding sites. RESULTS We present ZEAL, an interactive tool to superpose global and local protein structures based on their shape resemblance using 3D (Zernike-Canterakis) functions to represent the molecular surface. In a benchmark study of structures with the same fold, we show that ZEAL outperforms two other methods for shape-based superposition. In addition, alignments from ZEAL were of comparable quality to the coordinate-based superpositions provided by TM-align. For comparisons of proteins with limited sequence and backbone-fold similarity, where coordinate-based methods typically fail, ZEAL can often find alignments with substantial surface-shape correspondence. In combination with shape-based matching, ZEAL can be used as a general tool to study relationships between shape and protein function. We identify several categories of protein functions where global shape similarity is significantly more likely than expected by random chance, when comparing proteins with little similarity on the fold level. In particular, we find that global surface shape similarity is particular common among DNA binding proteins. AVAILABILITY AND IMPLEMENTATION ZEAL can be used online at https://andrelab.org/zeal or as a standalone program with command line or graphical user interface. Source files and installers are available at https://github.com/Andre-lab/ZEAL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Filip Ljung
- Division of Biochemistry and Structural Biology, Department of Chemistry, Lund University, Lund SE-22100, Sweden
| | - Ingemar André
- Division of Biochemistry and Structural Biology, Department of Chemistry, Lund University, Lund SE-22100, Sweden
| |
Collapse
|
14
|
Yang Z, Liu M, Wang B, Wang B. Classification of protein domains based on their three-dimensional shapes (CPD3DS). Synth Syst Biotechnol 2021; 6:224-230. [PMID: 34541344 PMCID: PMC8429105 DOI: 10.1016/j.synbio.2021.08.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 08/23/2021] [Accepted: 08/30/2021] [Indexed: 11/13/2022] Open
Abstract
Protein design has become a powerful method to expand the number of natural proteins and design customized proteins according to demands. Domain-based protein design spares the need to create novel elements from scratch, which makes it a more efficient strategy than scratch-based protein design in designing multi-domain proteins, protein complexes and biomaterials. As the surface shape plays a central role in domain-domain and protein-protein interactions, a global map of the surface shapes of all domains should be very beneficial for domain-based protein design. Therefore, in this study, we characterized the surface shapes of protein domains, collected from CATH and SCOP databases, with their 3D-Zernike descriptors (3DZDs). Then similarities of domain shape features were identified, and all domains were classified accordingly. The preferences of the combinations of domains between different clusters were analyzed in natural proteins from the Protein Data Bank. A user-friendly website, termed CPD3DS, was also developed for storage, retrieval, analyses and visualization of our results. This work not only provides an overall view of protein domain shapes by showing their variety and similarities, but also opens up a new avenue to understand the properties of protein structural domains, and design principles of protein architectures.
Collapse
Affiliation(s)
- Zhaochang Yang
- School of Life Science and Technology, University of Electronic Science and Technology of China, China
| | - Mingkang Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, China
| | - Bin Wang
- School of Information and Software Engineering, University of Electronic Science and Technology of China, China
| | - Beibei Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China, China.,Centre for Informational Biology, University of Electronic Science and Technology of China, 2006 Xiyuan Road, Chengdu, Sichuan, 611731, China
| |
Collapse
|
15
|
|
16
|
Zhao B, Katuwawala A, Uversky VN, Kurgan L. IDPology of the living cell: intrinsic disorder in the subcellular compartments of the human cell. Cell Mol Life Sci 2021; 78:2371-2385. [PMID: 32997198 PMCID: PMC11071772 DOI: 10.1007/s00018-020-03654-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 09/09/2020] [Accepted: 09/22/2020] [Indexed: 12/11/2022]
Abstract
Intrinsic disorder can be found in all proteomes of all kingdoms of life and in viruses, being particularly prevalent in the eukaryotes. We conduct a comprehensive analysis of the intrinsic disorder in the human proteins while mapping them into 24 compartments of the human cell. In agreement with previous studies, we show that human proteins are significantly enriched in disorder relative to a generic protein set that represents the protein universe. In fact, the fraction of proteins with long disordered regions and the average protein-level disorder content in the human proteome are about 3 times higher than in the protein universe. Furthermore, levels of intrinsic disorder in the majority of human subcellular compartments significantly exceed the average disorder content in the protein universe. Relative to the overall amount of disorder in the human proteome, proteins localized in the nucleus and cytoskeleton have significantly increased amounts of disorder, measured by both high disorder content and presence of multiple long intrinsically disordered regions. We empirically demonstrate that, on average, human proteins are assigned to 2.3 subcellular compartments, with proteins localized to few subcellular compartments being more disordered than the proteins that are localized to many compartments. Functionally, the disordered proteins localized in the most disorder-enriched subcellular compartments are primarily responsible for interactions with nucleic acids and protein partners. This is the first-time disorder is comprehensively mapped into the human cell. Our observations add a missing piece to the puzzle of functional disorder and its organization inside the cell.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL, 33612, USA.
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation of the Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", Pushchino, Russia.
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Room E4225, Richmond, VA, 23284, USA.
| |
Collapse
|
17
|
Landreh M, Sahin C, Gault J, Sadeghi S, Drum CL, Uzdavinys P, Drew D, Allison TM, Degiacomi MT, Marklund EG. Predicting the Shapes of Protein Complexes through Collision Cross Section Measurements and Database Searches. Anal Chem 2020; 92:12297-12303. [PMID: 32660238 DOI: 10.1021/acs.analchem.0c01940] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
In structural biology, collision cross sections (CCSs) from ion mobility mass spectrometry (IM-MS) measurements are routinely compared to computationally or experimentally derived protein structures. Here, we investigate whether CCS data can inform about the shape of a protein in the absence of specific reference structures. Analysis of the proteins in the CCS database shows that protein complexes with low apparent densities are structurally more diverse than those with a high apparent density. Although assigning protein shapes purely on CCS data is not possible, we find that we can distinguish oblate- and prolate-shaped protein complexes by using the CCS, molecular weight, and oligomeric states to mine the Protein Data Bank (PDB) for potentially similar protein structures. Furthermore, comparing the CCS of a ferritin cage to the solution structures in the PDB reveals significant deviations caused by structural collapse in the gas phase. We then apply the strategy to an integral membrane protein by comparing the shapes of a prokaryotic and a eukaryotic sodium/proton antiporter homologue. We conclude that mining the PDB with IM-MS data is a time-effective way to derive low-resolution structural models.
Collapse
Affiliation(s)
- Michael Landreh
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solnavägen 9, 171 65, Stockholm, Sweden
| | - Cagla Sahin
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solnavägen 9, 171 65, Stockholm, Sweden.,Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen N, Denmark
| | - Joseph Gault
- Department of Chemistry, University of Oxford, South Parks Road, Oxford OX1 3QZ, United Kingdom
| | - Samira Sadeghi
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, 10 Medical Dr, Singapore 119228, Singapore
| | - Chester L Drum
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, 10 Medical Dr, Singapore 119228, Singapore
| | - Povilas Uzdavinys
- Department of Biochemistry and Biophysics, Stockholm University, Stockholm 114 19, Sweden
| | - David Drew
- Department of Biochemistry and Biophysics, Stockholm University, Stockholm 114 19, Sweden
| | - Timothy M Allison
- Biomolecular Interaction Centre and School of Physical and Chemical Sciences, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand
| | - Matteo T Degiacomi
- Department of Physics, Durham University, South Road, Durham DH1 3LE, United Kingdom
| | - Erik G Marklund
- Department of Chemistry - BMC, Uppsala University, Box 576, Uppsala 751 23, Sweden
| |
Collapse
|
18
|
McCafferty CL, Verbeke EJ, Marcotte EM, Taylor DW. Structural Biology in the Multi-Omics Era. J Chem Inf Model 2020; 60:2424-2429. [PMID: 32129623 PMCID: PMC7254829 DOI: 10.1021/acs.jcim.9b01164] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Indexed: 12/12/2022]
Abstract
Rapid developments in cryogenic electron microscopy have opened new avenues to probe the structures of protein assemblies in their near native states. Recent studies have begun applying single -particle analysis to heterogeneous mixtures, revealing the potential of structural-omics approaches that combine the power of mass spectrometry and electron microscopy. Here we highlight advances and challenges in sample preparation, data processing, and molecular modeling for handling increasingly complex mixtures. Such advances will help structural-omics methods extend to cellular-level models of structural biology.
Collapse
Affiliation(s)
- Caitlyn L. McCafferty
- Department
of Molecular Biosciences, University of
Texas at Austin, Austin, Texas 78712, United States
| | - Eric J. Verbeke
- Department
of Molecular Biosciences, University of
Texas at Austin, Austin, Texas 78712, United States
| | - Edward M. Marcotte
- Department
of Molecular Biosciences, University of
Texas at Austin, Austin, Texas 78712, United States
- Institute
for Cellular and Molecular Biology, University
of Texas at Austin, Austin, Texas 78712, United States
- Center
for Systems and Synthetic Biology, University
of Texas at Austin, Austin, Texas 78712, United States
| | - David W. Taylor
- Department
of Molecular Biosciences, University of
Texas at Austin, Austin, Texas 78712, United States
- Institute
for Cellular and Molecular Biology, University
of Texas at Austin, Austin, Texas 78712, United States
- Center
for Systems and Synthetic Biology, University
of Texas at Austin, Austin, Texas 78712, United States
- LIVESTRONG
Cancer Institutes, Dell Medical School, Austin, Texas 78712, United States
| |
Collapse
|
19
|
DeVille JS, Kihara D, Sit A. 2DKD: a toolkit for content-based local image search. SOURCE CODE FOR BIOLOGY AND MEDICINE 2020; 15:1. [PMID: 32064000 PMCID: PMC7011505 DOI: 10.1186/s13029-020-0077-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Accepted: 01/14/2020] [Indexed: 11/10/2022]
Abstract
BACKGROUND Direct comparison of 2D images is computationally inefficient due to the need for translation, rotation, and scaling of the images to evaluate their similarity. In many biological applications, such as digital pathology and cryo-EM, often identifying specific local regions of images is of particular interest. Therefore, finding invariant descriptors that can efficiently retrieve local image patches or subimages becomes necessary. RESULTS We present a software package called Two-Dimensional Krawtchouk Descriptors that allows to perform local subimage search in 2D images. The new toolkit uses only a small number of invariant descriptors per image for efficient local image retrieval. This enables querying an image and comparing similar patterns locally across a potentially large database. We show that these descriptors appear to be useful for searching local patterns or small particles in images and demonstrate some test cases that can be helpful for both assembly software developers and their users. CONCLUSIONS Local image comparison and subimage search can prove cumbersome in both computational complexity and runtime, due to factors such as the rotation, scaling, and translation of the object in question. By using the 2DKD toolkit, relatively few descriptors are developed to describe a given image, and this can be achieved with minimal memory usage.
Collapse
Affiliation(s)
- Julian S. DeVille
- Department of Mathematics and Statistics, Eastern Kentucky University, 521 Lancaster Ave., Richmond, 40475 KY USA
| | - Daisuke Kihara
- Department of Biology Sciences, Purdue University, 249 S Martin Jischke Dr, West Lafayette, 47907 IN USA
- Department of Computer Science, Purdue University, 305 N University Street, West Lafayette, 47907 IN USA
- Department of Pediatrics, Cincinnati Children’s Hospital Medical Care, University of Cincinnati, Cincinnati, 45229 OH USA
| | - Atilla Sit
- Department of Mathematics and Statistics, Eastern Kentucky University, 521 Lancaster Ave., Richmond, 40475 KY USA
| |
Collapse
|
20
|
Trisolini L, Gambacorta N, Gorgoglione R, Montaruli M, Laera L, Colella F, Volpicella M, De Grassi A, Pierri CL. FAD/NADH Dependent Oxidoreductases: From Different Amino Acid Sequences to Similar Protein Shapes for Playing an Ancient Function. J Clin Med 2019; 8:jcm8122117. [PMID: 31810296 PMCID: PMC6947548 DOI: 10.3390/jcm8122117] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 11/11/2019] [Accepted: 11/18/2019] [Indexed: 12/29/2022] Open
Abstract
Flavoprotein oxidoreductases are members of a large protein family of specialized dehydrogenases, which include type II NADH dehydrogenase, pyridine nucleotide-disulphide oxidoreductases, ferredoxin-NAD+ reductases, NADH oxidases, and NADH peroxidases, playing a crucial role in the metabolism of several prokaryotes and eukaryotes. Although several studies have been performed on single members or protein subgroups of flavoprotein oxidoreductases, a comprehensive analysis on structure-function relationships among the different members and subgroups of this great dehydrogenase family is still missing. Here, we present a structural comparative analysis showing that the investigated flavoprotein oxidoreductases have a highly similar overall structure, although the investigated dehydrogenases are quite different in functional annotations and global amino acid composition. The different functional annotation is ascribed to their participation in species-specific metabolic pathways based on the same biochemical reaction, i.e., the oxidation of specific cofactors, like NADH and FADH2. Notably, the performed comparative analysis sheds light on conserved sequence features that reflect very similar oxidation mechanisms, conserved among flavoprotein oxidoreductases belonging to phylogenetically distant species, as the bacterial type II NADH dehydrogenases and the mammalian apoptosis-inducing factor protein, until now retained as unique protein entities in Bacteria/Fungi or Animals, respectively. Furthermore, the presented computational analyses will allow consideration of FAD/NADH oxidoreductases as a possible target of new small molecules to be used as modulators of mitochondrial respiration for patients affected by rare diseases or cancer showing mitochondrial dysfunction, or antibiotics for treating bacterial/fungal/protista infections.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Anna De Grassi
- Correspondence: (A.D.G.); or (C.L.P.); Tel.: +39-080-544-3614 (A.D.G. & C.L.P.); Fax: +39-080-544-2770 (A.D.G. & C.L.P.)
| | - Ciro Leonardo Pierri
- Correspondence: (A.D.G.); or (C.L.P.); Tel.: +39-080-544-3614 (A.D.G. & C.L.P.); Fax: +39-080-544-2770 (A.D.G. & C.L.P.)
| |
Collapse
|
21
|
Shin WH, Kihara D. Predicting binding poses and affinity ranking in D3R Grand Challenge using PL-PatchSurfer2.0. J Comput Aided Mol Des 2019; 33:1083-1094. [PMID: 31506789 DOI: 10.1007/s10822-019-00222-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Accepted: 08/28/2019] [Indexed: 10/26/2022]
Abstract
Computational prediction of protein-ligand interactions is a useful approach that aids the drug discovery process. Two major tasks of computational approaches are to predict the docking pose of a compound in a known binding pocket and to rank compounds in a library according to their predicted binding affinities. There are many computational tools developed in the past decades both in academia and industry. To objectively assess the performance of existing tools, the community has held a blind assessment of computational predictions, the Drug Design Data Resource Grand Challenge. This round, Grand Challenge 4 (GC4), focused on two targets, protein beta-secretase 1 (BACE-1) and cathepsin S (CatS). We participated in GC4 in both BACE-1 and CatS challenges using our molecular surface-based virtual screening method, PL-PatchSurfer2.0. A unique feature of PL-PatchSurfer2.0 is that it uses the three-dimensional Zernike descriptor, a mathematical moment-based shape descriptor, to quantify local shape complementarity between a ligand and a receptor, which properly incorporates molecular flexibility and provides stable affinity assessment for a bound ligand-receptor complex. Since PL-PatchSurfer2.0 does not explicitly build a bound pose of a ligand, we used an external docking program, such as AutoDock Vina, to provide an ensemble of poses, which were then evaluated by PL-PatchSurfer2.0. Here, we provide an overview of our method and report the performance in GC4.
Collapse
Affiliation(s)
- Woong-Hee Shin
- Department of Biological Science, Purdue University, West Lafayette, IN, 47907, USA.,Department of Chemistry Education, Sunchon National University, Suncheon, 57922, Republic of Korea
| | - Daisuke Kihara
- Department of Biological Science, Purdue University, West Lafayette, IN, 47907, USA. .,Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA. .,Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, 47907, USA. .,Department of Pediatrics, University of Cincinnati, Cincinnati, OH, 45229, USA.
| |
Collapse
|