1
|
Lai JS, Burley SK, Duarte JM. ZMPY3D: accelerating protein structure volume analysis through vectorized 3D Zernike moments and Python-based GPU integration. BIOINFORMATICS ADVANCES 2024; 4:vbae111. [PMID: 39100546 PMCID: PMC11297494 DOI: 10.1093/bioadv/vbae111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/12/2024] [Accepted: 07/25/2024] [Indexed: 08/06/2024]
Abstract
Motivation Volumetric 3D object analyses are being applied in research fields such as structural bioinformatics, biophysics, and structural biology, with potential integration of artificial intelligence/machine learning (AI/ML) techniques. One such method, 3D Zernike moments, has proven valuable in analyzing protein structures (e.g., protein fold classification, protein-protein interaction analysis, and molecular dynamics simulations). Their compactness and efficiency make them amenable to large-scale analyses. Established methods for deriving 3D Zernike moments, however, can be inefficient, particularly when higher order terms are required, hindering broader applications. As the volume of experimental and computationally-predicted protein structure information continues to increase, structural biology has become a "big data" science requiring more efficient analysis tools. Results This application note presents a Python-based software package, ZMPY3D, to accelerate computation of 3D Zernike moments by vectorizing the mathematical formulae and using graphical processing units (GPUs). The package offers popular GPU-supported libraries such as CuPy and TensorFlow together with NumPy implementations, aiming to improve computational efficiency, adaptability, and flexibility in future algorithm development. The ZMPY3D package can be installed via PyPI, and the source code is available from GitHub. Volumetric-based protein 3D structural similarity scores and transform matrix of superposition functionalities have both been implemented, creating a powerful computational tool that will allow the research community to amalgamate 3D Zernike moments with existing AI/ML tools, to advance research and education in protein structure bioinformatics. Availability and implementation ZMPY3D, implemented in Python, is available on GitHub (https://github.com/tawssie/ZMPY3D) and PyPI, released under the GPL License.
Collapse
Affiliation(s)
- Jhih-Siang Lai
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, United States
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
| |
Collapse
|
2
|
Qi J, Feng C, Shi Y, Yang J, Zhang F, Li G, Han R. FP-Zernike: An Open-source Structural Database Construction Toolkit for Fast Structure Retrieval. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae007. [PMID: 38894604 DOI: 10.1093/gpbjnl/qzae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 08/16/2023] [Accepted: 09/20/2023] [Indexed: 06/21/2024]
Abstract
The release of AlphaFold2 has sparked a rapid expansion in protein model databases. Efficient protein structure retrieval is crucial for the analysis of structure models, while measuring the similarity between structures is the key challenge in structural retrieval. Although existing structure alignment algorithms can address this challenge, they are often time-consuming. Currently, the state-of-the-art approach involves converting protein structures into three-dimensional (3D) Zernike descriptors and assessing similarity using Euclidean distance. However, the methods for computing 3D Zernike descriptors mainly rely on structural surfaces and are predominantly web-based, thus limiting their application in studying custom datasets. To overcome this limitation, we developed FP-Zernike, a user-friendly toolkit for computing different types of Zernike descriptors based on feature points. Users simply need to enter a single line of command to calculate the Zernike descriptors of all structures in customized datasets. FP-Zernike outperforms the leading method in terms of retrieval accuracy and binary classification accuracy across diverse benchmark datasets. In addition, we showed the application of FP-Zernike in the construction of the descriptor database and the protocol used for the Protein Data Bank (PDB) dataset to facilitate the local deployment of this tool for interested readers. Our demonstration contained 590,685 structures, and at this scale, our system required only 4-9 s to complete a retrieval. The experiments confirmed that it achieved the state-of-the-art accuracy level. FP-Zernike is an open-source toolkit, with the source code and related data accessible at https://ngdc.cncb.ac.cn/biocode/tools/BT007365/releases/0.1, as well as through a webserver at http://www.structbioinfo.cn/.
Collapse
Affiliation(s)
- Junhai Qi
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- BioMap Research, Menlo Park, CA 94025, USA
| | - Chenjie Feng
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- College of Medical Information and Engineering, Ningxia Medical University, Yinchuan 750004, China
| | - Yulin Shi
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Jianyi Yang
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Fa Zhang
- Institute of Engineering Medicine, Beijing Institute of Technology, Beijing 100081, China
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Renmin Han
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| |
Collapse
|
3
|
Parisi G, Piacentini R, Incocciati A, Bonamore A, Macone A, Rupert J, Zacco E, Miotto M, Milanetti E, Tartaglia GG, Ruocco G, Boffi A, Di Rienzo L. Design of protein-binding peptides with controlled binding affinity: the case of SARS-CoV-2 receptor binding domain and angiotensin-converting enzyme 2 derived peptides. Front Mol Biosci 2024; 10:1332359. [PMID: 38250735 PMCID: PMC10797010 DOI: 10.3389/fmolb.2023.1332359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 12/14/2023] [Indexed: 01/23/2024] Open
Abstract
The development of methods able to modulate the binding affinity between proteins and peptides is of paramount biotechnological interest in view of a vast range of applications that imply designed polypeptides capable to impair or favour Protein-Protein Interactions. Here, we applied a peptide design algorithm based on shape complementarity optimization and electrostatic compatibility and provided the first experimental in vitro proof of the efficacy of the design algorithm. Focusing on the interaction between the SARS-CoV-2 Spike Receptor-Binding Domain (RBD) and the human angiotensin-converting enzyme 2 (ACE2) receptor, we extracted a 23-residues long peptide that structurally mimics the major interacting portion of the ACE2 receptor and designed in silico five mutants of such a peptide with a modulated affinity. Remarkably, experimental KD measurements, conducted using biolayer interferometry, matched the in silico predictions. Moreover, we investigated the molecular determinants that govern the variation in binding affinity through molecular dynamics simulation, by identifying the mechanisms driving the different values of binding affinity at a single residue level. Finally, the peptide sequence with the highest affinity, in comparison with the wild type peptide, was expressed as a fusion protein with human H ferritin (HFt) 24-mer. Solution measurements performed on the latter constructs confirmed that peptides still exhibited the expected trend, thereby enhancing their efficacy in RBD binding. Altogether, these results indicate the high potentiality of this general method in developing potent high-affinity vectors for hindering/enhancing protein-protein associations.
Collapse
Affiliation(s)
- Giacomo Parisi
- Department of Basic and Applied Sciences for Engineering (SBAI), Università“Sapienza”, Roma, Italy
| | - Roberta Piacentini
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Alessio Incocciati
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Alessandra Bonamore
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Alberto Macone
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Jakob Rupert
- Department of Biology and Biotechnologies “Charles Darwin”, Università“Sapienza”, Roma, Italy
- Centre for Human Technologies (CHT), Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Elsa Zacco
- Centre for Human Technologies (CHT), Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Mattia Miotto
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
| | - Edoardo Milanetti
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
- Department of Physics, Università“Sapienza”, Roma, Italy
| | - Gian Gaetano Tartaglia
- Department of Biology and Biotechnologies “Charles Darwin”, Università“Sapienza”, Roma, Italy
- Centre for Human Technologies (CHT), Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Giancarlo Ruocco
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
- Department of Physics, Università“Sapienza”, Roma, Italy
| | - Alberto Boffi
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
| |
Collapse
|
4
|
Emonts J, Buyel J. An overview of descriptors to capture protein properties - Tools and perspectives in the context of QSAR modeling. Comput Struct Biotechnol J 2023; 21:3234-3247. [PMID: 38213891 PMCID: PMC10781719 DOI: 10.1016/j.csbj.2023.05.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/23/2023] [Accepted: 05/23/2023] [Indexed: 01/13/2024] Open
Abstract
Proteins are important ingredients in food and feed, they are the active components of many pharmaceutical products, and they are necessary, in the form of enzymes, for the success of many technical processes. However, production can be challenging, especially when using heterologous host cells such as bacteria to express and assemble recombinant mammalian proteins. The manufacturability of proteins can be hindered by low solubility, a tendency to aggregate, or inefficient purification. Tools such as in silico protein engineering and models that predict separation criteria can overcome these issues but usually require the complex shape and surface properties of proteins to be represented by a small number of quantitative numeric values known as descriptors, as similarly used to capture the features of small molecules. Here, we review the current status of protein descriptors, especially for application in quantitative structure activity relationship (QSAR) models. First, we describe the complexity of proteins and the properties that descriptors must accommodate. Then we introduce descriptors of shape and surface properties that quantify the global and local features of proteins. Finally, we highlight the current limitations of protein descriptors and propose strategies for the derivation of novel protein descriptors that are more informative.
Collapse
Affiliation(s)
- J. Emonts
- Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Germany
| | - J.F. Buyel
- University of Natural Resources and Life Sciences, Vienna (BOKU), Department of Biotechnology (DBT), Institute of Bioprocess Science and Engineering (IBSE), Muthgasse 18, 1190 Vienna, Austria
- Institute for Molecular Biotechnology, Worringerweg 1, RWTH Aachen University, 52074 Aachen, Germany
| |
Collapse
|
5
|
Di Rienzo L, Miotto M, Milanetti E, Ruocco G. Computational structural-based GPCR optimization for user-defined ligand: Implications for the development of biosensors. Comput Struct Biotechnol J 2023; 21:3002-3009. [PMID: 37249971 PMCID: PMC10220229 DOI: 10.1016/j.csbj.2023.05.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 04/17/2023] [Accepted: 05/04/2023] [Indexed: 05/31/2023] Open
Abstract
Organisms have developed effective mechanisms to sense the external environment. Human-designed biosensors exploit this natural optimization, where different biological machinery have been adapted to detect the presence of user-defined molecules. Specifically, the pheromone pathway in the model organism Saccharomyces cerevisiae represents a suitable candidate as a synthetic signaling system. Indeed, it expresses just one G-Protein Coupled Receptor (GPCR), Ste2, able to recognize pheromone and initiate the expression of pheromone-dependent genes. To date, the standard procedure to engineer this system relies on the substitution of the yeast GPCR with another one and on the modification of the yeast G-protein to bind the inserted receptor. Here, we propose an innovative computational procedure, based on geometrical and chemical optimization of protein binding pockets, to select the amino acid substitutions required to make the native yeast GPCR able to recognize a user-defined ligand. This procedure would allow the yeast to recognize a wide range of ligands, without a-priori knowledge about a GPCR recognizing them or the corresponding G protein. We used Monte Carlo simulations to design on Ste2 a binding pocket able to recognize epinephrine, selected as a test ligand. We validated Ste2 mutants via molecular docking and molecular dynamics. We verified that the amino acid substitutions we identified make Ste2 able to accommodate and remain firmly bound to epinephrine. Our results indicate that we sampled efficiently the huge space of possible mutants, proposing such a strategy as a promising starting point for the development of a new kind of S.cerevisiae-based biosensors.
Collapse
Affiliation(s)
- Lorenzo Di Rienzo
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Mattia Miotto
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Edoardo Milanetti
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
- Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
- Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro 5, 00185 Rome, Italy
| |
Collapse
|
6
|
Milanetti E, Miotto M, Bo' L, Di Rienzo L, Ruocco G. Investigating the competition between ACE2 natural molecular interactors and SARS-CoV-2 candidate inhibitors. Chem Biol Interact 2023; 374:110380. [PMID: 36822303 PMCID: PMC9942480 DOI: 10.1016/j.cbi.2023.110380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Revised: 01/22/2023] [Accepted: 02/01/2023] [Indexed: 02/23/2023]
Abstract
The SARS-CoV-2 pandemic still poses a threat to the global health as the virus continues spreading in most countries. Therefore, the identification of molecules capable of inhibiting the binding between the ACE2 receptor and the SARS-CoV-2 spike protein is of paramount importance. Recently, two DNA aptamers were designed with the aim to inhibit the interaction between the ACE2 receptor and the spike protein of SARS-CoV-2. Indeed, the two molecules interact with the ACE2 receptor in the region around the K353 residue, preventing its binding of the spike protein. If on the one hand this inhibition process hinders the entry of the virus into the host cell, it could lead to a series of side effects, both in physiological and pathological conditions, preventing the correct functioning of the ACE2 receptor. Here, we discuss through a computational study the possible effect of these two very promising DNA aptamers, investigating all possible interactions between ACE2 and its experimentally known molecular partners. Our in silico predictions show that some of the 10 known molecular partners of ACE2 could interact, physiologically or pathologically, in a region adjacent to the K353 residue. Thus, the curative action of the proposed DNA aptamers could recruit ACE2 from its biological functions.
Collapse
Affiliation(s)
- Edoardo Milanetti
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185, Rome, Italy; Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy.
| | - Mattia Miotto
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| | - Leonardo Bo'
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| | - Giancarlo Ruocco
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185, Rome, Italy; Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| |
Collapse
|
7
|
Zhu DC, Gwo C, Deng A, Scheel N, Dowling MA, Zhang R. Hippocampus shape characterization with 3D Zernike transformation in clinical Alzheimer's disease progression. Hum Brain Mapp 2023; 44:1432-1444. [PMID: 36346203 PMCID: PMC9921247 DOI: 10.1002/hbm.26130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 07/30/2022] [Accepted: 10/05/2022] [Indexed: 11/11/2022] Open
Abstract
Alzheimer's disease (AD) is a neurodegenerative disease and the most common cause of dementia among older adults. Mild cognitive impairment (MCI) is considered a transitional phase between healthy cognitive aging and dementia. Progressive brain volume reduction/atrophy, particularly of the hippocampus, is associated with the transition from normal to MCI, and then to AD. We aimed to develop methods to characterize the shape of hippocampus and explore its potential as an imaging marker to monitor clinical AD progression. We implemented a 3D Zernike transformation to characterize the shape changes of hippocampus in 428 older subjects with high-quality T1 -weighted volumetric brain scans from the Alzheimer's Disease Neuroimaging Initiative data set (151 normal, 258 MCI, and 19 AD). Over 2 years, 15 cognitively normal subjects converted to MCI, and 42 subjects with MCI converted to AD. We found a significant correlation between hippocampal volume changes and Zernike shape metrics. Before a clinical diagnosis of AD, the shapes of the left and right hippocampi changed slowly. After AD diagnosis, both volume and shape changed rapidly but were uncorrelated to each other. During the transition from a clinical diagnosis of MCI to AD, the shape of the left and right hippocampi changed in a correlated manner but became uncorrelated after AD diagnosis. Finally, the pace of hippocampus shape change was associated with its shape and the subject's age and disease condition. In conclusion, the hippocampus shape features characterized with 3D Zernike transformation, in complement to volume measures, may serve as a novel imaging marker to monitor clinical AD progression.
Collapse
Affiliation(s)
- David C. Zhu
- Department of Radiology and Cognitive Imaging Research CenterMichigan State UniversityEast LansingMichiganUSA
| | - Chih‐Ying Gwo
- Department of Information ManagementChien Hsin University of Science and TechnologyTaoyuan CityTaiwan
| | - An‐Wen Deng
- Department of Information ManagementChien Hsin University of Science and TechnologyTaoyuan CityTaiwan
| | - Norman Scheel
- Department of Radiology and Cognitive Imaging Research CenterMichigan State UniversityEast LansingMichiganUSA
| | - Mari A. Dowling
- Department of Radiology and Cognitive Imaging Research CenterMichigan State UniversityEast LansingMichiganUSA
| | - Rong Zhang
- Departments of Neurology and Internal MedicineUniversity of Texas Southwestern Medical CenterDallasTexasUSA
- Institute for Exercise and Environmental MedicineTexas Health Presbyterian Hospital DallasDallasTexasUSA
| | | |
Collapse
|
8
|
Gwo CY, Zhu DC, Zhang R. Brain white matter hyperintensity lesion characterization in 3D T 2 fluid-attenuated inversion recovery magnetic resonance images: Shape, texture, and their correlations with potential growth. Front Neurosci 2022; 16:1028929. [PMID: 36507337 PMCID: PMC9731131 DOI: 10.3389/fnins.2022.1028929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 11/07/2022] [Indexed: 11/25/2022] Open
Abstract
Analyses of age-related white matter hyperintensity (WMH) lesions manifested in T2 fluid-attenuated inversion recovery (FLAIR) magnetic resonance images (MRI) have been mostly on understanding the size and location of the WMH lesions and rarely on the morphological characterization of the lesions. This work extends our prior analyses of the morphological characteristics and texture of WMH from 2D to 3D based on 3D T2 FLAIR images. 3D Zernike transformation was used to characterize WMH shape; a fuzzy logic method was used to characterize the lesion texture. We then clustered 3D WMH lesions into groups based on their 3D shape and texture features. A potential growth index (PGI) to assess dynamic changes in WMH lesions was developed based on the image texture features of the WMH lesion penumbra. WMH lesions with various sizes were segmented from brain images of 32 cognitively normal older adults. The WMH lesions were divided into two groups based on their size. Analyses of Variance (ANOVAs) showed significant differences in PGI among WMH shape clusters (P = 1.57 × 10-3 for small lesions; P = 3.14 × 10-2 for large lesions). Significant differences in PGI were also found among WMH texture group clusters (P = 1.79 × 10-6). In conclusion, we presented a novel approach to characterize the morphology of 3D WMH lesions and explored the potential to assess the dynamic morphological changes of WMH lesions using PGI.
Collapse
Affiliation(s)
- Chih-Ying Gwo
- Department of Information Management, Chien Hsin University of Science and Technology, Taoyuan City, Taiwan
| | - David C. Zhu
- Department of Radiology, Cognitive Imaging Research Center, Michigan State University, East Lansing, MI, United States
- Department of Psychology, Cognitive Imaging Research Center, Michigan State University, East Lansing, MI, United States
| | - Rong Zhang
- Department of Neurology and Internal Medicine, University of Texas Southwestern Medical Center, Dallas, TX, United States
- Institute for Exercise and Environmental Medicine, Texas Health Presbyterian Hospital Dallas, Dallas, TX, United States
| |
Collapse
|
9
|
Avery C, Patterson J, Grear T, Frater T, Jacobs DJ. Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:1246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein-ligand binding, including allosteric effects, protein-protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
Affiliation(s)
- Chris Avery
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - John Patterson
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Tyler Grear
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Theodore Frater
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Donald J. Jacobs
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
10
|
Walder M, Edelstein E, Carroll M, Lazarev S, Fajardo JE, Fiser A, Viswanathan R. Integrated structure-based protein interface prediction. BMC Bioinformatics 2022; 23:301. [PMID: 35879651 PMCID: PMC9316365 DOI: 10.1186/s12859-022-04852-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 07/18/2022] [Indexed: 11/29/2022] Open
Abstract
Background Identifying protein interfaces can inform how proteins interact with their binding partners, uncover the regulatory mechanisms that control biological functions and guide the development of novel therapeutic agents. A variety of computational approaches have been developed for predicting a protein’s interfacial residues from its known sequence and structure. Methods using the known three-dimensional structures of proteins can be template-based or template-free. Template-based methods have limited success in predicting interfaces when homologues with known complex structures are not available to use as templates. The prediction performance of template-free methods that only rely only upon proteins’ intrinsic properties is limited by the amount of biologically relevant features that can be included in an interface prediction model. Results We describe the development of an integrated method for protein interface prediction (ISPIP) to explore the hypothesis that the efficacy of a computational prediction method of protein binding sites can be enhanced by using a combination of methods that rely on orthogonal structure-based properties of a query protein, combining and balancing both template-free and template-based features. ISPIP is a method that integrates these approaches through simple linear or logistic regression models and more complex decision tree models. On a diverse test set of 156 query proteins, ISPIP outperforms each of its individual classifiers in identifying protein binding interfaces. Conclusions The integrated method captures the best performance of individual classifiers and delivers an improved interface prediction. The method is robust and performs well even when one of the individual classifiers performs poorly on a particular query protein. This work demonstrates that integrating orthogonal methods that depend on different structural properties of proteins performs better at interface prediction than any individual classifier alone. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04852-2.
Collapse
Affiliation(s)
- M Walder
- Department of Chemistry, Yeshiva College, Yeshiva University, New York, NY, 10033, USA
| | - E Edelstein
- Department of Chemistry, Yeshiva College, Yeshiva University, New York, NY, 10033, USA
| | - M Carroll
- Department of Chemistry, Yeshiva College, Yeshiva University, New York, NY, 10033, USA
| | - S Lazarev
- Department of Chemistry, Yeshiva College, Yeshiva University, New York, NY, 10033, USA
| | - J E Fajardo
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - A Fiser
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - R Viswanathan
- Department of Chemistry, Yeshiva College, Yeshiva University, New York, NY, 10033, USA.
| |
Collapse
|
11
|
Pozzati G, Kundrotas P, Elofsson A. Scoring of protein–protein docking models utilizing predicted interface residues. Proteins 2022; 90:1493-1505. [PMID: 35246997 PMCID: PMC9314140 DOI: 10.1002/prot.26330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 02/23/2022] [Accepted: 02/28/2022] [Indexed: 11/08/2022]
Abstract
Scoring docking solutions is a difficult task, and many methods have been developed for this purpose. In docking, only a handful of the hundreds of thousands of models generated by docking algorithms are acceptable, causing difficulties when developing scoring functions. Today's best scoring functions can significantly increase the number of top‐ranked models but still fail for most targets. Here, we examine the possibility of utilizing predicted interface residues to score docking models generated during the scan stage of a docking algorithm. Many methods have been developed to infer the regions of a protein surface that interact with another protein, but most have not been benchmarked using docking algorithms. This study systematically tests different interface prediction methods for scoring >300.000 low‐resolution rigid‐body template free docking decoys. Overall we find that contact‐based interface prediction by BIPSPI is the best method to score docking solutions, with >12% of first ranked docking models being acceptable. Additional experiments indicated precision as a high‐importance metric when estimating interface prediction quality, focusing on docking constraints production. Finally, we discussed several limitations for adopting interface predictions as constraints in a docking protocol.
Collapse
Affiliation(s)
- Gabriele Pozzati
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
| | - Petras Kundrotas
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
- Center for Bioinformatics and Department of Molecular Biosciences University of Kansas Lawrence Kansas USA
| | - Arne Elofsson
- Department of Biochemistry and Biophysics and Science for Life Laboratory Stockholm University Solna Sweden
| |
Collapse
|
12
|
De Lauro A, Di Rienzo L, Miotto M, Olimpieri PP, Milanetti E, Ruocco G. Shape Complementarity Optimization of Antibody–Antigen Interfaces: The Application to SARS-CoV-2 Spike Protein. Front Mol Biosci 2022; 9:874296. [PMID: 35669567 PMCID: PMC9163568 DOI: 10.3389/fmolb.2022.874296] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 04/07/2022] [Indexed: 11/17/2022] Open
Abstract
Many factors influence biomolecule binding, and its assessment constitutes an elusive challenge in computational structural biology. In this aspect, the evaluation of shape complementarity at molecular interfaces is one of the main factors to be considered. We focus on the particular case of antibody–antigen complexes to quantify the complementarities occurring at molecular interfaces. We relied on a method we recently developed, which employs the 2D Zernike descriptors, to characterize the investigated regions with an ordered set of numbers summarizing the local shape properties. Collecting a structural dataset of antibody–antigen complexes, we applied this method and we statistically distinguished, in terms of shape complementarity, pairs of the interacting regions from the non-interacting ones. Thus, we set up a novel computational strategy based on in silico mutagenesis of antibody-binding site residues. We developed a Monte Carlo procedure to increase the shape complementarity between the antibody paratope and a given epitope on a target protein surface. We applied our protocol against several molecular targets in SARS-CoV-2 spike protein, known to be indispensable for viral cell invasion. We, therefore, optimized the shape of template antibodies for the interaction with such regions. As the last step of our procedure, we performed an independent molecular docking validation of the results of our Monte Carlo simulations.
Collapse
Affiliation(s)
| | - Lorenzo Di Rienzo
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- *Correspondence: Lorenzo Di Rienzo,
| | - Mattia Miotto
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
| | | | - Edoardo Milanetti
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| |
Collapse
|
13
|
Casadio R, Martelli PL, Savojardo C. Machine learning solutions for predicting protein–protein interactions. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Rita Casadio
- Biocomputing Group University of Bologna Bologna Italy
| | | | | |
Collapse
|
14
|
Quadrini M, Daberdaku S, Ferrari C. Hierarchical representation for PPI sites prediction. BMC Bioinformatics 2022; 23:96. [PMID: 35307006 PMCID: PMC8934516 DOI: 10.1186/s12859-022-04624-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 02/23/2022] [Indexed: 01/06/2023] Open
Abstract
Abstract
Background
Protein–protein interactions have pivotal roles in life processes, and aberrant interactions are associated with various disorders. Interaction site identification is key for understanding disease mechanisms and design new drugs. Effective and efficient computational methods for the PPI prediction are of great value due to the overall cost of experimental methods. Promising results have been obtained using machine learning methods and deep learning techniques, but their effectiveness depends on protein representation and feature selection.
Results
We define a new abstraction of the protein structure, called hierarchical representations, considering and quantifying spatial and sequential neighboring among amino acids. We also investigate the effect of molecular abstractions using the Graph Convolutional Networks technique to classify amino acids as interface and no-interface ones. Our study takes into account three abstractions, hierarchical representations, contact map, and the residue sequence, and considers the eight functional classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0. The performance of our method, evaluated using standard metrics, is compared to the ones obtained with some state-of-the-art protein interface predictors. The analysis of the performance values shows that our method outperforms the considered competitors when the considered molecules are structurally similar.
Conclusions
The hierarchical representation can capture the structural properties that promote the interactions and can be used to represent proteins with unknown structures by codifying only their sequential neighboring. Analyzing the results, we conclude that classes should be arranged according to their architectures rather than functions.
Collapse
|
15
|
Ray A. Machine learning in postgenomic biology and personalized medicine. WILEY INTERDISCIPLINARY REVIEWS. DATA MINING AND KNOWLEDGE DISCOVERY 2022; 12:e1451. [PMID: 35966173 PMCID: PMC9371441 DOI: 10.1002/widm.1451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 12/22/2021] [Indexed: 06/15/2023]
Abstract
In recent years Artificial Intelligence in the form of machine learning has been revolutionizing biology, biomedical sciences, and gene-based agricultural technology capabilities. Massive data generated in biological sciences by rapid and deep gene sequencing and protein or other molecular structure determination, on the one hand, requires data analysis capabilities using machine learning that are distinctly different from classical statistical methods; on the other, these large datasets are enabling the adoption of novel data-intensive machine learning algorithms for the solution of biological problems that until recently had relied on mechanistic model-based approaches that are computationally expensive. This review provides a bird's eye view of the applications of machine learning in post-genomic biology. Attempt is also made to indicate as far as possible the areas of research that are poised to make further impacts in these areas, including the importance of explainable artificial intelligence (XAI) in human health. Further contributions of machine learning are expected to transform medicine, public health, agricultural technology, as well as to provide invaluable gene-based guidance for the management of complex environments in this age of global warming.
Collapse
Affiliation(s)
- Animesh Ray
- Riggs School of Applied Life Sciences, Keck Graduate Institute, 535 Watson Drive, Claremont, CA91711, USA
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, USA
| |
Collapse
|
16
|
Hot spots-making directed evolution easier. Biotechnol Adv 2022; 56:107926. [DOI: 10.1016/j.biotechadv.2022.107926] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 01/04/2022] [Accepted: 02/07/2022] [Indexed: 01/20/2023]
|
17
|
Miotto M, Di Rienzo L, Gosti G, Bo' L, Parisi G, Piacentini R, Boffi A, Ruocco G, Milanetti E. Inferring the stabilization effects of SARS-CoV-2 variants on the binding with ACE2 receptor. Commun Biol 2022; 5:20221. [PMID: 34992214 PMCID: PMC8738749 DOI: 10.1038/s42003-021-02946-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 11/26/2021] [Indexed: 12/18/2022] Open
Abstract
As the SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) pandemic continues to spread, several variants of the virus, with mutations distributed all over the viral genome, are emerging. While most of the variants present mutations having little to no effects at the phenotypic level, some of these variants are spreading at a rate that suggests they may present a selective advantage. In particular, these rapidly spreading variants present specific mutations on the spike protein. These observations call for an urgent need to characterize the effects of these variants’ mutations on phenotype features like contagiousness and antigenicity. With this aim, we performed molecular dynamics simulations on a selected set of possible spike variants in order to assess the stabilizing effect of particular amino acid substitutions on the molecular complex. We specifically focused on the mutations that are both characteristic of the top three most worrying variants at the moment, i.e the English, South African, and Amazonian ones, and that occur at the molecular interface between SARS-CoV-2 spike protein and its human ACE2 receptor. We characterize these variants’ effect in terms of (i) residue mobility, (ii) compactness, studying the network of interactions at the interface, and (iii) variation of shape complementarity via expanding the molecular surfaces in the Zernike basis. Overall, our analyses highlighted greater stability of the three variant complexes with respect to both the wild type and two negative control systems, especially for the English and Amazonian variants. In addition, in the three variants, we investigate the effects a not-yet observed mutation in position 501 could provoke on complex stability. We found that a phenylalanine mutation behaves similarly to the English variant and may cooperate in further increasing the stability of the South African one, hinting at the need for careful surveillance for the emergence of these mutations in the population. Ultimately, we show that the proposed observables describe key features for the stability of the ACE2-spike complex and can help to monitor further possible spike variants. Miotto et al. perform molecular dynamics simulations on a selected set of possible SARS-CoV-2 spike variants in order to assess the stabilizing effect of particular amino acid substitutions on the molecular complex. Their analysis can help to monitor further possible spike variants.
Collapse
Affiliation(s)
- Mattia Miotto
- Center for Life Nano & Neuroscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nano & Neuroscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| | - Giorgio Gosti
- Center for Life Nano & Neuroscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| | - Leonardo Bo'
- Center for Life Nano & Neuroscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| | - Giacomo Parisi
- Center for Life Nano & Neuroscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| | - Roberta Piacentini
- Center for Life Nano & Neuroscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy.,Department of Biochemical Sciences "Alessandro Rossi Fanelli", Sapienza University of Rome, P.Le A. Moro 5, 00185, Rome, Italy
| | - Alberto Boffi
- Center for Life Nano & Neuroscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy.,Department of Biochemical Sciences "Alessandro Rossi Fanelli", Sapienza University of Rome, P.Le A. Moro 5, 00185, Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano & Neuroscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy.,Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185, Rome, Italy
| | - Edoardo Milanetti
- Center for Life Nano & Neuroscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy. .,Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185, Rome, Italy.
| |
Collapse
|
18
|
Binding site identification of G protein-coupled receptors through a 3D Zernike polynomials-based method: application to C. elegans olfactory receptors. J Comput Aided Mol Des 2022; 36:11-24. [PMID: 34977999 PMCID: PMC8831295 DOI: 10.1007/s10822-021-00434-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 11/18/2021] [Indexed: 11/01/2022]
Abstract
Studying the binding processes of G protein-coupled receptors (GPCRs) proteins is of particular interest both to better understand the molecular mechanisms that regulate the signaling between the extracellular and intracellular environment and for drug design purposes. In this study, we propose a new computational approach for the identification of the binding site for a specific ligand on a GPCR. The method is based on the Zernike polynomials and performs the ligand-GPCR association through a shape complementarity analysis of the local molecular surfaces. The method is parameter-free and it can distinguish, working on hundreds of experimentally GPCR-ligand complexes, binding pockets from randomly sampled regions on the receptor surface, obtaining an Area Under ROC curve of 0.77. Given its importance both as a model organism and in terms of applications, we thus investigated the olfactory receptors of the C. elegans, building a list of associations between 21 GPCRs belonging to its olfactory neurons and a set of possible ligands. Thus, we can not only carry out rapid and efficient screenings of drugs proposed for GPCRs, key targets in many pathologies, but also we laid the groundwork for computational mutagenesis processes, aimed at increasing or decreasing the binding affinity between ligands and receptors.
Collapse
|
19
|
Di Rienzo L, Milanetti E, Ruocco G, Lepore R. Quantitative Description of Surface Complementarity of Antibody-Antigen Interfaces. Front Mol Biosci 2021; 8:749784. [PMID: 34660699 PMCID: PMC8514621 DOI: 10.3389/fmolb.2021.749784] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 09/14/2021] [Indexed: 11/29/2022] Open
Abstract
Antibodies have the remarkable ability to recognise their cognate antigens with extraordinary affinity and specificity. Discerning the rules that define antibody-antigen recognition is a fundamental step in the rational design and engineering of functional antibodies with desired properties. In this study we apply the 3D Zernike formalism to the analysis of the surface properties of the antibody complementary determining regions (CDRs). Our results show that shape and electrostatic 3DZD descriptors of the surface of the CDRs are predictive of antigen specificity, with classification accuracy of 81% and area under the receiver operating characteristic curve (AUC) of 0.85. Additionally, while in terms of surface size, solvent accessibility and amino acid composition, antibody epitopes are typically not distinguishable from non-epitope, solvent-exposed regions of the antigen, the 3DZD descriptors detect significantly higher surface complementarity to the paratope, and are able to predict correct paratope-epitope interaction with an AUC = 0.75.
Collapse
Affiliation(s)
- Lorenzo Di Rienzo
- Center for Life Nano and Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
| | - Edoardo Milanetti
- Center for Life Nano and Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano and Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| | - Rosalba Lepore
- Department of Biomedicine, Basel University Hospital and University of Basel, Basel, Switzerland
| |
Collapse
|
20
|
Yang S, Huang J, He B. CASPredict: a web service for identifying Cas proteins. PeerJ 2021; 9:e11887. [PMID: 34395100 PMCID: PMC8327967 DOI: 10.7717/peerj.11887] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 07/09/2021] [Indexed: 12/16/2022] Open
Abstract
Clustered regularly interspaced short palindromic repeats (CRISPR) and their associated (Cas) proteins constitute the CRISPR-Cas systems, which play a key role in prokaryote adaptive immune system against invasive foreign elements. In recent years, the CRISPR-Cas systems have also been designed to facilitate target gene editing in eukaryotic genomes. As one of the important components of the CRISPR-Cas system, Cas protein plays an irreplaceable role. The effector module composed of Cas proteins is used to distinguish the type of CRISPR-Cas systems. Effective prediction and identification of Cas proteins can help biologists further infer the type of CRISPR-Cas systems. Moreover, the class 2 CRISPR-Cas systems are gradually applied in the field of genome editing. The discovery of Cas protein will help provide more candidates for genome editing. In this paper, we described a web service named CASPredict (http://i.uestc.edu.cn/caspredict/cgi-bin/CASPredict.pl) for identifying Cas proteins. CASPredict first predicts Cas proteins based on support vector machine (SVM) by using the optimal dipeptide composition and then annotates the function of Cas proteins based on the hmmscan search algorithm. The ten-fold cross-validation results showed that the 84.84% of Cas proteins were correctly classified. CASPredict will be a useful tool for the identification of Cas proteins, or at least can play a complementary role to the existing methods in this area.
Collapse
Affiliation(s)
- Shanshan Yang
- Medical College, Guizhou University, Guiyang, Guizhou Province, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan Province, China
| | - Bifang He
- Medical College, Guizhou University, Guiyang, Guizhou Province, China.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan Province, China
| |
Collapse
|
21
|
Insights into the Interaction Mechanism of DTP3 with MKK7 by Using STD-NMR and Computational Approaches. Biomedicines 2020; 9:biomedicines9010020. [PMID: 33396582 PMCID: PMC7824710 DOI: 10.3390/biomedicines9010020] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Revised: 12/21/2020] [Accepted: 12/23/2020] [Indexed: 01/18/2023] Open
Abstract
GADD45β/MKK7 complex is a non-redundant, cancer cell-restricted survival module downstream of the NF-kB survival pathway, and it has a pathogenically critical role in multiple myeloma, an incurable malignancy of plasma cells. The first-in-class GADD45β/MKK7 inhibitor DTP3 effectively kills MM cells expressing its molecular target, both in vitro and in vivo, by inducing MKK7/JNK-dependent apoptosis with no apparent toxicity to normal cells. DTP3 combines favorable drug-like properties, with on-target-specific pharmacology, resulting in a safe and cancer-selective therapeutic effect; however, its mode of action is only partially understood. In this work, we have investigated the molecular determinants underlying the MKK7 interaction with DTP3 by combining computational, NMR, and spectroscopic methods. Data gathered by fluorescence quenching and computational approaches consistently indicate that the N-terminal region of MKK7 is the optimal binding site explored by DTP3. These findings further the understanding of the selective mode of action of GADD45β/MKK7 inhibitors and inform potential mechanisms of drug resistance. Notably, upon validation of the safety and efficacy of DTP3 in human trials, our results could also facilitate the development of novel DTP3-like therapeutics with improved bioavailability or the capacity to bypass drug resistance.
Collapse
|
22
|
Di Rienzo L, Milanetti E, Testi C, Montemiglio LC, Baiocco P, Boffi A, Ruocco G. A novel strategy for molecular interfaces optimization: The case of Ferritin-Transferrin receptor interaction. Comput Struct Biotechnol J 2020; 18:2678-2686. [PMID: 33101606 PMCID: PMC7548301 DOI: 10.1016/j.csbj.2020.09.020] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 09/10/2020] [Accepted: 09/11/2020] [Indexed: 11/24/2022] Open
Abstract
Protein-protein interactions regulate almost all cellular functions and rely on a fine tune of surface amino acids properties involved on both molecular partners. The disruption of a molecular association can be caused even by a single residue mutation, often leading to a pathological modification of a biochemical pathway. Therefore the evaluation of the effects of amino acid substitutions on binding, and the ad hoc design of protein-protein interfaces, is one of the biggest challenges in computational biology. Here, we present a novel strategy for computational mutation and optimization of protein-protein interfaces. Modeling the interaction surface properties using the Zernike polynomials, we describe the shape and electrostatics of binding sites with an ordered set of descriptors, making possible the evaluation of complementarity between interacting surfaces. With a Monte Carlo approach, we obtain protein mutants with controlled molecular complementarities. Applying this strategy to the relevant case of the interaction between Ferritin and Transferrin Receptor, we obtain a set of Ferritin mutants with increased or decreased complementarity. The extensive molecular dynamics validation of the method results confirms its efficacy, showing that this strategy represents a very promising approach in designing correct molecular interfaces.
Collapse
Affiliation(s)
- Lorenzo Di Rienzo
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Edoardo Milanetti
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Claudia Testi
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | | | - Paola Baiocco
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
- Department of Biochemical Sciences ‘A. Rossi Fanelli’ Sapienza University, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Alberto Boffi
- Department of Biochemical Sciences ‘A. Rossi Fanelli’ Sapienza University, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185 Rome, Italy
| |
Collapse
|
23
|
Savojardo C, Martelli PL, Casadio R. Protein–Protein Interaction Methods and Protein Phase Separation. Annu Rev Biomed Data Sci 2020. [DOI: 10.1146/annurev-biodatasci-011720-104428] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In the last decade, newly developed experimental methods have made it possible to highlight that macromolecules in the cell milieu physically interact to support physiology. This has shifted the problem of protein–protein interaction from a microscopic, electron-density scale to a mesoscopic one. Further, nowadays there is increasing evidence that proteins in the nucleus and in the cytoplasm can aggregate in membraneless organelles for different physiological reasons. In this scenario, it is urgent to face the problem of biomolecule functional annotation with efficient computational methods, suited to extract knowledge from reliable data and transfer information across different domains of investigation. Here, we revise the present state of the art of our knowledge of protein–protein interaction and the computational methods that differently implement it. Furthermore, we explore experimental and computational features of a set of proteins involved in phase separation.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology and Interdepartmental Center “Luigi Galvani” for Integrated Studies of Bioinformatics, Biophysics, and Biocomplexity, University of Bologna, 40126 Bologna, Italy
- Institute of Biomembranes, Bioenergetics, and Molecular Biotechnologies (IBIOM), Italian National Research Council (CNR), 70126 Bari, Italy
| |
Collapse
|
24
|
Daberdaku S, Ferrari C. Antibody interface prediction with 3D Zernike descriptors and SVM. Bioinformatics 2020; 35:1870-1876. [PMID: 30395191 DOI: 10.1093/bioinformatics/bty918] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Revised: 09/21/2018] [Accepted: 11/01/2018] [Indexed: 12/23/2022] Open
Abstract
MOTIVATION Antibodies are a class of proteins capable of specifically recognizing and binding to a virtually infinite number of antigens. This binding malleability makes them the most valuable category of biopharmaceuticals for both diagnostic and therapeutic applications. The correct identification of the antigen-binding residues in the antibody is crucial for all antibody design and engineering techniques and could also help to understand the complex antigen binding mechanisms. However, the antibody-binding interface prediction field appears to be still rather underdeveloped. RESULTS We present a novel method for antibody interface prediction from their experimentally solved structures based on 3D Zernike Descriptors. Roto-translationally invariant descriptors are computed from circular patches of the antibody surface enriched with a chosen subset of physico-chemical properties from the AAindex1 amino acid index set, and are used as samples for a binary classification problem. An SVM classifier is used to distinguish interface surface patches from non-interface ones. The proposed method was shown to outperform other antigen-binding interface prediction software. AVAILABILITY AND IMPLEMENTATION Linux binaries and Python scripts are available at https://github.com/sebastiandaberdaku/AntibodyInterfacePrediction. The datasets generated and/or analyzed during the current study are available at https://doi.org/10.6084/m9.figshare.5442229. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sebastian Daberdaku
- Department of Comparative Biomedicine and Food Science, University of Padova, Legnaro, Italy
| | - Carlo Ferrari
- Department of Information Engineering, University of Padova, Padova, Italy
| |
Collapse
|
25
|
Deng A, Zhang H, Wang W, Zhang J, Fan D, Chen P, Wang B. Developing Computational Model to Predict Protein-Protein Interaction Sites Based on the XGBoost Algorithm. Int J Mol Sci 2020; 21:E2274. [PMID: 32218345 PMCID: PMC7178137 DOI: 10.3390/ijms21072274] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Revised: 03/10/2020] [Accepted: 03/23/2020] [Indexed: 12/27/2022] Open
Abstract
The study of protein-protein interaction is of great biological significance, and the prediction of protein-protein interaction sites can promote the understanding of cell biological activity and will be helpful for drug development. However, uneven distribution between interaction and non-interaction sites is common because only a small number of protein interactions have been confirmed by experimental techniques, which greatly affects the predictive capability of computational methods. In this work, two imbalanced data processing strategies based on XGBoost algorithm were proposed to re-balance the original dataset from inherent relationship between positive and negative samples for the prediction of protein-protein interaction sites. Herein, a feature extraction method was applied to represent the protein interaction sites based on evolutionary conservatism of proteins, and the influence of overlapping regions of positive and negative samples was considered in prediction performance. Our method showed good prediction performance, such as prediction accuracy of 0.807 and MCC of 0.614, on an original dataset with 10,455 surface residues but only 2297 interface residues. Experimental results demonstrated the effectiveness of our XGBoost-based method.
Collapse
Affiliation(s)
- Aijun Deng
- Key Laboratory of Metallurgical Emission Reduction & Resources Recycling (Anhui University of Technology), Ministry of Education, Ma'anshan 243002, China
- School of Metallurgical Engineering, Anhui University of Technology, Ma'anshan 243032, China
- Department of Engineering, University of Leicester, Leicester LE1 7RH, UK
| | - Huan Zhang
- School of Electrical and Information Engineering, Anhui University of Technology, Ma'anshan 243032, China
| | - Wenyan Wang
- School of Electrical and Information Engineering, Anhui University of Technology, Ma'anshan 243032, China
| | - Jun Zhang
- Co-Innovation Center for Information Supply & Assurance Technology, Anhui University, Hefei 230032, China
| | - Dingdong Fan
- School of Metallurgical Engineering, Anhui University of Technology, Ma'anshan 243032, China
| | - Peng Chen
- Co-Innovation Center for Information Supply & Assurance Technology, Anhui University, Hefei 230032, China
| | - Bing Wang
- Key Laboratory of Metallurgical Emission Reduction & Resources Recycling (Anhui University of Technology), Ministry of Education, Ma'anshan 243002, China
- School of Electrical and Information Engineering, Anhui University of Technology, Ma'anshan 243032, China
- Co-Innovation Center for Information Supply & Assurance Technology, Anhui University, Hefei 230032, China
| |
Collapse
|
26
|
Di Rienzo L, Milanetti E, Alba J, D'Abramo M. Quantitative Characterization of Binding Pockets and Binding Complementarity by Means of Zernike Descriptors. J Chem Inf Model 2020; 60:1390-1398. [PMID: 32050068 PMCID: PMC7997106 DOI: 10.1021/acs.jcim.9b01066] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In this work, we describe the application of the Zernike formalism to quantitatively characterize the binding pockets of two sets of biologically relevant systems. Such an approach, when applied to molecular dynamics trajectories, is able to pinpoint the subtle differences between very similar molecular regions and their impact on the local propensity to ligand binding, allowing us to quantify such differences. The statistical robustness of our procedure suggests that it is very suitable to describe protein binding sites and protein-ligand interactions within a rigorous and well-defined framework.
Collapse
Affiliation(s)
- Lorenzo Di Rienzo
- Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro, 5, 00185 Rome, Italy
| | - Edoardo Milanetti
- Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro, 5, 00185 Rome, Italy.,Center for Life Nano Science@Sapienza, Italian Institute of Technology, Viale Regina Elena 291, 00161 Rome, Italy
| | - Josephine Alba
- Department of Chemistry, Sapienza University of Rome, Piazzale Aldo Moro, 5, 00185 Rome, Italy
| | - Marco D'Abramo
- Department of Chemistry, Sapienza University of Rome, Piazzale Aldo Moro, 5, 00185 Rome, Italy
| |
Collapse
|
27
|
Nilofer C, Sukhwal A, Mohanapriya A, Sakharkar MK, Kangueane P. Small protein-protein interfaces rich in electrostatic are often linked to regulatory function. J Biomol Struct Dyn 2019; 38:3260-3279. [PMID: 31495333 DOI: 10.1080/07391102.2019.1657040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Protein-protein interaction (PPI) is critical for several biological functions in living cells through the formation of an interface. Therefore, it is of interest to characterize protein-protein interfaces using an updated non-redundant structural dataset of 2557 homo (identical subunits) and 393 hetero (different subunits) dimer protein complexes determined by X-ray crystallography. We analyzed the interfaces using van der Waals (vdW), hydrogen bonding and electrostatic energies. Results show that on average homo and hetero interfaces are similar. Hence, we further grouped the 2950 interfaces based on percentage vdW to total energies into dominant (≥60%) and sub-dominant (<60%) vdW interfaces. Majority (92%) of interfaces have dominant vdW energy with large interface size (146 ± 87 (homo) and 137 ± 76 (hetero) residues) and interface area (1622 ± 1135 Å2 (homo) and 1579 ± 1060 Å2 (hetero)). However, a proportion (8%) of interfaces have sub-dominant vdW energy with small interface size (85 ± 46 (homo) and 88 ± 36 (hetero) residues) and interface area (823 ± 538 Å2 (homo) and 881 ± 377 Å2 (hetero)). It is found that large interfaces have two-fold more interface area and interface size than small interfaces with increasing hydrogen bonding energy to interface size. However, small interfaces have three-fold more electrostatics energy than large interfaces with increasing electrostatics to interface size. Thus, 8% of complexes having small interfaces with limited interface area and sub-dominant vdW energy are rich in electrostatics. It is interesting to observe that complexes having small interfaces are often associated with regulatory function. Hence, the observed structural features with known molecular function provide insights for the better understanding of PPI.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Christina Nilofer
- Biomedical Informatics (P) Ltd., Pondicherry, India.,School of Biosciences & Technology, VIT University, Vellore, Tamil Nadu, India
| | - Anshul Sukhwal
- National Centre for Biological Sciences (NCBS), Bangalore, India
| | | | | | | |
Collapse
|
28
|
Tian B, Wu X, Chen C, Qiu W, Ma Q, Yu B. Predicting protein–protein interactions by fusing various Chou's pseudo components and using wavelet denoising approach. J Theor Biol 2019; 462:329-346. [DOI: 10.1016/j.jtbi.2018.11.011] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 11/08/2018] [Accepted: 11/15/2018] [Indexed: 12/26/2022]
|
29
|
Song D, Chen Y, Min Q, Sun Q, Ye K, Zhou C, Yuan S, Sun Z, Liao J. Similarity-based machine learning support vector machine predictor of drug-drug interactions with improved accuracies. J Clin Pharm Ther 2018; 44:268-275. [PMID: 30565313 DOI: 10.1111/jcpt.12786] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Revised: 10/29/2018] [Accepted: 11/18/2018] [Indexed: 12/12/2022]
Affiliation(s)
- Dalong Song
- Guizhou University; Guiyang China
- Department of Urology; GuiZhou Provincial People’s Hospital; Guiyang China
| | - Yao Chen
- School of Science; China Pharmaceutical University; Nanjing China
| | - Qian Min
- School of Science; China Pharmaceutical University; Nanjing China
| | - Qingrong Sun
- School of Science; China Pharmaceutical University; Nanjing China
| | - Kai Ye
- MandalaT Software Corporation, F5; Wuxi China
| | - Changjiang Zhou
- School of Science; China Pharmaceutical University; Nanjing China
| | - Shengyue Yuan
- School of Science; China Pharmaceutical University; Nanjing China
| | - Zhaolin Sun
- Department of Urology; GuiZhou Provincial People’s Hospital; Guiyang China
| | - Jun Liao
- School of Science; China Pharmaceutical University; Nanjing China
- Key Laboratory of Drug Quality Control and Pharmacovigilance (China Pharmaceutical University); Ministry of Education; Nanjing China
| |
Collapse
|