1
|
Qi J, Feng C, Shi Y, Yang J, Zhang F, Li G, Han R. FP-Zernike: An Open-source Structural Database Construction Toolkit for Fast Structure Retrieval. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae007. [PMID: 38894604 DOI: 10.1093/gpbjnl/qzae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 08/16/2023] [Accepted: 09/20/2023] [Indexed: 06/21/2024]
Abstract
The release of AlphaFold2 has sparked a rapid expansion in protein model databases. Efficient protein structure retrieval is crucial for the analysis of structure models, while measuring the similarity between structures is the key challenge in structural retrieval. Although existing structure alignment algorithms can address this challenge, they are often time-consuming. Currently, the state-of-the-art approach involves converting protein structures into three-dimensional (3D) Zernike descriptors and assessing similarity using Euclidean distance. However, the methods for computing 3D Zernike descriptors mainly rely on structural surfaces and are predominantly web-based, thus limiting their application in studying custom datasets. To overcome this limitation, we developed FP-Zernike, a user-friendly toolkit for computing different types of Zernike descriptors based on feature points. Users simply need to enter a single line of command to calculate the Zernike descriptors of all structures in customized datasets. FP-Zernike outperforms the leading method in terms of retrieval accuracy and binary classification accuracy across diverse benchmark datasets. In addition, we showed the application of FP-Zernike in the construction of the descriptor database and the protocol used for the Protein Data Bank (PDB) dataset to facilitate the local deployment of this tool for interested readers. Our demonstration contained 590,685 structures, and at this scale, our system required only 4-9 s to complete a retrieval. The experiments confirmed that it achieved the state-of-the-art accuracy level. FP-Zernike is an open-source toolkit, with the source code and related data accessible at https://ngdc.cncb.ac.cn/biocode/tools/BT007365/releases/0.1, as well as through a webserver at http://www.structbioinfo.cn/.
Collapse
Affiliation(s)
- Junhai Qi
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- BioMap Research, Menlo Park, CA 94025, USA
| | - Chenjie Feng
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- College of Medical Information and Engineering, Ningxia Medical University, Yinchuan 750004, China
| | - Yulin Shi
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Jianyi Yang
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Fa Zhang
- Institute of Engineering Medicine, Beijing Institute of Technology, Beijing 100081, China
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Renmin Han
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| |
Collapse
|
2
|
Parisi G, Piacentini R, Incocciati A, Bonamore A, Macone A, Rupert J, Zacco E, Miotto M, Milanetti E, Tartaglia GG, Ruocco G, Boffi A, Di Rienzo L. Design of protein-binding peptides with controlled binding affinity: the case of SARS-CoV-2 receptor binding domain and angiotensin-converting enzyme 2 derived peptides. Front Mol Biosci 2024; 10:1332359. [PMID: 38250735 PMCID: PMC10797010 DOI: 10.3389/fmolb.2023.1332359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 12/14/2023] [Indexed: 01/23/2024] Open
Abstract
The development of methods able to modulate the binding affinity between proteins and peptides is of paramount biotechnological interest in view of a vast range of applications that imply designed polypeptides capable to impair or favour Protein-Protein Interactions. Here, we applied a peptide design algorithm based on shape complementarity optimization and electrostatic compatibility and provided the first experimental in vitro proof of the efficacy of the design algorithm. Focusing on the interaction between the SARS-CoV-2 Spike Receptor-Binding Domain (RBD) and the human angiotensin-converting enzyme 2 (ACE2) receptor, we extracted a 23-residues long peptide that structurally mimics the major interacting portion of the ACE2 receptor and designed in silico five mutants of such a peptide with a modulated affinity. Remarkably, experimental KD measurements, conducted using biolayer interferometry, matched the in silico predictions. Moreover, we investigated the molecular determinants that govern the variation in binding affinity through molecular dynamics simulation, by identifying the mechanisms driving the different values of binding affinity at a single residue level. Finally, the peptide sequence with the highest affinity, in comparison with the wild type peptide, was expressed as a fusion protein with human H ferritin (HFt) 24-mer. Solution measurements performed on the latter constructs confirmed that peptides still exhibited the expected trend, thereby enhancing their efficacy in RBD binding. Altogether, these results indicate the high potentiality of this general method in developing potent high-affinity vectors for hindering/enhancing protein-protein associations.
Collapse
Affiliation(s)
- Giacomo Parisi
- Department of Basic and Applied Sciences for Engineering (SBAI), Università“Sapienza”, Roma, Italy
| | - Roberta Piacentini
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Alessio Incocciati
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Alessandra Bonamore
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Alberto Macone
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Jakob Rupert
- Department of Biology and Biotechnologies “Charles Darwin”, Università“Sapienza”, Roma, Italy
- Centre for Human Technologies (CHT), Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Elsa Zacco
- Centre for Human Technologies (CHT), Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Mattia Miotto
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
| | - Edoardo Milanetti
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
- Department of Physics, Università“Sapienza”, Roma, Italy
| | - Gian Gaetano Tartaglia
- Department of Biology and Biotechnologies “Charles Darwin”, Università“Sapienza”, Roma, Italy
- Centre for Human Technologies (CHT), Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Giancarlo Ruocco
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
- Department of Physics, Università“Sapienza”, Roma, Italy
| | - Alberto Boffi
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
| |
Collapse
|
3
|
Yamane H, Ishida T. Helix encoder: a compound-protein interaction prediction model specifically designed for class A GPCRs. FRONTIERS IN BIOINFORMATICS 2023; 3:1193025. [PMID: 37304403 PMCID: PMC10250622 DOI: 10.3389/fbinf.2023.1193025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 05/15/2023] [Indexed: 06/13/2023] Open
Abstract
Class A G protein-coupled receptors (GPCRs) represent the largest class of GPCRs. They are essential targets of drug discovery and thus various computational approaches have been applied to predict their ligands. However, there are a large number of orphan receptors in class A GPCRs and it is difficult to use a general protein-specific supervised prediction scheme. Therefore, the compound-protein interaction (CPI) prediction approach has been considered one of the most suitable for class A GPCRs. However, the accuracy of CPI prediction is still insufficient. The current CPI prediction model generally employs the whole protein sequence as the input because it is difficult to identify the important regions in general proteins. In contrast, it is well-known that only a few transmembrane helices of class A GPCRs play a critical role in ligand binding. Therefore, using such domain knowledge, the CPI prediction performance could be improved by developing an encoding method that is specifically designed for this family. In this study, we developed a protein sequence encoder called the Helix encoder, which takes only a protein sequence of transmembrane regions of class A GPCRs as input. The performance evaluation showed that the proposed model achieved a higher prediction accuracy compared to a prediction model using the entire protein sequence. Additionally, our analysis indicated that several extracellular loops are also important for the prediction as mentioned in several biological researches.
Collapse
|
4
|
Di Rienzo L, Miotto M, Milanetti E, Ruocco G. Computational structural-based GPCR optimization for user-defined ligand: Implications for the development of biosensors. Comput Struct Biotechnol J 2023; 21:3002-3009. [PMID: 37249971 PMCID: PMC10220229 DOI: 10.1016/j.csbj.2023.05.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 04/17/2023] [Accepted: 05/04/2023] [Indexed: 05/31/2023] Open
Abstract
Organisms have developed effective mechanisms to sense the external environment. Human-designed biosensors exploit this natural optimization, where different biological machinery have been adapted to detect the presence of user-defined molecules. Specifically, the pheromone pathway in the model organism Saccharomyces cerevisiae represents a suitable candidate as a synthetic signaling system. Indeed, it expresses just one G-Protein Coupled Receptor (GPCR), Ste2, able to recognize pheromone and initiate the expression of pheromone-dependent genes. To date, the standard procedure to engineer this system relies on the substitution of the yeast GPCR with another one and on the modification of the yeast G-protein to bind the inserted receptor. Here, we propose an innovative computational procedure, based on geometrical and chemical optimization of protein binding pockets, to select the amino acid substitutions required to make the native yeast GPCR able to recognize a user-defined ligand. This procedure would allow the yeast to recognize a wide range of ligands, without a-priori knowledge about a GPCR recognizing them or the corresponding G protein. We used Monte Carlo simulations to design on Ste2 a binding pocket able to recognize epinephrine, selected as a test ligand. We validated Ste2 mutants via molecular docking and molecular dynamics. We verified that the amino acid substitutions we identified make Ste2 able to accommodate and remain firmly bound to epinephrine. Our results indicate that we sampled efficiently the huge space of possible mutants, proposing such a strategy as a promising starting point for the development of a new kind of S.cerevisiae-based biosensors.
Collapse
Affiliation(s)
- Lorenzo Di Rienzo
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Mattia Miotto
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Edoardo Milanetti
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
- Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
- Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro 5, 00185 Rome, Italy
| |
Collapse
|