1
|
Li P, Liu ZP. Structure-Based Prediction of lncRNA-Protein Interactions by Deep Learning. Methods Mol Biol 2025; 2883:363-376. [PMID: 39702717 DOI: 10.1007/978-1-0716-4290-0_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2024]
Abstract
The interactions between long noncoding RNA (lncRNA) and protein play crucial roles in various biological processes. Computational methods are essential for predicting lncRNA-protein interactions and deciphering their mechanisms. In this chapter, we aim to introduce the fundamental framework for predicting lncRNA-protein interactions based on three-dimensional structure information. With the increasing availability of lncRNA and protein molecular tertiary structures, the feasibility of using deep learning methods for automatic representation and learning has become evident. This chapter outlines the key steps in predicting lncRNA-protein interactions using deep learning, including three common non-Euclidean data representations for lncRNA and proteins, as well as neural networks tailored to these specific data characteristics. We also highlight the advantages and challenges of structure-based prediction of lncRNA-protein interactions with geometric deep learning methods.
Collapse
Affiliation(s)
- Pengpai Li
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong, China.
| |
Collapse
|
2
|
Madsen AV, Mejias-Gomez O, Pedersen LE, Preben Morth J, Kristensen P, Jenkins TP, Goletz S. Structural trends in antibody-antigen binding interfaces: a computational analysis of 1833 experimentally determined 3D structures. Comput Struct Biotechnol J 2024; 23:199-211. [PMID: 38161735 PMCID: PMC10755492 DOI: 10.1016/j.csbj.2023.11.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 11/27/2023] [Accepted: 11/28/2023] [Indexed: 01/03/2024] Open
Abstract
Antibodies are attractive therapeutic candidates due to their ability to bind cognate antigens with high affinity and specificity. Still, the underlying molecular rules governing the antibody-antigen interface remain poorly understood, making in silico antibody design inherently difficult and keeping the discovery and design of novel antibodies a costly and laborious process. This study investigates the characteristics of antibody-antigen binding interfaces through a computational analysis of more than 850,000 atom-atom contacts from the largest reported set of antibody-antigen complexes with 1833 nonredundant, experimentally determined structures. The analysis compares binding characteristics of conventional antibodies and single-domain antibodies (sdAbs) targeting both protein- and peptide antigens. We find clear patterns in the number antibody-antigen contacts and amino acid frequencies in the paratope. The direct comparison of sdAbs and conventional antibodies helps elucidate the mechanisms employed by sdAbs to compensate for their smaller size and the fact that they harbor only half the number of complementarity-determining regions compared to conventional antibodies. Furthermore, we pinpoint antibody interface hotspot residues that are often found at the binding interface and the amino acid frequencies at these positions. These findings have direct potential applications in antibody engineering and the design of improved antibody libraries.
Collapse
Affiliation(s)
- Andreas V. Madsen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Oscar Mejias-Gomez
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Lasse E. Pedersen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - J. Preben Morth
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Peter Kristensen
- Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Timothy P. Jenkins
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Steffen Goletz
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| |
Collapse
|
3
|
Kalemati M, Noroozi A, Shahbakhsh A, Koohi S. ParaAntiProt provides paratope prediction using antibody and protein language models. Sci Rep 2024; 14:29141. [PMID: 39587231 PMCID: PMC11589832 DOI: 10.1038/s41598-024-80940-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Accepted: 11/22/2024] [Indexed: 11/27/2024] Open
Abstract
Efficiently predicting the paratope holds immense potential for enhancing antibody design, treating cancers and other serious diseases, and advancing personalized medicine. Although traditional methods are highly accurate, they are often time-consuming, labor-intensive, and reliant on 3D structures, restricting their broader use. On the other hand, machine learning-based methods, besides relying on structural data, entail descriptor computation, consideration of diverse physicochemical properties, and feature engineering. Here, we develop a deep learning-assisted prediction method for paratope identification, relying solely on amino acid sequences and being antigen-agnostic. Built on the ProtTrans architecture, and utilizing pre-trained protein and antibody language models, we extract efficient embeddings for predicting paratope. By incorporating positional encoding for Complementarity Determining Regions, our model gains a deeper structural understanding, achieving remarkable performance with a 0.904 ROC AUC, 0.701 F1-score, and 0.585 MCC on benchmark datasets. In addition to yielding accurate antibody paratope predictions, our method exhibits strong performance in predicting nanobody paratope, achieving a ROC AUC of 0.912 and a PR AUC of 0.665 on the nanobody dataset. Notably, our approach outperforms structure-based prediction methods, boasting a PR AUC of 0.731. Various conducted ablation studies, which elaborate on the impact of each part of the model on the prediction task, show that the improvement in prediction performance by applying CDR positional encoding together with CNNs depends on the specific protein and antibody language models used. These results highlight the potential of our method to advance disease understanding and aid in the discovery of new diagnostics and antibody therapies.
Collapse
Affiliation(s)
- Mahmood Kalemati
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Alireza Noroozi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Aref Shahbakhsh
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Somayyeh Koohi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran.
| |
Collapse
|
4
|
Su Y, Zeng X, Zhang L, Bian Y, Wang Y, Ma B. ABTrans: A Transformer-based Model for Predicting Interaction between Anti-Aβ Antibodies and Peptides. Interdiscip Sci 2024:10.1007/s12539-024-00664-5. [PMID: 39466358 DOI: 10.1007/s12539-024-00664-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 09/29/2024] [Accepted: 10/04/2024] [Indexed: 10/30/2024]
Abstract
Antibodies against Aβ peptide have been recently approved to treat Alzheimer's disease, underscoring the importance of understanding their interactions for developing more potent treatments. Here we investigated the interaction between anti-Aβ antibodies and various peptides using a deep learning model. Our model, ABTrans, was trained on dodecapeptide sequences from phage display experiments and known anti-Aβ antibody sequences sourced from public sources. It classified the binding ability between anti-Aβ antibodies and dodecapeptides into four levels: not binding, weak binding, medium binding, and strong binding, achieving an accuracy of 0.83. Using ABTrans, we examined the cross-reaction of anti-Aβ antibodies with other human amyloidogenic proteins, revealing that Aducanumab and Donanemab exhibited the least cross-reactivity. Additionally, we systematically screened interactions between eleven selected anti-Aβ antibodies and all human proteins to identify potential off-target candidates.
Collapse
Affiliation(s)
- Yuhong Su
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xincheng Zeng
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Lingfeng Zhang
- School of Electrical Engineering and Computer Science, University of Ottawa, 75 Laurier Ave, Ottawa, K1N 6N5, Canada
| | - Yanlin Bian
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yangjing Wang
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Buyong Ma
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China.
- Shanghai Digiwiser Biological, Inc, Shanghai, 200240, China.
| |
Collapse
|
5
|
Lai JS, Burley SK, Duarte JM. ZMPY3D: accelerating protein structure volume analysis through vectorized 3D Zernike moments and Python-based GPU integration. BIOINFORMATICS ADVANCES 2024; 4:vbae111. [PMID: 39100546 PMCID: PMC11297494 DOI: 10.1093/bioadv/vbae111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/12/2024] [Accepted: 07/25/2024] [Indexed: 08/06/2024]
Abstract
Motivation Volumetric 3D object analyses are being applied in research fields such as structural bioinformatics, biophysics, and structural biology, with potential integration of artificial intelligence/machine learning (AI/ML) techniques. One such method, 3D Zernike moments, has proven valuable in analyzing protein structures (e.g., protein fold classification, protein-protein interaction analysis, and molecular dynamics simulations). Their compactness and efficiency make them amenable to large-scale analyses. Established methods for deriving 3D Zernike moments, however, can be inefficient, particularly when higher order terms are required, hindering broader applications. As the volume of experimental and computationally-predicted protein structure information continues to increase, structural biology has become a "big data" science requiring more efficient analysis tools. Results This application note presents a Python-based software package, ZMPY3D, to accelerate computation of 3D Zernike moments by vectorizing the mathematical formulae and using graphical processing units (GPUs). The package offers popular GPU-supported libraries such as CuPy and TensorFlow together with NumPy implementations, aiming to improve computational efficiency, adaptability, and flexibility in future algorithm development. The ZMPY3D package can be installed via PyPI, and the source code is available from GitHub. Volumetric-based protein 3D structural similarity scores and transform matrix of superposition functionalities have both been implemented, creating a powerful computational tool that will allow the research community to amalgamate 3D Zernike moments with existing AI/ML tools, to advance research and education in protein structure bioinformatics. Availability and implementation ZMPY3D, implemented in Python, is available on GitHub (https://github.com/tawssie/ZMPY3D) and PyPI, released under the GPL License.
Collapse
Affiliation(s)
- Jhih-Siang Lai
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, United States
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
| |
Collapse
|
6
|
Richardson E, Trevizani R, Greenbaum JA, Carter H, Nielsen M, Peters B. The receiver operating characteristic curve accurately assesses imbalanced datasets. PATTERNS (NEW YORK, N.Y.) 2024; 5:100994. [PMID: 39005487 PMCID: PMC11240176 DOI: 10.1016/j.patter.2024.100994] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 03/05/2024] [Accepted: 05/03/2024] [Indexed: 07/16/2024]
Abstract
Many problems in biology require looking for a "needle in a haystack," corresponding to a binary classification where there are a few positives within a much larger set of negatives, which is referred to as a class imbalance. The receiver operating characteristic (ROC) curve and the associated area under the curve (AUC) have been reported as ill-suited to evaluate prediction performance on imbalanced problems where there is more interest in performance on the positive minority class, while the precision-recall (PR) curve is preferable. We show via simulation and a real case study that this is a misinterpretation of the difference between the ROC and PR spaces, showing that the ROC curve is robust to class imbalance, while the PR curve is highly sensitive to class imbalance. Furthermore, we show that class imbalance cannot be easily disentangled from classifier performance measured via PR-AUC.
Collapse
Affiliation(s)
- Eve Richardson
- Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology, La Jolla, CA, USA
| | - Raphael Trevizani
- Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology, La Jolla, CA, USA
- Fiocruz Ceará, Fundação Oswaldo Cruz, Rua São José s/n, Precabura, Eusébio/CE, Brazil
| | - Jason A Greenbaum
- Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology, La Jolla, CA, USA
| | - Hannah Carter
- Department of Medicine, University of California, La Jolla, CA, USA
| | - Morten Nielsen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Lyngby, Denmark
| | - Bjoern Peters
- Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology, La Jolla, CA, USA
| |
Collapse
|
7
|
Joubbi S, Micheli A, Milazzo P, Maccari G, Ciano G, Cardamone D, Medini D. Antibody design using deep learning: from sequence and structure design to affinity maturation. Brief Bioinform 2024; 25:bbae307. [PMID: 38960409 PMCID: PMC11221890 DOI: 10.1093/bib/bbae307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 05/20/2024] [Accepted: 06/12/2024] [Indexed: 07/05/2024] Open
Abstract
Deep learning has achieved impressive results in various fields such as computer vision and natural language processing, making it a powerful tool in biology. Its applications now encompass cellular image classification, genomic studies and drug discovery. While drug development traditionally focused deep learning applications on small molecules, recent innovations have incorporated it in the discovery and development of biological molecules, particularly antibodies. Researchers have devised novel techniques to streamline antibody development, combining in vitro and in silico methods. In particular, computational power expedites lead candidate generation, scaling and potential antibody development against complex antigens. This survey highlights significant advancements in protein design and optimization, specifically focusing on antibodies. This includes various aspects such as design, folding, antibody-antigen interactions docking and affinity maturation.
Collapse
Affiliation(s)
- Sara Joubbi
- Department of Computer Science, University of Pisa, Largo B. Pontecorvo, 3, 56127, Pisa, Italy
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Alessio Micheli
- Department of Computer Science, University of Pisa, Largo B. Pontecorvo, 3, 56127, Pisa, Italy
| | - Paolo Milazzo
- Department of Computer Science, University of Pisa, Largo B. Pontecorvo, 3, 56127, Pisa, Italy
| | - Giuseppe Maccari
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Giorgio Ciano
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Dario Cardamone
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| | - Duccio Medini
- Data Science for Health (DaScH) Lab, Fondazione Toscana Life Sciences, Via Fiorentina, 1, 53100, Siena, Italy
| |
Collapse
|
8
|
Qi J, Feng C, Shi Y, Yang J, Zhang F, Li G, Han R. FP-Zernike: An Open-source Structural Database Construction Toolkit for Fast Structure Retrieval. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae007. [PMID: 38894604 PMCID: PMC11423855 DOI: 10.1093/gpbjnl/qzae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 08/16/2023] [Accepted: 09/20/2023] [Indexed: 06/21/2024]
Abstract
The release of AlphaFold2 has sparked a rapid expansion in protein model databases. Efficient protein structure retrieval is crucial for the analysis of structure models, while measuring the similarity between structures is the key challenge in structural retrieval. Although existing structure alignment algorithms can address this challenge, they are often time-consuming. Currently, the state-of-the-art approach involves converting protein structures into three-dimensional (3D) Zernike descriptors and assessing similarity using Euclidean distance. However, the methods for computing 3D Zernike descriptors mainly rely on structural surfaces and are predominantly web-based, thus limiting their application in studying custom datasets. To overcome this limitation, we developed FP-Zernike, a user-friendly toolkit for computing different types of Zernike descriptors based on feature points. Users simply need to enter a single line of command to calculate the Zernike descriptors of all structures in customized datasets. FP-Zernike outperforms the leading method in terms of retrieval accuracy and binary classification accuracy across diverse benchmark datasets. In addition, we showed the application of FP-Zernike in the construction of the descriptor database and the protocol used for the Protein Data Bank (PDB) dataset to facilitate the local deployment of this tool for interested readers. Our demonstration contained 590,685 structures, and at this scale, our system required only 4-9 s to complete a retrieval. The experiments confirmed that it achieved the state-of-the-art accuracy level. FP-Zernike is an open-source toolkit, with the source code and related data accessible at https://ngdc.cncb.ac.cn/biocode/tools/BT007365/releases/0.1, as well as through a webserver at http://www.structbioinfo.cn/.
Collapse
Affiliation(s)
- Junhai Qi
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- BioMap Research, Menlo Park, CA 94025, USA
| | - Chenjie Feng
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
- College of Medical Information and Engineering, Ningxia Medical University, Yinchuan 750004, China
| | - Yulin Shi
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Jianyi Yang
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Fa Zhang
- Institute of Engineering Medicine, Beijing Institute of Technology, Beijing 100081, China
| | - Guojun Li
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| | - Renmin Han
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China
| |
Collapse
|
9
|
Gallo E. Revolutionizing Synthetic Antibody Design: Harnessing Artificial Intelligence and Deep Sequencing Big Data for Unprecedented Advances. Mol Biotechnol 2024:10.1007/s12033-024-01064-2. [PMID: 38308755 DOI: 10.1007/s12033-024-01064-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 01/02/2024] [Indexed: 02/05/2024]
Abstract
Synthetic antibodies (Abs) represent a category of engineered proteins meticulously crafted to replicate the functions of their natural counterparts. Such Abs are generated in vitro, enabling advanced molecular alterations associated with antigen recognition, paratope site engineering, and biochemical refinements. In a parallel realm, deep sequencing has brought about a paradigm shift in molecular biology. It facilitates the prompt and cost-effective high-throughput sequencing of DNA and RNA molecules, enabling the comprehensive big data analysis of Ab transcriptomes, including specific regions of interest. Significantly, the integration of artificial intelligence (AI), based on machine- and deep- learning approaches, has fundamentally transformed our capacity to discern patterns hidden within deep sequencing big data, including distinctive Ab features and protein folding free energy landscapes. Ultimately, current AI advances can generate approximations of the most stable Ab structural configurations, enabling the prediction of de novo synthetic Abs. As a result, this manuscript comprehensively examines the latest and relevant literature concerning the intersection of deep sequencing big data and AI methodologies for the design and development of synthetic Abs. Together, these advancements have accelerated the exploration of antibody repertoires, contributing to the refinement of synthetic Ab engineering and optimizations, and facilitating advancements in the lead identification process.
Collapse
Affiliation(s)
- Eugenio Gallo
- Avance Biologicals, Department of Medicinal Chemistry, 950 Dupont Street, Toronto, ON, M6H 1Z2, Canada.
- RevivAb, Department of Protein Engineering, Av. Ipiranga, 6681, Partenon, Porto Alegre, RS, 90619-900, Brazil.
| |
Collapse
|
10
|
Bravi B. Development and use of machine learning algorithms in vaccine target selection. NPJ Vaccines 2024; 9:15. [PMID: 38242890 PMCID: PMC10798987 DOI: 10.1038/s41541-023-00795-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 12/07/2023] [Indexed: 01/21/2024] Open
Abstract
Computer-aided discovery of vaccine targets has become a cornerstone of rational vaccine design. In this article, I discuss how Machine Learning (ML) can inform and guide key computational steps in rational vaccine design concerned with the identification of B and T cell epitopes and correlates of protection. I provide examples of ML models, as well as types of data and predictions for which they are built. I argue that interpretable ML has the potential to improve the identification of immunogens also as a tool for scientific discovery, by helping elucidate the molecular processes underlying vaccine-induced immune responses. I outline the limitations and challenges in terms of data availability and method development that need to be addressed to bridge the gap between advances in ML predictions and their translational application to vaccine design.
Collapse
Affiliation(s)
- Barbara Bravi
- Department of Mathematics, Imperial College London, London, SW7 2AZ, UK.
| |
Collapse
|
11
|
Parisi G, Piacentini R, Incocciati A, Bonamore A, Macone A, Rupert J, Zacco E, Miotto M, Milanetti E, Tartaglia GG, Ruocco G, Boffi A, Di Rienzo L. Design of protein-binding peptides with controlled binding affinity: the case of SARS-CoV-2 receptor binding domain and angiotensin-converting enzyme 2 derived peptides. Front Mol Biosci 2024; 10:1332359. [PMID: 38250735 PMCID: PMC10797010 DOI: 10.3389/fmolb.2023.1332359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 12/14/2023] [Indexed: 01/23/2024] Open
Abstract
The development of methods able to modulate the binding affinity between proteins and peptides is of paramount biotechnological interest in view of a vast range of applications that imply designed polypeptides capable to impair or favour Protein-Protein Interactions. Here, we applied a peptide design algorithm based on shape complementarity optimization and electrostatic compatibility and provided the first experimental in vitro proof of the efficacy of the design algorithm. Focusing on the interaction between the SARS-CoV-2 Spike Receptor-Binding Domain (RBD) and the human angiotensin-converting enzyme 2 (ACE2) receptor, we extracted a 23-residues long peptide that structurally mimics the major interacting portion of the ACE2 receptor and designed in silico five mutants of such a peptide with a modulated affinity. Remarkably, experimental KD measurements, conducted using biolayer interferometry, matched the in silico predictions. Moreover, we investigated the molecular determinants that govern the variation in binding affinity through molecular dynamics simulation, by identifying the mechanisms driving the different values of binding affinity at a single residue level. Finally, the peptide sequence with the highest affinity, in comparison with the wild type peptide, was expressed as a fusion protein with human H ferritin (HFt) 24-mer. Solution measurements performed on the latter constructs confirmed that peptides still exhibited the expected trend, thereby enhancing their efficacy in RBD binding. Altogether, these results indicate the high potentiality of this general method in developing potent high-affinity vectors for hindering/enhancing protein-protein associations.
Collapse
Affiliation(s)
- Giacomo Parisi
- Department of Basic and Applied Sciences for Engineering (SBAI), Università“Sapienza”, Roma, Italy
| | - Roberta Piacentini
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Alessio Incocciati
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Alessandra Bonamore
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Alberto Macone
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Jakob Rupert
- Department of Biology and Biotechnologies “Charles Darwin”, Università“Sapienza”, Roma, Italy
- Centre for Human Technologies (CHT), Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Elsa Zacco
- Centre for Human Technologies (CHT), Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Mattia Miotto
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
| | - Edoardo Milanetti
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
- Department of Physics, Università“Sapienza”, Roma, Italy
| | - Gian Gaetano Tartaglia
- Department of Biology and Biotechnologies “Charles Darwin”, Università“Sapienza”, Roma, Italy
- Centre for Human Technologies (CHT), Istituto Italiano di Tecnologia (IIT), Genova, Italy
| | - Giancarlo Ruocco
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
- Department of Physics, Università“Sapienza”, Roma, Italy
| | - Alberto Boffi
- Department of Biochemical Sciences “Alessandro Rossi Fanelli”, Università“Sapienza”, Roma, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nano and Neuro Science, Istituto Italiano di Tecnologia (IIT), Roma, Italy
| |
Collapse
|
12
|
Quadrini M, Ferrari C. Exploiting the Role of Features for Antigens-Antibodies Interaction Site Prediction. Methods Mol Biol 2024; 2780:303-325. [PMID: 38987475 DOI: 10.1007/978-1-0716-3985-6_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Antibodies are a class of proteins that recognize and neutralize pathogens by binding to their antigens. They are the most significant category of biopharmaceuticals for both diagnostic and therapeutic applications. Understanding how antibodies interact with their antigens plays a fundamental role in drug and vaccine design and helps to comprise the complex antigen binding mechanisms. Computational methods for predicting interaction sites of antibody-antigen are of great value due to the overall cost of experimental methods. Machine learning methods and deep learning techniques obtained promising results.In this work, we predict antibody interaction interface sites by applying HSS-PPI, a hybrid method defined to predict the interface sites of general proteins. The approach abstracts the proteins in terms of hierarchical representation and uses a graph convolutional network to classify the amino acids between interface and non-interface. Moreover, we also equipped the amino acids with different sets of physicochemical features together with structural ones to describe the residues. Analyzing the results, we observe that the structural features play a fundamental role in the amino acid descriptions. We compare the obtained performances, evaluated using standard metrics, with the ones obtained with SVM with 3D Zernike descriptors, Parapred, Paratome, and Antibody i-Patch.
Collapse
Affiliation(s)
- Michela Quadrini
- School of Science and Technology, University of Camerino, Camerino, Italy.
| | - Carlo Ferrari
- Department of Information Engineering, University of Padua, Padua, Italy
| |
Collapse
|
13
|
Bachmann Salvy M, Santuari L, Schmid-Siegert E, Lykoskoufis N, Xenarios I, Arpat B. Seq2scFv: a toolkit for the comprehensive analysis of display libraries from long-read sequencing platforms. MAbs 2024; 16:2408344. [PMID: 39379324 PMCID: PMC11469439 DOI: 10.1080/19420862.2024.2408344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Revised: 09/15/2024] [Accepted: 09/19/2024] [Indexed: 10/10/2024] Open
Abstract
Antibodies have emerged as the leading class of biotherapeutics, yet traditional screening methods face significant time and resource challenges in identifying lead candidates. Integrating high-throughput sequencing with computational approaches marks a pivotal advancement in antibody discovery, expanding the antibody space to explore. In this context, a major breakthrough has been the full-length sequencing of single-chain variable fragments (scFvs) used in in vitro display libraries. However, few tools address the task of annotating the paired heavy and light chain variable domains (VH and VL), which is the primary advantage of full-scFv sequencing. To address this methodological gap, we introduce Seq2scFv, a novel open-source toolkit designed for analyzing in vitro display libraries from long-read sequencing platforms. Seq2scFv facilitates the identification and thorough characterization of V(D)J recombination in both VH and VL regions. In addition to providing annotated scFvs, translated sequences and numbered chains, Seq2scFv enables linker inference and characterization, sequence encoding with unique identifiers and quantification of identical sequences across selection rounds, thereby simplifying enrichment identification. With its versatile and standalone functionality, we anticipate that the implementation of Seq2scFv tools in antibody discovery pipelines will efficiently expedite the full characterization of display libraries and potentially facilitate the identification of high-affinity antibody candidates.
Collapse
Affiliation(s)
| | - Luca Santuari
- NGS-AI Division, JSR Life Sciences, Epalinges, Switzerland
| | | | | | | | - Bulak Arpat
- NGS-AI Division, JSR Life Sciences, Epalinges, Switzerland
| |
Collapse
|
14
|
Yuan M, Shen A, Fu K, Guan J, Ma Y, Qiao Q, Wang M. ProteinMAE: masked autoencoder for protein surface self-supervised learning. Bioinformatics 2023; 39:btad724. [PMID: 38019955 PMCID: PMC10713117 DOI: 10.1093/bioinformatics/btad724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 10/27/2023] [Accepted: 11/28/2023] [Indexed: 12/01/2023] Open
Abstract
SUMMARY The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein-protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein-protein interaction prediction. The extensive experiments show that our method not only successfully improves the network's performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods. AVAILABILITY AND IMPLEMENTATION https://github.com/phdymz/ProteinMAE.
Collapse
Affiliation(s)
- Mingzhi Yuan
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Ao Shen
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Kexue Fu
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Jiaming Guan
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Yingfan Ma
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Qin Qiao
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| | - Manning Wang
- Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
- Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, Shanghai 200032, China
| |
Collapse
|
15
|
Bai G, Sun C, Guo Z, Wang Y, Zeng X, Su Y, Zhao Q, Ma B. Accelerating antibody discovery and design with artificial intelligence: Recent advances and prospects. Semin Cancer Biol 2023; 95:13-24. [PMID: 37355214 DOI: 10.1016/j.semcancer.2023.06.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 06/09/2023] [Accepted: 06/18/2023] [Indexed: 06/26/2023]
Abstract
Therapeutic antibodies are the largest class of biotherapeutics and have been successful in treating human diseases. However, the design and discovery of antibody drugs remains challenging and time-consuming. Recently, artificial intelligence technology has had an incredible impact on antibody design and discovery, resulting in significant advances in antibody discovery, optimization, and developability. This review summarizes major machine learning (ML) methods and their applications for computational predictors of antibody structure and antigen interface/interaction, as well as the evaluation of antibody developability. Additionally, this review addresses the current status of ML-based therapeutic antibodies under preclinical and clinical phases. While many challenges remain, ML may offer a new therapeutic option for the future direction of fully computational antibody design.
Collapse
Affiliation(s)
- Ganggang Bai
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Chuance Sun
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ziang Guo
- Cancer Center, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macao Special Administrative Region of China
| | - Yangjing Wang
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xincheng Zeng
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yuhong Su
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Qi Zhao
- Cancer Center, Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macao Special Administrative Region of China; MoE Frontiers Science Center for Precision Oncology, University of Macau, Taipa, Macao Special Administrative Region of China.
| | - Buyong Ma
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, China; Shanghai Digiwiser BioTechnolgy, Limited, Shanghai 201203, China.
| |
Collapse
|
16
|
Sunny S, Prakash PB, Gopakumar G, Jayaraj PB. DeepBindPPI: Protein-Protein Binding Site Prediction Using Attention Based Graph Convolutional Network. Protein J 2023; 42:276-287. [PMID: 37198346 PMCID: PMC10191823 DOI: 10.1007/s10930-023-10121-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2023] [Indexed: 05/19/2023]
Abstract
Due to the importance of protein-protein interactions in defence mechanism of living body, attempts were made to investigate its attributes, including, but not limited to, binding affinity, and binding region. Contemporary strategies for binding site prediction largely resort to deep learning techniques but turned out to be low precision models. As laboratory experiments for drug discovery tasks utilize this information, increased false positives devalue the computational methods. This emphasize the need to develop enhanced strategies. DeepBindPPI employs deep learning technique to predict the binding regions of proteins, particularly antigen-antibody interaction sites. The results obtained are applied in a docking environment to confirm their correctness. An integration of graph convolutional network with attention mechanism predicts interacting amino acids with improved precision. The model learns the determining factors in interaction from a general pool of proteins and is then fine-tuned using antigen-antibody data. Comparison of the proposed method with existing techniques shows that the developed model has comparable performance. The use of a separate spatial network clearly improved the precision of the proposed method from 0.4 to 0.5. An attempt to utilize the interface information for docking using the HDOCK server gives promising results, with high-quality structures appearing in the top10 ranks.
Collapse
Affiliation(s)
- Sharon Sunny
- Department of CSE, National Institute of Technology, Calicut, Kerala 673601 India
| | | | - G. Gopakumar
- Department of CSE, National Institute of Technology, Calicut, Kerala 673601 India
| | - P. B. Jayaraj
- Department of CSE, National Institute of Technology, Calicut, Kerala 673601 India
| |
Collapse
|
17
|
Grassmann G, Di Rienzo L, Gosti G, Leonetti M, Ruocco G, Miotto M, Milanetti E. Electrostatic complementarity at the interface drives transient protein-protein interactions. Sci Rep 2023; 13:10207. [PMID: 37353566 PMCID: PMC10290103 DOI: 10.1038/s41598-023-37130-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 06/16/2023] [Indexed: 06/25/2023] Open
Abstract
Understanding the mechanisms driving bio-molecules binding and determining the resulting complexes' stability is fundamental for the prediction of binding regions, which is the starting point for drug-ability and design. Characteristics like the preferentially hydrophobic composition of the binding interfaces, the role of van der Waals interactions, and the consequent shape complementarity between the interacting molecular surfaces are well established. However, no consensus has yet been reached on the role of electrostatic. Here, we perform extensive analyses on a large dataset of protein complexes for which both experimental binding affinity and pH data were available. Probing the amino acid composition, the disposition of the charges, and the electrostatic potential they generated on the protein molecular surfaces, we found that (i) although different classes of dimers do not present marked differences in the amino acid composition and charges disposition in the binding region, (ii) homodimers with identical binding region show higher electrostatic compatibility with respect to both homodimers with non-identical binding region and heterodimers. Interestingly, (iii) shape and electrostatic complementarity, for patches defined on short-range interactions, behave oppositely when one stratifies the complexes by their binding affinity: complexes with higher binding affinity present high values of shape complementarity (the role of the Lennard-Jones potential predominates) while electrostatic tends to be randomly distributed. Conversely, complexes with low values of binding affinity exploit Coulombic complementarity to acquire specificity, suggesting that electrostatic complementarity may play a greater role in transient (or less stable) complexes. In light of these results, (iv) we provide a novel, fast, and efficient method, based on the 2D Zernike polynomial formalism, to measure electrostatic complementarity without the need of knowing the complex structure. Expanding the electrostatic potential on a basis of 2D orthogonal polynomials, we can discriminate between transient and permanent protein complexes with an AUC of the ROC of [Formula: see text] 0.8. Ultimately, our work helps shedding light on the non-trivial relationship between the hydrophobic and electrostatic contributions in the binding interfaces, thus favoring the development of new predictive methods for binding affinity characterization.
Collapse
Affiliation(s)
- Greta Grassmann
- Department of Biochemical Sciences "Alessandro Rossi Fanelli", Sapienza University of Rome, Piazzale Aldo Moro 5, 00185, Rome, Italy
- Center for Life Nano & Neuro Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nano & Neuro Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
| | - Giorgio Gosti
- Center for Life Nano & Neuro Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
- Soft and Living Matter Laboratory, Institute of Nanotechnology, Consiglio Nazionale delle Ricerche, 00185, Rome, Italy
| | - Marco Leonetti
- Center for Life Nano & Neuro Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
- Soft and Living Matter Laboratory, Institute of Nanotechnology, Consiglio Nazionale delle Ricerche, 00185, Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano & Neuro Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy
- Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro 5, 00185, Rome, Italy
| | - Mattia Miotto
- Center for Life Nano & Neuro Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy.
| | - Edoardo Milanetti
- Center for Life Nano & Neuro Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161, Rome, Italy.
- Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro 5, 00185, Rome, Italy.
| |
Collapse
|
18
|
Li P, Liu ZP. GeoBind: segmentation of nucleic acid binding interface on protein surface with geometric deep learning. Nucleic Acids Res 2023; 51:e60. [PMID: 37070217 PMCID: PMC10250245 DOI: 10.1093/nar/gkad288] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 03/21/2023] [Accepted: 04/06/2023] [Indexed: 04/19/2023] Open
Abstract
Unveiling the nucleic acid binding sites of a protein helps reveal its regulatory functions in vivo. Current methods encode protein sites from the handcrafted features of their local neighbors and recognize them via a classification, which are limited in expressive ability. Here, we present GeoBind, a geometric deep learning method for predicting nucleic binding sites on protein surface in a segmentation manner. GeoBind takes the whole point clouds of protein surface as input and learns the high-level representation based on the aggregation of their neighbors in local reference frames. Testing GeoBind on benchmark datasets, we demonstrate GeoBind is superior to state-of-the-art predictors. Specific case studies are performed to show the powerful ability of GeoBind to explore molecular surfaces when deciphering proteins with multimer formation. To show the versatility of GeoBind, we further extend GeoBind to five other types of ligand binding sites prediction tasks and achieve competitive performances.
Collapse
Affiliation(s)
- Pengpai Li
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| |
Collapse
|
19
|
Di Rienzo L, Miotto M, Milanetti E, Ruocco G. Computational structural-based GPCR optimization for user-defined ligand: Implications for the development of biosensors. Comput Struct Biotechnol J 2023; 21:3002-3009. [PMID: 37249971 PMCID: PMC10220229 DOI: 10.1016/j.csbj.2023.05.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 04/17/2023] [Accepted: 05/04/2023] [Indexed: 05/31/2023] Open
Abstract
Organisms have developed effective mechanisms to sense the external environment. Human-designed biosensors exploit this natural optimization, where different biological machinery have been adapted to detect the presence of user-defined molecules. Specifically, the pheromone pathway in the model organism Saccharomyces cerevisiae represents a suitable candidate as a synthetic signaling system. Indeed, it expresses just one G-Protein Coupled Receptor (GPCR), Ste2, able to recognize pheromone and initiate the expression of pheromone-dependent genes. To date, the standard procedure to engineer this system relies on the substitution of the yeast GPCR with another one and on the modification of the yeast G-protein to bind the inserted receptor. Here, we propose an innovative computational procedure, based on geometrical and chemical optimization of protein binding pockets, to select the amino acid substitutions required to make the native yeast GPCR able to recognize a user-defined ligand. This procedure would allow the yeast to recognize a wide range of ligands, without a-priori knowledge about a GPCR recognizing them or the corresponding G protein. We used Monte Carlo simulations to design on Ste2 a binding pocket able to recognize epinephrine, selected as a test ligand. We validated Ste2 mutants via molecular docking and molecular dynamics. We verified that the amino acid substitutions we identified make Ste2 able to accommodate and remain firmly bound to epinephrine. Our results indicate that we sampled efficiently the huge space of possible mutants, proposing such a strategy as a promising starting point for the development of a new kind of S.cerevisiae-based biosensors.
Collapse
Affiliation(s)
- Lorenzo Di Rienzo
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Mattia Miotto
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Edoardo Milanetti
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
- Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano- & Neuro-Science, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
- Department of Physics, Sapienza University of Rome, Piazzale Aldo Moro 5, 00185 Rome, Italy
| |
Collapse
|
20
|
Chinery L, Wahome N, Moal I, Deane CM. Paragraph-antibody paratope prediction using graph neural networks with minimal feature vectors. BIOINFORMATICS (OXFORD, ENGLAND) 2023; 39:6825310. [PMID: 36370083 DOI: 10.1093/bioinformatics/btac732] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 10/25/2022] [Accepted: 11/11/2022] [Indexed: 11/13/2022]
Abstract
SUMMARY The development of new vaccines and antibody therapeutics typically takes several years and requires over $1bn in investment. Accurate knowledge of the paratope (antibody binding site) can speed up and reduce the cost of this process by improving our understanding of antibody-antigen binding. We present Paragraph, a structure-based paratope prediction tool that outperforms current state-of-the-art tools using simpler feature vectors and no antigen information. AVAILABILITY AND IMPLEMENTATION Source code is freely available at www.github.com/oxpig/Paragraph. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lewis Chinery
- Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| | | | | | | |
Collapse
|
21
|
Robert PA, Akbar R, Frank R, Pavlović M, Widrich M, Snapkov I, Slabodkin A, Chernigovskaya M, Scheffer L, Smorodina E, Rawat P, Mehta BB, Vu MH, Mathisen IF, Prósz A, Abram K, Olar A, Miho E, Haug DTT, Lund-Johansen F, Hochreiter S, Haff IH, Klambauer G, Sandve GK, Greiff V. Unconstrained generation of synthetic antibody-antigen structures to guide machine learning methodology for antibody specificity prediction. NATURE COMPUTATIONAL SCIENCE 2022; 2:845-865. [PMID: 38177393 DOI: 10.1038/s43588-022-00372-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 11/09/2022] [Indexed: 01/06/2024]
Abstract
Machine learning (ML) is a key technology for accurate prediction of antibody-antigen binding. Two orthogonal problems hinder the application of ML to antibody-specificity prediction and the benchmarking thereof: the lack of a unified ML formalization of immunological antibody-specificity prediction problems and the unavailability of large-scale synthetic datasets to benchmark real-world relevant ML methods and dataset design. Here we developed the Absolut! software suite that enables parameter-based unconstrained generation of synthetic lattice-based three-dimensional antibody-antigen-binding structures with ground-truth access to conformational paratope, epitope and affinity. We formalized common immunological antibody-specificity prediction problems as ML tasks and confirmed that for both sequence- and structure-based tasks, accuracy-based rankings of ML methods trained on experimental data hold for ML methods trained on Absolut!-generated data. The Absolut! framework has the potential to enable real-world relevant development and benchmarking of ML strategies for biotherapeutics design.
Collapse
Affiliation(s)
- Philippe A Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| | - Rahmad Akbar
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Robert Frank
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | | | - Michael Widrich
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
| | - Igor Snapkov
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Andrei Slabodkin
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Maria Chernigovskaya
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | | | - Eva Smorodina
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Puneet Rawat
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Brij Bhushan Mehta
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Mai Ha Vu
- Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway
| | | | - Aurél Prósz
- Danish Cancer Society Research Center, Translational Cancer Genomics, Copenhagen, Denmark
| | - Krzysztof Abram
- The Novo Nordisk Foundation Center for Biosustainability, Autoflow, DTU Biosustain and IT University of Copenhagen, Copenhagen, Denmark
| | - Alex Olar
- Department of Complex Systems in Physics, Eötvös Loránd University, Budapest, Hungary
| | - Enkelejda Miho
- Institute of Medical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
- aiNET GmbH, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | | | - Sepp Hochreiter
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
- Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria
| | | | - Günter Klambauer
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
| | | | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
22
|
Tsuchiya Y, Yamamori Y, Tomii K. Protein-protein interaction prediction methods: from docking-based to AI-based approaches. Biophys Rev 2022; 14:1341-1348. [PMID: 36570321 PMCID: PMC9759050 DOI: 10.1007/s12551-022-01032-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/30/2022] [Indexed: 12/23/2022] Open
Abstract
Protein-protein interactions (PPIs), such as protein-protein inhibitor, antibody-antigen complex, and supercomplexes play diverse and important roles in cells. Recent advances in structural analysis methods, including cryo-EM, for the determination of protein complex structures are remarkable. Nevertheless, much room remains for improvement and utilization of computational methods to predict PPIs because of the large number and great diversity of unresolved complex structures. This review introduces a wide array of computational methods, including our own, for estimating PPIs including antibody-antigen interactions, offering both historical and forward-looking perspectives.
Collapse
Affiliation(s)
- Yuko Tsuchiya
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-Ku, Tokyo, 135-0064 Japan
| | - Yu Yamamori
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-Ku, Tokyo, 135-0064 Japan
| | - Kentaro Tomii
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-Ku, Tokyo, 135-0064 Japan
| |
Collapse
|
23
|
Lou H, Cao X. Antibody variable region engineering for improving cancer immunotherapy. Cancer Commun (Lond) 2022; 42:804-827. [PMID: 35822503 PMCID: PMC9456695 DOI: 10.1002/cac2.12330] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 04/25/2022] [Accepted: 06/22/2022] [Indexed: 04/09/2023] Open
Abstract
The efficacy and specificity of conventional monoclonal antibody (mAb) drugs in the clinic require further improvement. Currently, the development and application of novel antibody formats for improving cancer immunotherapy have attracted much attention. Variable region-retaining antibody fragments, such as antigen-binding fragment (Fab), single-chain variable fragment (scFv), bispecific antibody, and bi/trispecific cell engagers, are engineered with humanization, multivalent antibody construction, affinity optimization and antibody masking for targeting tumor cells and killer cells to improve antibody-based therapy potency, efficacy and specificity. In this review, we summarize the application of antibody variable region engineering and discuss the future direction of antibody engineering for improving cancer therapies.
Collapse
Affiliation(s)
- Hantao Lou
- Ludwig Institute of Cancer ResearchUniversity of OxfordOxfordOX3 7DRUK
- Chinese Academy for Medical Sciences Oxford InstituteNuffield Department of MedicineUniversity of OxfordOxfordOX3 7FZUK
| | - Xuetao Cao
- Chinese Academy for Medical Sciences Oxford InstituteNuffield Department of MedicineUniversity of OxfordOxfordOX3 7FZUK
- Department of ImmunologyCentre for Immunotherapy, Institute of Basic Medical SciencesChinese Academy of Medical SciencesBeijing100005P. R. China
| |
Collapse
|
24
|
De Lauro A, Di Rienzo L, Miotto M, Olimpieri PP, Milanetti E, Ruocco G. Shape Complementarity Optimization of Antibody–Antigen Interfaces: The Application to SARS-CoV-2 Spike Protein. Front Mol Biosci 2022; 9:874296. [PMID: 35669567 PMCID: PMC9163568 DOI: 10.3389/fmolb.2022.874296] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 04/07/2022] [Indexed: 11/17/2022] Open
Abstract
Many factors influence biomolecule binding, and its assessment constitutes an elusive challenge in computational structural biology. In this aspect, the evaluation of shape complementarity at molecular interfaces is one of the main factors to be considered. We focus on the particular case of antibody–antigen complexes to quantify the complementarities occurring at molecular interfaces. We relied on a method we recently developed, which employs the 2D Zernike descriptors, to characterize the investigated regions with an ordered set of numbers summarizing the local shape properties. Collecting a structural dataset of antibody–antigen complexes, we applied this method and we statistically distinguished, in terms of shape complementarity, pairs of the interacting regions from the non-interacting ones. Thus, we set up a novel computational strategy based on in silico mutagenesis of antibody-binding site residues. We developed a Monte Carlo procedure to increase the shape complementarity between the antibody paratope and a given epitope on a target protein surface. We applied our protocol against several molecular targets in SARS-CoV-2 spike protein, known to be indispensable for viral cell invasion. We, therefore, optimized the shape of template antibodies for the interaction with such regions. As the last step of our procedure, we performed an independent molecular docking validation of the results of our Monte Carlo simulations.
Collapse
Affiliation(s)
| | - Lorenzo Di Rienzo
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- *Correspondence: Lorenzo Di Rienzo,
| | - Mattia Miotto
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
| | | | - Edoardo Milanetti
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano & Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| |
Collapse
|
25
|
Grassmann G, Miotto M, Di Rienzo L, Gosti G, Ruocco G, Milanetti E. A novel computational strategy for defining the minimal protein molecular surface representation. PLoS One 2022; 17:e0266004. [PMID: 35421111 PMCID: PMC9009619 DOI: 10.1371/journal.pone.0266004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 03/12/2022] [Indexed: 11/18/2022] Open
Abstract
Most proteins perform their biological function by interacting with one or more molecular partners. In this respect, characterizing local features of the molecular surface, that can potentially be involved in the interaction with other molecules, represents a step forward in the investigation of the mechanisms of recognition and binding between molecules. Predictive methods often rely on extensive samplings of molecular patches with the aim to identify hot spots on the surface. In this framework, analysis of large proteins and/or many molecular dynamics frames is often unfeasible due to the high computational cost. Thus, finding optimal ways to reduce the number of points to be sampled maintaining the biological information (including the surface shape) carried by the molecular surface is pivotal. In this perspective, we here present a new theoretical and computational algorithm with the aim of defining a set of molecular surfaces composed of points not uniformly distributed in space, in such a way as to maximize the information of the overall shape of the molecule by minimizing the number of total points. We test our procedure’s ability in recognizing hot-spots by describing the local shape properties of portions of molecular surfaces through a recently developed method based on the formalism of 2D Zernike polynomials. The results of this work show the ability of the proposed algorithm to preserve the key information of the molecular surface using a reduced number of points compared to the complete surface, where all points of the surface are used for the description. In fact, the methodology shows a significant gain of the information stored in the sampling procedure compared to uniform random sampling.
Collapse
Affiliation(s)
| | - Mattia Miotto
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome, Italy
| | - Giorgio Gosti
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| | - Edoardo Milanetti
- Center for Life Nano & Neuroscience, Italian Institute of Technology, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
- * E-mail:
| |
Collapse
|
26
|
Quadrini M, Daberdaku S, Ferrari C. Hierarchical representation for PPI sites prediction. BMC Bioinformatics 2022; 23:96. [PMID: 35307006 PMCID: PMC8934516 DOI: 10.1186/s12859-022-04624-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 02/23/2022] [Indexed: 01/06/2023] Open
Abstract
Abstract
Background
Protein–protein interactions have pivotal roles in life processes, and aberrant interactions are associated with various disorders. Interaction site identification is key for understanding disease mechanisms and design new drugs. Effective and efficient computational methods for the PPI prediction are of great value due to the overall cost of experimental methods. Promising results have been obtained using machine learning methods and deep learning techniques, but their effectiveness depends on protein representation and feature selection.
Results
We define a new abstraction of the protein structure, called hierarchical representations, considering and quantifying spatial and sequential neighboring among amino acids. We also investigate the effect of molecular abstractions using the Graph Convolutional Networks technique to classify amino acids as interface and no-interface ones. Our study takes into account three abstractions, hierarchical representations, contact map, and the residue sequence, and considers the eight functional classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0. The performance of our method, evaluated using standard metrics, is compared to the ones obtained with some state-of-the-art protein interface predictors. The analysis of the performance values shows that our method outperforms the considered competitors when the considered molecules are structurally similar.
Conclusions
The hierarchical representation can capture the structural properties that promote the interactions and can be used to represent proteins with unknown structures by codifying only their sequential neighboring. Analyzing the results, we conclude that classes should be arranged according to their architectures rather than functions.
Collapse
|
27
|
Ding W, Wu L, Li X, Chang L, Liu G, Du H. Comprehensive analysis of competitive endogenous RNAs network: Identification and validation of prediction model composed of mRNA signature and miRNA signature in gastric cancer. Oncol Lett 2022; 23:150. [PMID: 35350591 PMCID: PMC8941526 DOI: 10.3892/ol.2022.13270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 02/22/2022] [Indexed: 11/18/2022] Open
Abstract
Gastric cancer (GC), one of the most lethal malignant tumors, is highly aggressive with a poor prognosis, while the molecular mechanisms underlying it remain largely unknown. Although advanced imaging techniques and comprehensive treatment facilitate the diagnosis and survival of some GC patients, the precise diagnosis and prognosis are still a challenge. The present study used publicly available gene expression profiles from The Cancer Genome Atlas and Gene Expression Omnibus datasets including mRNA, micro (mi)RNA and circular (circ)RNA of GC to establish a competing endogenous RNA network (ceRNA). Further, the present study performed least absolute shrinkage and selector operator regression analysis on the hub RNAs to establish a prediction model with mRNA and miRNA. The ceRNA network contained 109 edges and 56 nodes and the visible network contains 13 miRNAs, 9 circRNAs and 34 mRNAs. The five mRNA-based signature were CTF1, FKBP5, RNF128, GSTM2 and ADAMTS1. The area under curve (AUC) value of the diagnosis training cohort was 0.9975. The prognosis of the high-risk group (RiskScore >4.664) was worse compared with that of the low-risk group (RiskScore ≤4.664; P<0.05) in the training cohort. The five miRNA-based signature were miR-145-5p, miR-615-3p, miR-6507-5p, miR-937-3p and miR-99a-3p. The AUC value of the diagnosis training cohort was 0.9975. The prognosis of the high-risk group (RiskScore >1.621) was worse compared with that of the low-risk group (RiskScore ≤1.621; P<0.05) in the training cohort. The validation cohorts indicated that both five mRNA and five miRNA-based signatures had strong predictive power in diagnosis and prognosis for GC. In conclusion, a ceRNA network was established for GC and a five mRNA-based signature and a five miRNA-based signature was identified that enabled diagnosis and prognosis of GC by assigning patient to a high-risk group or low-risk group.
Collapse
Affiliation(s)
- Wenshuang Ding
- Department of Pathology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, Guangdong 510030, P.R. China
| | - Liqiong Wu
- Department of Pathology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, Guangdong 510030, P.R. China
| | - Xiubo Li
- Department of Pathology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, Guangdong 510030, P.R. China
| | - Lijun Chang
- Department of Pathology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, Guangdong 510030, P.R. China
| | - Guorong Liu
- Department of Pathology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, Guangdong 510030, P.R. China
| | - Hong Du
- Department of Pathology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, Guangdong 510030, P.R. China
| |
Collapse
|
28
|
Ray A. Machine learning in postgenomic biology and personalized medicine. WILEY INTERDISCIPLINARY REVIEWS. DATA MINING AND KNOWLEDGE DISCOVERY 2022; 12:e1451. [PMID: 35966173 PMCID: PMC9371441 DOI: 10.1002/widm.1451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 12/22/2021] [Indexed: 06/15/2023]
Abstract
In recent years Artificial Intelligence in the form of machine learning has been revolutionizing biology, biomedical sciences, and gene-based agricultural technology capabilities. Massive data generated in biological sciences by rapid and deep gene sequencing and protein or other molecular structure determination, on the one hand, requires data analysis capabilities using machine learning that are distinctly different from classical statistical methods; on the other, these large datasets are enabling the adoption of novel data-intensive machine learning algorithms for the solution of biological problems that until recently had relied on mechanistic model-based approaches that are computationally expensive. This review provides a bird's eye view of the applications of machine learning in post-genomic biology. Attempt is also made to indicate as far as possible the areas of research that are poised to make further impacts in these areas, including the importance of explainable artificial intelligence (XAI) in human health. Further contributions of machine learning are expected to transform medicine, public health, agricultural technology, as well as to provide invaluable gene-based guidance for the management of complex environments in this age of global warming.
Collapse
Affiliation(s)
- Animesh Ray
- Riggs School of Applied Life Sciences, Keck Graduate Institute, 535 Watson Drive, Claremont, CA91711, USA
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, USA
| |
Collapse
|
29
|
Akbar R, Bashour H, Rawat P, Robert PA, Smorodina E, Cotet TS, Flem-Karlsen K, Frank R, Mehta BB, Vu MH, Zengin T, Gutierrez-Marcos J, Lund-Johansen F, Andersen JT, Greiff V. Progress and challenges for the machine learning-based design of fit-for-purpose monoclonal antibodies. MAbs 2022; 14:2008790. [PMID: 35293269 PMCID: PMC8928824 DOI: 10.1080/19420862.2021.2008790] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Revised: 11/04/2021] [Accepted: 11/17/2021] [Indexed: 12/15/2022] Open
Abstract
Although the therapeutic efficacy and commercial success of monoclonal antibodies (mAbs) are tremendous, the design and discovery of new candidates remain a time and cost-intensive endeavor. In this regard, progress in the generation of data describing antigen binding and developability, computational methodology, and artificial intelligence may pave the way for a new era of in silico on-demand immunotherapeutics design and discovery. Here, we argue that the main necessary machine learning (ML) components for an in silico mAb sequence generator are: understanding of the rules of mAb-antigen binding, capacity to modularly combine mAb design parameters, and algorithms for unconstrained parameter-driven in silico mAb sequence synthesis. We review the current progress toward the realization of these necessary components and discuss the challenges that must be overcome to allow the on-demand ML-based discovery and design of fit-for-purpose mAb therapeutic candidates.
Collapse
Affiliation(s)
- Rahmad Akbar
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Habib Bashour
- School of Life Sciences, University of Warwick, Coventry, UK
| | - Puneet Rawat
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Philippe A. Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Eva Smorodina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Russia
| | | | - Karine Flem-Karlsen
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, Department of Pharmacology, University of Oslo and Oslo University Hospital, Norway
| | - Robert Frank
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Brij Bhushan Mehta
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Mai Ha Vu
- Department of Linguistics and Scandinavian Studies, University of Oslo, Norway
| | - Talip Zengin
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Department of Bioinformatics, Mugla Sitki Kocman University, Turkey
| | | | | | - Jan Terje Andersen
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, Department of Pharmacology, University of Oslo and Oslo University Hospital, Norway
| | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| |
Collapse
|
30
|
Lu S, Li Y, Wang F, Nan X, Zhang S. Leveraging Sequential and Spatial Neighbors Information by Using CNNs Linked With GCNs for Paratope Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:68-74. [PMID: 34029193 DOI: 10.1109/tcbb.2021.3083001] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Antibodies consisting of variable and constant regions, are a special type of proteins playing a vital role in immune system of the vertebrate. They have the remarkable ability to bind a large range of diverse antigens with extraordinary affinity and specificity. This malleability of binding makes antibodies an important class of biological drugs and biomarkers. In this article, we propose a method to identify which amino acid residues of an antibody directly interact with its associated antigen based on the features from sequence and structure. Our algorithm uses convolution neural networks (CNNs) linked with graph convolution networks (GCNs) to make use of information from both sequential and spatial neighbors to understand more about the local environment of target amino acid residue. Furthermore, we process the antigen partner of an antibody by employing an attention layer. Our method improves on the state-of-the-art methodology.
Collapse
|
31
|
Mattox DE, Bailey-Kellogg C. Comprehensive analysis of lectin-glycan interactions reveals determinants of lectin specificity. PLoS Comput Biol 2021; 17:e1009470. [PMID: 34613971 PMCID: PMC8523061 DOI: 10.1371/journal.pcbi.1009470] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 10/18/2021] [Accepted: 09/22/2021] [Indexed: 12/23/2022] Open
Abstract
Lectin-glycan interactions facilitate inter- and intracellular communication in many processes including protein trafficking, host-pathogen recognition, and tumorigenesis promotion. Specific recognition of glycans by lectins is also the basis for a wide range of applications in areas including glycobiology research, cancer screening, and antiviral therapeutics. To provide a better understanding of the determinants of lectin-glycan interaction specificity and support such applications, this study comprehensively investigates specificity-conferring features of all available lectin-glycan complex structures. Systematic characterization, comparison, and predictive modeling of a set of 221 complementary physicochemical and geometric features representing these interactions highlighted specificity-conferring features with potential mechanistic insight. Univariable comparative analyses with weighted Wilcoxon-Mann-Whitney tests revealed strong statistical associations between binding site features and specificity that are conserved across unrelated lectin binding sites. Multivariable modeling with random forests demonstrated the utility of these features for predicting the identity of bound glycans based on generalized patterns learned from non-homologous lectins. These analyses revealed global determinants of lectin specificity, such as sialic acid glycan recognition in deep, concave binding sites enriched for positively charged residues, in contrast to high mannose glycan recognition in fairly shallow but well-defined pockets enriched for non-polar residues. Focused fine specificity analysis of hemagglutinin interactions with human-like and avian-like glycans uncovered features representing both known and novel mutations related to shifts in influenza tropism from avian to human tissues. As the approach presented here relies on co-crystallized lectin-glycan pairs for studying specificity, it is limited in its inferences by the quantity, quality, and diversity of the structural data available. Regardless, the systematic characterization of lectin binding sites presented here provides a novel approach to studying lectin specificity and is a step towards confidently predicting new lectin-glycan interactions.
Collapse
Affiliation(s)
- Daniel E. Mattox
- Program in Quantitative Biomedical Sciences, Geisel School of Medicine at Dartmouth College, Hanover, New Hampshire, United States of America
| | - Chris Bailey-Kellogg
- Program in Quantitative Biomedical Sciences, Geisel School of Medicine at Dartmouth College, Hanover, New Hampshire, United States of America
- Department of Computer Science, Dartmouth College, Hanover, New Hampshire, United States of America
| |
Collapse
|
32
|
Di Rienzo L, Milanetti E, Ruocco G, Lepore R. Quantitative Description of Surface Complementarity of Antibody-Antigen Interfaces. Front Mol Biosci 2021; 8:749784. [PMID: 34660699 PMCID: PMC8514621 DOI: 10.3389/fmolb.2021.749784] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 09/14/2021] [Indexed: 11/29/2022] Open
Abstract
Antibodies have the remarkable ability to recognise their cognate antigens with extraordinary affinity and specificity. Discerning the rules that define antibody-antigen recognition is a fundamental step in the rational design and engineering of functional antibodies with desired properties. In this study we apply the 3D Zernike formalism to the analysis of the surface properties of the antibody complementary determining regions (CDRs). Our results show that shape and electrostatic 3DZD descriptors of the surface of the CDRs are predictive of antigen specificity, with classification accuracy of 81% and area under the receiver operating characteristic curve (AUC) of 0.85. Additionally, while in terms of surface size, solvent accessibility and amino acid composition, antibody epitopes are typically not distinguishable from non-epitope, solvent-exposed regions of the antigen, the 3DZD descriptors detect significantly higher surface complementarity to the paratope, and are able to predict correct paratope-epitope interaction with an AUC = 0.75.
Collapse
Affiliation(s)
- Lorenzo Di Rienzo
- Center for Life Nano and Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
| | - Edoardo Milanetti
- Center for Life Nano and Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| | - Giancarlo Ruocco
- Center for Life Nano and Neuro-Science, Istituto Italiano di Tecnologia, Rome, Italy
- Department of Physics, Sapienza University, Rome, Italy
| | - Rosalba Lepore
- Department of Biomedicine, Basel University Hospital and University of Basel, Basel, Switzerland
| |
Collapse
|
33
|
Zhang Y, Chen T, Zeng H, Yang X, Xu Q, Zhang Y, Chen Y, Wang M, Zhu Y, Lan C, Wang Q, Tang H, Zhang Y, Wang C, Xie W, Ma C, Guan J, Guo S, Chen S, Yang W, Wei L, Ren J, Yu X, Zhang Z. RAPID: A Rep-Seq Dataset Analysis Platform With an Integrated Antibody Database. Front Immunol 2021; 12:717496. [PMID: 34484220 PMCID: PMC8414647 DOI: 10.3389/fimmu.2021.717496] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 07/27/2021] [Indexed: 12/12/2022] Open
Abstract
The antibody repertoire is a critical component of the adaptive immune system and is believed to reflect an individual’s immune history and current immune status. Delineating the antibody repertoire has advanced our understanding of humoral immunity, facilitated antibody discovery, and showed great potential for improving the diagnosis and treatment of disease. However, no tool to date has effectively integrated big Rep-seq data and prior knowledge of functional antibodies to elucidate the remarkably diverse antibody repertoire. We developed a Rep-seq dataset Analysis Platform with an Integrated antibody Database (RAPID; https://rapid.zzhlab.org/), a free and web-based tool that allows researchers to process and analyse Rep-seq datasets. RAPID consolidates 521 WHO-recognized therapeutic antibodies, 88,059 antigen- or disease-specific antibodies, and 306 million clones extracted from 2,449 human IGH Rep-seq datasets generated from individuals with 29 different health conditions. RAPID also integrates a standardized Rep-seq dataset analysis pipeline to enable users to upload and analyse their datasets. In the process, users can also select set of existing repertoires for comparison. RAPID automatically annotates clones based on integrated therapeutic and known antibodies, and users can easily query antibodies or repertoires based on sequence or optional keywords. With its powerful analysis functions and rich set of antibody and antibody repertoire information, RAPID will benefit researchers in adaptive immune studies.
Collapse
Affiliation(s)
- Yanfang Zhang
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.,Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou, China.,Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Tianjian Chen
- State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Huikun Zeng
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.,Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou, China.,Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Xiujia Yang
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.,Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou, China.,Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Qingxian Xu
- State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Yanxia Zhang
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Yuan Chen
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Minhui Wang
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Nephrology, Hainan General Hospital, Haikou, China.,Hainan Affiliated Hospital of Hainan Medical College, Haikou, China
| | - Yan Zhu
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Chunhong Lan
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Qilong Wang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Haipei Tang
- Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Yan Zhang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Chengrui Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Wenxi Xie
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Cuiyu Ma
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Junjie Guan
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Shixin Guo
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou, China
| | - Sen Chen
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Wei Yang
- Department of Pathology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Lai Wei
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou, China
| | - Jian Ren
- State Key Laboratory of Oncology in South China, Cancer Center, Collaborative Innovation Center for Cancer Medicine, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Xueqing Yu
- Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,Division of Nephrology, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Zhenhai Zhang
- State Key Laboratory of Organ Failure Research, National Clinical Research, Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.,Center for Precision Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou, China.,Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| |
Collapse
|
34
|
Milanetti E, Miotto M, Di Rienzo L, Nagaraj M, Monti M, Golbek TW, Gosti G, Roeters SJ, Weidner T, Otzen DE, Ruocco G. In-Silico Evidence for a Two Receptor Based Strategy of SARS-CoV-2. Front Mol Biosci 2021; 8:690655. [PMID: 34179095 PMCID: PMC8219949 DOI: 10.3389/fmolb.2021.690655] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 05/19/2021] [Indexed: 01/04/2023] Open
Abstract
We propose a computational investigation on the interaction mechanisms between SARS-CoV-2 spike protein and possible human cell receptors. In particular, we make use of our newly developed numerical method able to determine efficiently and effectively the relationship of complementarity between portions of protein surfaces. This innovative and general procedure, based on the representation of the molecular isoelectronic density surface in terms of 2D Zernike polynomials, allows the rapid and quantitative assessment of the geometrical shape complementarity between interacting proteins, which was unfeasible with previous methods. Our results indicate that SARS-CoV-2 uses a dual strategy: in addition to the known interaction with angiotensin-converting enzyme 2, the viral spike protein can also interact with sialic-acid receptors of the cells in the upper airways.
Collapse
Affiliation(s)
- Edoardo Milanetti
- Department of Physics, Sapienza University, Rome, Italy
- Center for Life Nano and Neuro Science, Italian Institute of Technology, Rome, Italy
| | - Mattia Miotto
- Department of Physics, Sapienza University, Rome, Italy
- Center for Life Nano and Neuro Science, Italian Institute of Technology, Rome, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nano and Neuro Science, Italian Institute of Technology, Rome, Italy
| | - Madhu Nagaraj
- Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Aarhus, Denmark
| | - Michele Monti
- Centre for Genomic Regulation (CRG), the Barcelona Institute for Science and Technology, Barcelona, Spain
- RNA System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Genoa, Italy
| | | | - Giorgio Gosti
- Center for Life Nano and Neuro Science, Italian Institute of Technology, Rome, Italy
| | | | - Tobias Weidner
- Department of Chemistry, Aarhus University, Aarhus, Denmark
| | - Daniel E. Otzen
- Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Aarhus, Denmark
| | - Giancarlo Ruocco
- Department of Physics, Sapienza University, Rome, Italy
- Center for Life Nano and Neuro Science, Italian Institute of Technology, Rome, Italy
| |
Collapse
|
35
|
Liu Q, Wang PS, Zhu C, Gaines BB, Zhu T, Bi J, Song M. OctSurf: Efficient hierarchical voxel-based molecular surface representation for protein-ligand affinity prediction. J Mol Graph Model 2021; 105:107865. [PMID: 33640787 DOI: 10.1016/j.jmgm.2021.107865] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 02/03/2021] [Accepted: 02/04/2021] [Indexed: 10/22/2022]
Abstract
Voxel-based 3D convolutional neural networks (CNNs) have been applied to predict protein-ligand binding affinity. However, the memory usage and computation cost of these voxel-based approaches increase cubically with respect to spatial resolution and sometimes make volumetric CNNs intractable at higher resolutions. Therefore, it is necessary to develop memory-efficient alternatives that can accelerate the convolutional operation on 3D volumetric representations of the protein-ligand interaction. In this study, we implement a novel volumetric representation, OctSurf, to characterize the 3D molecular surface of protein binding pockets and bound ligands. The OctSurf surface representation is built based on the octree data structure, which has been widely used in computer graphics to efficiently represent and store 3D object data. Vanilla 3D-CNN approaches often divide the 3D space of objects into equal-sized voxels. In contrast, OctSurf recursively partitions the 3D space containing the protein-ligand pocket into eight subspaces called octants. Only those octants containing van der Waals surface points of protein or ligand atoms undergo the recursive subdivision process until they reach the predefined octree depth, whereas unoccupied octants are kept intact to reduce the memory cost. Resulting non-empty leaf octants approximate molecular surfaces of the protein pocket and bound ligands. These surface octants, along with their chemical and geometric features, are used as the input to 3D-CNNs. Two kinds of CNN architectures, VGG and ResNet, are applied to the OctSurf representation to predict binding affinity. The OctSurf representation consumes much less memory than the conventional voxel representation at the same resolution. By restricting the convolution operation to only octants of the smallest size, our method also alleviates the overall computational overhead of CNN. A series of experiments are performed to demonstrate the disk storage and computational efficiency of the proposed learning method. Our code is available at the following GitHub repository: https://github.uconn.edu/mldrugdiscovery/OctSurf.
Collapse
Affiliation(s)
- Qinqing Liu
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06279, USA
| | | | - Chunjiang Zhu
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06279, USA
| | - Blake Blumenfeld Gaines
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06279, USA
| | - Tan Zhu
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06279, USA
| | - Jinbo Bi
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06279, USA; Department of Biomedical Engineering, University of Connecticut, Storrs, CT 06279, USA
| | - Minghu Song
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT 06279, USA.
| |
Collapse
|
36
|
Narayanan H, Dingfelder F, Butté A, Lorenzen N, Sokolov M, Arosio P. Machine Learning for Biologics: Opportunities for Protein Engineering, Developability, and Formulation. Trends Pharmacol Sci 2021; 42:151-165. [DOI: 10.1016/j.tips.2020.12.004] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 12/10/2020] [Accepted: 12/16/2020] [Indexed: 12/19/2022]
|
37
|
Miotto M, Di Rienzo L, Bò L, Boffi A, Ruocco G, Milanetti E. Molecular Mechanisms Behind Anti SARS-CoV-2 Action of Lactoferrin. Front Mol Biosci 2021; 8:607443. [PMID: 33659275 PMCID: PMC7917183 DOI: 10.3389/fmolb.2021.607443] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 01/11/2021] [Indexed: 12/15/2022] Open
Abstract
Despite the huge effort to contain the infection, the novel SARS-CoV-2 coronavirus has rapidly become pandemic, mainly due to its extremely high human-to-human transmission capability, and a surprisingly high viral charge of symptom-less people. While the seek for a vaccine is still ongoing, promising results have been obtained with antiviral compounds. In particular, lactoferrin is regarded to have beneficial effects both in preventing and soothing the infection. Here, we explore the possible molecular mechanisms with which lactoferrin interferes with SARS-CoV-2 cell invasion, preventing attachment and/or entry of the virus. To this aim, we search for possible interactions lactoferrin may have with virus structural proteins and host receptors. Representing the molecular iso-electron surface of proteins in terms of 2D-Zernike descriptors, we 1) identified putative regions on the lactoferrin surface able to bind sialic acid present on the host cell membrane, sheltering the cell from the virus attachment; 2) showed that no significant shape complementarity is present between lactoferrin and the ACE2 receptor, while 3) two high complementarity regions are found on the N- and C-terminal domains of the SARS-CoV-2 spike protein, hinting at a possible competition between lactoferrin and ACE2 for the binding to the spike protein.
Collapse
Affiliation(s)
- Mattia Miotto
- Department of Physics, University of Rome `La Sapienza', Rome, Italy
- Istituto Italiano di Tecnologia (IIT), Center for Life Nano Science, Rome, Italy
| | - Lorenzo Di Rienzo
- Istituto Italiano di Tecnologia (IIT), Center for Life Nano Science, Rome, Italy
| | - Leonardo Bò
- Istituto Italiano di Tecnologia (IIT), Center for Life Nano Science, Rome, Italy
| | - Alberto Boffi
- Istituto Italiano di Tecnologia (IIT), Center for Life Nano Science, Rome, Italy
- Department of Biochemical Sciences “A. Rossi Fanelli” Sapienza University, Rome, Italy
| | - Giancarlo Ruocco
- Department of Physics, University of Rome `La Sapienza', Rome, Italy
- Istituto Italiano di Tecnologia (IIT), Center for Life Nano Science, Rome, Italy
| | - Edoardo Milanetti
- Department of Physics, University of Rome `La Sapienza', Rome, Italy
- Istituto Italiano di Tecnologia (IIT), Center for Life Nano Science, Rome, Italy
| |
Collapse
|
38
|
Lai PK, Fernando A, Cloutier TK, Kingsbury JS, Gokarn Y, Halloran KT, Calero-Rubio C, Trout BL. Machine Learning Feature Selection for Predicting High Concentration Therapeutic Antibody Aggregation. J Pharm Sci 2020; 110:1583-1591. [PMID: 33346034 DOI: 10.1016/j.xphs.2020.12.014] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 11/25/2020] [Accepted: 12/11/2020] [Indexed: 02/03/2023]
Abstract
Protein aggregation can hinder the development, safety and efficacy of therapeutic antibody-based drugs. Developing a predictive model that evaluates aggregation behaviors during early stage development is therefore desirable. Machine learning is a widely used tool to train models that predict data with different attributes. However, most machine learning techniques require more data than is typically available in antibody development. In this work, we describe a rational feature selection framework to develop accurate models with a small number of features. We applied this framework to predict aggregation behaviors of 21 approved monospecific monoclonal antibodies at high concentration (150 mg/mL), yielding a correlation coefficient of 0.71 on validation tests with only two features using a linear model. The nearest neighbors and support vector regression models further improved the performance, which have correlation coefficients of 0.86 and 0.80, respectively. This framework can be extended to train other models that predict different physical properties.
Collapse
Affiliation(s)
- Pin-Kuang Lai
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Amendra Fernando
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Theresa K Cloutier
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | - Yatin Gokarn
- Biologics Development, Sanofi, Framingham, MA, USA
| | | | | | - Bernhardt L Trout
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| |
Collapse
|
39
|
Milanetti E, Miotto M, Di Rienzo L, Monti M, Gosti G, Ruocco G. 2D Zernike polynomial expansion: Finding the protein-protein binding regions. Comput Struct Biotechnol J 2020; 19:29-36. [PMID: 33363707 PMCID: PMC7750141 DOI: 10.1016/j.csbj.2020.11.051] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 11/26/2020] [Accepted: 11/28/2020] [Indexed: 01/26/2023] Open
Abstract
We present a method for efficiently and effectively assessing whether and where two proteins can interact with each other to form a complex. This is still largely an open problem, even for those relatively few cases where the 3D structure of both proteins is known. In fact, even if much of the information about the interaction is encoded in the chemical and geometric features of the structures, the set of possible contact patches and of their relative orientations are too large to be computationally affordable in a reasonable time, thus preventing the compilation of reliable interactome. Our method is able to rapidly and quantitatively measure the geometrical shape complementarity between interacting proteins, comparing their molecular iso-electron density surfaces expanding the surface patches in term of 2D Zernike polynomials. We first test the method against the real binding region of a large dataset of known protein complexes, reaching a success rate of 0.72. We then apply the method for the blind recognition of binding sites, identifying the real region of interaction in about 60% of the analyzed cases. Finally, we investigate how the efficiency in finding the right binding region depends on the surface roughness as a function of the expansion order.
Collapse
Affiliation(s)
- Edoardo Milanetti
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185 Rome, Italy.,Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Mattia Miotto
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185 Rome, Italy.,Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Lorenzo Di Rienzo
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Michele Monti
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain.,RNA System Biology Lab, Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Via Morego 30, 16163 Genoa, Italy
| | - Giorgio Gosti
- Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| | - Giancarlo Ruocco
- Department of Physics, Sapienza University, Piazzale Aldo Moro 5, 00185 Rome, Italy.,Center for Life Nanoscience, Istituto Italiano di Tecnologia, Viale Regina Elena 291, 00161 Rome, Italy
| |
Collapse
|
40
|
Norman RA, Ambrosetti F, Bonvin AMJJ, Colwell LJ, Kelm S, Kumar S, Krawczyk K. Computational approaches to therapeutic antibody design: established methods and emerging trends. Brief Bioinform 2020; 21:1549-1567. [PMID: 31626279 PMCID: PMC7947987 DOI: 10.1093/bib/bbz095] [Citation(s) in RCA: 121] [Impact Index Per Article: 24.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 06/07/2019] [Accepted: 07/05/2019] [Indexed: 12/31/2022] Open
Abstract
Antibodies are proteins that recognize the molecular surfaces of potentially noxious molecules to mount an adaptive immune response or, in the case of autoimmune diseases, molecules that are part of healthy cells and tissues. Due to their binding versatility, antibodies are currently the largest class of biotherapeutics, with five monoclonal antibodies ranked in the top 10 blockbuster drugs. Computational advances in protein modelling and design can have a tangible impact on antibody-based therapeutic development. Antibody-specific computational protocols currently benefit from an increasing volume of data provided by next generation sequencing and application to related drug modalities based on traditional antibodies, such as nanobodies. Here we present a structured overview of available databases, methods and emerging trends in computational antibody analysis and contextualize them towards the engineering of candidate antibody therapeutics.
Collapse
|
41
|
Cloutier TK, Sudrik C, Mody N, Sathish HA, Trout BL. Machine Learning Models of Antibody–Excipient Preferential Interactions for Use in Computational Formulation Design. Mol Pharm 2020; 17:3589-3599. [DOI: 10.1021/acs.molpharmaceut.0c00629] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Theresa K. Cloutier
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chaitanya Sudrik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Neil Mody
- Dosage Form Design and Development, AstraZeneca, Gaithersburg, Maryland 20878, United States
| | - Hasige A. Sathish
- Dosage Form Design and Development, AstraZeneca, Gaithersburg, Maryland 20878, United States
| | - Bernhardt L. Trout
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
42
|
Fernández-Quintero ML, Loeffler JR, Bacher LM, Waibl F, Seidler CA, Liedl KR. Local and Global Rigidification Upon Antibody Affinity Maturation. Front Mol Biosci 2020; 7:182. [PMID: 32850970 PMCID: PMC7426445 DOI: 10.3389/fmolb.2020.00182] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Accepted: 07/13/2020] [Indexed: 01/03/2023] Open
Abstract
During the affinity maturation process the immune system produces antibodies with higher specificity and activity through various rounds of somatic hypermutations in response to an antigen. Elucidating the affinity maturation process is fundamental in understanding immunity and in the development of biotherapeutics. Therefore, we analyzed 10 pairs of antibody fragments differing in their specificity and in distinct stages of affinity maturation using metadynamics in combination with molecular dynamics (MD) simulations. We investigated differences in flexibility of the CDR-H3 loop and global changes in plasticity upon affinity maturation. Among all antibody pairs we observed a substantial rigidification in flexibility and plasticity reflected in a substantial decrease of conformational diversity. To visualize and characterize these findings we used Markov-states models to reconstruct the kinetics of CDR-H3 loop dynamics and for the first time provide a method to define and localize surface plasticity upon affinity maturation.
Collapse
Affiliation(s)
| | | | | | | | | | - Klaus R. Liedl
- Center for Molecular Biosciences Innsbruck, Institute of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| |
Collapse
|
43
|
Teraguchi S, Saputri DS, Llamas-Covarrubias MA, Davila A, Diez D, Nazlica SA, Rozewicki J, Ismanto HS, Wilamowski J, Xie J, Xu Z, Loza-Lopez MDJ, van Eerden FJ, Li S, Standley DM. Methods for sequence and structural analysis of B and T cell receptor repertoires. Comput Struct Biotechnol J 2020; 18:2000-2011. [PMID: 32802272 PMCID: PMC7366105 DOI: 10.1016/j.csbj.2020.07.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 07/08/2020] [Accepted: 07/08/2020] [Indexed: 02/07/2023] Open
Abstract
B cell receptors (BCRs) and T cell receptors (TCRs) make up an essential network of defense molecules that, collectively, can distinguish self from non-self and facilitate destruction of antigen-bearing cells such as pathogens or tumors. The analysis of BCR and TCR repertoires plays an important role in both basic immunology as well as in biotechnology. Because the repertoires are highly diverse, specialized software methods are needed to extract meaningful information from BCR and TCR sequence data. Here, we review recent developments in bioinformatics tools for analysis of BCR and TCR repertoires, with an emphasis on those that incorporate structural features. After describing the recent sequencing technologies for immune receptor repertoires, we survey structural modeling methods for BCR and TCRs, along with methods for clustering such models. We review downstream analyses, including BCR and TCR epitope prediction, antibody-antigen docking and TCR-peptide-MHC Modeling. We also briefly discuss molecular dynamics in this context.
Collapse
Affiliation(s)
- Shunsuke Teraguchi
- Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Japan
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - Dianita S. Saputri
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - Mara Anais Llamas-Covarrubias
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
- Departamento de Biología Molecular y Genómica, Centro Universitario de Ciencias de la Salud, Universidad de Guadalajara, Mexico
| | - Ana Davila
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - Diego Diez
- Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - Sedat Aybars Nazlica
- Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - John Rozewicki
- Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Japan
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - Hendra S. Ismanto
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - Jan Wilamowski
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - Jiaqi Xie
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - Zichang Xu
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | | | - Floris J. van Eerden
- Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - Songling Li
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
| | - Daron M. Standley
- Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Japan
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Japan
| |
Collapse
|
44
|
Myung Y, Pires DEV, Ascher DB. mmCSM-AB: guiding rational antibody engineering through multiple point mutations. Nucleic Acids Res 2020; 48:W125-W131. [PMID: 32432715 PMCID: PMC7319589 DOI: 10.1093/nar/gkaa389] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 04/18/2020] [Accepted: 05/16/2020] [Indexed: 12/15/2022] Open
Abstract
While antibodies are becoming an increasingly important therapeutic class, especially in personalized medicine, their development and optimization has been largely through experimental exploration. While there have been many efforts to develop computational tools to guide rational antibody engineering, most approaches are of limited accuracy when applied to antibody design, and have largely been limited to analysing a single point mutation at a time. To overcome this gap, we have curated a dataset of 242 experimentally determined changes in binding affinity upon multiple point mutations in antibody-target complexes (89 increasing and 153 decreasing binding affinity). Here, we have shown that by using our graph-based signatures and atomic interaction information, we can accurately analyse the consequence of multi-point mutations on antigen binding affinity. Our approach outperformed other available tools across cross-validation and two independent blind tests, achieving Pearson's correlations of up to 0.95. We have implemented our new approach, mmCSM-AB, as a web-server that can help guide the process of affinity maturation in antibody design. mmCSM-AB is freely available at http://biosig.unimelb.edu.au/mmcsm_ab/.
Collapse
Affiliation(s)
- Yoochan Myung
- Computational Biology and Clinical Informatics, Baker Institute, Melbourne, VIC 3004, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC 3052, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Institute, Melbourne, VIC 3004, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC 3052, Australia
- School of Computing and Information Systems, University of Melbourne, Parkville, VIC 3052, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Institute, Melbourne, VIC 3004, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC 3052, Australia
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| |
Collapse
|
45
|
Pittala S, Bailey-Kellogg C. Learning context-aware structural representations to predict antigen and antibody binding interfaces. Bioinformatics 2020; 36:3996-4003. [PMID: 32321157 PMCID: PMC7332568 DOI: 10.1093/bioinformatics/btaa263] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2019] [Revised: 04/10/2020] [Accepted: 04/15/2020] [Indexed: 01/19/2023] Open
Abstract
MOTIVATION Understanding how antibodies specifically interact with their antigens can enable better drug and vaccine design, as well as provide insights into natural immunity. Experimental structural characterization can detail the 'ground truth' of antibody-antigen interactions, but computational methods are required to efficiently scale to large-scale studies. To increase prediction accuracy as well as to provide a means to gain new biological insights into these interactions, we have developed a unified deep learning-based framework to predict binding interfaces on both antibodies and antigens. RESULTS Our framework leverages three key aspects of antibody-antigen interactions to learn predictive structural representations: (i) since interfaces are formed from multiple residues in spatial proximity, we employ graph convolutions to aggregate properties across local regions in a protein; (ii) since interactions are specific between antibody-antigen pairs, we employ an attention layer to explicitly encode the context of the partner; (iii) since more data are available for general protein-protein interactions, we employ transfer learning to leverage this data as a prior for the specific case of antibody-antigen interactions. We show that this single framework achieves state-of-the-art performance at predicting binding interfaces on both antibodies and antigens, and that each of its three aspects drives additional improvement in the performance. We further show that the attention layer not only improves performance, but also provides a biologically interpretable perspective into the mode of interaction. AVAILABILITY AND IMPLEMENTATION The source code is freely available on github at https://github.com/vamships/PECAN.git.
Collapse
Affiliation(s)
- Srivamshi Pittala
- Department of Computer Science, Dartmouth College, Hanover, NH 03755, USA
| | | |
Collapse
|
46
|
Marks C, Deane CM. How repertoire data are changing antibody science. J Biol Chem 2020; 295:9823-9837. [PMID: 32409582 DOI: 10.1074/jbc.rev120.010181] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 04/28/2020] [Indexed: 12/13/2022] Open
Abstract
Antibodies are vital proteins of the immune system that recognize potentially harmful molecules and initiate their removal. Mammals can efficiently create vast numbers of antibodies with different sequences capable of binding to any antigen with high affinity and specificity. Because they can be developed to bind to many disease agents, antibodies can be used as therapeutics. In an organism, after antigen exposure, antibodies specific to that antigen are enriched through clonal selection, expansion, and somatic hypermutation. The antibodies present in an organism therefore report on its immune status, describe its innate ability to deal with harmful substances, and reveal how it has previously responded. Next-generation sequencing technologies are being increasingly used to query the antibody, or B-cell receptor (BCR), sequence repertoire, and the amount of BCR data in public repositories is growing. The Observed Antibody Space database, for example, currently contains over a billion sequences from 68 different studies. Repertoires are available that represent both the naive state (i.e. antigen-inexperienced) and that after immunization. This wealth of data has created opportunities to learn more about our immune system. In this review, we discuss the many ways in which BCR repertoire data have been or could be exploited. We highlight its utility for providing insights into how the naive immune repertoire is generated and how it responds to antigens. We also consider how structural information can be used to enhance these data and may lead to more accurate depictions of the sequence space and to applications in the discovery of new therapeutics.
Collapse
Affiliation(s)
- Claire Marks
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
47
|
Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat Methods 2019; 17:184-192. [DOI: 10.1038/s41592-019-0666-6] [Citation(s) in RCA: 172] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Accepted: 10/28/2019] [Indexed: 02/05/2023]
|