1
|
Kalejaye L, Wu IE, Terry T, Lai PK. DeepSP: Deep learning-based spatial properties to predict monoclonal antibody stability. Comput Struct Biotechnol J 2024; 23:2220-2229. [PMID: 38827232 PMCID: PMC11140563 DOI: 10.1016/j.csbj.2024.05.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Revised: 05/15/2024] [Accepted: 05/16/2024] [Indexed: 06/04/2024] Open
Abstract
Therapeutic antibody development faces challenges due to high viscosities and aggregation tendencies. The spatial charge map (SCM) and spatial aggregation propensity (SAP) are computational techniques that aid in predicting viscosity and aggregation, respectively. These methods rely on structural data derived from molecular dynamics (MD) simulations, which are computationally demanding. DeepSCM, a deep learning surrogate model based on sequence information to predict SCM, was recently developed to screen high-concentration antibody viscosity. This study further utilized a dataset of 20,530 antibody sequences to train a convolutional neural network deep learning surrogate model called Deep Spatial Properties (DeepSP). DeepSP directly predicts SAP and SCM scores in different domains of antibody variable regions based solely on their sequences without performing MD simulations. The linear correlation coefficient between DeepSP scores and MD-derived scores for 30 properties achieved values between 0.76 and 0.96 with an average of 0.87. DeepSP descriptors were employed as features to build machine learning models to predict the aggregation rate of 21 antibodies, and the performance is similar to the results obtained from the previous study using MD simulations. This result demonstrates that the DeepSP approach significantly reduces the computational time required compared to MD simulations. The DeepSP model enables the rapid generation of 30 structural properties that can also be used as features in other research to train machine learning models for predicting various antibody stability using sequences only. DeepSP is freely available as an online tool via https://deepspwebapp.onrender.com and the codes and parameters are freely available at https://github.com/Lailabcode/DeepSP.
Collapse
Affiliation(s)
- Lateefat Kalejaye
- Department of Chemical Engineering and Materials Science, Stevens Institute of Technology, Hoboken 07030, NJ, United States
| | - I-En Wu
- Department of Chemical Engineering and Materials Science, Stevens Institute of Technology, Hoboken 07030, NJ, United States
| | - Taylor Terry
- Department of Chemical Engineering and Materials Science, Stevens Institute of Technology, Hoboken 07030, NJ, United States
| | - Pin-Kuang Lai
- Department of Chemical Engineering and Materials Science, Stevens Institute of Technology, Hoboken 07030, NJ, United States
| |
Collapse
|
2
|
Kim DN, McNaughton AD, Kumar N. Leveraging Artificial Intelligence to Expedite Antibody Design and Enhance Antibody-Antigen Interactions. Bioengineering (Basel) 2024; 11:185. [PMID: 38391671 PMCID: PMC10886287 DOI: 10.3390/bioengineering11020185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 01/30/2024] [Accepted: 02/06/2024] [Indexed: 02/24/2024] Open
Abstract
This perspective sheds light on the transformative impact of recent computational advancements in the field of protein therapeutics, with a particular focus on the design and development of antibodies. Cutting-edge computational methods have revolutionized our understanding of protein-protein interactions (PPIs), enhancing the efficacy of protein therapeutics in preclinical and clinical settings. Central to these advancements is the application of machine learning and deep learning, which offers unprecedented insights into the intricate mechanisms of PPIs and facilitates precise control over protein functions. Despite these advancements, the complex structural nuances of antibodies pose ongoing challenges in their design and optimization. Our review provides a comprehensive exploration of the latest deep learning approaches, including language models and diffusion techniques, and their role in surmounting these challenges. We also present a critical analysis of these methods, offering insights to drive further progress in this rapidly evolving field. The paper includes practical recommendations for the application of these computational techniques, supplemented with independent benchmark studies. These studies focus on key performance metrics such as accuracy and the ease of program execution, providing a valuable resource for researchers engaged in antibody design and development. Through this detailed perspective, we aim to contribute to the advancement of antibody design, equipping researchers with the tools and knowledge to navigate the complexities of this field.
Collapse
Affiliation(s)
- Doo Nam Kim
- Pacific Northwest National Laboratory, 902 Battelle Blvd., Richland, WA 99352, USA
| | - Andrew D McNaughton
- Pacific Northwest National Laboratory, 902 Battelle Blvd., Richland, WA 99352, USA
| | - Neeraj Kumar
- Pacific Northwest National Laboratory, 902 Battelle Blvd., Richland, WA 99352, USA
| |
Collapse
|
3
|
Kumar N, Bajiya N, Patiyal S, Raghava GPS. Multi-perspectives and challenges in identifying B-cell epitopes. Protein Sci 2023; 32:e4785. [PMID: 37733481 PMCID: PMC10578127 DOI: 10.1002/pro.4785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 09/11/2023] [Accepted: 09/16/2023] [Indexed: 09/23/2023]
Abstract
The identification of B-cell epitopes (BCEs) in antigens is a crucial step in developing recombinant vaccines or immunotherapies for various diseases. Over the past four decades, numerous in silico methods have been developed for predicting BCEs. However, existing reviews have only covered specific aspects, such as the progress in predicting conformational or linear BCEs. Therefore, in this paper, we have undertaken a systematic approach to provide a comprehensive review covering all aspects associated with the identification of BCEs. First, we have covered the experimental techniques developed over the years for identifying linear and conformational epitopes, including the limitations and challenges associated with these techniques. Second, we have briefly described the historical perspectives and resources that maintain experimentally validated information on BCEs. Third, we have extensively reviewed the computational methods developed for predicting conformational BCEs from the structure of the antigen, as well as the methods for predicting conformational epitopes from the sequence. Fourth, we have systematically reviewed the in silico methods developed in the last four decades for predicting linear or continuous BCEs. Finally, we have discussed the overall challenge of identifying continuous or conformational BCEs. In this review, we only listed major computational resources; a complete list with the URL is available from the BCinfo website (https://webs.iiitd.edu.in/raghava/bcinfo/).
Collapse
Affiliation(s)
- Nishant Kumar
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Nisha Bajiya
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Sumeet Patiyal
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| | - Gajendra P. S. Raghava
- Department of Computational BiologyIndraprastha Institute of Information TechnologyNew DelhiIndia
| |
Collapse
|
4
|
Zhou Y, Huang Z, Li W, Wei J, Jiang Q, Yang W, Huang J. Deep learning in preclinical antibody drug discovery and development. Methods 2023; 218:57-71. [PMID: 37454742 DOI: 10.1016/j.ymeth.2023.07.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 03/20/2023] [Accepted: 07/10/2023] [Indexed: 07/18/2023] Open
Abstract
Antibody drugs have become a key part of biotherapeutics. Patients suffering from various diseases have benefited from antibody therapies. However, its development process is rather long, expensive and risky. To speed up the process, reduce cost and improve success rate, artificial intelligence, especially deep learning methods, have been widely used in all aspects of preclinical antibody drug development, from library generation to hit identification, developability screening, lead selection and optimization. In this review, we systematically summarize antibody encodings, deep learning architectures and models used in preclinical antibody drug discovery and development. We also critically discuss challenges and opportunities, problems and possible solutions, current applications and future directions of deep learning in antibody drug development.
Collapse
Affiliation(s)
- Yuwei Zhou
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Ziru Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Wenzhen Li
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Jinyi Wei
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Qianhu Jiang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Wei Yang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
| |
Collapse
|
5
|
Hederman AP, Ackerman ME. Leveraging deep learning to improve vaccine design. Trends Immunol 2023; 44:333-344. [PMID: 37003949 PMCID: PMC10485910 DOI: 10.1016/j.it.2023.03.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/05/2023] [Accepted: 03/05/2023] [Indexed: 04/03/2023]
Abstract
Deep learning has led to incredible breakthroughs in areas of research, from self-driving vehicles to solutions, to formal mathematical proofs. In the biomedical sciences, however, the revolutionary results seen in other fields are only now beginning to be realized. Vaccine research and development efforts represent an application with high public health significance. Protein structure prediction, immune repertoire analysis, and phylogenetics are three principal areas in which deep learning is poised to provide key advances. Here, we opine on some of the current challenges with deep learning and how they are being addressed. Despite the nascent stage of deep learning applications in immunological studies, there is ample opportunity to utilize this new technology to address the most challenging and burdensome infectious diseases confronting global populations.
Collapse
Affiliation(s)
| | - Margaret E Ackerman
- Thayer School of Engineering, Dartmouth College, Hanover, NH, USA; Department of Microbiology and Immunology, Geisel School of Medicine, Hanover, NH, USA.
| |
Collapse
|
6
|
Zheng D, Liang S, Zhang C. B-Cell Epitope Predictions Using Computational Methods. Methods Mol Biol 2023; 2552:239-254. [PMID: 36346595 DOI: 10.1007/978-1-0716-2609-2_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Identifying protein antigenic epitopes that are recognizable by antibodies is a key step in immunologic research. This type of research has broad medical applications, such as new immunodiagnostic reagent discovery, vaccine design, and antibody design. However, due to the countless possibilities of potential epitopes, the experimental search through trial and error would be too costly and time-consuming to be practical. To facilitate this process and improve its efficiency, computational methods were developed to predict both linear epitopes and discontinuous antigenic epitopes. For linear B-cell epitope prediction, many methods were developed, including PREDITOP, PEOPLE, BEPITOPE, BepiPred, COBEpro, ABCpred, AAP, BCPred, BayesB, BEOracle/BROracle, BEST, LBEEP, DRREP, iBCE-EL, SVMTriP, etc. For the more challenging yet important task of discontinuous epitope prediction, methods were also developed, including CEP, DiscoTope, PEPITO, ElliPro, SEPPA, EPITOPIA, PEASE, EpiPred, SEPIa, EPCES, EPSVR, etc. In this chapter, we will discuss computational methods for B-cell epitope predictions of both linear and discontinuous epitopes. SVMTriP and EPCES/EPCSVR, the most successful among the methods for each type of the predictions, will be used as model methods to detail the standard protocols. For linear epitope prediction, SVMTriP was reported to achieve a sensitivity of 80.1% and a precision of 55.2% with a fivefold cross-validation based on a large dataset, yielding an AUC of 0.702. For discontinuous or conformational B-cell epitope prediction, EPCES and EPCSVR were both benchmarked by a curated independent test dataset in which all antigens had no complex structures with the antibody. The identified epitopes by these methods were later independently validated by various biochemical experiments. For these three model methods, webservers and all datasets are publicly available at http://sysbio.unl.edu/SVMTriP , http://sysbio.unl.edu/EPCES/ , and http://sysbio.unl.edu/EPSVR/ .
Collapse
Affiliation(s)
- Dandan Zheng
- Department of Radiation Oncology, University of Rochester School of Medicine and Dentistry, Rochester, NY, USA
| | - Shide Liang
- Department of Research and Development, Bio-Thera Solutions, Guangzhou, China.
| | - Chi Zhang
- School of Biological Sciences, University of Nebraska, Lincoln, NE, USA.
| |
Collapse
|
7
|
Xu Z, Ismanto HS, Zhou H, Saputri DS, Sugihara F, Standley DM. Advances in antibody discovery from human BCR repertoires. FRONTIERS IN BIOINFORMATICS 2022; 2:1044975. [PMID: 36338807 PMCID: PMC9631452 DOI: 10.3389/fbinf.2022.1044975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 10/11/2022] [Indexed: 11/06/2022] Open
Abstract
Antibodies make up an important and growing class of compounds used for the diagnosis or treatment of disease. While traditional antibody discovery utilized immunization of animals to generate lead compounds, technological innovations have made it possible to search for antibodies targeting a given antigen within the repertoires of B cells in humans. Here we group these innovations into four broad categories: cell sorting allows the collection of cells enriched in specificity to one or more antigens; BCR sequencing can be performed on bulk mRNA, genomic DNA or on paired (heavy-light) mRNA; BCR repertoire analysis generally involves clustering BCRs into specificity groups or more in-depth modeling of antibody-antigen interactions, such as antibody-specific epitope predictions; validation of antibody-antigen interactions requires expression of antibodies, followed by antigen binding assays or epitope mapping. Together with innovations in Deep learning these technologies will contribute to the future discovery of diagnostic and therapeutic antibodies directly from humans.
Collapse
Affiliation(s)
- Zichang Xu
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Hendra S. Ismanto
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Hao Zhou
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Dianita S. Saputri
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
| | - Fuminori Sugihara
- Core Instrumentation Facility, Immunology Frontier Research Center, Osaka University, Suita, Japan
| | - Daron M. Standley
- Department of Genome Informatics, Research Institute for Microbial Diseases, Osaka University, Suita, Japan
- Department Systems Immunology, Immunology Frontier Research Center, Osaka University, Suita, Japan
- *Correspondence: Daron M. Standley,
| |
Collapse
|
8
|
Xu H, Zhao Z. NetBCE: An Interpretable Deep Neural Network for Accurate Prediction of Linear B-cell Epitopes. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:1002-1012. [PMID: 36526218 PMCID: PMC10025766 DOI: 10.1016/j.gpb.2022.11.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 10/27/2022] [Accepted: 11/11/2022] [Indexed: 12/15/2022]
Abstract
Identification of B-cell epitopes (BCEs) plays an essential role in the development of peptide vaccines and immuno-diagnostic reagents, as well as antibody design and production. In this work, we generated a large benchmark dataset comprising 124,879 experimentally supported linear epitope-containing regions in 3567 protein clusters from over 1.3 million B cell assays. Analysis of this curated dataset showed large pathogen diversity covering 176 different families. The accuracy in linear BCE prediction was found to strongly vary with different features, while all sequence-derived and structural features were informative. To search more efficient and interpretive feature representations, a ten-layer deep learning framework for linear BCE prediction, namely NetBCE, was developed. NetBCE achieved high accuracy and robust performance with the average area under the curve (AUC) value of 0.8455 in five-fold cross-validation through automatically learning the informative classification features. NetBCE substantially outperformed the conventional machine learning algorithms and other tools, with more than 22.06% improvement of AUC value compared to other tools using an independent dataset. Through investigating the output of important network modules in NetBCE, epitopes and non-epitopes tended to be presented in distinct regions with efficient feature representation along the network layer hierarchy. The NetBCE is freely available at https://github.com/bsml320/NetBCE.
Collapse
Affiliation(s)
- Haodong Xu
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA.
| |
Collapse
|
9
|
Lai PK. DeepSCM: An efficient convolutional neural network surrogate model for the screening of therapeutic antibody viscosity. Comput Struct Biotechnol J 2022; 20:2143-2152. [PMID: 35832619 PMCID: PMC9092385 DOI: 10.1016/j.csbj.2022.04.035] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Revised: 04/26/2022] [Accepted: 04/26/2022] [Indexed: 11/24/2022] Open
Abstract
Predicting high concentration antibody viscosity is essential for developing subcutaneous administration. Computer simulations provide promising tools to reach this aim. One such model is the spatial charge map (SCM) proposed by Agrawal and coworkers (mAbs. 2015, 8(1):43-48). SCM applies molecular dynamics simulations to calculate a score for the screening of antibody viscosity at high concentrations. However, molecular dynamics simulations are computationally costly and require structural information, a significant application bottleneck. In this work, high throughput computing was performed to calculate the SCM scores for 6596 nonredundant antibody variable regions. A convolutional neural network surrogate model, DeepSCM, requiring only sequence information, was then developed based on this dataset. The linear correlation coefficient of the DeepSCM and SCM scores achieved 0.9 on the test set (N = 1320). The DeepSCM model was applied to screen the viscosity of 38 therapeutic antibodies that SCM correctly classified and resulted in only one misclassification. The DeepSCM model will facilitate high concentration antibody viscosity screening. The code and parameters are freely available at https://github.com/Lailabcode/DeepSCM.
Collapse
Affiliation(s)
- Pin-Kuang Lai
- Department of Chemical Engineering and Materials Science, Stevens Institute of Technology, Hoboken 07030, NJ, United States
| |
Collapse
|
10
|
Bukhari SNH, Jain A, Haq E, Mehbodniya A, Webber J. Machine Learning Techniques for the Prediction of B-Cell and T-Cell Epitopes as Potential Vaccine Targets with a Specific Focus on SARS-CoV-2 Pathogen: A Review. Pathogens 2022; 11:pathogens11020146. [PMID: 35215090 PMCID: PMC8879824 DOI: 10.3390/pathogens11020146] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 01/19/2022] [Accepted: 01/21/2022] [Indexed: 02/01/2023] Open
Abstract
The only part of an antigen (a protein molecule found on the surface of a pathogen) that is composed of epitopes specific to T and B cells is recognized by the human immune system (HIS). Identification of epitopes is considered critical for designing an epitope-based peptide vaccine (EBPV). Although there are a number of vaccine types, EBPVs have received less attention thus far. It is important to mention that EBPVs have a great deal of untapped potential for boosting vaccination safety—they are less expensive and take a short time to produce. Thus, in order to quickly contain global pandemics such as the ongoing outbreak of coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), as well as epidemics and endemics, EBPVs are considered promising vaccine types. The high mutation rate of SARS-CoV-2 has posed a great challenge to public health worldwide because either the composition of existing vaccines has to be changed or a new vaccine has to be developed to protect against its different variants. In such scenarios, time being the critical factor, EBPVs can be a promising alternative. To design an effective and viable EBPV against different strains of a pathogen, it is important to identify the putative T- and B-cell epitopes. Using the wet-lab experimental approach to identify these epitopes is time-consuming and costly because the experimental screening of a vast number of potential epitope candidates is required. Fortunately, various available machine learning (ML)-based prediction methods have reduced the burden related to the epitope mapping process by decreasing the potential epitope candidate list for experimental trials. Moreover, these methods are also cost-effective, scalable, and fast. This paper presents a systematic review of various state-of-the-art and relevant ML-based methods and tools for predicting T- and B-cell epitopes. Special emphasis is placed on highlighting and analyzing various models for predicting epitopes of SARS-CoV-2, the causative agent of COVID-19. Based on the various methods and tools discussed, future research directions for epitope prediction are presented.
Collapse
Affiliation(s)
- Syed Nisar Hussain Bukhari
- University Institute of Computing, Chandigarh University, NH-95, Chandigarh-Ludhiana Highway, Mohali 140413, India;
- Correspondence:
| | - Amit Jain
- University Institute of Computing, Chandigarh University, NH-95, Chandigarh-Ludhiana Highway, Mohali 140413, India;
| | - Ehtishamul Haq
- Department of Biotechnology, University of Kashmir, Srinagar 190006, India;
| | - Abolfazl Mehbodniya
- Department of Electronics and Communication Engineering, Kuwait College of Science and Technology, Kuwait City 20185145, Kuwait;
| | - Julian Webber
- Graduate School of Engineering Science, Osaka University, Osaka 560-8531, Japan;
| |
Collapse
|
11
|
Tarasova O, Poroikov V. Machine Learning in Discovery of New Antivirals and Optimization of Viral Infections Therapy. Curr Med Chem 2021; 28:7840-7861. [PMID: 33949929 DOI: 10.2174/0929867328666210504114351] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 02/13/2021] [Accepted: 02/24/2021] [Indexed: 11/22/2022]
Abstract
Nowadays, computational approaches play an important role in the design of new drug-like compounds and optimization of pharmacotherapeutic treatment of diseases. The emerging growth of viral infections, including those caused by the Human Immunodeficiency Virus (HIV), Ebola virus, recently detected coronavirus, and some others, leads to many newly infected people with a high risk of death or severe complications. A huge amount of chemical, biological, clinical data is at the disposal of the researchers. Therefore, there are many opportunities to find the relationships between the particular features of chemical data and the antiviral activity of biologically active compounds based on machine learning approaches. Biological and clinical data can also be used for building models to predict relationships between viral genotype and drug resistance, which might help determine the clinical outcome of treatment. In the current study, we consider machine-learning approaches in the antiviral research carried out during the past decade. We overview in detail the application of machine-learning methods for the design of new potential antiviral agents and vaccines, drug resistance prediction, and analysis of virus-host interactions. Our review also covers the perspectives of using the machine-learning approaches for antiviral research, including Dengue, Ebola viruses, Influenza A, Human Immunodeficiency Virus, coronaviruses, and some others.
Collapse
Affiliation(s)
- Olga Tarasova
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow. Russian Federation
| | - Vladimir Poroikov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow. Russian Federation
| |
Collapse
|
12
|
Galanis KA, Nastou KC, Papandreou NC, Petichakis GN, Pigis DG, Iconomidou VA. Linear B-Cell Epitope Prediction for In Silico Vaccine Design: A Performance Review of Methods Available via Command-Line Interface. Int J Mol Sci 2021; 22:3210. [PMID: 33809918 PMCID: PMC8004178 DOI: 10.3390/ijms22063210] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 03/15/2021] [Accepted: 03/19/2021] [Indexed: 12/17/2022] Open
Abstract
Linear B-cell epitope prediction research has received a steadily growing interest ever since the first method was developed in 1981. B-cell epitope identification with the help of an accurate prediction method can lead to an overall faster and cheaper vaccine design process, a crucial necessity in the COVID-19 era. Consequently, several B-cell epitope prediction methods have been developed over the past few decades, but without significant success. In this study, we review the current performance and methodology of some of the most widely used linear B-cell epitope predictors which are available via a command-line interface, namely, BcePred, BepiPred, ABCpred, COBEpro, SVMTriP, LBtope, and LBEEP. Additionally, we attempted to remedy performance issues of the individual methods by developing a consensus classifier, which combines the separate predictions of these methods into a single output, accelerating the epitope-based vaccine design. While the method comparison was performed with some necessary caveats and individual methods might perform much better for specialized datasets, we hope that this update in performance can aid researchers towards the choice of a predictor, for the development of biomedical applications such as designed vaccines, diagnostic kits, immunotherapeutics, immunodiagnostic tests, antibody production, and disease diagnosis and therapy.
Collapse
Affiliation(s)
| | | | | | | | | | - Vassiliki A. Iconomidou
- Section of Cell Biology and Biophysics, Department of Biology, School of Sciences, National and Kapodistrian University of Athens, 15701 Athens, Greece; (K.A.G.); (K.C.N.); (N.C.P.); (G.N.P.); (D.G.P.)
| |
Collapse
|
13
|
Ward D, Higgins M, Phelan JE, Hibberd ML, Campino S, Clark TG. An integrated in silico immuno-genetic analytical platform provides insights into COVID-19 serological and vaccine targets. Genome Med 2021; 13:4. [PMID: 33413610 PMCID: PMC7790334 DOI: 10.1186/s13073-020-00822-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 12/14/2020] [Indexed: 12/31/2022] Open
Abstract
During COVID-19, diagnostic serological tools and vaccines have been developed. To inform control activities in a post-vaccine surveillance setting, we have developed an online “immuno-analytics” resource that combines epitope, sequence, protein and SARS-CoV-2 mutation analysis. SARS-CoV-2 spike and nucleocapsid proteins are both vaccine and serological diagnostic targets. Using the tool, the nucleocapsid protein appears to be a sub-optimal target for use in serological platforms. Spike D614G (and nsp12 L314P) mutations were most frequent (> 86%), whilst spike A222V/L18F have recently increased. Also, Orf3a proteins may be a suitable target for serology. The tool can accessed from: http://genomics.lshtm.ac.uk/immuno (online); https://github.com/dan-ward-bio/COVID-immunoanalytics (source code).
Collapse
Affiliation(s)
- Daniel Ward
- Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK.
| | - Matthew Higgins
- Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Jody E Phelan
- Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Martin L Hibberd
- Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Susana Campino
- Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Taane G Clark
- Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK. .,Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK.
| |
Collapse
|
14
|
Zinsli LV, Stierlin N, Loessner MJ, Schmelcher M. Deimmunization of protein therapeutics - Recent advances in experimental and computational epitope prediction and deletion. Comput Struct Biotechnol J 2020; 19:315-329. [PMID: 33425259 PMCID: PMC7779837 DOI: 10.1016/j.csbj.2020.12.024] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 12/15/2020] [Accepted: 12/16/2020] [Indexed: 12/11/2022] Open
Abstract
Biotherapeutics, and antimicrobial proteins in particular, are of increasing interest for human medicine. An important challenge in the development of such therapeutics is their potential immunogenicity, which can induce production of anti-drug-antibodies, resulting in altered pharmacokinetics, reduced efficacy, and potentially severe anaphylactic or hypersensitivity reactions. For this reason, the development and application of effective deimmunization methods for protein drugs is of utmost importance. Deimmunization may be achieved by unspecific shielding approaches, which include PEGylation, fusion to polypeptides (e.g., XTEN or PAS), reductive methylation, glycosylation, and polysialylation. Alternatively, the identification of epitopes for T cells or B cells and their subsequent deletion through site-directed mutagenesis represent promising deimmunization strategies and can be accomplished through either experimental or computational approaches. This review highlights the most recent advances and current challenges in the deimmunization of protein therapeutics, with a special focus on computational epitope prediction and deletion tools.
Collapse
Key Words
- ABR, Antigen-binding region
- ADA, Anti-drug antibody
- ANN, Artificial neural network
- APC, Antigen-presenting cell
- Anti-drug-antibody
- B cell epitope
- BCR, B cell receptor
- Bab, Binding antibody
- CDR, Complementarity determining region
- CRISPR, Clustered regularly interspaced short palindromic repeats
- DC, Dendritic cell
- ELP, Elastin-like polypeptide
- EPO, Erythropoietin
- ER, Endoplasmatic reticulum
- GLK, Gelatin-like protein
- HAP, Homo-amino-acid polymer
- HLA, Human leukocyte antigen
- HMM, Hidden Markov model
- IL, Interleukin
- Ig, Immunoglobulin
- Immunogenicity
- LPS, Lipopolysaccharide
- MHC, Major histocompatibility complex
- NMR, Nuclear magnetic resonance
- Nab, Neutralizing antibody
- PAMP, Pathogen-associated molecular pattern
- PAS, Polypeptide composed of proline, alanine, and/or serine
- PBMC, Peripheral blood mononuclear cell
- PD, Pharmacodynamics
- PEG, Polyethylene glycol
- PK, Pharmacokinetics
- PRR, Pattern recognition receptor
- PSA, Sialic acid polymers
- Protein therapeutic
- RNN, Recurrent artificial neural network
- SVM, Support vector machine
- T cell epitope
- TAP, Transporter associated with antigen processing
- TCR, T cell receptor
- TLR, Toll-like receptor
- XTEN, “Xtended” recombinant polypeptide
Collapse
Affiliation(s)
- Léa V. Zinsli
- Institute of Food, Nutrition and Health, ETH Zurich, Zurich, Switzerland
| | - Noël Stierlin
- Institute of Food, Nutrition and Health, ETH Zurich, Zurich, Switzerland
| | - Martin J. Loessner
- Institute of Food, Nutrition and Health, ETH Zurich, Zurich, Switzerland
| | - Mathias Schmelcher
- Institute of Food, Nutrition and Health, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
15
|
Keshavarzi Arshadi A, Webb J, Salem M, Cruz E, Calad-Thomson S, Ghadirian N, Collins J, Diez-Cecilia E, Kelly B, Goodarzi H, Yuan JS. Artificial Intelligence for COVID-19 Drug Discovery and Vaccine Development. Front Artif Intell 2020; 3:65. [PMID: 33733182 PMCID: PMC7861281 DOI: 10.3389/frai.2020.00065] [Citation(s) in RCA: 97] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Accepted: 07/17/2020] [Indexed: 12/31/2022] Open
Abstract
SARS-COV-2 has roused the scientific community with a call to action to combat the growing pandemic. At the time of this writing, there are as yet no novel antiviral agents or approved vaccines available for deployment as a frontline defense. Understanding the pathobiology of COVID-19 could aid scientists in their discovery of potent antivirals by elucidating unexplored viral pathways. One method for accomplishing this is the leveraging of computational methods to discover new candidate drugs and vaccines in silico. In the last decade, machine learning-based models, trained on specific biomolecules, have offered inexpensive and rapid implementation methods for the discovery of effective viral therapies. Given a target biomolecule, these models are capable of predicting inhibitor candidates in a structural-based manner. If enough data are presented to a model, it can aid the search for a drug or vaccine candidate by identifying patterns within the data. In this review, we focus on the recent advances of COVID-19 drug and vaccine development using artificial intelligence and the potential of intelligent training for the discovery of COVID-19 therapeutics. To facilitate applications of deep learning for SARS-COV-2, we highlight multiple molecular targets of COVID-19, inhibition of which may increase patient survival. Moreover, we present CoronaDB-AI, a dataset of compounds, peptides, and epitopes discovered either in silico or in vitro that can be potentially used for training models in order to extract COVID-19 treatment. The information and datasets provided in this review can be used to train deep learning-based models and accelerate the discovery of effective viral therapies.
Collapse
Affiliation(s)
- Arash Keshavarzi Arshadi
- Burnett School of Biomedical Sciences, University of Central Florida, Orlando, FL, United States
| | - Julia Webb
- Burnett School of Biomedical Sciences, University of Central Florida, Orlando, FL, United States
| | - Milad Salem
- Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL, United States
| | | | | | - Niloofar Ghadirian
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ, United States
| | - Jennifer Collins
- Burnett School of Biomedical Sciences, University of Central Florida, Orlando, FL, United States
| | | | | | - Hani Goodarzi
- Department of Biochemistry and Biophysics, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States
| | - Jiann Shiun Yuan
- Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL, United States
| |
Collapse
|
16
|
Identification and Analysis of Unstructured, Linear B-Cell Epitopes in SARS-CoV-2 Virion Proteins for Vaccine Development. Vaccines (Basel) 2020; 8:vaccines8030397. [PMID: 32698423 PMCID: PMC7564417 DOI: 10.3390/vaccines8030397] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 07/14/2020] [Accepted: 07/17/2020] [Indexed: 12/13/2022] Open
Abstract
The efficacy of SARS-CoV-2 nucleic acid-based vaccines may be limited by proteolysis of the translated product due to anomalous protein folding. This may be the case for vaccines employing linear SARS-CoV-2 B-cell epitopes identified in previous studies since most of them participate in secondary structure formation. In contrast, we have employed a consensus of predictors for epitopic zones plus a structural filter for identifying 20 unstructured B-cell epitope-containing loops (uBCELs) in S, M, and N proteins. Phylogenetic comparison suggests epitope switching with respect to SARS-CoV in some of the identified uBCELs. Such events may be associated with the reported lack of serum cross-protection between the 2003 and 2019 pandemic strains. Incipient variability within a sample of 1639 SARS-CoV-2 isolates was also detected for 10 uBCELs which could cause vaccine failure. Intermediate stages of the putative epitope switch events were observed in bat coronaviruses in which additive mutational processes possibly facilitating evasion of the bat immune system appear to have taken place prior to transfer to humans. While there was some overlap between uBCELs and previously validated SARS-CoV B-cell epitopes, multiple uBCELs had not been identified in prior studies. Overall, these uBCELs may facilitate the development of biomedical products for SARS-CoV-2.
Collapse
|
17
|
Graves J, Byerly J, Priego E, Makkapati N, Parish SV, Medellin B, Berrondo M. A Review of Deep Learning Methods for Antibodies. Antibodies (Basel) 2020; 9:E12. [PMID: 32354020 PMCID: PMC7344881 DOI: 10.3390/antib9020012] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 04/15/2020] [Accepted: 04/16/2020] [Indexed: 01/09/2023] Open
Abstract
Driven by its successes across domains such as computer vision and natural language processing, deep learning has recently entered the field of biology by aiding in cellular image classification, finding genomic connections, and advancing drug discovery. In drug discovery and protein engineering, a major goal is to design a molecule that will perform a useful function as a therapeutic drug. Typically, the focus has been on small molecules, but new approaches have been developed to apply these same principles of deep learning to biologics, such as antibodies. Here we give a brief background of deep learning as it applies to antibody drug development, and an in-depth explanation of several deep learning algorithms that have been proposed to solve aspects of both protein design in general, and antibody design in particular.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Monica Berrondo
- Macromoltek, Inc, 2500 W William Cannon Dr, Suite 204, Austin, Austin, TX 78745, USA
| |
Collapse
|
18
|
McDermott JE, Cort JR, Nakayasu ES, Pruneda JN, Overall C, Adkins JN. Prediction of bacterial E3 ubiquitin ligase effectors using reduced amino acid peptide fingerprinting. PeerJ 2019; 7:e7055. [PMID: 31211016 PMCID: PMC6557245 DOI: 10.7717/peerj.7055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Accepted: 05/02/2019] [Indexed: 11/20/2022] Open
Abstract
Background Although pathogenic Gram-negative bacteria lack their own ubiquitination machinery, they have evolved or acquired virulence effectors that can manipulate the host ubiquitination process through structural and/or functional mimicry of host machinery. Many such effectors have been identified in a wide variety of bacterial pathogens that share little sequence similarity amongst themselves or with eukaryotic ubiquitin E3 ligases. Methods To allow identification of novel bacterial E3 ubiquitin ligase effectors from protein sequences we have developed a machine learning approach, the SVM-based Identification and Evaluation of Virulence Effector Ubiquitin ligases (SIEVE-Ub). We extend the string kernel approach used previously to sequence classification by introducing reduced amino acid (RED) alphabet encoding for protein sequences. Results We found that 14mer peptides with amino acids represented as simply either hydrophobic or hydrophilic provided the best models for discrimination of E3 ligases from other effector proteins with a receiver-operator characteristic area under the curve (AUC) of 0.90. When considering a subset of E3 ubiquitin ligase effectors that do not fall into known sequence based families we found that the AUC was 0.82, demonstrating the effectiveness of our method at identifying novel functional family members. Feature selection was used to identify a parsimonious set of 10 RED peptides that provided good discrimination, and these peptides were found to be located in functionally important regions of the proteins involved in E2 and host target protein binding. Our general approach enables construction of models based on other effector functions. We used SIEVE-Ub to predict nine potential novel E3 ligases from a large set of bacterial genomes. SIEVE-Ub is available for download at https://doi.org/10.6084/m9.figshare.7766984.v1 or https://github.com/biodataganache/SIEVE-Ub for the most current version.
Collapse
Affiliation(s)
- Jason E McDermott
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States of America.,Department of Molecular Microbiology and Immunology, Oregon Health & Science University, Portland, OR, United States of America
| | - John R Cort
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States of America
| | - Ernesto S Nakayasu
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States of America
| | - Jonathan N Pruneda
- Department of Molecular Microbiology and Immunology, Oregon Health & Science University, Portland, OR, United States of America
| | - Christopher Overall
- Center for Brain Immunology and Glia, University of Virginia, Charlottesville, United States of America
| | - Joshua N Adkins
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA, United States of America
| |
Collapse
|
19
|
Ringel O, Vieillard V, Debré P, Eichler J, Büning H, Dietrich U. The Hard Way towards an Antibody-Based HIV-1 Env Vaccine: Lessons from Other Viruses. Viruses 2018; 10:v10040197. [PMID: 29662026 PMCID: PMC5923491 DOI: 10.3390/v10040197] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2018] [Revised: 04/05/2018] [Accepted: 04/13/2018] [Indexed: 12/13/2022] Open
Abstract
Although effective antibody-based vaccines have been developed against multiple viruses, such approaches have so far failed for the human immunodeficiency virus type 1 (HIV-1). Despite the success of anti-retroviral therapy (ART) that has turned HIV-1 infection into a chronic disease and has reduced the number of new infections worldwide, a vaccine against HIV-1 is still urgently needed. We discuss here the major reasons for the failure of “classical” vaccine approaches, which are mostly due to the biological properties of the virus itself. HIV-1 has developed multiple mechanisms of immune escape, which also account for vaccine failure. So far, no vaccine candidate has been able to induce broadly neutralizing antibodies (bnAbs) against primary patient viruses from different clades. However, such antibodies were identified in a subset of patients during chronic infection and were shown to protect from infection in animal models and to reduce viremia in first clinical trials. Their detailed characterization has guided structure-based reverse vaccinology approaches to design better HIV-1 envelope (Env) immunogens. Furthermore, conserved Env epitopes have been identified, which are promising candidates in view of clinical applications. Together with new vector-based technologies, considerable progress has been achieved in recent years towards the development of an effective antibody-based HIV-1 vaccine.
Collapse
Affiliation(s)
- Oliver Ringel
- Georg-Speyer-Haus, Institute for Tumor Biology and Experimental Therapy, 60596 Frankfurt, Germany.
| | - Vincent Vieillard
- Centre d'Immunologie et des Maladies Infectieuses (CIMI-Paris), Sorbonne Université, UPMC Univ Paris 06, INSERM U1135, CNRS ERL8255, 75013 Paris, France.
| | - Patrice Debré
- Centre d'Immunologie et des Maladies Infectieuses (CIMI-Paris), Sorbonne Université, UPMC Univ Paris 06, INSERM U1135, CNRS ERL8255, 75013 Paris, France.
| | - Jutta Eichler
- Department of Chemistry and Pharmacy, University of Erlangen-Nurnberg, 91058 Erlangen, Germany.
| | - Hildegard Büning
- Laboratory for Infection Biology & Gene Transfer, Institute of Experimental Hematology, Hannover Medical School, 30625 Hannover, Germany.
- German Center for Infection Research (DZIF), Partner Site Hannover-Braunschweig, 38124 Braunschweig, Germany.
| | - Ursula Dietrich
- Georg-Speyer-Haus, Institute for Tumor Biology and Experimental Therapy, 60596 Frankfurt, Germany.
| |
Collapse
|
20
|
The International Conference on Intelligent Biology and Medicine (ICIBM) 2016: summary and innovation in genomics. BMC Genomics 2017; 18:703. [PMID: 28984207 PMCID: PMC5629612 DOI: 10.1186/s12864-017-4018-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
In this editorial, we first summarize the 2016 International Conference on Intelligent Biology and Medicine (ICIBM 2016) that was held on December 8–10, 2016 in Houston, Texas, USA, and then briefly introduce the ten research articles included in this supplement issue. ICIBM 2016 included four workshops or tutorials, four keynote lectures, four conference invited talks, eight concurrent scientific sessions and a poster session for 53 accepted abstracts, covering current topics in bioinformatics, systems biology, intelligent computing, and biomedical informatics. Through our call for papers, a total of 77 original manuscripts were submitted to ICIBM 2016. After peer review, 11 articles were selected in this special issue, covering topics such as single cell RNA-seq analysis method, genome sequence and variation analysis, bioinformatics method for vaccine development, and cancer genomics.
Collapse
|