1
|
Planas-Iglesias J, Borko S, Swiatkowski J, Elias M, Havlasek M, Salamon O, Grakova E, Kunka A, Martinovic T, Damborsky J, Martinovic J, Bednar D. AggreProt: a web server for predicting and engineering aggregation prone regions in proteins. Nucleic Acids Res 2024; 52:W159-W169. [PMID: 38801076 PMCID: PMC11223854 DOI: 10.1093/nar/gkae420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 04/23/2024] [Accepted: 05/13/2024] [Indexed: 05/29/2024] Open
Abstract
Recombinant proteins play pivotal roles in numerous applications including industrial biocatalysts or therapeutics. Despite the recent progress in computational protein structure prediction, protein solubility and reduced aggregation propensity remain challenging attributes to design. Identification of aggregation-prone regions is essential for understanding misfolding diseases or designing efficient protein-based technologies, and as such has a great socio-economic impact. Here, we introduce AggreProt, a user-friendly webserver that automatically exploits an ensemble of deep neural networks to predict aggregation-prone regions (APRs) in protein sequences. Trained on experimentally evaluated hexapeptides, AggreProt compares to or outperforms state-of-the-art algorithms on two independent benchmark datasets. The server provides per-residue aggregation profiles along with information on solvent accessibility and transmembrane propensity within an intuitive interface with interactive sequence and structure viewers for comprehensive analysis. We demonstrate AggreProt efficacy in predicting differential aggregation behaviours in proteins on several use cases, which emphasize its potential for guiding protein engineering strategies towards decreased aggregation propensity and improved solubility. The webserver is freely available and accessible at https://loschmidt.chemi.muni.cz/aggreprot/.
Collapse
Affiliation(s)
- Joan Planas-Iglesias
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Simeon Borko
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Jan Swiatkowski
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - Matej Elias
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - Martin Havlasek
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Ondrej Salamon
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - Ekaterina Grakova
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - Antonín Kunka
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Tomas Martinovic
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| | - Jan Martinovic
- IT4Innovations, VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic
| | - David Bednar
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne's University Hospital Brno, Brno, Czech Republic
| |
Collapse
|
2
|
Ghosh D, Biswas A, Radhakrishna M. Advanced computational approaches to understand protein aggregation. BIOPHYSICS REVIEWS 2024; 5:021302. [PMID: 38681860 PMCID: PMC11045254 DOI: 10.1063/5.0180691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Protein aggregation is a widespread phenomenon implicated in debilitating diseases like Alzheimer's, Parkinson's, and cataracts, presenting complex hurdles for the field of molecular biology. In this review, we explore the evolving realm of computational methods and bioinformatics tools that have revolutionized our comprehension of protein aggregation. Beginning with a discussion of the multifaceted challenges associated with understanding this process and emphasizing the critical need for precise predictive tools, we highlight how computational techniques have become indispensable for understanding protein aggregation. We focus on molecular simulations, notably molecular dynamics (MD) simulations, spanning from atomistic to coarse-grained levels, which have emerged as pivotal tools in unraveling the complex dynamics governing protein aggregation in diseases such as cataracts, Alzheimer's, and Parkinson's. MD simulations provide microscopic insights into protein interactions and the subtleties of aggregation pathways, with advanced techniques like replica exchange molecular dynamics, Metadynamics (MetaD), and umbrella sampling enhancing our understanding by probing intricate energy landscapes and transition states. We delve into specific applications of MD simulations, elucidating the chaperone mechanism underlying cataract formation using Markov state modeling and the intricate pathways and interactions driving the toxic aggregate formation in Alzheimer's and Parkinson's disease. Transitioning we highlight how computational techniques, including bioinformatics, sequence analysis, structural data, machine learning algorithms, and artificial intelligence have become indispensable for predicting protein aggregation propensity and locating aggregation-prone regions within protein sequences. Throughout our exploration, we underscore the symbiotic relationship between computational approaches and empirical data, which has paved the way for potential therapeutic strategies against protein aggregation-related diseases. In conclusion, this review offers a comprehensive overview of advanced computational methodologies and bioinformatics tools that have catalyzed breakthroughs in unraveling the molecular basis of protein aggregation, with significant implications for clinical interventions, standing at the intersection of computational biology and experimental research.
Collapse
Affiliation(s)
- Deepshikha Ghosh
- Department of Biological Sciences and Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | - Anushka Biswas
- Department of Chemical Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | | |
Collapse
|
3
|
Khalili K, Farzam F, Dabirmanesh B, Khajeh K. Prediction of protein aggregation. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2024; 206:229-263. [PMID: 38811082 DOI: 10.1016/bs.pmbts.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
The scientific community is very interested in protein aggregation because of its involvement in several neurodegenerative diseases and its significance in industry. Remarkably, fibrillar aggregates are utilized naturally for constructing structural scaffolds or creating biological switches and may be intentionally designed to construct versatile nanomaterials. Consequently, there is a significant need to rationalize and predict protein aggregation. Researchers have developed various computational methodologies and algorithms to predict protein aggregation and understand its underlying mechanics. This chapter aims to summarize the significant advancements in computational methods, accessible resources, and prospective developments in the field of in silico research. We assess the existing computational tools for predicting protein aggregation propensities, detecting areas that are prone to sequential and structural aggregation, analyzing the effects of mutations on protein aggregation, or identifying prion-like domains.
Collapse
Affiliation(s)
- Kavyan Khalili
- Department of Biochemistry, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Farnoosh Farzam
- Department of Biochemistry, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Bahareh Dabirmanesh
- Department of Biochemistry, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Khosro Khajeh
- Department of Biochemistry, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran.
| |
Collapse
|
4
|
Liao S, Zhang Y, Han X, Wang T, Wang X, Yan Q, Li Q, Qi Y, Zhang Z. A sequence-based model for identifying proteins undergoing liquid-liquid phase separation/forming fibril aggregates via machine learning. Protein Sci 2024; 33:e4927. [PMID: 38380794 PMCID: PMC10880426 DOI: 10.1002/pro.4927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 01/27/2024] [Accepted: 01/30/2024] [Indexed: 02/22/2024]
Abstract
Liquid-liquid phase separation (LLPS) and the solid aggregate (also referred to as amyloid aggregates) formation of proteins, have gained significant attention in recent years due to their associations with various physiological and pathological processes in living organisms. The systematic investigation of the differences and connections between proteins undergoing LLPS and those forming amyloid fibrils at the sequence level has not yet been explored. In this research, we aim to address this gap by comparing the two types of proteins across 36 features using collected data available currently. The statistical comparison results indicate that, 24 of the selected 36 features exhibit significant difference between the two protein groups. A LLPS-Fibrils binary classification model built on these 24 features using random forest reveals that the fraction of intrinsically disordered residues (FIDR ) is identified as the most crucial feature. While, in the further three-class LLPS-Fibrils-Background classification model built on the same screened features, the composition of cysteine and that of leucine show more significant contributions than others. Through feature ablation analysis, we finally constructed a model FLFB (Feature-based LLPS-Fibrils-Background protein predictor) using six refined features, with an average area under the receiver operating characteristics of 0.83. This work indicates using sequence features and a machine learning model, proteins undergoing LLPS or forming amyloid fibrils can be identified.
Collapse
Affiliation(s)
- Shaofeng Liao
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
| | - Yujun Zhang
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
| | - Xinchen Han
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
| | - Tinglan Wang
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
| | - Xi Wang
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
| | - Qinglin Yan
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
| | - Qian Li
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
| | - Yifei Qi
- School of PharmacyFudan UniversityShanghaiChina
| | - Zhuqing Zhang
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
| |
Collapse
|
5
|
Kang S, Kim M, Sun J, Lee M, Min K. Prediction of Protein Aggregation Propensity via Data-Driven Approaches. ACS Biomater Sci Eng 2023; 9:6451-6463. [PMID: 37844262 DOI: 10.1021/acsbiomaterials.3c01001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2023]
Abstract
Protein aggregation occurs when misfolded or unfolded proteins physically bind together and can promote the development of various amyloid diseases. This study aimed to construct surrogate models for predicting protein aggregation via data-driven methods using two types of databases. First, an aggregation propensity score database was constructed by calculating the scores for protein structures in the Protein Data Bank using Aggrescan3D 2.0. Moreover, feature- and graph-based models for predicting protein aggregation have been developed by using this database. The graph-based model outperformed the feature-based model, resulting in an R2 of 0.95, although it intrinsically required protein structures. Second, for the experimental data, a feature-based model was built using the Curated Protein Aggregation Database 2.0 to predict the aggregated intensity curves. In summary, this study suggests approaches that are more effective in predicting protein aggregation, depending on the type of descriptor and the database.
Collapse
Affiliation(s)
- Seungpyo Kang
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu 06978, Seoul, Republic of Korea
| | - Minseon Kim
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu 06978, Seoul, Republic of Korea
| | - Jiwon Sun
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu 06978, Seoul, Republic of Korea
| | - Myeonghun Lee
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu 06978, Seoul, Republic of Korea
| | - Kyoungmin Min
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu 06978, Seoul, Republic of Korea
| |
Collapse
|
6
|
Kouba P, Kohout P, Haddadi F, Bushuiev A, Samusevich R, Sedlar J, Damborsky J, Pluskal T, Sivic J, Mazurenko S. Machine Learning-Guided Protein Engineering. ACS Catal 2023; 13:13863-13895. [PMID: 37942269 PMCID: PMC10629210 DOI: 10.1021/acscatal.3c02743] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/20/2023] [Indexed: 11/10/2023]
Abstract
Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid in the discovery and annotation of promising enzymes, as well as in suggesting beneficial mutations for improving known targets. The field of machine learning for protein engineering is gathering steam, driven by recent success stories and notable progress in other areas. It already encompasses ambitious tasks such as understanding and predicting protein structure and function, catalytic efficiency, enantioselectivity, protein dynamics, stability, solubility, aggregation, and more. Nonetheless, the field is still evolving, with many challenges to overcome and questions to address. In this Perspective, we provide an overview of ongoing trends in this domain, highlight recent case studies, and examine the current limitations of machine learning-based methods. We emphasize the crucial importance of thorough experimental validation of emerging models before their use for rational protein design. We present our opinions on the fundamental problems and outline the potential directions for future research.
Collapse
Affiliation(s)
- Petr Kouba
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Faculty of
Electrical Engineering, Czech Technical
University in Prague, Technicka 2, 166 27 Prague 6, Czech Republic
| | - Pavel Kohout
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Faraneh Haddadi
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Anton Bushuiev
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Raman Samusevich
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Jiri Sedlar
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Jiri Damborsky
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Tomas Pluskal
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Josef Sivic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Stanislav Mazurenko
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| |
Collapse
|
7
|
Cao S, Song Z, Rong J, Andrikopoulos N, Liang X, Wang Y, Peng G, Ding F, Ke PC. Spike Protein Fragments Promote Alzheimer's Amyloidogenesis. ACS APPLIED MATERIALS & INTERFACES 2023; 15:40317-40329. [PMID: 37585091 PMCID: PMC10480042 DOI: 10.1021/acsami.3c09815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Alzheimer's disease (AD) is a major cause of dementia inducing memory loss, cognitive decline, and mortality among the aging population. While the amyloid aggregation of peptide Aβ has long been implicated in neurodegeneration in AD, primarily through the production of toxic polymorphic aggregates and reactive oxygen species, viral infection has a less explicit role in the etiology of the brain disease. On the other hand, while the COVID-19 pandemic is known to harm human organs and function, its adverse effects on AD pathobiology and other human conditions remain unclear. Here we first identified the amyloidogenic potential of 1058HGVVFLHVTYV1068, a short fragment of the spike protein of SARS-CoV-2 coronavirus. The peptide fragment was found to be toxic and displayed a high binding propensity for the amyloidogenic segments of Aβ, thereby promoting the aggregation and toxicity of the peptide in vitro and in silico, while retarding the hatching and survival of zebrafish embryos upon exposure. Our study implicated SARS-CoV-2 viral infection as a potential contributor to AD pathogenesis, a little explored area in our quest for understanding and overcoming Long Covid.
Collapse
Affiliation(s)
- Sujian Cao
- Nanomedicine Center, The Great Bay Area National Institute for Nanotechnology Innovation, 136 Kaiyuan Avenue, Guangzhou, 510700, China
| | - Zhiyuan Song
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Jinyu Rong
- College of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | - Nicholas Andrikopoulos
- Nanomedicine Center, The Great Bay Area National Institute for Nanotechnology Innovation, 136 Kaiyuan Avenue, Guangzhou, 510700, China
- Drug Delivery, Disposition and Dynamics, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, VIC 3052, Australia
| | - Xiufang Liang
- Nanomedicine Center, The Great Bay Area National Institute for Nanotechnology Innovation, 136 Kaiyuan Avenue, Guangzhou, 510700, China
- School of Biomedical Sciences and Engineering, Guangzhou International Campus, South China University of Technology, Guangzhou, 510006, China
| | - Yue Wang
- Nanomedicine Center, The Great Bay Area National Institute for Nanotechnology Innovation, 136 Kaiyuan Avenue, Guangzhou, 510700, China
- School of Biomedical Sciences and Engineering, Guangzhou International Campus, South China University of Technology, Guangzhou, 510006, China
| | - Guotao Peng
- College of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | - Feng Ding
- Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA
| | - Pu Chun Ke
- Nanomedicine Center, The Great Bay Area National Institute for Nanotechnology Innovation, 136 Kaiyuan Avenue, Guangzhou, 510700, China
- Drug Delivery, Disposition and Dynamics, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, VIC 3052, Australia
| |
Collapse
|
8
|
Bauer J, Rajagopal N, Gupta P, Gupta P, Nixon AE, Kumar S. How can we discover developable antibody-based biotherapeutics? Front Mol Biosci 2023; 10:1221626. [PMID: 37609373 PMCID: PMC10441133 DOI: 10.3389/fmolb.2023.1221626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 07/10/2023] [Indexed: 08/24/2023] Open
Abstract
Antibody-based biotherapeutics have emerged as a successful class of pharmaceuticals despite significant challenges and risks to their discovery and development. This review discusses the most frequently encountered hurdles in the research and development (R&D) of antibody-based biotherapeutics and proposes a conceptual framework called biopharmaceutical informatics. Our vision advocates for the syncretic use of computation and experimentation at every stage of biologic drug discovery, considering developability (manufacturability, safety, efficacy, and pharmacology) of potential drug candidates from the earliest stages of the drug discovery phase. The computational advances in recent years allow for more precise formulation of disease concepts, rapid identification, and validation of targets suitable for therapeutic intervention and discovery of potential biotherapeutics that can agonize or antagonize them. Furthermore, computational methods for de novo and epitope-specific antibody design are increasingly being developed, opening novel computationally driven opportunities for biologic drug discovery. Here, we review the opportunities and limitations of emerging computational approaches for optimizing antigens to generate robust immune responses, in silico generation of antibody sequences, discovery of potential antibody binders through virtual screening, assessment of hits, identification of lead drug candidates and their affinity maturation, and optimization for developability. The adoption of biopharmaceutical informatics across all aspects of drug discovery and development cycles should help bring affordable and effective biotherapeutics to patients more quickly.
Collapse
Affiliation(s)
- Joschka Bauer
- Early Stage Pharmaceutical Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach/Riss, Germany
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
| | - Nandhini Rajagopal
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Priyanka Gupta
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Pankaj Gupta
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Andrew E. Nixon
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Sandeep Kumar
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| |
Collapse
|
9
|
Mills BJ, Godamudunage MP, Ren S, Laha M. Predictive Nature of High-Throughput Assays in ADC Formulation Screening. J Pharm Sci 2023; 112:1821-1831. [PMID: 37037342 DOI: 10.1016/j.xphs.2023.03.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 03/28/2023] [Accepted: 03/29/2023] [Indexed: 04/12/2023]
Abstract
Utilization of high-throughput biophysical screening techniques during early screening studies is warranted due to the limited amount of material and large number of samples. But the predictability of the data to longer-term storage stability is critical as the high-throughput methods assist in defining the design space for the longer-term studies. In this study, the biophysical properties of two ADCs in 16 formulation conditions were evaluated using high-throughput techniques. Conformational stability and colloidal stability were evaluated by determining Tm values, kD, B22, and Tagg. In addition, the samples were placed on stability and the extent of aggregate formation over the 8-week interval was determined. The rank order of the 16 different formulations in the high-throughput assays was compared to the rank order observed during the stability studies to assess the predictive capabilities of the screening methods. It was demonstrated that similar rank orders can be expected between high-throughput physical stability indicating assays such as Tagg and B22 and traditional aggregation by SEC data, whereas conformational stability read-outs (Tm) are less predictive. In addition, the high-throughput assays appropriately identified the poor performing formulation conditions, which is ultimately what is desired of screening assays.
Collapse
Affiliation(s)
- Brittney J Mills
- Biologics CMC Drug Product Development, AbbVie Inc., 1 N Waukegan Road, North Chicago, IL 60064, United States.
| | - Malika P Godamudunage
- Biologics CMC Drug Product Development, AbbVie Inc., 1 N Waukegan Road, North Chicago, IL 60064, United States
| | - Siyuan Ren
- Biologics CMC Drug Product Development, AbbVie Inc., 1 N Waukegan Road, North Chicago, IL 60064, United States
| | - Malabika Laha
- Biologics CMC Drug Product Development, AbbVie Inc., 1 N Waukegan Road, North Chicago, IL 60064, United States
| |
Collapse
|
10
|
Chen Z, Wang X, Chen X, Huang J, Wang C, Wang J, Wang Z. Accelerating therapeutic protein design with computational approaches toward the clinical stage. Comput Struct Biotechnol J 2023; 21:2909-2926. [PMID: 38213894 PMCID: PMC10781723 DOI: 10.1016/j.csbj.2023.04.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 04/11/2023] [Accepted: 04/27/2023] [Indexed: 01/13/2024] Open
Abstract
Therapeutic protein, represented by antibodies, is of increasing interest in human medicine. However, clinical translation of therapeutic protein is still largely hindered by different aspects of developability, including affinity and selectivity, stability and aggregation prevention, solubility and viscosity reduction, and deimmunization. Conventional optimization of the developability with widely used methods, like display technologies and library screening approaches, is a time and cost-intensive endeavor, and the efficiency in finding suitable solutions is still not enough to meet clinical needs. In recent years, the accelerated advancement of computational methodologies has ushered in a transformative era in the field of therapeutic protein design. Owing to their remarkable capabilities in feature extraction and modeling, the integration of cutting-edge computational strategies with conventional techniques presents a promising avenue to accelerate the progression of therapeutic protein design and optimization toward clinical implementation. Here, we compared the differences between therapeutic protein and small molecules in developability and provided an overview of the computational approaches applicable to the design or optimization of therapeutic protein in several developability issues.
Collapse
Affiliation(s)
- Zhidong Chen
- Department of Pathology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen 518033, China
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Xinpei Wang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Xu Chen
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Juyang Huang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Chenglin Wang
- Shenzhen Qiyu Biotechnology Co., Ltd, Shenzhen 518107, China
| | - Junqing Wang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Zhe Wang
- Department of Pathology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen 518033, China
| |
Collapse
|
11
|
Fernández-Quintero ML, Ljungars A, Waibl F, Greiff V, Andersen JT, Gjølberg TT, Jenkins TP, Voldborg BG, Grav LM, Kumar S, Georges G, Kettenberger H, Liedl KR, Tessier PM, McCafferty J, Laustsen AH. Assessing developability early in the discovery process for novel biologics. MAbs 2023; 15:2171248. [PMID: 36823021 PMCID: PMC9980699 DOI: 10.1080/19420862.2023.2171248] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 01/18/2023] [Indexed: 02/25/2023] Open
Abstract
Beyond potency, a good developability profile is a key attribute of a biological drug. Selecting and screening for such attributes early in the drug development process can save resources and avoid costly late-stage failures. Here, we review some of the most important developability properties that can be assessed early on for biologics. These include the influence of the source of the biologic, its biophysical and pharmacokinetic properties, and how well it can be expressed recombinantly. We furthermore present in silico, in vitro, and in vivo methods and techniques that can be exploited at different stages of the discovery process to identify molecules with liabilities and thereby facilitate the selection of the most optimal drug leads. Finally, we reflect on the most relevant developability parameters for injectable versus orally delivered biologics and provide an outlook toward what general trends are expected to rise in the development of biologics.
Collapse
Affiliation(s)
- Monica L. Fernández-Quintero
- Center for Molecular Biosciences Innsbruck (CMBI), Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Anne Ljungars
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Franz Waibl
- Center for Molecular Biosciences Innsbruck (CMBI), Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Victor Greiff
- Department of Immunology, University of Oslo, Oslo, Norway
| | - Jan Terje Andersen
- Department of Immunology, University of Oslo, Oslo University Hospital Rikshospitalet, Oslo, Norway
- Institute of Clinical Medicine and Department of Pharmacology, University of Oslo, Oslo, Norway
| | | | - Timothy P. Jenkins
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Bjørn Gunnar Voldborg
- National Biologics Facility, Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Lise Marie Grav
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Sandeep Kumar
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc, Ridgefield, CT, USA
| | - Guy Georges
- Roche Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich, Penzberg, Germany
| | - Hubert Kettenberger
- Roche Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich, Penzberg, Germany
| | - Klaus R. Liedl
- Center for Molecular Biosciences Innsbruck (CMBI), Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Peter M. Tessier
- Department of Chemical Engineering, Pharmaceutical Sciences and Biomedical Engineering, Biointerfaces Institute, University of Michigan, Ann Arbor, Michigan, USA
| | - John McCafferty
- Department of Medicine, Cambridge Institute of Therapeutic Immunology and Infectious Disease, University of Cambridge, Cambridge, UK
- Maxion Therapeutics, Babraham Research Campus, Cambridge, UK
| | - Andreas H. Laustsen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kongens Lyngby, Denmark
| |
Collapse
|
12
|
Miller ME, Li MH, Baghai A, Peetz VH, Zhyvoloup A, Raleigh DP. Analysis of Sheep and Goat IAPP Provides Insight into IAPP Amyloidogenicity and Cytotoxicity. Biochemistry 2022; 61:2531-2545. [PMID: 36286531 PMCID: PMC11132794 DOI: 10.1021/acs.biochem.2c00470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Human islet amyloid polypeptide (hIAPP) plays a role in glucose regulation but forms pancreatic amyloid deposits in type 2 diabetes, and that process contributes to β-cell dysfunction. Not all species develop diabetes, and not all secrete an IAPP that is amyloidogenic in vitro under normal conditions, a perfect correlation currently exists between both. Studies of IAPPs from such organisms can provide clues about the high amyloidogenicity of hIAPP and can inform the design of soluble analogues of hIAPP. Sheep and goat IAPP are among the most divergent from hIAPP, with 13 and 11 substitutions, respectively, including an unusual Tyr to His substitution at the C-terminus. The properties of sheep and goat IAPP were examined in solution and in the presence of anionic vesicles, resulting in no observed amyloid formation, even at increased concentrations. Furthermore, both peptides are considerably less toxic to cultured β-cells than hIAPP. The effect of the Y37H replacements was studied in the context of hIAPP, as was a Y37R substitution. Buffer- and salt-dependent effects were observed. There was little impact on the time to form amyloid in phosphate-buffered saline; however, a significant deceleration was observed in Tris buffer, and amyloid formation was slower in the absence of added salt. The Y37H substitution had little impact on toxicity, while the Y37R replacement led to a 30% decrease in toxicity compared with that of hIAPP. The implications for the amyloidogenicity of hIAPP and the design of soluble analogues of the human peptide are discussed.
Collapse
Affiliation(s)
- Matthew E.T. Miller
- Department of Chemistry, Stony Brook University, Nicolls Road, Stony Brook, New York 11790, United States
| | - Ming-Hao Li
- Graduate Program in Biochemistry and Structural Biology, Stony Brook University, Stony Brook, New York 11790, United States
| | - Aria Baghai
- Institute of Structural and Molecular Biology, University College London, Gower Street, London WC1E 6BT, United Kingdom
| | - Vincent H. Peetz
- Department of Chemistry, Stony Brook University, Nicolls Road, Stony Brook, New York 11790, United States
| | - Alexander Zhyvoloup
- Institute of Structural and Molecular Biology, University College London, Gower Street, London WC1E 6BT, United Kingdom
| | - Daniel P. Raleigh
- Department of Chemistry, Stony Brook University, Nicolls Road, Stony Brook, New York 11790, United States
- Graduate Program in Biochemistry and Structural Biology, Stony Brook University, Stony Brook, New York 11790, United States
- Institute of Structural and Molecular Biology, University College London, Gower Street, London WC1E 6BT, United Kingdom
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
| |
Collapse
|
13
|
Qing R, Hao S, Smorodina E, Jin D, Zalevsky A, Zhang S. Protein Design: From the Aspect of Water Solubility and Stability. Chem Rev 2022; 122:14085-14179. [PMID: 35921495 PMCID: PMC9523718 DOI: 10.1021/acs.chemrev.1c00757] [Citation(s) in RCA: 54] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Indexed: 12/13/2022]
Abstract
Water solubility and structural stability are key merits for proteins defined by the primary sequence and 3D-conformation. Their manipulation represents important aspects of the protein design field that relies on the accurate placement of amino acids and molecular interactions, guided by underlying physiochemical principles. Emulated designer proteins with well-defined properties both fuel the knowledge-base for more precise computational design models and are used in various biomedical and nanotechnological applications. The continuous developments in protein science, increasing computing power, new algorithms, and characterization techniques provide sophisticated toolkits for solubility design beyond guess work. In this review, we summarize recent advances in the protein design field with respect to water solubility and structural stability. After introducing fundamental design rules, we discuss the transmembrane protein solubilization and de novo transmembrane protein design. Traditional strategies to enhance protein solubility and structural stability are introduced. The designs of stable protein complexes and high-order assemblies are covered. Computational methodologies behind these endeavors, including structure prediction programs, machine learning algorithms, and specialty software dedicated to the evaluation of protein solubility and aggregation, are discussed. The findings and opportunities for Cryo-EM are presented. This review provides an overview of significant progress and prospects in accurate protein design for solubility and stability.
Collapse
Affiliation(s)
- Rui Qing
- State
Key Laboratory of Microbial Metabolism, School of Life Sciences and
Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- The
David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Shilei Hao
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- Key
Laboratory of Biorheological Science and Technology, Ministry of Education, College of Bioengineering, Chongqing University, Chongqing 400030, China
| | - Eva Smorodina
- Department
of Immunology, University of Oslo and Oslo
University Hospital, Oslo 0424, Norway
| | - David Jin
- Avalon GloboCare
Corp., Freehold, New Jersey 07728, United States
| | - Arthur Zalevsky
- Laboratory
of Bioinformatics Approaches in Combinatorial Chemistry and Biology, Shemyakin−Ovchinnikov Institute of Bioorganic
Chemistry RAS, Moscow 117997, Russia
| | - Shuguang Zhang
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
14
|
Nordquist EB, Clerico EM, Chen J, Gierasch LM. Computationally-Aided Modeling of Hsp70-Client Interactions: Past, Present, and Future. J Phys Chem B 2022; 126:6780-6791. [PMID: 36040440 PMCID: PMC10309085 DOI: 10.1021/acs.jpcb.2c03806] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Hsp70 molecular chaperones play central roles in maintaining a healthy cellular proteome. Hsp70s function by binding to short peptide sequences in incompletely folded client proteins, thus preventing them from misfolding and/or aggregating, and in many cases holding them in a state that is competent for subsequent processes like translocation across membranes. There is considerable interest in predicting the sites where Hsp70s may bind their clients, as the ability to do so sheds light on the cellular functions of the chaperone. In addition, the capacity of the Hsp70 chaperone family to bind to a broad array of clients and to identify accessible sequences that enable discrimination of those that are folded from those that are not fully folded, which is essential to their cellular roles, is a fascinating puzzle in molecular recognition. In this article we discuss efforts to harness computational modeling with input from experimental data to develop a predictive understanding of the promiscuous yet selective binding of Hsp70 molecular chaperones to accessible sequences within their client proteins. We trace how an increasing understanding of the complexities of Hsp70-client interactions has led computational modeling to new underlying assumptions and design features. We describe the trend from purely data-driven analysis toward increased reliance on physics-based modeling that deeply integrates structural information and sequence-based functional data with physics-based binding energies. Notably, new experimental insights are adding to our understanding of the molecular origins of "selective promiscuity" in substrate binding by Hsp70 chaperones and challenging the underlying assumptions and design used in earlier predictive models. Taking the new experimental findings together with exciting progress in computational modeling of protein structures leads us to foresee a bright future for a predictive understanding of selective-yet-promiscuous binding exploited by Hsp70 molecular chaperones; the resulting new insights will also apply to substrate binding by other chaperones and by signaling proteins.
Collapse
Affiliation(s)
- Erik B. Nordquist
- Department of Chemistry, University of Massachusetts, Amherst, Massachusetts, 01003, United States
| | - Eugenia M. Clerico
- Department of Biochemistry and Molecular Biology, University of Massachusetts, Amherst, Massachusetts, 01003, United States
| | - Jianhan Chen
- Department of Chemistry, University of Massachusetts, Amherst, Massachusetts, 01003, United States
- Department of Biochemistry and Molecular Biology, University of Massachusetts, Amherst, Massachusetts, 01003, United States
| | - Lila M. Gierasch
- Department of Chemistry, University of Massachusetts, Amherst, Massachusetts, 01003, United States
- Department of Biochemistry and Molecular Biology, University of Massachusetts, Amherst, Massachusetts, 01003, United States
| |
Collapse
|
15
|
Bhagavatula H, Sarkar A, Santra B, Das A. Scan-Find-Scan-Model: Discrete Site-Targeted Suppressor Design Strategy for Amyloid-β. ACS Chem Neurosci 2022; 13:2191-2208. [PMID: 35767676 DOI: 10.1021/acschemneuro.2c00272] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Alzheimer's disease is undoubtedly the most well-studied neurodegenerative disease. Consequently, the amyloid-β (Aβ) protein ranks at the top in terms of getting attention from the scientific community for structural property-based characterization. Even after decades of extensive research, there is existing volatility in terms of understanding and hence the effective tackling procedures against the disease that arises due to the lack of knowledge of both specific target- and site-specific drugs. Here, we develop a multidimensional approach based on the characterization of the common static-dynamic-thermodynamic trait of the monomeric protein, which efficiently identifies a small target sequence that contains an inherent tendency to misfold and consequently aggregate. The robustness of the identification of the target sequence comes with an abundance of a priori knowledge about the length and sequence of the target and hence guides toward effective designing of the target-specific drug with a very low probability of bottleneck and failure. Based on the target sequence information, we further identified a specific mutant that showed the maximum potential to act as a destabilizer of the monomeric protein as well as enormous success as an aggregation suppressor. We eventually tested the drug efficacy by estimating the extent of modulation of binding affinity existing within the fibrillar form of the Aβ protein due to a single-point mutation and hence provided a proof of concept of the entire protocol.
Collapse
Affiliation(s)
- Hasathi Bhagavatula
- Department of Biotechnology, Progressive Education Society's Modern College of Arts Science and Commerce, Shivajinagar, Pune 411005, India
| | - Archishman Sarkar
- School of Applied and Interdisciplinary Sciences, Indian Association for the Cultivation of Science, 2A & 2B, Raja Subodh Chandra Mallick Road, Kolkata, West Bengal 700032, India
| | - Binit Santra
- Department of Chemistry, Indian Institute of Technology Kanpur, Kalyanpur, Kanpur, Uttar Pradesh 208016, India
| | - Atanu Das
- Physical and Materials Chemistry Division, CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune, Maharashtra 411008, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
16
|
Charoenkwan P, Ahmed S, Nantasenamat C, Quinn JMW, Moni MA, Lio' P, Shoombuatong W. AMYPred-FRL is a novel approach for accurate prediction of amyloid proteins by using feature representation learning. Sci Rep 2022; 12:7697. [PMID: 35546347 PMCID: PMC9095707 DOI: 10.1038/s41598-022-11897-z] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Accepted: 05/03/2022] [Indexed: 12/13/2022] Open
Abstract
Amyloid proteins have the ability to form insoluble fibril aggregates that have important pathogenic effects in many tissues. Such amyloidoses are prominently associated with common diseases such as type 2 diabetes, Alzheimer's disease, and Parkinson's disease. There are many types of amyloid proteins, and some proteins that form amyloid aggregates when in a misfolded state. It is difficult to identify such amyloid proteins and their pathogenic properties, but a new and effective approach is by developing effective bioinformatics tools. While several machine learning (ML)-based models for in silico identification of amyloid proteins have been proposed, their predictive performance is limited. In this study, we present AMYPred-FRL, a novel meta-predictor that uses a feature representation learning approach to achieve more accurate amyloid protein identification. AMYPred-FRL combined six well-known ML algorithms (extremely randomized tree, extreme gradient boosting, k-nearest neighbor, logistic regression, random forest, and support vector machine) with ten different sequence-based feature descriptors to generate 60 probabilistic features (PFs), as opposed to state-of-the-art methods developed by a single feature-based approach. A logistic regression recursive feature elimination (LR-RFE) method was used to find the optimal m number of 60 PFs in order to improve the predictive performance. Finally, using the meta-predictor approach, the 20 selected PFs were fed into a logistic regression method to create the final hybrid model (AMYPred-FRL). Both cross-validation and independent tests showed that AMYPred-FRL achieved superior predictive performance than its constituent baseline models. In an extensive independent test, AMYPred-FRL outperformed the existing methods by 5.5% and 16.1%, respectively, with accuracy and MCC of 0.873 and 0.710. To expedite high-throughput prediction, a user-friendly web server of AMYPred-FRL is freely available at http://pmlabstack.pythonanywhere.com/AMYPred-FRL. It is anticipated that AMYPred-FRL will be a useful tool in helping researchers to identify new amyloid proteins.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Saeed Ahmed
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand
| | - Julian M W Quinn
- Bone Biology Division, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, NSW, 2010, Australia
| | - Mohammad Ali Moni
- Artificial Intelligence and Digital Health Data Science, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD, 4072, Australia
| | - Pietro Lio'
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, 10700, Thailand.
| |
Collapse
|
17
|
Badaczewska-Dawid AE, Garcia-Pardo J, Kuriata A, Pujols J, Ventura S, Kmiecik S. A3D database: structure-based predictions of protein aggregation for the human proteome. Bioinformatics 2022; 38:3121-3123. [PMID: 35445695 PMCID: PMC9746890 DOI: 10.1093/bioinformatics/btac215] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 03/10/2022] [Accepted: 04/19/2022] [Indexed: 12/16/2022] Open
Abstract
SUMMARY Protein aggregation is associated with many human disorders and constitutes a major bottleneck for producing therapeutic proteins. Our knowledge of the human protein structures repertoire has dramatically increased with the recent development of the AlphaFold (AF) deep-learning method. This structural information can be used to understand better protein aggregation properties and the rational design of protein solubility. This article uses the Aggrescan3D (A3D) tool to compute the structure-based aggregation predictions for the human proteome and make the predictions available in a database form. In the A3D database, we analyze the AF-predicted human protein structures (for over 20.5 thousand unique Uniprot IDs) in terms of their aggregation properties using the A3D tool. Each entry of the A3D database provides a detailed analysis of the structure-based aggregation propensity computed with A3D. The A3D database implements simple but useful graphical tools for visualizing and interpreting protein structure datasets. It also enables testing the influence of user-selected mutations on protein solubility and stability, all integrated into a user-friendly interface. AVAILABILITY AND IMPLEMENTATION A3D database is freely available at: http://biocomp.chem.uw.edu.pl/A3D2/hproteome. The data underlying this article are available in the article and in its online supplementary material. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Javier Garcia-Pardo
- Institut de Biotecnologia i de Biomedicina (IBB) and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, 08193 Barcelona, Spain
| | - Aleksander Kuriata
- Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw 02-093, Poland
| | - Jordi Pujols
- Institut de Biotecnologia i de Biomedicina (IBB) and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, 08193 Barcelona, Spain
| | - Salvador Ventura
- Institut de Biotecnologia i de Biomedicina (IBB) and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, 08193 Barcelona, Spain,To whom correspondence should be addressed. and
| | - Sebastian Kmiecik
- Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw 02-093, Poland,To whom correspondence should be addressed. and
| |
Collapse
|
18
|
Computational methods to predict protein aggregation. Curr Opin Struct Biol 2022; 73:102343. [PMID: 35240456 DOI: 10.1016/j.sbi.2022.102343] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 12/20/2021] [Accepted: 01/17/2022] [Indexed: 01/13/2023]
Abstract
In most cases, protein aggregation stems from the establishment of non-native intermolecular contacts. The formation of insoluble protein aggregates is associated with many human diseases and is a major bottleneck for the industrial production of protein-based therapeutics. Strikingly, fibrillar aggregates are naturally exploited for structural scaffolding or to generate molecular switches and can be artificially engineered to build up multi-functional nanomaterials. Thus, there is a high interest in rationalizing and forecasting protein aggregation. Here, we review the available computational toolbox to predict protein aggregation propensities, identify sequential or structural aggregation-prone regions, evaluate the impact of mutations on aggregation or recognize prion-like domains. We discuss the strengths and limitations of these algorithms and how they can evolve in the next future.
Collapse
|
19
|
Lai PK, Gallegos A, Mody N, Sathish HA, Trout BL. Machine learning prediction of antibody aggregation and viscosity for high concentration formulation development of protein therapeutics. MAbs 2022; 14:2026208. [PMID: 35075980 PMCID: PMC8794240 DOI: 10.1080/19420862.2022.2026208] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Machine learning has been recently used to predict therapeutic antibody aggregation rates and viscosity at high concentrations (150 mg/ml). These works focused on commercially available antibodies, which may have been optimized for stability. In this study, we measured accelerated aggregation rates at 45°C and viscosity at 150 mg/ml for 20 preclinical and clinical-stage antibodies. Features obtained from molecular dynamics simulations of the full-length antibody and sequences were used for machine learning model construction. We found a k-nearest neighbors regression model with two features, spatial positive charge map on the CDRH2 and solvent-accessible surface area of hydrophobic residues on the variable fragment, gives the best performance for predicting antibody aggregation rates (r = 0.89). For the viscosity classification model, the model with the highest accuracy is a logistic regression model with two features, spatial negative charge map on the heavy chain variable region and spatial negative charge map on the light chain variable region. The accuracy and the area under precision recall curve of the classification model from validation tests are 0.86 and 0.70, respectively. In addition, we combined data from another 27 commercial mAbs to develop a viscosity predictive model. The best model is a logistic regression model with two features, number of hydrophobic residues on the light chain variable region and net charges on the light chain variable region. The accuracy and the area under precision recall curve of the classification model are 0.85 and 0.6, respectively. The aggregation rates and viscosity models can be used to predict antibody stability to facilitate pharmaceutical development.
Collapse
Affiliation(s)
- Pin-Kuang Lai
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.,Department of Chemical Engineering and Materials Science, Stevens Institute of Technology, Hoboken, New Jersey, USA
| | - Austin Gallegos
- Dosage Form Design and Development, AstraZeneca, Gaithersburg, Maryland, USA
| | - Neil Mody
- Dosage Form Design and Development, AstraZeneca, Gaithersburg, Maryland, USA
| | - Hasige A Sathish
- Dosage Form Design and Development, AstraZeneca, Gaithersburg, Maryland, USA
| | - Bernhardt L Trout
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| |
Collapse
|
20
|
Rawat P, Prabakaran R, Kumar S, Gromiha MM. Exploring the sequence features determining amyloidosis in human antibody light chains. Sci Rep 2021; 11:13785. [PMID: 34215782 PMCID: PMC8253744 DOI: 10.1038/s41598-021-93019-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 06/18/2021] [Indexed: 02/06/2023] Open
Abstract
The light chain (AL) amyloidosis is caused by the aggregation of light chain of antibodies into amyloid fibrils. There are plenty of computational resources available for the prediction of short aggregation-prone regions within proteins. However, it is still a challenging task to predict the amyloidogenic nature of the whole protein using sequence/structure information. In the case of antibody light chains, common architecture and known binding sites can provide vital information for the prediction of amyloidogenicity at physiological conditions. Here, in this work, we have compared classical sequence-based, aggregation-related features (such as hydrophobicity, presence of gatekeeper residues, disorderness, β-propensity, etc.) calculated for the CDR, FR or VL regions of amyloidogenic and non-amyloidogenic antibody light chains and implemented the insights gained in a machine learning-based webserver called "VLAmY-Pred" ( https://web.iitm.ac.in/bioinfo2/vlamy-pred/ ). The model shows prediction accuracy of 79.7% (sensitivity: 78.7% and specificity: 79.9%) with a ROC value of 0.88 on a dataset of 1828 variable region sequences of the antibody light chains. This model will be helpful towards improved prognosis for patients that may likely suffer from diseases caused by light chain amyloidosis, understanding origins of aggregation in antibody-based biotherapeutics, large-scale in-silico analysis of antibody sequences generated by next generation sequencing, and finally towards rational engineering of aggregation resistant antibodies.
Collapse
Affiliation(s)
- Puneet Rawat
- grid.417969.40000 0001 2315 1926Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036 Tamil Nadu India
| | - R. Prabakaran
- grid.417969.40000 0001 2315 1926Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036 Tamil Nadu India
| | - Sandeep Kumar
- grid.418412.a0000 0001 1312 9717Biotherapeutics Discovery, Boehringer-Ingelheim Inc., 5571 R & D Building, 175 Briar Ridge Road, Ridgefield, CT 06877 USA
| | - M. Michael Gromiha
- grid.417969.40000 0001 2315 1926Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036 Tamil Nadu India ,grid.32197.3e0000 0001 2179 2105Advanced Computational Drug Discovery Unit (ACDD), Institute of Innovative Research, Tokyo Institute of Technology, 4259 Nagatsutacho, Midori-ku, Yokohama, Kanagawa 226-8501 Japan
| |
Collapse
|
21
|
Casadio R, Lenhard B, Sternberg MJE. Computational Resources for Molecular Biology 2021. J Mol Biol 2021; 433:166962. [PMID: 33774035 DOI: 10.1016/j.jmb.2021.166962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Rita Casadio
- Biocomputing Group, FABIT-University of Bologna, Italy
| | - Boris Lenhard
- Institute of Clinical Sciences, Faculty of Medicine. Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, UK; Computational Regulatory Genomics, MRC London Institute of Medical Sciences, Du Cane Road, London W12 0NN, UK
| | - Michael J E Sternberg
- Structural Bioinformatics Group, Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK.
| |
Collapse
|
22
|
Prabakaran R, Rawat P, Thangakani AM, Kumar S, Gromiha MM. Protein aggregation: in silico algorithms and applications. Biophys Rev 2021; 13:71-89. [PMID: 33747245 PMCID: PMC7930180 DOI: 10.1007/s12551-021-00778-w] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 01/01/2021] [Indexed: 01/08/2023] Open
Abstract
Protein aggregation is a topic of immense interest to the scientific community due to its role in several neurodegenerative diseases/disorders and industrial importance. Several in silico techniques, tools, and algorithms have been developed to predict aggregation in proteins and understand the aggregation mechanisms. This review attempts to provide an essence of the vast developments in in silico approaches, resources available, and future perspectives. It reviews aggregation-related databases, mechanistic models (aggregation-prone region and aggregation propensity prediction), kinetic models (aggregation rate prediction), and molecular dynamics studies related to aggregation. With a multitude of prediction models related to aggregation already available to the scientific community, the field of protein aggregation is rapidly maturing to tackle new applications.
Collapse
Affiliation(s)
- R. Prabakaran
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
| | - Puneet Rawat
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
| | - A. Mary Thangakani
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
| | - Sandeep Kumar
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceutical Inc., Ridgefield, CT USA
| | - M. Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
- School of Computing, Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, Kanagawa Japan
| |
Collapse
|
23
|
Ebrahim-Habibi A, Kashani-Amin E, Larijani B. Modeling and simulation in medical sciences: an overview of specific applications based on research experience in EMRI (Endocrinology and Metabolism Research Institute of Tehran University of Medical Sciences). J Diabetes Metab Disord 2021:1-7. [PMID: 33500880 PMCID: PMC7821172 DOI: 10.1007/s40200-020-00706-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 11/26/2020] [Accepted: 12/07/2020] [Indexed: 01/31/2023]
Abstract
The concomitant use of various types of models (in silico, in vitro, and in vivo) has been exemplified here within the context of biomedical researches performed in the Endocrinology and Metabolism Research Institute (EMRI) of Tehran University of Medical Sciences. Two main research aeras have been discussed: the search for new small molecules as therapeutics for diabetes and related metabolic conditions, and diseases related to protein aggregation. Due to their multidisciplinary nature, the majority of these studies have needed the collaboration of different specialties. In both cases, a brief overview of the subject is provided through literature examples, and sequential use of these methods is described.
Collapse
Affiliation(s)
- Azadeh Ebrahim-Habibi
- Biosensor Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Jalal-al-Ahmad Street, Chamran Highway, 1411713137 Tehran, Iran
| | - Elaheh Kashani-Amin
- Biosensor Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Jalal-al-Ahmad Street, Chamran Highway, 1411713137 Tehran, Iran
| | - Bagher Larijani
- Endocrinology and Metabolism Research Center, Endocrinology and Metabolism Clinical Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| |
Collapse
|