1
|
Ghosh D, Biswas A, Radhakrishna M. Advanced computational approaches to understand protein aggregation. BIOPHYSICS REVIEWS 2024; 5:021302. [PMID: 38681860 PMCID: PMC11045254 DOI: 10.1063/5.0180691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Protein aggregation is a widespread phenomenon implicated in debilitating diseases like Alzheimer's, Parkinson's, and cataracts, presenting complex hurdles for the field of molecular biology. In this review, we explore the evolving realm of computational methods and bioinformatics tools that have revolutionized our comprehension of protein aggregation. Beginning with a discussion of the multifaceted challenges associated with understanding this process and emphasizing the critical need for precise predictive tools, we highlight how computational techniques have become indispensable for understanding protein aggregation. We focus on molecular simulations, notably molecular dynamics (MD) simulations, spanning from atomistic to coarse-grained levels, which have emerged as pivotal tools in unraveling the complex dynamics governing protein aggregation in diseases such as cataracts, Alzheimer's, and Parkinson's. MD simulations provide microscopic insights into protein interactions and the subtleties of aggregation pathways, with advanced techniques like replica exchange molecular dynamics, Metadynamics (MetaD), and umbrella sampling enhancing our understanding by probing intricate energy landscapes and transition states. We delve into specific applications of MD simulations, elucidating the chaperone mechanism underlying cataract formation using Markov state modeling and the intricate pathways and interactions driving the toxic aggregate formation in Alzheimer's and Parkinson's disease. Transitioning we highlight how computational techniques, including bioinformatics, sequence analysis, structural data, machine learning algorithms, and artificial intelligence have become indispensable for predicting protein aggregation propensity and locating aggregation-prone regions within protein sequences. Throughout our exploration, we underscore the symbiotic relationship between computational approaches and empirical data, which has paved the way for potential therapeutic strategies against protein aggregation-related diseases. In conclusion, this review offers a comprehensive overview of advanced computational methodologies and bioinformatics tools that have catalyzed breakthroughs in unraveling the molecular basis of protein aggregation, with significant implications for clinical interventions, standing at the intersection of computational biology and experimental research.
Collapse
Affiliation(s)
- Deepshikha Ghosh
- Department of Biological Sciences and Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | - Anushka Biswas
- Department of Chemical Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | | |
Collapse
|
2
|
Gallo E. Revolutionizing Synthetic Antibody Design: Harnessing Artificial Intelligence and Deep Sequencing Big Data for Unprecedented Advances. Mol Biotechnol 2024:10.1007/s12033-024-01064-2. [PMID: 38308755 DOI: 10.1007/s12033-024-01064-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 01/02/2024] [Indexed: 02/05/2024]
Abstract
Synthetic antibodies (Abs) represent a category of engineered proteins meticulously crafted to replicate the functions of their natural counterparts. Such Abs are generated in vitro, enabling advanced molecular alterations associated with antigen recognition, paratope site engineering, and biochemical refinements. In a parallel realm, deep sequencing has brought about a paradigm shift in molecular biology. It facilitates the prompt and cost-effective high-throughput sequencing of DNA and RNA molecules, enabling the comprehensive big data analysis of Ab transcriptomes, including specific regions of interest. Significantly, the integration of artificial intelligence (AI), based on machine- and deep- learning approaches, has fundamentally transformed our capacity to discern patterns hidden within deep sequencing big data, including distinctive Ab features and protein folding free energy landscapes. Ultimately, current AI advances can generate approximations of the most stable Ab structural configurations, enabling the prediction of de novo synthetic Abs. As a result, this manuscript comprehensively examines the latest and relevant literature concerning the intersection of deep sequencing big data and AI methodologies for the design and development of synthetic Abs. Together, these advancements have accelerated the exploration of antibody repertoires, contributing to the refinement of synthetic Ab engineering and optimizations, and facilitating advancements in the lead identification process.
Collapse
Affiliation(s)
- Eugenio Gallo
- Avance Biologicals, Department of Medicinal Chemistry, 950 Dupont Street, Toronto, ON, M6H 1Z2, Canada.
- RevivAb, Department of Protein Engineering, Av. Ipiranga, 6681, Partenon, Porto Alegre, RS, 90619-900, Brazil.
| |
Collapse
|
3
|
Das P, Majumder R, Sen N, Nandi SK, Ghosh A, Mandal M, Basak P. A computational analysis to evaluate deleterious SNPs of GSK3β, a multifunctional and regulatory protein, for metabolism, wound healing, and migratory processes. Int J Biol Macromol 2024; 256:128262. [PMID: 37989431 DOI: 10.1016/j.ijbiomac.2023.128262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 10/04/2023] [Accepted: 11/17/2023] [Indexed: 11/23/2023]
Abstract
This study focused on GSK-3β, a critical serine/threonine kinase with diverse cellular functions. However, there is limited understanding of the impact of non-synonymous single nucleotide polymorphisms (nsSNPs) on its structure and function. Through an exhaustive in-silico investigation 12 harmful nsSNPs were predicted from a pool of 172 acquired from the NCBI dbSNP database using 12 established tools that detects deleterious SNPs. Consistently, these nsSNPs were discovered in locations with high levels of conservation. Notably, the three harmful nsSNPs F67C, A83T, and T138I were situated in the active/binding site of GSK-3β, which may affect the protein's capacity to bind to substrates and other proteins. Molecular dynamics simulations revealed that the F67C and T138I mutants had stable structures, indicating rigidness, whereas the A83T mutant was unstable. Analysis of secondary structures revealed different modifications in all mutant forms, which may affect the stability, functioning, and interactions of the protein. These mutations appear to alter the structural dynamics of GSK-3β, which may have functional ramifications, such as the formation of novel secondary structures and variations in coil-to-helix transitions. In conclusion, this study illuminates the possible structural and functional ramifications of these GSK-3 nsSNPs, revealing how protein compactness, stiffness, and interactions may affect biological activities.
Collapse
Affiliation(s)
- Pratik Das
- School of Bioscience and Engineering, Jadavpur University, Kolkata, India
| | - Ranabir Majumder
- Cancer Biology Lab, School of Medical Science & Technology, Indian Institute of Technology Kharagpur, India
| | - Nandita Sen
- Molecular biology wing, Dept of Biotechnology, PES University, Bangalore, India
| | - Samit Kumar Nandi
- Department of Veterinary Surgery & Radiology, West Bengal University of Animal and Fishery Sciences, Kolkata, India
| | - Arabinda Ghosh
- Department of Computational Biology and Biotechnology, Mahapurusha Srimanta Sankaradeva Viswavidyalaya, Guwahati Unit, Guwahati, Assam, India
| | - Mahitosh Mandal
- Cancer Biology Lab, School of Medical Science & Technology, Indian Institute of Technology Kharagpur, India
| | - Piyali Basak
- School of Bioscience and Engineering, Jadavpur University, Kolkata, India.
| |
Collapse
|
4
|
Kang S, Kim M, Sun J, Lee M, Min K. Prediction of Protein Aggregation Propensity via Data-Driven Approaches. ACS Biomater Sci Eng 2023; 9:6451-6463. [PMID: 37844262 DOI: 10.1021/acsbiomaterials.3c01001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2023]
Abstract
Protein aggregation occurs when misfolded or unfolded proteins physically bind together and can promote the development of various amyloid diseases. This study aimed to construct surrogate models for predicting protein aggregation via data-driven methods using two types of databases. First, an aggregation propensity score database was constructed by calculating the scores for protein structures in the Protein Data Bank using Aggrescan3D 2.0. Moreover, feature- and graph-based models for predicting protein aggregation have been developed by using this database. The graph-based model outperformed the feature-based model, resulting in an R2 of 0.95, although it intrinsically required protein structures. Second, for the experimental data, a feature-based model was built using the Curated Protein Aggregation Database 2.0 to predict the aggregated intensity curves. In summary, this study suggests approaches that are more effective in predicting protein aggregation, depending on the type of descriptor and the database.
Collapse
Affiliation(s)
- Seungpyo Kang
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu 06978, Seoul, Republic of Korea
| | - Minseon Kim
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu 06978, Seoul, Republic of Korea
| | - Jiwon Sun
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu 06978, Seoul, Republic of Korea
| | - Myeonghun Lee
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu 06978, Seoul, Republic of Korea
| | - Kyoungmin Min
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu 06978, Seoul, Republic of Korea
| |
Collapse
|
5
|
Bauer J, Rajagopal N, Gupta P, Gupta P, Nixon AE, Kumar S. How can we discover developable antibody-based biotherapeutics? Front Mol Biosci 2023; 10:1221626. [PMID: 37609373 PMCID: PMC10441133 DOI: 10.3389/fmolb.2023.1221626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 07/10/2023] [Indexed: 08/24/2023] Open
Abstract
Antibody-based biotherapeutics have emerged as a successful class of pharmaceuticals despite significant challenges and risks to their discovery and development. This review discusses the most frequently encountered hurdles in the research and development (R&D) of antibody-based biotherapeutics and proposes a conceptual framework called biopharmaceutical informatics. Our vision advocates for the syncretic use of computation and experimentation at every stage of biologic drug discovery, considering developability (manufacturability, safety, efficacy, and pharmacology) of potential drug candidates from the earliest stages of the drug discovery phase. The computational advances in recent years allow for more precise formulation of disease concepts, rapid identification, and validation of targets suitable for therapeutic intervention and discovery of potential biotherapeutics that can agonize or antagonize them. Furthermore, computational methods for de novo and epitope-specific antibody design are increasingly being developed, opening novel computationally driven opportunities for biologic drug discovery. Here, we review the opportunities and limitations of emerging computational approaches for optimizing antigens to generate robust immune responses, in silico generation of antibody sequences, discovery of potential antibody binders through virtual screening, assessment of hits, identification of lead drug candidates and their affinity maturation, and optimization for developability. The adoption of biopharmaceutical informatics across all aspects of drug discovery and development cycles should help bring affordable and effective biotherapeutics to patients more quickly.
Collapse
Affiliation(s)
- Joschka Bauer
- Early Stage Pharmaceutical Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach/Riss, Germany
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
| | - Nandhini Rajagopal
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Priyanka Gupta
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Pankaj Gupta
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Andrew E. Nixon
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Sandeep Kumar
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| |
Collapse
|
6
|
Chen Z, Wang X, Chen X, Huang J, Wang C, Wang J, Wang Z. Accelerating therapeutic protein design with computational approaches toward the clinical stage. Comput Struct Biotechnol J 2023; 21:2909-2926. [PMID: 38213894 PMCID: PMC10781723 DOI: 10.1016/j.csbj.2023.04.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 04/11/2023] [Accepted: 04/27/2023] [Indexed: 01/13/2024] Open
Abstract
Therapeutic protein, represented by antibodies, is of increasing interest in human medicine. However, clinical translation of therapeutic protein is still largely hindered by different aspects of developability, including affinity and selectivity, stability and aggregation prevention, solubility and viscosity reduction, and deimmunization. Conventional optimization of the developability with widely used methods, like display technologies and library screening approaches, is a time and cost-intensive endeavor, and the efficiency in finding suitable solutions is still not enough to meet clinical needs. In recent years, the accelerated advancement of computational methodologies has ushered in a transformative era in the field of therapeutic protein design. Owing to their remarkable capabilities in feature extraction and modeling, the integration of cutting-edge computational strategies with conventional techniques presents a promising avenue to accelerate the progression of therapeutic protein design and optimization toward clinical implementation. Here, we compared the differences between therapeutic protein and small molecules in developability and provided an overview of the computational approaches applicable to the design or optimization of therapeutic protein in several developability issues.
Collapse
Affiliation(s)
- Zhidong Chen
- Department of Pathology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen 518033, China
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Xinpei Wang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Xu Chen
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Juyang Huang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Chenglin Wang
- Shenzhen Qiyu Biotechnology Co., Ltd, Shenzhen 518107, China
| | - Junqing Wang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Zhe Wang
- Department of Pathology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen 518033, China
| |
Collapse
|
7
|
Machine Learning Approaches in Diagnosis, Prognosis and Treatment Selection of Cardiac Amyloidosis. Int J Mol Sci 2023; 24:ijms24065680. [PMID: 36982754 PMCID: PMC10051237 DOI: 10.3390/ijms24065680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/12/2023] [Accepted: 03/14/2023] [Indexed: 03/18/2023] Open
Abstract
Cardiac amyloidosis is an uncommon restrictive cardiomyopathy featuring an unregulated amyloid protein deposition that impairs organic function. Early cardiac amyloidosis diagnosis is generally delayed by indistinguishable clinical findings of more frequent hypertrophic diseases. Furthermore, amyloidosis is divided into various groups, according to a generally accepted taxonomy, based on the proteins that make up the amyloid deposits; a careful differentiation between the various forms of amyloidosis is necessary to undertake an adequate therapeutic treatment. Thus, cardiac amyloidosis is thought to be underdiagnosed, which delays necessary therapeutic procedures, diminishing quality of life and impairing clinical prognosis. The diagnostic work-up for cardiac amyloidosis begins with the identification of clinical features, electrocardiographic and imaging findings suggestive or compatible with cardiac amyloidosis, and often requires the histological demonstration of amyloid deposition. One approach to overcome the difficulty of an early diagnosis is the use of automated diagnostic algorithms. Machine learning enables the automatic extraction of salient information from “raw data” without the need for pre-processing methods based on the a priori knowledge of the human operator. This review attempts to assess the various diagnostic approaches and artificial intelligence computational techniques in the detection of cardiac amyloidosis.
Collapse
|
8
|
Understanding the mutational frequency in SARS-CoV-2 proteome using structural features. Comput Biol Med 2022; 147:105708. [PMID: 35714506 PMCID: PMC9173821 DOI: 10.1016/j.compbiomed.2022.105708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Revised: 04/26/2022] [Accepted: 06/04/2022] [Indexed: 01/18/2023]
Abstract
The prolonged transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus in the human population has led to demographic divergence and the emergence of several location-specific clusters of viral strains. Although the effect of mutation(s) on severity and survival of the virus is still unclear, it is evident that certain sites in the viral proteome are more/less prone to mutations. In fact, millions of SARS-CoV-2 sequences collected all over the world have provided us a unique opportunity to understand viral protein mutations and develop novel computational approaches to predict mutational patterns. In this study, we have classified the mutation sites into low and high mutability classes based on viral isolates count containing mutations. The physicochemical features and structural analysis of the SARS-CoV-2 proteins showed that features including residue type, surface accessibility, residue bulkiness, stability and sequence conservation at the mutation site were able to classify the low and high mutability sites. We further developed machine learning models using above-mentioned features, to predict low and high mutability sites at different selection thresholds (ranging 5-30% of topmost and bottommost mutated sites) and observed the improvement in performance as the selection threshold is reduced (prediction accuracy ranging from 65 to 77%). The analysis will be useful for early detection of variants of concern for the SARS-CoV-2, which can also be applied to other existing and emerging viruses for another pandemic prevention.
Collapse
|
9
|
Bhosale H, Ramakrishnan V, Jayaraman VK. Support vector machine-based prediction of pore-forming toxins (PFT) using distributed representation of reduced alphabets. J Bioinform Comput Biol 2021; 19:2150028. [PMID: 34693886 DOI: 10.1142/s0219720021500281] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Bacterial virulence can be attributed to a wide variety of factors including toxins that harm the host. Pore-forming toxins are one class of toxins that confer virulence to the bacteria and are one of the promising targets for therapeutic intervention. In this work, we develop a sequence-based machine learning framework for the prediction of pore-forming toxins. For this, we have used distributed representation of the protein sequence encoded by reduced alphabet schemes based on conformational similarity and hydropathy index as input features to Support Vector Machines (SVMs). The choice of conformational similarity and hydropathy indices is based on the functional mechanism of pore-forming toxins. Our methodology achieves about 81% accuracy indicating that conformational similarity, an indicator of the flexibility of amino acids, along with hydrophobic index can capture the intrinsic features of pore-forming toxins that distinguish it from other types of transporter proteins. Increased understanding of the mechanisms of pore-forming toxins can further contribute to the use of such "mechanism-informed" features that may increase the prediction accuracy further.
Collapse
Affiliation(s)
- Hrushikesh Bhosale
- Department of Computer Science, FLAME University, Pune, Maharashtra, India
| | - Vigneshwar Ramakrishnan
- School of Chemical & Biotechnology, SASTRA Deemed-to-be University, Thanjavur, Tamilnadu, India
| | - Valadi K Jayaraman
- Department of Computer Science, FLAME University, Pune, Maharashtra, India
| |
Collapse
|
10
|
Bauer J, Mathias S, Kube S, Otte K, Garidel P, Gamer M, Blech M, Fischer S, Karow-Zwick AR. Rational optimization of a monoclonal antibody improves the aggregation propensity and enhances the CMC properties along the entire pharmaceutical process chain. MAbs 2021; 12:1787121. [PMID: 32658605 PMCID: PMC7531517 DOI: 10.1080/19420862.2020.1787121] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The discovery of therapeutic monoclonal antibodies (mAbs) primarily focuses on their biological activity favoring the selection of highly potent drug candidates. These candidates, however, may have physical or chemical attributes that lead to unfavorable chemistry, manufacturing, and control (CMC) properties, such as low product titers, conformational and colloidal instabilities, or poor solubility, which can hamper or even prevent development and manufacturing. Hence, there is an urgent need to consider the developability of mAb candidates during lead identification and optimization. This work provides a comprehensive proof of concept study for the significantly improved developability of a mAb variant that was optimized with the help of sophisticated in silico tools relative to its difficult-to-develop parental counterpart. Interestingly, a single amino acid substitution in the variable domain of the light chain resulted in a three-fold increased product titer after stable expression in Chinese hamster ovary cells. Microscopic investigations revealed that wild type mAb-producing cells displayed potential antibody inclusions, while the in silico optimized variant-producing cells showed a rescued phenotype. Notably, the drug substance of the in silico optimized variant contained substantially reduced levels of aggregates and fragments after downstream process purification. Finally, formulation studies unraveled a significantly enhanced colloidal stability of the in silico optimized variant while its folding stability and potency were maintained. This study emphasizes that implementation of bioinformatics early in lead generation and optimization of biotherapeutics reduces failures during subsequent development activities and supports the reduction of project timelines and resources.
Collapse
Affiliation(s)
- Joschka Bauer
- Early Stage Pharmaceutical Development, Pharmaceutical Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG , Biberach/Riss, Germany
| | - Sven Mathias
- Institute of Applied Biotechnology, University of Applied Sciences Biberach , Biberach/Riss, Germany.,Early Stage Bioprocess Development, Bioprocess Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG , Biberach/Riss, Germany
| | - Sebastian Kube
- Early Stage Pharmaceutical Development, Pharmaceutical Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG , Biberach/Riss, Germany
| | - Kerstin Otte
- Institute of Applied Biotechnology, University of Applied Sciences Biberach , Biberach/Riss, Germany
| | - Patrick Garidel
- Early Stage Pharmaceutical Development, Pharmaceutical Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG , Biberach/Riss, Germany
| | - Martin Gamer
- Early Stage Bioprocess Development, Bioprocess Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG , Biberach/Riss, Germany
| | - Michaela Blech
- Early Stage Pharmaceutical Development, Pharmaceutical Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG , Biberach/Riss, Germany
| | - Simon Fischer
- Cell Line Development, Bioprocess Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG, Boehringer Ingelheim Pharma GmbH & Co. KG , Biberach/Riss, Germany
| | - Anne R Karow-Zwick
- Early Stage Pharmaceutical Development, Pharmaceutical Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG , Biberach/Riss, Germany
| |
Collapse
|
11
|
Rawat P, Prabakaran R, Kumar S, Gromiha MM. Exploring the sequence features determining amyloidosis in human antibody light chains. Sci Rep 2021; 11:13785. [PMID: 34215782 PMCID: PMC8253744 DOI: 10.1038/s41598-021-93019-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 06/18/2021] [Indexed: 02/06/2023] Open
Abstract
The light chain (AL) amyloidosis is caused by the aggregation of light chain of antibodies into amyloid fibrils. There are plenty of computational resources available for the prediction of short aggregation-prone regions within proteins. However, it is still a challenging task to predict the amyloidogenic nature of the whole protein using sequence/structure information. In the case of antibody light chains, common architecture and known binding sites can provide vital information for the prediction of amyloidogenicity at physiological conditions. Here, in this work, we have compared classical sequence-based, aggregation-related features (such as hydrophobicity, presence of gatekeeper residues, disorderness, β-propensity, etc.) calculated for the CDR, FR or VL regions of amyloidogenic and non-amyloidogenic antibody light chains and implemented the insights gained in a machine learning-based webserver called "VLAmY-Pred" ( https://web.iitm.ac.in/bioinfo2/vlamy-pred/ ). The model shows prediction accuracy of 79.7% (sensitivity: 78.7% and specificity: 79.9%) with a ROC value of 0.88 on a dataset of 1828 variable region sequences of the antibody light chains. This model will be helpful towards improved prognosis for patients that may likely suffer from diseases caused by light chain amyloidosis, understanding origins of aggregation in antibody-based biotherapeutics, large-scale in-silico analysis of antibody sequences generated by next generation sequencing, and finally towards rational engineering of aggregation resistant antibodies.
Collapse
Affiliation(s)
- Puneet Rawat
- grid.417969.40000 0001 2315 1926Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036 Tamil Nadu India
| | - R. Prabakaran
- grid.417969.40000 0001 2315 1926Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036 Tamil Nadu India
| | - Sandeep Kumar
- grid.418412.a0000 0001 1312 9717Biotherapeutics Discovery, Boehringer-Ingelheim Inc., 5571 R & D Building, 175 Briar Ridge Road, Ridgefield, CT 06877 USA
| | - M. Michael Gromiha
- grid.417969.40000 0001 2315 1926Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036 Tamil Nadu India ,grid.32197.3e0000 0001 2179 2105Advanced Computational Drug Discovery Unit (ACDD), Institute of Innovative Research, Tokyo Institute of Technology, 4259 Nagatsutacho, Midori-ku, Yokohama, Kanagawa 226-8501 Japan
| |
Collapse
|
12
|
AbsoluRATE: An in-silico method to predict the aggregation kinetics of native proteins. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2021; 1869:140682. [PMID: 34102324 DOI: 10.1016/j.bbapap.2021.140682] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 05/12/2021] [Accepted: 06/04/2021] [Indexed: 12/12/2022]
Abstract
Protein aggregation has two aspects, namely, mechanistic and kinetics. Understanding protein aggregation kinetics is critical for prediction of progression of diseases caused by amyloidosis, accumulation of aggregates in biotherapeutics during storage and engineering commercial nano-biomaterials. In this work, we have collected experimentally determined absolute protein aggregation rates and developed an SVM based regression model to predict absolute rates of protein and peptide aggregation near-physiological conditions. The regression model achieved a correlation coefficient of 0.72 with MAE of 0.91 (natural log of kapp, where kapp is in hour-1) using leave-one-out cross-validation on a dataset of 82 non-redundant proteins/peptides. The model accounts for the experimental conditions (such as temperature, pH, ionic and protein concentration) and sequence-based properties. The amino acid sequence features revealed by this model as being important for aggregation kinetics, are also associated with the aggregation mechanism. In particular, inherent aggregation propensity of the protein/peptide sequence and number of aggregation prone regions (APRs) unpunctuated by the gatekeeping residues, were found to play important roles in the prediction of the absolute aggregation rates. This analysis shows that mechanism and kinetics of protein aggregation are coupled via common sequence attributes. The aggregation kinetic prediction method developed in this work is available at https://web.iitm.ac.in/bioinfo2/absolurate-pred/index.html.
Collapse
|
13
|
Khodaparast L, Wu G, Khodaparast L, Schmidt BZ, Rousseau F, Schymkowitz J. Bacterial Protein Homeostasis Disruption as a Therapeutic Intervention. Front Mol Biosci 2021; 8:681855. [PMID: 34150852 PMCID: PMC8206779 DOI: 10.3389/fmolb.2021.681855] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 05/04/2021] [Indexed: 12/15/2022] Open
Abstract
Cells have evolved a complex molecular network, collectively called the protein homeostasis (proteostasis) network, to produce and maintain proteins in the appropriate conformation, concentration and subcellular localization. Loss of proteostasis leads to a reduction in cell viability, which occurs to some degree during healthy ageing, but is also the root cause of a group of diverse human pathologies. The accumulation of proteins in aberrant conformations and their aggregation into specific beta-rich assemblies are particularly detrimental to cell viability and challenging to the protein homeostasis network. This is especially true for bacteria; it can be argued that the need to adapt to their changing environments and their high protein turnover rates render bacteria particularly vulnerable to the disruption of protein homeostasis in general, as well as protein misfolding and aggregation. Targeting bacterial proteostasis could therefore be an attractive strategy for the development of novel antibacterial therapeutics. This review highlights advances with an antibacterial strategy that is based on deliberately inducing aggregation of target proteins in bacterial cells aiming to induce a lethal collapse of protein homeostasis. The approach exploits the intrinsic aggregation propensity of regions residing in the hydrophobic core regions of the polypeptide sequence of proteins, which are genetically conserved because of their essential role in protein folding and stability. Moreover, the molecules were designed to target multiple proteins, to slow down the build-up of resistance. Although more research is required, results thus far allow the hope that this strategy may one day contribute to the arsenal to combat multidrug-resistant bacterial infections.
Collapse
Affiliation(s)
- Laleh Khodaparast
- Switch Laboratory, VIB Center for Brain and Disease Research, Leuven, Belgium.,Switch Laboratory, Department of Cellular and Molecular Medicine, Leuven, Belgium
| | - Guiqin Wu
- Switch Laboratory, VIB Center for Brain and Disease Research, Leuven, Belgium.,Switch Laboratory, Department of Cellular and Molecular Medicine, Leuven, Belgium
| | - Ladan Khodaparast
- Switch Laboratory, VIB Center for Brain and Disease Research, Leuven, Belgium.,Switch Laboratory, Department of Cellular and Molecular Medicine, Leuven, Belgium
| | - Béla Z Schmidt
- Switch Laboratory, VIB Center for Brain and Disease Research, Leuven, Belgium.,Switch Laboratory, Department of Cellular and Molecular Medicine, Leuven, Belgium
| | - Frederic Rousseau
- Switch Laboratory, VIB Center for Brain and Disease Research, Leuven, Belgium.,Switch Laboratory, Department of Cellular and Molecular Medicine, Leuven, Belgium
| | - Joost Schymkowitz
- Switch Laboratory, VIB Center for Brain and Disease Research, Leuven, Belgium.,Switch Laboratory, Department of Cellular and Molecular Medicine, Leuven, Belgium
| |
Collapse
|
14
|
Prabakaran R, Rawat P, Thangakani AM, Kumar S, Gromiha MM. Protein aggregation: in silico algorithms and applications. Biophys Rev 2021; 13:71-89. [PMID: 33747245 PMCID: PMC7930180 DOI: 10.1007/s12551-021-00778-w] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 01/01/2021] [Indexed: 01/08/2023] Open
Abstract
Protein aggregation is a topic of immense interest to the scientific community due to its role in several neurodegenerative diseases/disorders and industrial importance. Several in silico techniques, tools, and algorithms have been developed to predict aggregation in proteins and understand the aggregation mechanisms. This review attempts to provide an essence of the vast developments in in silico approaches, resources available, and future perspectives. It reviews aggregation-related databases, mechanistic models (aggregation-prone region and aggregation propensity prediction), kinetic models (aggregation rate prediction), and molecular dynamics studies related to aggregation. With a multitude of prediction models related to aggregation already available to the scientific community, the field of protein aggregation is rapidly maturing to tackle new applications.
Collapse
Affiliation(s)
- R. Prabakaran
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
| | - Puneet Rawat
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
| | - A. Mary Thangakani
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
| | - Sandeep Kumar
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceutical Inc., Ridgefield, CT USA
| | - M. Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu India
- School of Computing, Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, Kanagawa Japan
| |
Collapse
|
15
|
Norman RA, Ambrosetti F, Bonvin AMJJ, Colwell LJ, Kelm S, Kumar S, Krawczyk K. Computational approaches to therapeutic antibody design: established methods and emerging trends. Brief Bioinform 2020; 21:1549-1567. [PMID: 31626279 PMCID: PMC7947987 DOI: 10.1093/bib/bbz095] [Citation(s) in RCA: 106] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 06/07/2019] [Accepted: 07/05/2019] [Indexed: 12/31/2022] Open
Abstract
Antibodies are proteins that recognize the molecular surfaces of potentially noxious molecules to mount an adaptive immune response or, in the case of autoimmune diseases, molecules that are part of healthy cells and tissues. Due to their binding versatility, antibodies are currently the largest class of biotherapeutics, with five monoclonal antibodies ranked in the top 10 blockbuster drugs. Computational advances in protein modelling and design can have a tangible impact on antibody-based therapeutic development. Antibody-specific computational protocols currently benefit from an increasing volume of data provided by next generation sequencing and application to related drug modalities based on traditional antibodies, such as nanobodies. Here we present a structured overview of available databases, methods and emerging trends in computational antibody analysis and contextualize them towards the engineering of candidate antibody therapeutics.
Collapse
|
16
|
Rawat P, Prabakaran R, Kumar S, Gromiha MM. AggreRATE-Pred: a mathematical model for the prediction of change in aggregation rate upon point mutation. Bioinformatics 2020; 36:1439-1444. [PMID: 31599925 DOI: 10.1093/bioinformatics/btz764] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Revised: 09/30/2019] [Accepted: 10/05/2019] [Indexed: 01/09/2023] Open
Abstract
MOTIVATION Protein aggregation is a major unsolved problem in biochemistry with implications for several human diseases, biotechnology and biomaterial sciences. A majority of sequence-structural properties known for their mechanistic roles in protein aggregation do not correlate well with the aggregation kinetics. This limits the practical utility of predictive algorithms. RESULTS We analyzed experimental data on 183 unique single point mutations that lead to change in aggregation rates for 23 polypeptides and proteins. Our initial mathematical model obtained a correlation coefficient of 0.43 between predicted and experimental change in aggregation rate upon mutation (P-value <0.0001). However, when the dataset was classified based on protein length and conformation at the mutation sites, the average correlation coefficient almost doubled to 0.82 (range: 0.74-0.87; P-value <0.0001). We observed that distinct sequence and structure-based properties determine protein aggregation kinetics in each class. In conclusion, the protein aggregation kinetics are impacted by local factors and not by global ones, such as overall three-dimensional protein fold, or mechanistic factors such as the presence of aggregation-prone regions. AVAILABILITY AND IMPLEMENTATION The web server is available at http://www.iitm.ac.in/bioinfo/aggrerate-pred/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Puneet Rawat
- Protein Bioinformatics Lab, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India
| | - R Prabakaran
- Protein Bioinformatics Lab, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India
| | - Sandeep Kumar
- Biotherapeutics Discovery, Boehringer-Ingelheim Pharmaceutical Inc. Ridgefield, CT, USA
| | - M Michael Gromiha
- Protein Bioinformatics Lab, Department of Biotechnology, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India.,Advanced Computational Drug Discovery Unit (ACDD), Tokyo Tech World Research Hub Initiative (WRHI), Institute of Innovative Research, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama, Japan
| |
Collapse
|
17
|
Rawat P, Prabakaran R, Sakthivel R, Mary Thangakani A, Kumar S, Gromiha MM. CPAD 2.0: a repository of curated experimental data on aggregating proteins and peptides. Amyloid 2020; 27:128-133. [PMID: 31979981 DOI: 10.1080/13506129.2020.1715363] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The Curated Protein Aggregation Database (CPAD) is a manually curated and open-access database dedicated to providing comprehensive information related to mechanistic, kinetic and structural aspects of protein and peptide aggregation. The database has been updated to CPAD 2.0 by significantly expanding datasets and improving the user-interface. Key features of CPAD 2.0 are (i) 83,098 data points on aggregation kinetics experiments, (ii) 565 structures related to aggregation, which are classified into proteins, fibrils, and protein-ligand complexes, (iii) 2031 aggregating/non-aggregating peptides with pre-calculated aggregation properties, and (iv) 912 aggregation-prone regions in amyloidogenic proteins. This database will help the scientific community (a) by facilitating research leading to improved understanding of protein aggregation, (b) by helping develop, validate and benchmark mechanistic and kinetic models of protein aggregation, and (c) by assisting experimentalists with design of their investigations and dissemination of data generated by their studies. CPAD 2.0 can be accessed at https://web.iitm.ac.in/bioinfo2/cpad2/index.html.
Collapse
Affiliation(s)
- Puneet Rawat
- Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - R Prabakaran
- Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - R Sakthivel
- Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - A Mary Thangakani
- Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Sandeep Kumar
- Biotherapeutics Discovery, Boehringer-Ingelheim Inc, Ridgefield, CT, USA
| | - M Michael Gromiha
- Protein Bioinformatics Lab, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India.,Advanced Computational Drug Discovery Unit (ACDD), Tokyo Tech World Research Hub Initiative (WRHI), Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, Japan
| |
Collapse
|
18
|
Zhou Y, Cui Q, Zhou Y. NmSEER V2.0: a prediction tool for 2'-O-methylation sites based on random forest and multi-encoding combination. BMC Bioinformatics 2019; 20:690. [PMID: 31874624 PMCID: PMC6929462 DOI: 10.1186/s12859-019-3265-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Background 2′-O-methylation (2′-O-me or Nm) is a post-transcriptional RNA methylation modified at 2′-hydroxy, which is common in mRNAs and various non-coding RNAs. Previous studies revealed the significance of Nm in multiple biological processes. With Nm getting more and more attention, a revolutionary technique termed Nm-seq, was developed to profile Nm sites mainly in mRNA with single nucleotide resolution and high sensitivity. In a recent work, supported by the Nm-seq data, we have reported a method in silico for predicting Nm sites, which relies on nucleotide sequence information, and established an online server named NmSEER. More recently, a more confident dataset produced by refined Nm-seq was available. Therefore, in this work, we redesigned the prediction model to achieve a more robust performance on the new data. Results We redesigned the prediction model from two perspectives, including machine learning algorithm and multi-encoding scheme combination. With optimization by 5-fold cross-validation tests and evaluation by independent test respectively, random forest was selected as the most robust algorithm. Meanwhile, one-hot encoding, together with position-specific dinucleotide sequence profile and K-nucleotide frequency encoding were collectively applied to build the final predictor. Conclusions The predictor of updated version, named NmSEER V2.0, achieves an accurate prediction performance (AUROC = 0.862) and has been settled into a brand-new server, which is available at http://www.rnanut.net/nmseer-v2/ for free.
Collapse
Affiliation(s)
- Yiran Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China.,Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China.
| |
Collapse
|