1
|
Huynh AT, Nguyen TTN, Villegas CA, Montemorso S, Strauss B, Pearson RA, Graham JG, Oribello J, Suresh R, Lustig B, Wang N. Prediction and confirmation of a switch-like region within the N-terminal domain of hSIRT1. Biochem Biophys Rep 2022; 30:101275. [PMID: 35592613 PMCID: PMC9112024 DOI: 10.1016/j.bbrep.2022.101275] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 05/02/2022] [Accepted: 05/04/2022] [Indexed: 11/28/2022] Open
Abstract
Many proteins display conformational changes resulting from allosteric regulation. Often only a few residues are crucial in conveying these structural and functional allosteric changes. These regions that undergo a significant change in structure upon receiving an input signal, such as molecular recognition, are defined as switch-like regions. Identifying these key residues within switch-like regions can help elucidate the mechanism of allosteric regulation and provide guidance for synthetic regulation. In this study, we combine a novel computational workflow with biochemical methods to identify a switch-like region in the N-terminal domain of human SIRT1 (hSIRT1), a lysine deacetylase that plays important roles in regulating cellular pathways. Based on primary sequence, computational methods predicted a region between residues 186-193 in hSIRT1 to exhibit switch-like behavior. Mutations were then introduced in this region and the resulting mutants were tested for allosteric reactions to resveratrol, a known hSIRT1 allosteric regulator. After fine-tuning the mutations based on comparison of known secondary structures, we were able to pinpoint M193 as the residue essential for allosteric regulation, likely by communicating the allosteric signal. Mutation of this residue maintained enzyme activity but abolished allosteric regulation by resveratrol. Our findings suggest a method to predict switch-like regions in allosterically regulated enzymes based on the primary sequence. If further validated, this could be an efficient way to identify key residues in enzymes for therapeutic drug targeting and other applications.
Collapse
Affiliation(s)
- Angelina T. Huynh
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Thi-Tina N. Nguyen
- Department of Biological Sciences, San José State University, San José, California, 95192, USA
| | - Carina A. Villegas
- Department of Biological Sciences, San José State University, San José, California, 95192, USA
| | - Saira Montemorso
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Benjamin Strauss
- Department of Computer Science, San José State University, San José, California, 95192, USA
| | - Richard A. Pearson
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Jason G. Graham
- Department of Biomedical, Chemical, and Materials Engineering, San José State University, San José, California, 95192, USA
| | - Jonathan Oribello
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Rohit Suresh
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Brooke Lustig
- Department of Chemistry, San José State University, San José, California, 95192, USA
| | - Ningkun Wang
- Department of Chemistry, San José State University, San José, California, 95192, USA
| |
Collapse
|
2
|
Khetan R, Curtis R, Deane CM, Hadsund JT, Kar U, Krawczyk K, Kuroda D, Robinson SA, Sormanni P, Tsumoto K, Warwicker J, Martin ACR. Current advances in biopharmaceutical informatics: guidelines, impact and challenges in the computational developability assessment of antibody therapeutics. MAbs 2022; 14:2020082. [PMID: 35104168 PMCID: PMC8812776 DOI: 10.1080/19420862.2021.2020082] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Therapeutic monoclonal antibodies and their derivatives are key components of clinical pipelines in the global biopharmaceutical industry. The availability of large datasets of antibody sequences, structures, and biophysical properties is increasingly enabling the development of predictive models and computational tools for the "developability assessment" of antibody drug candidates. Here, we provide an overview of the antibody informatics tools applicable to the prediction of developability issues such as stability, aggregation, immunogenicity, and chemical degradation. We further evaluate the opportunities and challenges of using biopharmaceutical informatics for drug discovery and optimization. Finally, we discuss the potential of developability guidelines based on in silico metrics that can be used for the assessment of antibody stability and manufacturability.
Collapse
Affiliation(s)
- Rahul Khetan
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK
| | - Robin Curtis
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK
| | | | | | - Uddipan Kar
- Department of Biological Engineering, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | | | - Daisuke Kuroda
- Department of Bioengineering, School of Engineering, The University of Tokyo, Tokyo, Japan.,Medical Device Development and Regulation Research Center, School of Engineering, The University of Tokyo, Tokyo, Japan.,Department of Chemistry and Biotechnology, School of Engineering, The University of Tokyo, Tokyo, Japan
| | | | - Pietro Sormanni
- Chemistry of Health, Yusuf Hamied Department of Chemistry, University of Cambridge
| | - Kouhei Tsumoto
- Department of Bioengineering, School of Engineering, The University of Tokyo, Tokyo, Japan.,Medical Device Development and Regulation Research Center, School of Engineering, The University of Tokyo, Tokyo, Japan.,Department of Chemistry and Biotechnology, School of Engineering, The University of Tokyo, Tokyo, Japan.,The Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Jim Warwicker
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK
| | - Andrew C R Martin
- Institute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK
| |
Collapse
|
3
|
Raimondi D, Orlando G, Vranken WF, Moreau Y. Exploring the limitations of biophysical propensity scales coupled with machine learning for protein sequence analysis. Sci Rep 2019; 9:16932. [PMID: 31729443 PMCID: PMC6858301 DOI: 10.1038/s41598-019-53324-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 10/25/2019] [Indexed: 11/21/2022] Open
Abstract
Machine learning (ML) is ubiquitous in bioinformatics, due to its versatility. One of the most crucial aspects to consider while training a ML model is to carefully select the optimal feature encoding for the problem at hand. Biophysical propensity scales are widely adopted in structural bioinformatics because they describe amino acids properties that are intuitively relevant for many structural and functional aspects of proteins, and are thus commonly used as input features for ML methods. In this paper we reproduce three classical structural bioinformatics prediction tasks to investigate the main assumptions about the use of propensity scales as input features for ML methods. We investigate their usefulness with different randomization experiments and we show that their effectiveness varies among the ML methods used and the tasks. We show that while linear methods are more dependent on the feature encoding, the specific biophysical meaning of the features is less relevant for non-linear methods. Moreover, we show that even among linear ML methods, the simpler one-hot encoding can surprisingly outperform the “biologically meaningful” scales. We also show that feature selection performed with non-linear ML methods may not be able to distinguish between randomized and “real” propensity scales by properly prioritizing to the latter. Finally, we show that learning problem-specific embeddings could be a simple, assumptions-free and optimal way to perform feature learning/engineering for structural bioinformatics tasks.
Collapse
Affiliation(s)
| | - Gabriele Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, 1050, Brussels, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, 1050, Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, 1050, Belgium
| | - Yves Moreau
- ESAT-STADIUS, KU Leuven, 3001, Leuven, Belgium.
| |
Collapse
|
4
|
Zhang B, Li L, Lü Q. Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network. Biomolecules 2018; 8:biom8020033. [PMID: 29799510 PMCID: PMC6023031 DOI: 10.3390/biom8020033] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Revised: 05/18/2018] [Accepted: 05/22/2018] [Indexed: 12/12/2022] Open
Abstract
Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibility, which is based on a stacked deep bidirectional recurrent neural network applied to sequence profiles. To capture more long-range sequence information, a merging operator was proposed when bidirectional information from hidden nodes was merged for outputs. Three types of merging operators were used in our improved model, with a long short-term memory network performing as a hidden computing node. The trained database was constructed from 7361 proteins extracted from the PISCES server using a cut-off of 25% sequence identity. Sequence-derived features including position-specific scoring matrix, physical properties, physicochemical characteristics, conservation score and protein coding were used to represent a residue. Using this method, predictive values of continuous relative solvent-accessible area were obtained, and then, these values were transformed into binary states with predefined thresholds. Our experimental results showed that our deep learning method improved prediction quality relative to current methods, with mean absolute error and Pearson’s correlation coefficient values of 8.8% and 74.8%, respectively, on the CB502 dataset and 8.2% and 78%, respectively, on the Manesh215 dataset.
Collapse
Affiliation(s)
- Buzhong Zhang
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China.
- School of Computer and Information, Anqing Normal University, Anqing 246011, China.
| | - Linqing Li
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China.
| | - Qiang Lü
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China.
| |
Collapse
|