1
|
Velloso JPL, de Sá AGC, Pires DEV, Ascher DB. Engineering G protein-coupled receptors for stabilization. Protein Sci 2024; 33:e5000. [PMID: 38747401 PMCID: PMC11094779 DOI: 10.1002/pro.5000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 03/21/2024] [Accepted: 04/10/2024] [Indexed: 05/19/2024]
Abstract
G protein-coupled receptors (GPCRs) are one of the most important families of targets for drug discovery. One of the limiting steps in the study of GPCRs has been their stability, with significant and time-consuming protein engineering often used to stabilize GPCRs for structural characterization and drug screening. Unfortunately, computational methods developed using globular soluble proteins have translated poorly to the rational engineering of GPCRs. To fill this gap, we propose GPCR-tm, a novel and personalized structurally driven web-based machine learning tool to study the impacts of mutations on GPCR stability. We show that GPCR-tm performs as well as or better than alternative methods, and that it can accurately rank the stability changes of a wide range of mutations occurring in various types of class A GPCRs. GPCR-tm achieved Pearson's correlation coefficients of 0.74 and 0.46 on 10-fold cross-validation and blind test sets, respectively. We observed that the (structural) graph-based signatures were the most important set of features for predicting destabilizing mutations, which points out that these signatures properly describe the changes in the environment where the mutations occur. More specifically, GPCR-tm was able to accurately rank mutations based on their effect on protein stability, guiding their rational stabilization. GPCR-tm is available through a user-friendly web server at https://biosig.lab.uq.edu.au/gpcr_tm/.
Collapse
Affiliation(s)
- João Paulo L. Velloso
- School of Chemistry and Molecular Biosciences, The Australian Centre for EcogenomicsThe University of QueenslandBrisbaneQueenslandAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- Baker Department of Cardiometabolic HealthThe University of MelbourneParkvilleVictoriaAustralia
| | - Alex G. C. de Sá
- School of Chemistry and Molecular Biosciences, The Australian Centre for EcogenomicsThe University of QueenslandBrisbaneQueenslandAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- Baker Department of Cardiometabolic HealthThe University of MelbourneParkvilleVictoriaAustralia
| | - Douglas E. V. Pires
- School of Computing and Information SystemsThe University of MelbourneParkvilleVictoriaAustralia
| | - David B. Ascher
- School of Chemistry and Molecular Biosciences, The Australian Centre for EcogenomicsThe University of QueenslandBrisbaneQueenslandAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- Baker Department of Cardiometabolic HealthThe University of MelbourneParkvilleVictoriaAustralia
| |
Collapse
|
2
|
Ramakrishna Reddy P, Kulandaisamy A, Michael Gromiha M. TMH Stab-pred: Predicting the stability of α-helical membrane proteins using sequence and structural features. Methods 2023; 218:118-124. [PMID: 37572768 DOI: 10.1016/j.ymeth.2023.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 08/02/2023] [Accepted: 08/04/2023] [Indexed: 08/14/2023] Open
Abstract
The folding and stability of transmembrane proteins (TMPs) are governed by the insertion of secondary structural elements into the cell membrane followed by their assembly. Understanding the important features that dictate the stability of TMPs is important for elucidating their functions. In this work, we related sequence and structure-based parameters with free energy (ΔG0) of α-helical membrane proteins. Our results showed that the free energy transfer of hydrophobic peptides, relative contact order, total interaction energy, number of hydrogen bonds and lipid accessibility of transmembrane regions are important for stability. Further, we have developed multiple-regression models to predict the stability of α-helical membrane proteins using these features and our method can predict the stability with a correlation and mean absolute error (MAE) of 0.89 and 1.21 kcal/mol, respectively, on jack-knife test. The method was validated with a blind test set of three recently reported experimental ΔG0, which could predict the stability within an average MAE of 0.51 kcal/mol. Further, we developed a webserver for predicting the stability and it is freely available at (https://web.iitm.ac.in/bioinfo2/TMHS/). The importance of selected parameters and limitations are discussed.
Collapse
Affiliation(s)
- P Ramakrishna Reddy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - A Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Basic and Translational Research Division, Department of Cardiology, Boston Children's Hospital, Boston, MA 02115, USA
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan; Department of Computer Science, National University of Singapore, Singapore.
| |
Collapse
|
3
|
Sun J, Kulandaisamy A, Liu J, Hu K, Gromiha MM, Zhang Y. Machine learning in computational modelling of membrane protein sequences and structures: From methodologies to applications. Comput Struct Biotechnol J 2023; 21:1205-1226. [PMID: 36817959 PMCID: PMC9932300 DOI: 10.1016/j.csbj.2023.01.036] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/16/2023] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
Membrane proteins mediate a wide spectrum of biological processes, such as signal transduction and cell communication. Due to the arduous and costly nature inherent to the experimental process, membrane proteins have long been devoid of well-resolved atomic-level tertiary structures and, consequently, the understanding of their functional roles underlying a multitude of life activities has been hampered. Currently, computational tools dedicated to furthering the structure-function understanding are primarily focused on utilizing intelligent algorithms to address a variety of site-wise prediction problems (e.g., topology and interaction sites), but are scattered across different computing sources. Moreover, the recent advent of deep learning techniques has immensely expedited the development of computational tools for membrane protein-related prediction problems. Given the growing number of applications optimized particularly by manifold deep neural networks, we herein provide a review on the current status of computational strategies mainly in membrane protein type classification, topology identification, interaction site detection, and pathogenic effect prediction. Meanwhile, we provide an overview of how the entire prediction process proceeds, including database collection, data pre-processing, feature extraction, and method selection. This review is expected to be useful for developing more extendable computational tools specific to membrane proteins.
Collapse
Affiliation(s)
- Jianfeng Sun
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Headington, Oxford OX3 7LD, UK
| | - Arulsamy Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - Jacklyn Liu
- UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK
| | - Kai Hu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India,Corresponding authors.
| | - Yuan Zhang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China,Corresponding authors.
| |
Collapse
|
4
|
Pandey M, Anoosha P, Yesudhas D, Gromiha MM. Identification of potential driver mutations in glioblastoma using machine learning. Brief Bioinform 2022; 23:6764546. [PMID: 36266243 DOI: 10.1093/bib/bbac451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 09/13/2022] [Accepted: 09/22/2022] [Indexed: 12/14/2022] Open
Abstract
Glioblastoma is a fast and aggressively growing tumor in the brain and spinal cord. Mutation of amino acid residues in targets proteins, which are involved in glioblastoma, alters the structure and function and may lead to disease. In this study, we collected a set of 9386 disease-causing (drivers) mutations based on the recurrence in patient samples and experimentally annotated as pathogenic and 8728 as neutral (passenger) mutations. We observed that Arg is highly preferred at the mutant sites of drivers, whereas Met and Ile showed preferences in passengers. Inspecting neighboring residues at the mutant sites revealed that the motifs YP, CP and GRH, are preferred in drivers, whereas SI, IQ and TVI are dominant in neutral. In addition, we have computed other sequence-based features such as conservation scores, Position Specific Scoring Matrices (PSSM) and physicochemical properties, and developed a machine learning-based method, GBMDriver (GlioBlastoma Multiforme Drivers), for distinguishing between driver and passenger mutations. Our method showed an accuracy and AUC of 73.59% and 0.82, respectively, on 10-fold cross-validation and 81.99% and 0.87 in a blind set of 1809 mutants. The tool is available at https://web.iitm.ac.in/bioinfo2/GBMDriver/index.html. We envisage that the present method is helpful to prioritize driver mutations in glioblastoma and assist in identifying therapeutic targets.
Collapse
Affiliation(s)
- Medha Pandey
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - P Anoosha
- Division of Medical Oncology, Department of Internal Medicine, Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, USA
| | - Dhanusha Yesudhas
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| |
Collapse
|
5
|
MPAD: A Database for Binding Affinity of Membrane Protein–protein Complexes and their Mutants. J Mol Biol 2022:167870. [DOI: 10.1016/j.jmb.2022.167870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/20/2022] [Accepted: 10/20/2022] [Indexed: 11/06/2022]
|
6
|
Pancotti C, Benevenuta S, Birolo G, Alberini V, Repetto V, Sanavia T, Capriotti E, Fariselli P. Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset. Brief Bioinform 2022; 23:6502552. [PMID: 35021190 PMCID: PMC8921618 DOI: 10.1093/bib/bbab555] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 11/29/2021] [Accepted: 12/05/2021] [Indexed: 12/13/2022] Open
Abstract
Predicting the difference in thermodynamic stability between protein variants is crucial for protein design and understanding the genotype-phenotype relationships. So far, several computational tools have been created to address this task. Nevertheless, most of them have been trained or optimized on the same and ‘all’ available data, making a fair comparison unfeasible. Here, we introduce a novel dataset, collected and manually cleaned from the latest version of the ThermoMutDB database, consisting of 669 variants not included in the most widely used training datasets. The prediction performance and the ability to satisfy the antisymmetry property by considering both direct and reverse variants were evaluated across 21 different tools. The Pearson correlations of the tested tools were in the ranges of 0.21–0.5 and 0–0.45 for the direct and reverse variants, respectively. When both direct and reverse variants are considered, the antisymmetric methods perform better achieving a Pearson correlation in the range of 0.51–0.62. The tested methods seem relatively insensitive to the physiological conditions, performing well also on the variants measured with more extreme pH and temperature values. A common issue with all the tested methods is the compression of the \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$\Delta \Delta G$\end{document} predictions toward zero. Furthermore, the thermodynamic stability of the most significantly stabilizing variants was found to be more challenging to predict. This study is the most extensive comparisons of prediction methods using an entirely novel set of variants never tested before.
Collapse
Affiliation(s)
- Corrado Pancotti
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Silvia Benevenuta
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Giovanni Birolo
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Virginia Alberini
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Valeria Repetto
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Tiziana Sanavia
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Emidio Capriotti
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| |
Collapse
|
7
|
Kulandaisamy A, Nikam R, Harini K, Sharma D, Gromiha MM. Illustrative Tutorials for ProThermDB: Thermodynamic Database for Proteins and Mutants. Curr Protoc 2021; 1:e306. [PMID: 34826364 DOI: 10.1002/cpz1.306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
ProThermDB (https://web.iitm.ac.in/bioinfo2/prothermdb/index.html) is a primary resource for protein stability, which contains experimentally determined thermodynamic data for proteins and their mutants. The most recent version of ProThermDB accumulates the data obtained from both high- and low-throughput experimental biophysical methods. It includes comprehensive information at four different levels, i.e.: (i) protein sequence and structure; (ii) experimental conditions; (iii) thermodynamic parameters such as Gibbs free energy, melting temperature, enthalpy, etc.; and (iv) literature. In the following protocols, we present detailed tutorials for retrieving data using different search, display and sorting options, interpretation of search results, description of each entry-level information category, data upload and download, cross-links with other databases, and visualization options. This protocol consists of six pictorial exercises, which are useful for biologists/users to understand the contents and organization of data in ProThermDB. Further, potential applications of ProThermDB in protein engineering are discussed. © 2021 Wiley Periodicals LLC. Basic Protocol 1: Retrieval of experimental thermodynamic data for wild-type and mutants of a specific protein using a simple query Basic Protocol 2: Retrieval of stabilizing point mutations, which are located at the interior of α-helical regions, and obtaining data by thermal denaturation methods Basic Protocol 3: Retrieval of destabilizing point mutations, which are in β-sheets of exposed regions, and obtaining data by chemical denaturation methods (urea and GdnHCl) Basic Protocol 4: Retrieval of stabilizing and destabilizing point mutations in a range of physiological conditions (pH: 6-9 and T: 20°C-25°C) and publication years (2010-2020) Support Protocol: Downloading the entire data of the database for academic research purposes and submission of new data in ProThermDB.
Collapse
Affiliation(s)
- A Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| | - Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| | - K Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| | - Divya Sharma
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| |
Collapse
|
8
|
Samaga YBL, Raghunathan S, Priyakumar UD. SCONES: Self-Consistent Neural Network for Protein Stability Prediction Upon Mutation. J Phys Chem B 2021; 125:10657-10671. [PMID: 34546056 DOI: 10.1021/acs.jpcb.1c04913] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Engineering proteins to have desired properties by mutating amino acids at specific sites is commonplace. Such engineered proteins must be stable to function. Experimental methods used to determine stability at throughputs required to scan the protein sequence space thoroughly are laborious. To this end, many machine learning based methods have been developed to predict thermodynamic stability changes upon mutation. These methods have been evaluated for symmetric consistency by testing with hypothetical reverse mutations. In this work, we propose transitive data augmentation, evaluating transitive consistency with our new Stransitive data set, and a new machine learning based method, the first of its kind, that incorporates both symmetric and transitive properties into the architecture. Our method, called SCONES, is an interpretable neural network that predicts small relative protein stability changes for missense mutations that do not significantly alter the structure. It estimates a residue's contributions toward protein stability (ΔG) in its local structural environment, and the difference between independently predicted contributions of the reference and mutant residues is reported as ΔΔG. We show that this self-consistent machine learning architecture is immune to many common biases in data sets, relies less on data than existing methods, is robust to overfitting, and can explain a substantial portion of the variance in experimental data.
Collapse
Affiliation(s)
- Yashas B L Samaga
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| | - Shampa Raghunathan
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| | - U Deva Priyakumar
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| |
Collapse
|
9
|
Casadio R, Lenhard B, Sternberg MJE. Computational Resources for Molecular Biology 2021. J Mol Biol 2021; 433:166962. [PMID: 33774035 DOI: 10.1016/j.jmb.2021.166962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Rita Casadio
- Biocomputing Group, FABIT-University of Bologna, Italy
| | - Boris Lenhard
- Institute of Clinical Sciences, Faculty of Medicine. Imperial College London, Hammersmith Campus, Du Cane Road, London W12 0NN, UK; Computational Regulatory Genomics, MRC London Institute of Medical Sciences, Du Cane Road, London W12 0NN, UK
| | - Michael J E Sternberg
- Structural Bioinformatics Group, Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK.
| |
Collapse
|
10
|
Nikam R, Kulandaisamy A, Harini K, Sharma D, Gromiha MM. ProThermDB: thermodynamic database for proteins and mutants revisited after 15 years. Nucleic Acids Res 2021; 49:D420-D424. [PMID: 33196841 PMCID: PMC7778892 DOI: 10.1093/nar/gkaa1035] [Citation(s) in RCA: 85] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/14/2020] [Accepted: 10/26/2020] [Indexed: 11/12/2022] Open
Abstract
ProThermDB is an updated version of the thermodynamic database for proteins and mutants (ProTherm), which has ∼31 500 data on protein stability, an increase of 84% from the previous version. It contains several thermodynamic parameters such as melting temperature, free energy obtained with thermal and denaturant denaturation, enthalpy change and heat capacity change along with experimental methods and conditions, sequence, structure and literature information. Besides, the current version of the database includes about 120 000 thermodynamic data obtained for different organisms and cell lines, which are determined by recent high throughput proteomics techniques using whole-cell approaches. In addition, we provided a graphical interface for visualization of mutations at sequence and structure levels. ProThermDB is cross-linked with other relevant databases, PDB, UniProt, PubMed etc. It is freely available at https://web.iitm.ac.in/bioinfo2/prothermdb/index.html without any login requirements. It is implemented in Python, HTML and JavaScript, and supports the latest versions of major browsers, such as Firefox, Chrome and Safari.
Collapse
Affiliation(s)
- Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - A Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - K Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - Divya Sharma
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| |
Collapse
|