1
|
Li C, Luo Y, Xie Y, Zhang Z, Liu Y, Zou L, Xiao F. Structural and functional prediction, evaluation, and validation in the post-sequencing era. Comput Struct Biotechnol J 2024; 23:446-451. [PMID: 38223342 PMCID: PMC10787220 DOI: 10.1016/j.csbj.2023.12.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 12/20/2023] [Accepted: 12/22/2023] [Indexed: 01/16/2024] Open
Abstract
The surge of genome sequencing data has underlined substantial genetic variants of uncertain significance (VUS). The decryption of VUS discovered by sequencing poses a major challenge in the post-sequencing era. Although experimental assays have progressed in classifying VUS, only a tiny fraction of the human genes have been explored experimentally. Thus, it is urgently needed to generate state-of-the-art functional predictors of VUS in silico. Artificial intelligence (AI) is an invaluable tool to assist in the identification of VUS with high efficiency and accuracy. An increasing number of studies indicate that AI has brought an exciting acceleration in the interpretation of VUS, and our group has already used AI to develop protein structure-based prediction models. In this review, we provide an overview of the previous research on AI-based prediction of missense variants, and elucidate the challenges and opportunities for protein structure-based variant prediction in the post-sequencing era.
Collapse
Affiliation(s)
- Chang Li
- Clinical Biobank, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Yixuan Luo
- Beijing Normal University, Beijing, China
| | - Yibo Xie
- Information Center, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Zaifeng Zhang
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Ye Liu
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Lihui Zou
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Fei Xiao
- Clinical Biobank, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- Beijing Normal University, Beijing, China
| |
Collapse
|
2
|
Qiu Y, Huang T, Cai YD. Review of predicting protein stability changes upon variations. Proteomics 2024; 24:e2300371. [PMID: 38643379 DOI: 10.1002/pmic.202300371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/07/2024] [Accepted: 04/08/2024] [Indexed: 04/22/2024]
Abstract
Forecasting alterations in protein stability caused by variations holds immense importance. Improving the thermal stability of proteins is important for biomedical and industrial applications. This review discusses the latest methods for predicting the effects of mutations on protein stability, databases containing protein mutations and thermodynamic parameters, and experimental techniques for efficiently assessing protein stability in high-throughput settings. Various publicly available databases for protein stability prediction are introduced. Furthermore, state-of-the-art computational approaches for anticipating protein stability changes due to variants are reviewed. Each method's types of features, base algorithm, and prediction results are also detailed. Additionally, some experimental approaches for verifying the prediction results of computational methods are introduced. Finally, the review summarizes the progress and challenges of protein stability prediction and discusses potential models for future research directions.
Collapse
Affiliation(s)
- Yiling Qiu
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
3
|
Li G, Yao S, Fan L. ProSTAGE: Predicting Effects of Mutations on Protein Stability by Using Protein Embeddings and Graph Convolutional Networks. J Chem Inf Model 2024; 64:340-347. [PMID: 38166383 PMCID: PMC10806799 DOI: 10.1021/acs.jcim.3c01697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 12/11/2023] [Accepted: 12/12/2023] [Indexed: 01/04/2024]
Abstract
Protein thermodynamic stability is essential to clarify the relationships among structure, function, and interaction. Therefore, developing a faster and more accurate method to predict the impact of the mutations on protein stability is helpful for protein design and understanding the phenotypic variation. Recent studies have shown that protein embedding will be particularly powerful at modeling sequence information with context dependence, such as subcellular localization, variant effect, and secondary structure prediction. Herein, we introduce a novel method, ProSTAGE, which is a deep learning method that fuses structure and sequence embedding to predict protein stability changes upon single point mutations. Our model combines graph-based techniques and language models to predict stability changes. Moreover, ProSTAGE is trained on a larger data set, which is almost twice as large as the most used S2648 data set. It consistently outperforms all existing state-of-the-art methods on mutation-affected problems as benchmarked on several independent data sets. The protein embedding as the prediction input achieves better results than the previous results, which shows the potential of protein language models in predicting the effect of mutations on proteins. ProSTAGE is implemented as a user-friendly web server.
Collapse
Affiliation(s)
- Gen Li
- Production and R&D Center
I of LSS, GenScript (Shanghai) Biotech Co.,
Ltd., Shanghai 200131, China
| | - Sijie Yao
- Production and R&D Center
I of LSS, GenScript (Shanghai) Biotech Co.,
Ltd., Shanghai 200131, China
| | - Long Fan
- Production and R&D Center
I of LSS, GenScript (Shanghai) Biotech Co.,
Ltd., Shanghai 200131, China
| |
Collapse
|
4
|
Selvasingh JA, McDonald EF, Neufer PD, McKinney JR, Meiler J, Ledwitch KV. Dark nanodiscs for evaluating membrane protein thermostability by differential scanning fluorimetry. Biophys J 2024; 123:68-79. [PMID: 37978799 PMCID: PMC10808023 DOI: 10.1016/j.bpj.2023.11.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 10/27/2023] [Accepted: 11/16/2023] [Indexed: 11/19/2023] Open
Abstract
Measuring protein thermostability provides valuable information on the biophysical rules that govern the structure-energy relationships of proteins. However, such measurements remain a challenge for membrane proteins. Here, we introduce a new experimental system to evaluate membrane protein thermostability. This system leverages a recently developed nonfluorescent membrane scaffold protein to reconstitute proteins into nanodiscs and is coupled with a nano-format of differential scanning fluorimetry (nanoDSF). This approach offers a label-free and direct measurement of the intrinsic tryptophan fluorescence of the membrane protein as it unfolds in solution without signal interference from the "dark" nanodisc. In this work, we demonstrate the application of this method using the disulfide bond formation protein B (DsbB) as a test membrane protein. NanoDSF measurements of DsbB reconstituted in dark nanodiscs loaded with 1,2-dimyristoyl-sn-glycero-3-phosphocholine (DMPC) and 1,2-dimyristoyl-sn-glycero-3-phosphorylglycerol (DMPG) lipids show a complex biphasic thermal unfolding pattern with a minor unfolding transition followed by a major transition. The inflection points of the thermal denaturation curve reveal two distinct unfolding midpoint melting temperatures (Tm) of 70.5°C and 77.5°C, consistent with a three-state unfolding model. Further, we show that the catalytically conserved disulfide bond between residues C41 and C130 drives the intermediate state of the unfolding pathway for DsbB in a DMPC and DMPG nanodisc. To extend the utility of this method, we evaluate and compare the thermostability of DsbB in different lipid environments. We introduce this method as a new tool that can be used to understand how compositionally and biophysically complex lipid environments drive membrane protein stability.
Collapse
Affiliation(s)
- Jazlyn A Selvasingh
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee; Department of Chemistry, Vanderbilt University, Nashville, Tennessee
| | - Eli F McDonald
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee; Department of Chemistry, Vanderbilt University, Nashville, Tennessee
| | - Preston D Neufer
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee; Department of Chemistry, Vanderbilt University, Nashville, Tennessee
| | - Jacob R McKinney
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee; Department of Chemistry, Vanderbilt University, Nashville, Tennessee
| | - Jens Meiler
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee; Department of Chemistry, Vanderbilt University, Nashville, Tennessee; Institute of Drug Discovery, Faculty of Medicine, University of Leipzig, Leipzig, Germany.
| | - Kaitlyn V Ledwitch
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee; Department of Chemistry, Vanderbilt University, Nashville, Tennessee.
| |
Collapse
|
5
|
Li M, Wang H, Yang Z, Zhang L, Zhu Y. DeepTM: A deep learning algorithm for prediction of melting temperature of thermophilic proteins directly from sequences. Comput Struct Biotechnol J 2023; 21:5544-5560. [PMID: 38034401 PMCID: PMC10681957 DOI: 10.1016/j.csbj.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 11/02/2023] [Accepted: 11/02/2023] [Indexed: 12/02/2023] Open
Abstract
Thermally stable proteins find extensive applications in industrial production, pharmaceutical development, and serve as a highly evolved starting point in protein engineering. The thermal stability of proteins is commonly characterized by their melting temperature (Tm). However, due to the limited availability of experimentally determined Tm data and the insufficient accuracy of existing computational methods in predicting Tm, there is an urgent need for a computational approach to accurately forecast the Tm values of thermophilic proteins. Here, we present a deep learning-based model, called DeepTM, which exclusively utilizes protein sequences as input and accurately predicts the Tm values of target thermophilic proteins on a dataset consisting of 7790 thermophilic protein entries. On a test set of 1550 samples, DeepTM demonstrates excellent performance with a coefficient of determination (R2) of 0.75, Pearson correlation coefficient (P) of 0.87, and root mean square error (RMSE) of 6.24 ℃. We further analyzed the sequence features that determine the thermal stability of thermophilic proteins and found that dipeptide frequency, optimal growth temperature (OGT) of the host organisms, and the evolutionary information of the protein significantly affect its melting temperature. We compared the performance of DeepTM with recently reported methods, ProTstab2 and DeepSTABp, in predicting the Tm values on two blind test datasets. One dataset comprised 22 PET plastic-degrading enzymes, while the other included 29 thermally stable proteins of broader classification. In the PET plastic-degrading enzyme dataset, DeepTM achieved RMSE of 8.25 ℃. Compared to ProTstab2 (20.05 ℃) and DeepSTABp (20.97 ℃), DeepTM demonstrated a reduction in RMSE of 58.85% and 60.66%, respectively. In the dataset of thermally stable proteins, DeepTM (RMSE=7.66 ℃) demonstrated a 51.73% reduction in RMSE compared to ProTstab2 (RMSE=15.87 ℃). DeepTM, with the sole requirement of protein sequence information, accurately predicts the melting temperature and achieves a fully end-to-end prediction process, thus providing enhanced convenience and expediency for further protein engineering.
Collapse
Affiliation(s)
- Mengyu Li
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
| | - Hongzhao Wang
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
| | - Zhenwu Yang
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
| | - Longgui Zhang
- SINOPEC Beijing Research Institute of Chemical Industry, Beijing 100013, China
| | - Yushan Zhu
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
- National Energy R&D Center for Biorefinery, Beijing University of Chemical Technology, Beijing 100029, China
| |
Collapse
|
6
|
Ramakrishna Reddy P, Kulandaisamy A, Michael Gromiha M. TMH Stab-pred: Predicting the stability of α-helical membrane proteins using sequence and structural features. Methods 2023; 218:118-124. [PMID: 37572768 DOI: 10.1016/j.ymeth.2023.08.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 08/02/2023] [Accepted: 08/04/2023] [Indexed: 08/14/2023] Open
Abstract
The folding and stability of transmembrane proteins (TMPs) are governed by the insertion of secondary structural elements into the cell membrane followed by their assembly. Understanding the important features that dictate the stability of TMPs is important for elucidating their functions. In this work, we related sequence and structure-based parameters with free energy (ΔG0) of α-helical membrane proteins. Our results showed that the free energy transfer of hydrophobic peptides, relative contact order, total interaction energy, number of hydrogen bonds and lipid accessibility of transmembrane regions are important for stability. Further, we have developed multiple-regression models to predict the stability of α-helical membrane proteins using these features and our method can predict the stability with a correlation and mean absolute error (MAE) of 0.89 and 1.21 kcal/mol, respectively, on jack-knife test. The method was validated with a blind test set of three recently reported experimental ΔG0, which could predict the stability within an average MAE of 0.51 kcal/mol. Further, we developed a webserver for predicting the stability and it is freely available at (https://web.iitm.ac.in/bioinfo2/TMHS/). The importance of selected parameters and limitations are discussed.
Collapse
Affiliation(s)
- P Ramakrishna Reddy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - A Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Basic and Translational Research Division, Department of Cardiology, Boston Children's Hospital, Boston, MA 02115, USA
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan; Department of Computer Science, National University of Singapore, Singapore.
| |
Collapse
|
7
|
Selvasingh JA, McDonald EF, Mckinney JR, Meiler J, Ledwitch KV. Dark nanodiscs as a model membrane for evaluating membrane protein thermostability by differential scanning fluorimetry. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.08.539917. [PMID: 37214798 PMCID: PMC10197605 DOI: 10.1101/2023.05.08.539917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Measuring protein thermostability provides valuable information on the biophysical rules that govern structure-energy relationships of proteins. However, such measurements remain a challenge for membrane proteins. Here, we introduce a new experimental system to evaluate membrane protein thermostability. This system leverages a recently-developed non-fluorescent membrane scaffold protein (MSP) to reconstitute proteins into nanodiscs and is coupled with a nano-format of differential scanning fluorimetry (nanoDSF). This approach offers a label-free and direct measurement of the intrinsic tryptophan fluorescence of the membrane protein as it unfolds in solution without signal interference from the "dark" nanodisc. In this work, we demonstrate the application of this method using the disulfide bond formation protein B (DsbB) as a test membrane protein. NanoDSF measurements of DsbB reconstituted in dark nanodiscs show a complex biphasic thermal unfolding pattern in the presence of lipids with a minor unfolding transition followed by a major transition. The inflection points of the thermal denaturation curve reveal two distinct unfolding midpoint melting temperatures (Tm) of 70.5 °C and 77.5 °C, consistent with a three-state unfolding model. Further, we show that the catalytically conserved disulfide bond between residues C41 and C130 drives the intermediate state of the unfolding pathway for DsbB in a nanodisc. We introduce this method as a new tool that can be used to understand how compositionally, and biophysically complex lipid environments drive membrane protein stability.
Collapse
Affiliation(s)
- Jazlyn A. Selvasingh
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37240, USA
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA
| | - Eli Fritz McDonald
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37240, USA
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA
| | - Jacob R. Mckinney
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37240, USA
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA
| | - Jens Meiler
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37240, USA
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA
- Institute of Drug Discovery, Faculty of Medicine, University of Leipzig, 04103 Leipzig, Germany
| | - Kaitlyn V. Ledwitch
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37240, USA
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA
- Lead contact
| |
Collapse
|
8
|
David A, Sternberg MJE. Protein structure-based evaluation of missense variants: Resources, challenges and future directions. Curr Opin Struct Biol 2023; 80:102600. [PMID: 37126977 DOI: 10.1016/j.sbi.2023.102600] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 03/30/2023] [Accepted: 03/31/2023] [Indexed: 05/03/2023]
Abstract
We provide an overview of the methods that can be used for protein structure-based evaluation of missense variants. The algorithms can be broadly divided into those that calculate the difference in free energy (ΔΔG) between the wild type and variant structures and those that use structural features to predict the damaging effect of a variant without providing a ΔΔG. A wide range of machine learning approaches have been employed to develop those algorithms. We also discuss challenges and opportunities for variant interpretation in view of the recent breakthrough in three-dimensional structural modelling using deep learning.
Collapse
Affiliation(s)
- Alessia David
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.
| | - Michael J E Sternberg
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK
| |
Collapse
|
9
|
Yuan Q, Chu Y, Li X, Shi Y, Chen Y, Zhao J, Lu J, Liu K, Guo Y. CAFrgDB: a database for cancer-associated fibroblasts related genes and their functions in cancer. Cancer Gene Ther 2023:10.1038/s41417-023-00603-4. [PMID: 36922546 DOI: 10.1038/s41417-023-00603-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 02/03/2023] [Accepted: 02/23/2023] [Indexed: 03/17/2023]
Abstract
As one of the most essential components of the tumor microenvironment (TME), cancer-associated fibroblasts (CAFs) interact extensively with cancer cells and other stromal cells to remodel TME and participate in the pathogenesis of cancer, which earmarked themselves as new promising targets for cancer therapy. Numerous studies have highlighted the heterogeneity and versatility of CAFs in most cancer types. Thus, the identification and appropriate use of CAF-related genes (CAFGenes) in the context of specific cancer types will provide critical insights into disease mechanisms and CAF-related therapeutic targets. In this study, we collected and curated 5421 CAFGenes identified from small- or large-scale experiments, encompassing 4982 responsors that directly or indirectly participate in cancer malignant behaviors managed by CAFs, 1069 secretions that are secreted by CAFs and 281 regulators that contribute in modulating CAFs in human and mouse, which covered 24 cancer types. For these human CAFGenes, we performed gene expression and prognostic marker-based analyses across 24 cancer types using TCGA data. Furthermore, we provided annotations for CAF-associated proteins by integrating the knowledge of protein-protein interaction(s), drug-target relations and basic annotations, from 9 public databases. CAFrgDB (CAF related Gene DataBase) is free for academic research at http://caf.zbiolab.cn and we anticipate CAFrgDB can be a useful resource for further study of CAFs.
Collapse
Affiliation(s)
- Qiang Yuan
- Department of Pathophysiology, State Key Laboratory of Esophageal Cancer Prevention and Treatment, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, 450001, China
| | - Yi Chu
- Department of Pathophysiology, State Key Laboratory of Esophageal Cancer Prevention and Treatment, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, 450001, China
| | - Xiaoyu Li
- Department of Pathophysiology, State Key Laboratory of Esophageal Cancer Prevention and Treatment, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, 450001, China
| | - Yunshu Shi
- Department of Pathophysiology, State Key Laboratory of Esophageal Cancer Prevention and Treatment, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, 450001, China
| | - Yingying Chen
- Department of Pathophysiology, State Key Laboratory of Esophageal Cancer Prevention and Treatment, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, 450001, China
| | - Jimin Zhao
- Department of Pathophysiology, State Key Laboratory of Esophageal Cancer Prevention and Treatment, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, 450001, China
| | - Jing Lu
- Department of Pathophysiology, State Key Laboratory of Esophageal Cancer Prevention and Treatment, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, 450001, China
| | - Kangdong Liu
- Department of Pathophysiology, State Key Laboratory of Esophageal Cancer Prevention and Treatment, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, 450001, China. .,China-US (Henan) Hormel Cancer Institute, Zhengzhou, Henan, 450001, China.
| | - Yaping Guo
- Department of Pathophysiology, State Key Laboratory of Esophageal Cancer Prevention and Treatment, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou, 450001, China.
| |
Collapse
|
10
|
Sun J, Kulandaisamy A, Liu J, Hu K, Gromiha MM, Zhang Y. Machine learning in computational modelling of membrane protein sequences and structures: From methodologies to applications. Comput Struct Biotechnol J 2023; 21:1205-1226. [PMID: 36817959 PMCID: PMC9932300 DOI: 10.1016/j.csbj.2023.01.036] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/16/2023] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
Membrane proteins mediate a wide spectrum of biological processes, such as signal transduction and cell communication. Due to the arduous and costly nature inherent to the experimental process, membrane proteins have long been devoid of well-resolved atomic-level tertiary structures and, consequently, the understanding of their functional roles underlying a multitude of life activities has been hampered. Currently, computational tools dedicated to furthering the structure-function understanding are primarily focused on utilizing intelligent algorithms to address a variety of site-wise prediction problems (e.g., topology and interaction sites), but are scattered across different computing sources. Moreover, the recent advent of deep learning techniques has immensely expedited the development of computational tools for membrane protein-related prediction problems. Given the growing number of applications optimized particularly by manifold deep neural networks, we herein provide a review on the current status of computational strategies mainly in membrane protein type classification, topology identification, interaction site detection, and pathogenic effect prediction. Meanwhile, we provide an overview of how the entire prediction process proceeds, including database collection, data pre-processing, feature extraction, and method selection. This review is expected to be useful for developing more extendable computational tools specific to membrane proteins.
Collapse
Affiliation(s)
- Jianfeng Sun
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Headington, Oxford OX3 7LD, UK
| | - Arulsamy Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - Jacklyn Liu
- UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK
| | - Kai Hu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India,Corresponding authors.
| | - Yuan Zhang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China,Corresponding authors.
| |
Collapse
|
11
|
BioMThermDB 1.0: Thermophysical Database of Proteins in Solutions. Int J Mol Sci 2022; 23:ijms232315371. [PMID: 36499696 PMCID: PMC9741033 DOI: 10.3390/ijms232315371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 12/01/2022] [Accepted: 12/03/2022] [Indexed: 12/12/2022] Open
Abstract
We present here a freely available web-based database, called BioMThermDB 1.0, of thermophysical and dynamic properties of various proteins and their aqueous solutions. It contains the hydrodynamic radius, electrophoretic mobility, zeta potential, self-diffusion coefficient, solution viscosity, and cloud-point temperature, as well as the conditions for those determinations and details of the experimental method. It can facilitate the meta-analysis and visualization of data, can enable comparisons, and may be useful for comparing theoretical model predictions with experiments.
Collapse
|
12
|
Velecký J, Hamsikova M, Stourac J, Musil M, Damborsk J, Bednar D, Mazurenko S. SoluProtMutDB: a manually curated database of protein solubility changes upon mutations. Comput Struct Biotechnol J 2022; 20:6339-6347. [DOI: 10.1016/j.csbj.2022.11.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 11/11/2022] Open
|
13
|
MPAD: A Database for Binding Affinity of Membrane Protein–protein Complexes and their Mutants. J Mol Biol 2022:167870. [DOI: 10.1016/j.jmb.2022.167870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/20/2022] [Accepted: 10/20/2022] [Indexed: 11/06/2022]
|
14
|
Qing R, Hao S, Smorodina E, Jin D, Zalevsky A, Zhang S. Protein Design: From the Aspect of Water Solubility and Stability. Chem Rev 2022; 122:14085-14179. [PMID: 35921495 PMCID: PMC9523718 DOI: 10.1021/acs.chemrev.1c00757] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Indexed: 12/13/2022]
Abstract
Water solubility and structural stability are key merits for proteins defined by the primary sequence and 3D-conformation. Their manipulation represents important aspects of the protein design field that relies on the accurate placement of amino acids and molecular interactions, guided by underlying physiochemical principles. Emulated designer proteins with well-defined properties both fuel the knowledge-base for more precise computational design models and are used in various biomedical and nanotechnological applications. The continuous developments in protein science, increasing computing power, new algorithms, and characterization techniques provide sophisticated toolkits for solubility design beyond guess work. In this review, we summarize recent advances in the protein design field with respect to water solubility and structural stability. After introducing fundamental design rules, we discuss the transmembrane protein solubilization and de novo transmembrane protein design. Traditional strategies to enhance protein solubility and structural stability are introduced. The designs of stable protein complexes and high-order assemblies are covered. Computational methodologies behind these endeavors, including structure prediction programs, machine learning algorithms, and specialty software dedicated to the evaluation of protein solubility and aggregation, are discussed. The findings and opportunities for Cryo-EM are presented. This review provides an overview of significant progress and prospects in accurate protein design for solubility and stability.
Collapse
Affiliation(s)
- Rui Qing
- State
Key Laboratory of Microbial Metabolism, School of Life Sciences and
Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- The
David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Shilei Hao
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
- Key
Laboratory of Biorheological Science and Technology, Ministry of Education, College of Bioengineering, Chongqing University, Chongqing 400030, China
| | - Eva Smorodina
- Department
of Immunology, University of Oslo and Oslo
University Hospital, Oslo 0424, Norway
| | - David Jin
- Avalon GloboCare
Corp., Freehold, New Jersey 07728, United States
| | - Arthur Zalevsky
- Laboratory
of Bioinformatics Approaches in Combinatorial Chemistry and Biology, Shemyakin−Ovchinnikov Institute of Bioorganic
Chemistry RAS, Moscow 117997, Russia
| | - Shuguang Zhang
- Media
Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
15
|
Vasina M, Velecký J, Planas-Iglesias J, Marques SM, Skarupova J, Damborsky J, Bednar D, Mazurenko S, Prokop Z. Tools for computational design and high-throughput screening of therapeutic enzymes. Adv Drug Deliv Rev 2022; 183:114143. [PMID: 35167900 DOI: 10.1016/j.addr.2022.114143] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Revised: 02/04/2022] [Accepted: 02/09/2022] [Indexed: 12/16/2022]
Abstract
Therapeutic enzymes are valuable biopharmaceuticals in various biomedical applications. They have been successfully applied for fibrinolysis, cancer treatment, enzyme replacement therapies, and the treatment of rare diseases. Still, there is a permanent demand to find new or better therapeutic enzymes, which would be sufficiently soluble, stable, and active to meet specific medical needs. Here, we highlight the benefits of coupling computational approaches with high-throughput experimental technologies, which significantly accelerate the identification and engineering of catalytic therapeutic agents. New enzymes can be identified in genomic and metagenomic databases, which grow thanks to next-generation sequencing technologies exponentially. Computational design and machine learning methods are being developed to improve catalytically potent enzymes and predict their properties to guide the selection of target enzymes. High-throughput experimental pipelines, increasingly relying on microfluidics, ensure functional screening and biochemical characterization of target enzymes to reach efficient therapeutic enzymes.
Collapse
Affiliation(s)
- Michal Vasina
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic
| | - Jan Velecký
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic
| | - Joan Planas-Iglesias
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic
| | - Sergio M Marques
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic
| | - Jana Skarupova
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic; Enantis, INBIT, Kamenice 34, Brno, Czech Republic
| | - David Bednar
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic.
| | - Stanislav Mazurenko
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic.
| | - Zbynek Prokop
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic.
| |
Collapse
|
16
|
Kulandaisamy A, Nikam R, Harini K, Sharma D, Gromiha MM. Illustrative Tutorials for ProThermDB: Thermodynamic Database for Proteins and Mutants. Curr Protoc 2021; 1:e306. [PMID: 34826364 DOI: 10.1002/cpz1.306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
ProThermDB (https://web.iitm.ac.in/bioinfo2/prothermdb/index.html) is a primary resource for protein stability, which contains experimentally determined thermodynamic data for proteins and their mutants. The most recent version of ProThermDB accumulates the data obtained from both high- and low-throughput experimental biophysical methods. It includes comprehensive information at four different levels, i.e.: (i) protein sequence and structure; (ii) experimental conditions; (iii) thermodynamic parameters such as Gibbs free energy, melting temperature, enthalpy, etc.; and (iv) literature. In the following protocols, we present detailed tutorials for retrieving data using different search, display and sorting options, interpretation of search results, description of each entry-level information category, data upload and download, cross-links with other databases, and visualization options. This protocol consists of six pictorial exercises, which are useful for biologists/users to understand the contents and organization of data in ProThermDB. Further, potential applications of ProThermDB in protein engineering are discussed. © 2021 Wiley Periodicals LLC. Basic Protocol 1: Retrieval of experimental thermodynamic data for wild-type and mutants of a specific protein using a simple query Basic Protocol 2: Retrieval of stabilizing point mutations, which are located at the interior of α-helical regions, and obtaining data by thermal denaturation methods Basic Protocol 3: Retrieval of destabilizing point mutations, which are in β-sheets of exposed regions, and obtaining data by chemical denaturation methods (urea and GdnHCl) Basic Protocol 4: Retrieval of stabilizing and destabilizing point mutations in a range of physiological conditions (pH: 6-9 and T: 20°C-25°C) and publication years (2010-2020) Support Protocol: Downloading the entire data of the database for academic research purposes and submission of new data in ProThermDB.
Collapse
Affiliation(s)
- A Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| | - Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| | - K Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| | - Divya Sharma
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu, India
| |
Collapse
|
17
|
Nikam R, Kulandaisamy A, Harini K, Sharma D, Gromiha MM. ProThermDB: thermodynamic database for proteins and mutants revisited after 15 years. Nucleic Acids Res 2021; 49:D420-D424. [PMID: 33196841 PMCID: PMC7778892 DOI: 10.1093/nar/gkaa1035] [Citation(s) in RCA: 85] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/14/2020] [Accepted: 10/26/2020] [Indexed: 11/12/2022] Open
Abstract
ProThermDB is an updated version of the thermodynamic database for proteins and mutants (ProTherm), which has ∼31 500 data on protein stability, an increase of 84% from the previous version. It contains several thermodynamic parameters such as melting temperature, free energy obtained with thermal and denaturant denaturation, enthalpy change and heat capacity change along with experimental methods and conditions, sequence, structure and literature information. Besides, the current version of the database includes about 120 000 thermodynamic data obtained for different organisms and cell lines, which are determined by recent high throughput proteomics techniques using whole-cell approaches. In addition, we provided a graphical interface for visualization of mutations at sequence and structure levels. ProThermDB is cross-linked with other relevant databases, PDB, UniProt, PubMed etc. It is freely available at https://web.iitm.ac.in/bioinfo2/prothermdb/index.html without any login requirements. It is implemented in Python, HTML and JavaScript, and supports the latest versions of major browsers, such as Firefox, Chrome and Safari.
Collapse
Affiliation(s)
- Rahul Nikam
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - A Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - K Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - Divya Sharma
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| |
Collapse
|