1
|
Xu R, Pan Q, Zhu G, Ye Y, Xin M, Wang Z, Wang S, Li W, Wei Y, Guo J, Zheng L. ThermoLink: Bridging disulfide bonds and enzyme thermostability through database construction and machine learning prediction. Protein Sci 2024; 33:e5097. [PMID: 39145402 PMCID: PMC11325166 DOI: 10.1002/pro.5097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 05/27/2024] [Accepted: 06/15/2024] [Indexed: 08/16/2024]
Abstract
Disulfide bonds, covalently formed by sulfur atoms in cysteine residues, play a crucial role in protein folding and structure stability. Considering their significance, artificial disulfide bonds are often introduced to enhance protein thermostability. Although an increasing number of tools can assist with this task, significant amounts of time and resources are often wasted owing to inadequate consideration. To enhance the accuracy and efficiency of designing disulfide bonds for protein thermostability improvement, we initially collected disulfide bond and protein thermostability data from extensive literature sources. Thereafter, we extracted various sequence- and structure-based features and constructed machine-learning models to predict whether disulfide bonds can improve protein thermostability. Among all models, the neighborhood context model based on the Adaboost-DT algorithm performed the best, yielding "area under the receiver operating characteristic curve" and accuracy scores of 0.773 and 0.714, respectively. Furthermore, we also found AlphaFold2 to exhibit high superiority in predicting disulfide bonds, and to some extent, the coevolutionary relationship between residue pairs potentially guided artificial disulfide bond design. Moreover, several mutants of imine reductase 89 (IR89) with artificially designed thermostable disulfide bonds were experimentally proven to be considerably efficient for substrate catalysis. The SS-bond data have been integrated into an online server, namely, ThermoLink, available at guolab.mpu.edu.mo/thermoLink.
Collapse
Affiliation(s)
- Ran Xu
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| | - Qican Pan
- Zelixir Biotech Company Ltd, Shanghai, China
| | | | - Yilin Ye
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| | - Minghui Xin
- School of Physics, Shandong University, Jinan, China
| | - Zechen Wang
- School of Physics, Shandong University, Jinan, China
| | - Sheng Wang
- Zelixir Biotech Company Ltd, Shanghai, China
| | - Weifeng Li
- School of Physics, Shandong University, Jinan, China
| | - Yanjie Wei
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Jingjing Guo
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| | - Liangzhen Zheng
- Zelixir Biotech Company Ltd, Shanghai, China
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| |
Collapse
|
2
|
Wang Z, Zhou F, Wang Z, Hu Q, Li YQ, Wang S, Wei Y, Zheng L, Li W, Peng X. Fully Flexible Molecular Alignment Enables Accurate Ligand Structure Modeling. J Chem Inf Model 2024; 64:6205-6215. [PMID: 39074901 DOI: 10.1021/acs.jcim.4c00669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/31/2024]
Abstract
Accurate protein-ligand binding poses are the prerequisites of structure-based binding affinity prediction and provide the structural basis for in-depth lead optimization in small molecule drug design. However, it is challenging to provide reasonable predictions of binding poses for different molecules due to the complexity and diversity of the chemical space of small molecules. Similarity-based molecular alignment techniques can effectively narrow the search range, as structurally similar molecules are likely to have similar binding modes, with higher similarity usually correlated to higher success rates. However, molecular similarity is not consistently high because molecules often require changes to achieve specific purposes, leading to reduced alignment precision. To address this issue, we propose a new alignment method─Z-align. This method uses topological structural information as a criterion for evaluating similarity, reducing the reliance on molecular fingerprint similarity. Our method has achieved success rates significantly higher than those of other methods at moderate levels of similarity. Additionally, our approach can comprehensively and flexibly optimize bond lengths and angles of molecules, maintaining a high accuracy even when dealing with larger molecules. Consequently, our proposed solution helps in achieving more accurate binding poses in protein-ligand docking problems, facilitating the development of small molecule drugs. Z-align is freely available as a web server at https://cloud.zelixir.com/zalign/home.
Collapse
Affiliation(s)
- Zhihao Wang
- School of Physics, Shandong University, Jinan, 250100, China
| | - Fan Zhou
- Shanghai Zelixir Biotech, Shanghai, 200030, China
| | - Zechen Wang
- School of Physics, Shandong University, Jinan, 250100, China
| | - Qiuyue Hu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Yong-Qiang Li
- School of Physics, Shandong University, Jinan, 250100, China
| | - Sheng Wang
- Shanghai Zelixir Biotech, Shanghai, 200030, China
| | - Yanjie Wei
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Liangzhen Zheng
- Shanghai Zelixir Biotech, Shanghai, 200030, China
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Weifeng Li
- School of Physics, Shandong University, Jinan, 250100, China
| | - Xiangda Peng
- Shanghai Zelixir Biotech, Shanghai, 200030, China
| |
Collapse
|
3
|
Wuyun Q, Chen Y, Shen Y, Cao Y, Hu G, Cui W, Gao J, Zheng W. Recent Progress of Protein Tertiary Structure Prediction. Molecules 2024; 29:832. [PMID: 38398585 PMCID: PMC10893003 DOI: 10.3390/molecules29040832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/06/2024] [Accepted: 02/08/2024] [Indexed: 02/25/2024] Open
Abstract
The prediction of three-dimensional (3D) protein structure from amino acid sequences has stood as a significant challenge in computational and structural bioinformatics for decades. Recently, the widespread integration of artificial intelligence (AI) algorithms has substantially expedited advancements in protein structure prediction, yielding numerous significant milestones. In particular, the end-to-end deep learning method AlphaFold2 has facilitated the rise of structure prediction performance to new heights, regularly competitive with experimental structures in the 14th Critical Assessment of Protein Structure Prediction (CASP14). To provide a comprehensive understanding and guide future research in the field of protein structure prediction for researchers, this review describes various methodologies, assessments, and databases in protein structure prediction, including traditionally used protein structure prediction methods, such as template-based modeling (TBM) and template-free modeling (FM) approaches; recently developed deep learning-based methods, such as contact/distance-guided methods, end-to-end folding methods, and protein language model (PLM)-based methods; multi-domain protein structure prediction methods; the CASP experiments and related assessments; and the recently released AlphaFold Protein Structure Database (AlphaFold DB). We discuss their advantages, disadvantages, and application scopes, aiming to provide researchers with insights through which to understand the limitations, contexts, and effective selections of protein structure prediction methods in protein-related fields.
Collapse
Affiliation(s)
- Qiqige Wuyun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yihan Chen
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Yifeng Shen
- Faculty of Environment and Information Studies, Keio University, Fujisawa 252-0882, Kanagawa, Japan;
| | - Yang Cao
- College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Gang Hu
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China
| | - Wei Cui
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|