1
|
Dahlström KM, Salminen TA. Apprehensions and emerging solutions in ML-based protein structure prediction. Curr Opin Struct Biol 2024; 86:102819. [PMID: 38631107 DOI: 10.1016/j.sbi.2024.102819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/05/2024] [Accepted: 03/31/2024] [Indexed: 04/19/2024]
Abstract
The three-dimensional structure of proteins determines their function in vital biological processes. Thus, when the structure is known, the molecular mechanism of protein function can be understood in more detail and obtained information utilized in biotechnological, diagnostics, and therapeutic applications. Over the past five years, machine learning (ML)-based modeling has pushed protein structure prediction to the next level with AlphaFold in the front line, predicting the structure for hundreds of millions of proteins. Further advances recently report promising ML-based approaches for solving remaining challenges by incorporating functionally important metals, co-factors, post-translational modifications, structural dynamics, and interdomain and multimer interactions in the structure prediction process.
Collapse
Affiliation(s)
- Käthe M Dahlström
- Structural Bioinformatics Laboratory, Biochemistry, Faculty of Science and Engineering, Åbo Akademi University, Tykistökatu 6A, 20520 Turku, Finland; InFLAMES Research Flagship Center, Åbo Akademi University, 20520 Turku, Finland
| | - Tiina A Salminen
- Structural Bioinformatics Laboratory, Biochemistry, Faculty of Science and Engineering, Åbo Akademi University, Tykistökatu 6A, 20520 Turku, Finland; InFLAMES Research Flagship Center, Åbo Akademi University, 20520 Turku, Finland.
| |
Collapse
|
2
|
Zheng L, Shi S, Sun X, Lu M, Liao Y, Zhu S, Zhang H, Pan Z, Fang P, Zeng Z, Li H, Li Z, Xue W, Zhu F. MoDAFold: a strategy for predicting the structure of missense mutant protein based on AlphaFold2 and molecular dynamics. Brief Bioinform 2024; 25:bbae006. [PMID: 38305456 PMCID: PMC10835750 DOI: 10.1093/bib/bbae006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 12/26/2023] [Accepted: 01/01/2024] [Indexed: 02/03/2024] Open
Abstract
Protein structure prediction is a longstanding issue crucial for identifying new drug targets and providing a mechanistic understanding of protein functions. To enhance the progress in this field, a spectrum of computational methodologies has been cultivated. AlphaFold2 has exhibited exceptional precision in predicting wild-type protein structures, with performance exceeding that of other methods. However, predicting the structures of missense mutant proteins using AlphaFold2 remains challenging due to the intricate and substantial structural alterations caused by minor sequence variations in the mutant proteins. Molecular dynamics (MD) has been validated for precisely capturing changes in amino acid interactions attributed to protein mutations. Therefore, for the first time, a strategy entitled 'MoDAFold' was proposed to improve the accuracy and reliability of missense mutant protein structure prediction by combining AlphaFold2 with MD. Multiple case studies have confirmed the superior performance of MoDAFold compared to other methods, particularly AlphaFold2.
Collapse
Affiliation(s)
- Lingyan Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Industry Solutions Research and Development, Alibaba Cloud Computing, Hangzhou 330110, China
| | - Shuiyang Shi
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Xiuna Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Industry Solutions Research and Development, Alibaba Cloud Computing, Hangzhou 330110, China
| | - Mingkun Lu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Industry Solutions Research and Development, Alibaba Cloud Computing, Hangzhou 330110, China
| | - Yang Liao
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Sisi Zhu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Hongning Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Pan Fang
- Industry Solutions Research and Development, Alibaba Cloud Computing, Hangzhou 330110, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Zhenyu Zeng
- Industry Solutions Research and Development, Alibaba Cloud Computing, Hangzhou 330110, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Honglin Li
- School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Zhaorong Li
- Industry Solutions Research and Development, Alibaba Cloud Computing, Hangzhou 330110, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Weiwei Xue
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Industry Solutions Research and Development, Alibaba Cloud Computing, Hangzhou 330110, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
3
|
Feltes BC, Pinto ÉSM, Mangini AT, Dorn M. NIAS-Server 2.0: A versatile complementary tool for structural biology studies. J Comput Chem 2023; 44:1610-1623. [PMID: 37040476 DOI: 10.1002/jcc.27112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 03/21/2023] [Accepted: 03/24/2023] [Indexed: 04/13/2023]
Abstract
Increasing the repertoire of available complementary tools to advance the knowledge of protein structures is fundamental for structural biology. The Neighbors Influence of Amino Acids and Secondary Structures (NIAS) is a server that analyzes a protein's conformational preferences of amino acids. NIAS is based on the Angle Probability List, representing the normalized frequency of empirical conformational preferences, such as torsion angles, of different amino acid pairs and their corresponding secondary structure information, as available in the Protein Data Bank. In this work, we announce the updated NIAS server with the data comprising all structures deposited until Sep 2022, 7 years after the initial release. Unlike the original publication, which accounted for only studies conducted with X-ray crystallography, we added data from solid nuclear magnetic resonance (NMR), solution NMR, CullPDB, Electron Microscopy, and Electron Crystallography using multiple filtering parameters. We also provide examples of how NIAS can be applied as a complementary analysis tool for different structural biology works and what are its limitations.
Collapse
Affiliation(s)
- Bruno César Feltes
- Institute of Informatics, Federal University of Rio Grande do Sul, Porto Alegre, RS, Brazil
| | | | | | - Márcio Dorn
- Institute of Informatics, Federal University of Rio Grande do Sul, Porto Alegre, RS, Brazil
- Center for Biotechnology, Federal University of Rio Grande do Sul, Porto Alegre, RS, Brazil
- National Institute of Forensic Science and Technology, Federal University of Rio Grande do Sul, Porto Alegre, RS, Brazil
| |
Collapse
|
4
|
Yang Z, Zeng X, Zhao Y, Chen R. AlphaFold2 and its applications in the fields of biology and medicine. Signal Transduct Target Ther 2023; 8:115. [PMID: 36918529 PMCID: PMC10011802 DOI: 10.1038/s41392-023-01381-z] [Citation(s) in RCA: 78] [Impact Index Per Article: 78.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Revised: 12/27/2022] [Accepted: 02/16/2023] [Indexed: 03/16/2023] Open
Abstract
AlphaFold2 (AF2) is an artificial intelligence (AI) system developed by DeepMind that can predict three-dimensional (3D) structures of proteins from amino acid sequences with atomic-level accuracy. Protein structure prediction is one of the most challenging problems in computational biology and chemistry, and has puzzled scientists for 50 years. The advent of AF2 presents an unprecedented progress in protein structure prediction and has attracted much attention. Subsequent release of structures of more than 200 million proteins predicted by AF2 further aroused great enthusiasm in the science community, especially in the fields of biology and medicine. AF2 is thought to have a significant impact on structural biology and research areas that need protein structure information, such as drug discovery, protein design, prediction of protein function, et al. Though the time is not long since AF2 was developed, there are already quite a few application studies of AF2 in the fields of biology and medicine, with many of them having preliminarily proved the potential of AF2. To better understand AF2 and promote its applications, we will in this article summarize the principle and system architecture of AF2 as well as the recipe of its success, and particularly focus on reviewing its applications in the fields of biology and medicine. Limitations of current AF2 prediction will also be discussed.
Collapse
Affiliation(s)
- Zhenyu Yang
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Xiaoxi Zeng
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China.
| | - Yi Zhao
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China.
- Key Laboratory of Intelligent Information Processing, Advanced Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China.
| | - Runsheng Chen
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China.
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- Pingshan Translational Medicine Center, Shenzhen Bay Laboratory, Shenzhen, 518118, China.
| |
Collapse
|
5
|
Wang L, Song Y, Wang H, Zhang X, Wang M, He J, Li S, Zhang L, Li K, Cao L. Advances of Artificial Intelligence in Anti-Cancer Drug Design: A Review of the Past Decade. Pharmaceuticals (Basel) 2023; 16:253. [PMID: 37259400 PMCID: PMC9963982 DOI: 10.3390/ph16020253] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 01/25/2023] [Accepted: 02/06/2023] [Indexed: 10/13/2023] Open
Abstract
Anti-cancer drug design has been acknowledged as a complicated, expensive, time-consuming, and challenging task. How to reduce the research costs and speed up the development process of anti-cancer drug designs has become a challenging and urgent question for the pharmaceutical industry. Computer-aided drug design methods have played a major role in the development of cancer treatments for over three decades. Recently, artificial intelligence has emerged as a powerful and promising technology for faster, cheaper, and more effective anti-cancer drug designs. This study is a narrative review that reviews a wide range of applications of artificial intelligence-based methods in anti-cancer drug design. We further clarify the fundamental principles of these methods, along with their advantages and disadvantages. Furthermore, we collate a large number of databases, including the omics database, the epigenomics database, the chemical compound database, and drug databases. Other researchers can consider them and adapt them to their own requirements.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Kang Li
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin 150081, China
| | - Lei Cao
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
6
|
Verschoor JA, Kusumawardhani H, Ram AFJ, de Winde JH. Toward Microbial Recycling and Upcycling of Plastics: Prospects and Challenges. Front Microbiol 2022; 13:821629. [PMID: 35401461 PMCID: PMC8985596 DOI: 10.3389/fmicb.2022.821629] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 02/15/2022] [Indexed: 12/12/2022] Open
Abstract
Annually, 400 Mt of plastics are produced of which roughly 40% is discarded within a year. Current plastic waste management approaches focus on applying physical, thermal, and chemical treatments of plastic polymers. However, these methods have severe limitations leading to the loss of valuable materials and resources. Another major drawback is the rapid accumulation of plastics into the environment causing one of the biggest environmental threats of the twenty-first century. Therefore, to complement current plastic management approaches novel routes toward plastic degradation and upcycling need to be developed. Enzymatic degradation and conversion of plastics present a promising approach toward sustainable recycling of plastics and plastics building blocks. However, the quest for novel enzymes that efficiently operate in cost-effective, large-scale plastics degradation poses many challenges. To date, a wide range of experimental set-ups has been reported, in many cases lacking a detailed investigation of microbial species exhibiting plastics degrading properties as well as of their corresponding plastics degrading enzymes. The apparent lack of consistent approaches compromises the necessary discovery of a wide range of novel enzymes. In this review, we discuss prospects and possibilities for efficient enzymatic degradation, recycling, and upcycling of plastics, in correlation with their wide diversity and broad utilization. Current methods for the identification and optimization of plastics degrading enzymes are compared and discussed. We present a framework for a standardized workflow, allowing transparent discovery and optimization of novel enzymes for efficient and sustainable plastics degradation in the future.
Collapse
Affiliation(s)
- Jo-Anne Verschoor
- Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, Netherlands
| | | | - Arthur F. J. Ram
- Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, Netherlands
| | - Johannes H. de Winde
- Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, Netherlands
| |
Collapse
|
7
|
Wang X, Li F, Qiu W, Xu B, Li Y, Lian X, Yu H, Zhang Z, Wang J, Li Z, Xue W, Zhu F. SYNBIP: synthetic binding proteins for research, diagnosis and therapy. Nucleic Acids Res 2021; 50:D560-D570. [PMID: 34664670 PMCID: PMC8728148 DOI: 10.1093/nar/gkab926] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Revised: 09/13/2021] [Accepted: 10/14/2021] [Indexed: 12/11/2022] Open
Abstract
The success of protein engineering and design has extensively expanded the protein space, which presents a promising strategy for creating next-generation proteins of diverse functions. Among these proteins, the synthetic binding proteins (SBPs) are smaller, more stable, less immunogenic, and better of tissue penetration than others, which make the SBP-related data attracting extensive interest from worldwide scientists. However, no database has been developed to systematically provide the valuable information of SBPs yet. In this study, a database named ‘Synthetic Binding Proteins for Research, Diagnosis, and Therapy (SYNBIP)’ was thus introduced. This database is unique in (a) comprehensively describing thousands of SBPs from the perspectives of scaffolds, biophysical & functional properties, etc.; (b) panoramically illustrating the binding targets & the broad application of each SBP and (c) enabling a similarity search against the sequences of all SBPs and their binding targets. Since SBP is a human-made protein that has not been found in nature, the discovery of novel SBPs relied heavily on experimental protein engineering and could be greatly facilitated by in-silico studies (such as AI and computational modeling). Thus, the data provided in SYNBIP could lay a solid foundation for the future development of novel SBPs. The SYNBIP is accessible without login requirement at both official (https://idrblab.org/synbip/) and mirror (http://synbip.idrblab.net/) sites.
Collapse
Affiliation(s)
- Xiaona Wang
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Wenqi Qiu
- Department of Surgery, HKU-SZH & Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Binbin Xu
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Yanlin Li
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Xichen Lian
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Hongyan Yu
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Zhao Zhang
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Weiwei Xue
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Feng Zhu
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
8
|
Gala M, Žoldák G. Classifying Residues in Mechanically Stable and Unstable Substructures Based on a Protein Sequence: The Case Study of the DnaK Hsp70 Chaperone. NANOMATERIALS (BASEL, SWITZERLAND) 2021; 11:2198. [PMID: 34578514 PMCID: PMC8467864 DOI: 10.3390/nano11092198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 08/16/2021] [Accepted: 08/24/2021] [Indexed: 12/17/2022]
Abstract
Artificial proteins can be constructed from stable substructures, whose stability is encoded in their protein sequence. Identifying stable protein substructures experimentally is the only available option at the moment because no suitable method exists to extract this information from a protein sequence. In previous research, we examined the mechanics of E. coli Hsp70 and found four mechanically stable (S class) and three unstable substructures (U class). Of the total 603 residues in the folded domains of Hsp70, 234 residues belong to one of four mechanically stable substructures, and 369 residues belong to one of three unstable substructures. Here our goal is to develop a machine learning model to categorize Hsp70 residues using sequence information. We applied three supervised methods: logistic regression (LR), random forest, and support vector machine. The LR method showed the highest accuracy, 0.925, to predict the correct class of a particular residue only when context-dependent physico-chemical features were included. The cross-validation of the LR model yielded a prediction accuracy of 0.879 and revealed that most of the misclassified residues lie at the borders between substructures. We foresee machine learning models being used to identify stable substructures as candidates for building blocks to engineer new proteins.
Collapse
Affiliation(s)
- Michal Gala
- Department of Biophysics, Faculty of Science, P. J. Šafárik University, Jesena 5, 040 01 Košice, Slovakia;
| | - Gabriel Žoldák
- Center for Interdisciplinary Biosciences, Technology and Innovation Park, P. J. Šafárik University, Trieda SNP 1, 040 11 Košice, Slovakia
| |
Collapse
|
9
|
3D architecture and structural flexibility revealed in the subfamily of large glutamate dehydrogenases by a mycobacterial enzyme. Commun Biol 2021; 4:684. [PMID: 34083757 PMCID: PMC8175468 DOI: 10.1038/s42003-021-02222-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 05/14/2021] [Indexed: 11/16/2022] Open
Abstract
Glutamate dehydrogenases (GDHs) are widespread metabolic enzymes that play key roles in nitrogen homeostasis. Large glutamate dehydrogenases composed of 180 kDa subunits (L-GDHs180) contain long N- and C-terminal segments flanking the catalytic core. Despite the relevance of L-GDHs180 in bacterial physiology, the lack of structural data for these enzymes has limited the progress of functional studies. Here we show that the mycobacterial L-GDH180 (mL-GDH180) adopts a quaternary structure that is radically different from that of related low molecular weight enzymes. Intersubunit contacts in mL-GDH180 involve a C-terminal domain that we propose as a new fold and a flexible N-terminal segment comprising ACT-like and PAS-type domains that could act as metabolic sensors for allosteric regulation. These findings uncover unique aspects of the structure-function relationship in the subfamily of L-GDHs. Lázaro et. al. report the first 3D structure of a large glutamate dehydrogenase (L-GDH), the one corresponding to the Mycobacterium smegmatis enzyme composed of 180 kDa subunits (mL-GDH180), obtained by X-ray crystallography and cryo-electron microscopy. This structure reveals that mL-GDH180 assembles as tetramers with the N- and C-terminal domains being involved in inter-subunit contacts and unveils unique features of the subfamily of L-GDHs.
Collapse
|