1
|
Wu J, Lv J, Zhao L, Zhao R, Gao T, Xu Q, Liu D, Yu Q, Ma F. Exploring the role of microbial proteins in controlling environmental pollutants based on molecular simulation. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 905:167028. [PMID: 37704131 DOI: 10.1016/j.scitotenv.2023.167028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 09/03/2023] [Accepted: 09/10/2023] [Indexed: 09/15/2023]
Abstract
Molecular simulation has been widely used to study microbial proteins' structural composition and dynamic properties, such as volatility, flexibility, and stability at the microscopic scale. Herein, this review describes the key elements of molecular docking and molecular dynamics (MD) simulations in molecular simulation; reviews the techniques combined with molecular simulation, such as crystallography, spectroscopy, molecular biology, and machine learning, to validate simulation results and bridge information gaps in the structure, microenvironmental changes, expression mechanisms, and intensity quantification; illustrates the application of molecular simulation, in characterizing the molecular mechanisms of interaction of microbial proteins with four different types of contaminants, namely heavy metals (HMs), pesticides, dyes and emerging contaminants (ECs). Finally, the review outlines the important role of molecular simulations in the study of microbial proteins for controlling environmental contamination and provides ideas for the application of molecular simulation in screening microbial proteins and incorporating targeted mutagenesis to obtain more effective contaminant control proteins.
Collapse
Affiliation(s)
- Jieting Wu
- School of Environmental Science, Liaoning University, Shenyang 110036, China
| | - Jin Lv
- School of Environmental Science, Liaoning University, Shenyang 110036, China
| | - Lei Zhao
- State Key Laboratory of Urban Water Resources & Environment, Harbin Institute of Technology, Harbin 150090, China
| | - Ruofan Zhao
- School of Environment, Beijing Normal University, Beijing 100875, China
| | - Tian Gao
- Key Laboratory of Integrated Regulation and Resource Development of Shallow Lakes, Ministry of Education, College of Environment, Hohai University, Xikang Road #1, Nanjing 210098, China
| | - Qi Xu
- PetroChina Fushun Petrochemical Company, Fushun 113000, China
| | - Dongbo Liu
- School of Environmental Science, Liaoning University, Shenyang 110036, China
| | - Qiqi Yu
- School of Environmental Science, Liaoning University, Shenyang 110036, China
| | - Fang Ma
- State Key Laboratory of Urban Water Resources & Environment, Harbin Institute of Technology, Harbin 150090, China.
| |
Collapse
|
2
|
Wang Z, Zheng L, Wang S, Lin M, Wang Z, Kong AWK, Mu Y, Wei Y, Li W. A fully differentiable ligand pose optimization framework guided by deep learning and a traditional scoring function. Brief Bioinform 2023; 24:6887112. [PMID: 36502369 DOI: 10.1093/bib/bbac520] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 10/17/2022] [Accepted: 10/31/2022] [Indexed: 12/14/2022] Open
Abstract
The recently reported machine learning- or deep learning-based scoring functions (SFs) have shown exciting performance in predicting protein-ligand binding affinities with fruitful application prospects. However, the differentiation between highly similar ligand conformations, including the native binding pose (the global energy minimum state), remains challenging that could greatly enhance the docking. In this work, we propose a fully differentiable, end-to-end framework for ligand pose optimization based on a hybrid SF called DeepRMSD+Vina combined with a multi-layer perceptron (DeepRMSD) and the traditional AutoDock Vina SF. The DeepRMSD+Vina, which combines (1) the root mean square deviation (RMSD) of the docking pose with respect to the native pose and (2) the AutoDock Vina score, is fully differentiable; thus is capable of optimizing the ligand binding pose to the energy-lowest conformation. Evaluated by the CASF-2016 docking power dataset, the DeepRMSD+Vina reaches a success rate of 94.4%, which outperforms most reported SFs to date. We evaluated the ligand conformation optimization framework in practical molecular docking scenarios (redocking and cross-docking tasks), revealing the high potentialities of this framework in drug design and discovery. Structural analysis shows that this framework has the ability to identify key physical interactions in protein-ligand binding, such as hydrogen-bonding. Our work provides a paradigm for optimizing ligand conformations based on deep learning algorithms. The DeepRMSD+Vina model and the optimization framework are available at GitHub repository https://github.com/zchwang/DeepRMSD-Vina_Optimization.
Collapse
Affiliation(s)
- Zechen Wang
- School of Physics, Shandong University, Jinan, Shandong 250100, China
| | - Liangzhen Zheng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China.,Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Mingzhi Lin
- Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China
| | - Zhihao Wang
- School of Physics, Shandong University, Jinan, Shandong 250100, China
| | - Adams Wai-Kin Kong
- Rolls-Royce Corporate Lab, Nanyang Technological University, Singapore 637551, Singapore
| | - Yuguang Mu
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Yanjie Wei
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
| | - Weifeng Li
- School of Physics, Shandong University, Jinan, Shandong 250100, China
| |
Collapse
|
3
|
Protein-ligand binding affinity prediction with edge awareness and supervised attention. iScience 2022; 26:105892. [PMID: 36691617 PMCID: PMC9860494 DOI: 10.1016/j.isci.2022.105892] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 11/12/2022] [Accepted: 12/23/2022] [Indexed: 12/29/2022] Open
Abstract
Accurate prediction of protein-ligand binding affinity is crucial in structure-based drug design but remains some challenges even with recent advances in deep learning: (1) Existing methods neglect the edge information in protein and ligand structure data; (2) current attention mechanisms struggle to capture true binding interactions in the small dataset. Herein, we proposed SEGSA_DTA, a SuperEdge Graph convolution-based and Supervised Attention-based Drug-Target Affinity prediction method, where the super edge graph convolution can comprehensively utilize node and edge information and the multi-supervised attention module can efficiently learn the attention distribution consistent with real protein-ligand interactions. Results on the multiple datasets show that SEGSA_DTA outperforms current state-of-the-art methods. We also applied SEGSA_DTA in repurposing FDA-approved drugs to identify potential coronavirus disease 2019 (COVID-19) treatments. Besides, by using SHapley Additive exPlanations (SHAP), we found that SEGSA_DTA is interpretable and further provides a new quantitative analytical solution for structure-based lead optimization.
Collapse
|
4
|
Developing a Naïve Bayesian Classification Model with PI3Kγ structural features for virtual screening against PI3Kγ: Combining molecular docking and pharmacophore based on multiple PI3Kγ conformations. Eur J Med Chem 2022; 244:114824. [DOI: 10.1016/j.ejmech.2022.114824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Revised: 09/28/2022] [Accepted: 10/01/2022] [Indexed: 11/21/2022]
|
5
|
Avery C, Patterson J, Grear T, Frater T, Jacobs DJ. Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:1246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein-ligand binding, including allosteric effects, protein-protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
Affiliation(s)
- Chris Avery
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - John Patterson
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Tyler Grear
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Theodore Frater
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Donald J. Jacobs
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
6
|
Zhang J. Atom typing using graph representation learning: How do models learn chemistry? J Chem Phys 2022; 156:204108. [DOI: 10.1063/5.0095008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Atom typing is the first step for simulating molecules using a force field. Automatic atom typing for an arbitrary molecule is often realized by rule-based algorithms, which have to manually encode rules for all types defined in this force field. These are time-consuming and force field-specific. In this study, a method that is independent of a specific force field based on graph representation learning is established for automatic atom typing. The topology adaptive graph convolution network (TAGCN) is found to be an optimal model. The model does not need manual enumeration of rules but can learn the rules just through training using typed molecules prepared during the development of a force field. The test on the CHARMM general force field gives a typing correctness of 91%. A systematic error of typing by TAGCN is its inability of distinguishing types in rings or acyclic chains. It originates from the fundamental structure of graph neural networks and can be fixed in a trivial way. More importantly, analysis of the rationalization processes of these models using layer-wise relation propagation reveals how TAGCN encodes rules learned during training. Our model is found to be able to type using the local chemical environments, in a way highly in accordance with chemists’ intuition.
Collapse
Affiliation(s)
- Jun Zhang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, People’s Republic of China
| |
Collapse
|
7
|
Affinity prediction using deep learning based on SMILES input for D3R grand challenge 4. J Comput Aided Mol Des 2022; 36:225-235. [PMID: 35314897 DOI: 10.1007/s10822-022-00448-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 03/08/2022] [Indexed: 10/18/2022]
Abstract
Modern molecular docking comprises the prediction of pose and affinity. Prediction of docking poses is required for affinity prediction when three-dimensional coordinates of the ligand have not been provided. However, a large number of feature engineering is required for existing methods. In addition, there is a need for a robust model for the sequential combination of pose and affinity prediction due to the probabilistic deviation of the ligand position issue. We propose a pipeline using a bipartite graph neural network and transfer learning trained on a re-docking dataset. We evaluated our model on the released data from drug design data resource grand challenge 4 (D3R GC4). The two target protein data provided by the challenge have different patterns. The model outperformed the best participant by 9% on the BACE target protein from stage 2. Further, our model showed competitive performance on the CatS target protein.
Collapse
|
8
|
Niu Y, Ji H. Current developments in extracellular-regulated protein kinase (ERK1/2) inhibitors. Drug Discov Today 2022; 27:1464-1473. [DOI: 10.1016/j.drudis.2022.01.012] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 12/19/2021] [Accepted: 01/25/2022] [Indexed: 12/22/2022]
|
9
|
Yang L, Yang G, Bing Z, Tian Y, Niu Y, Huang L, Yang L. Transformer-Based Generative Model Accelerating the Development of Novel BRAF Inhibitors. ACS OMEGA 2021; 6:33864-33873. [PMID: 34926933 PMCID: PMC8674994 DOI: 10.1021/acsomega.1c05145] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Accepted: 11/08/2021] [Indexed: 06/14/2023]
Abstract
The de novo drug design based on SMILES format is a typical sequence-processing problem. Previous methods based on recurrent neural network (RNN) exhibit limitation in capturing long-range dependency, resulting in a high invalid percentage in generated molecules. Recent studies have shown the potential of Transformer architecture to increase the capacity of handling sequence data. In this work, the encoder module in the Transformer is used to build a generative model. First, we train a Transformer-encoder-based generative model to learn the grammatical rules of known drug molecules and a predictive model to predict the activity of the molecules. Subsequently, transfer learning and reinforcement learning were used to fine-tune and optimize the generative model, respectively, to design new molecules with desirable activity. Compared with previous RNN-based methods, our method has improved the percentage of generating chemically valid molecules (from 95.6 to 98.2%), the structural diversity of the generated molecules, and the feasibility of molecular synthesis. The pipeline is validated by designing inhibitors against the human BRAF protein. Molecular docking and binding mode analysis showed that our method can generate small molecules with higher activity than those carrying ligands in the crystal structure and have similar interaction sites with these ligands, which can provide new ideas and suggestions for pharmaceutical chemists.
Collapse
Affiliation(s)
- Lijuan Yang
- Institute
of Modern Physics, Chinese Academy of Sciences, Lanzhou 730000, China
- School
of Physics and Technology, Lanzhou University, Lanzhou 730000, China
- School
of Physics, University of Chinese Academy
of Sciences, Beijing 100049, China
- Advanced
Energy Science and Technology, Guangdong
Laboratory, Huizhou 516000, China
| | - Guanghui Yang
- Institute
of Modern Physics, Chinese Academy of Sciences, Lanzhou 730000, China
- Advanced
Energy Science and Technology, Guangdong
Laboratory, Huizhou 516000, China
| | - Zhitong Bing
- Institute
of Modern Physics, Chinese Academy of Sciences, Lanzhou 730000, China
- Advanced
Energy Science and Technology, Guangdong
Laboratory, Huizhou 516000, China
| | - Yuan Tian
- Institute
of Modern Physics, Chinese Academy of Sciences, Lanzhou 730000, China
- School
of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China
| | - Yuzhen Niu
- Shandong
Provincial Research Center for Bioinformatic Engineering and Technique,
School of Life Sciences, Shandong University
of Technology, Zibo 255000, China
| | - Liang Huang
- School
of Physics and Technology, Lanzhou University, Lanzhou 730000, China
| | - Lei Yang
- Institute
of Modern Physics, Chinese Academy of Sciences, Lanzhou 730000, China
- Advanced
Energy Science and Technology, Guangdong
Laboratory, Huizhou 516000, China
| |
Collapse
|
10
|
Mordalski S, Wojtuch A, Podolak I, Kurczab R, Bojarski AJ. 2D SIFt: a matrix of ligand-receptor interactions. J Cheminform 2021; 13:66. [PMID: 34496955 PMCID: PMC8424890 DOI: 10.1186/s13321-021-00545-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 08/21/2021] [Indexed: 11/10/2022] Open
Abstract
Depicting a ligand-receptor complex via Interaction Fingerprints has been shown to be both a viable data visualization and an analysis tool. The spectrum of its applications ranges from simple visualization of the binding site through analysis of molecular dynamics runs, to the evaluation of the homology models and virtual screening. Here we present a novel tool derived from the Structural Interaction Fingerprints providing a detailed and unique insight into the interactions between receptor and specific regions of the ligand (grouped into pharmacophore features) in the form of a matrix, a 2D-SIFt descriptor. The provided implementation is easy to use and extends the python library, allowing the generation of interaction matrices and their manipulation (reading and writing as well as producing the average 2D-SIFt). The library for handling the interaction matrices is available via repository http://bitbucket.org/zchl/sift2d.
Collapse
Affiliation(s)
- Stefan Mordalski
- Department of Medicinal Chemistry, Maj Institute of Pharmacology Polish Academy of Sciences, Krakow, Poland.
| | - Agnieszka Wojtuch
- Faculty of Mathematics and Computer Science, Jagiellonian University, Krakow, Poland
| | - Igor Podolak
- Faculty of Mathematics and Computer Science, Jagiellonian University, Krakow, Poland
| | - Rafał Kurczab
- Department of Medicinal Chemistry, Maj Institute of Pharmacology Polish Academy of Sciences, Krakow, Poland
| | - Andrzej J Bojarski
- Department of Medicinal Chemistry, Maj Institute of Pharmacology Polish Academy of Sciences, Krakow, Poland
| |
Collapse
|