1
|
Shan M, Jiang C, Qin L, Cheng G. A Review of Computational Methods in Predicting hERG Channel Blockers. ChemistrySelect 2022. [DOI: 10.1002/slct.202201221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Mengyi Shan
- School of Pharmaceutical Sciences Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| | - Chen Jiang
- QuanMin RenZheng (HangZhou) Technology Co. Ltd. China
| | - Lu‐Ping Qin
- School of Pharmaceutical Sciences Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| | - Gang Cheng
- School of Pharmaceutical Sciences Zhejiang Chinese Medical University Hangzhou 310053 People's Republic of China
| |
Collapse
|
2
|
Mao J, Akhtar J, Zhang X, Sun L, Guan S, Li X, Chen G, Liu J, Jeon HN, Kim MS, No KT, Wang G. Comprehensive strategies of machine-learning-based quantitative structure-activity relationship models. iScience 2021; 24:103052. [PMID: 34553136 PMCID: PMC8441174 DOI: 10.1016/j.isci.2021.103052] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Early quantitative structure-activity relationship (QSAR) technologies have unsatisfactory versatility and accuracy in fields such as drug discovery because they are based on traditional machine learning and interpretive expert features. The development of Big Data and deep learning technologies significantly improve the processing of unstructured data and unleash the great potential of QSAR. Here we discuss the integration of wet experiments (which provide experimental data and reliable verification), molecular dynamics simulation (which provides mechanistic interpretation at the atomic/molecular levels), and machine learning (including deep learning) techniques to improve QSAR models. We first review the history of traditional QSAR and point out its problems. We then propose a better QSAR model characterized by a new iterative framework to integrate machine learning with disparate data input. Finally, we discuss the application of QSAR and machine learning to many practical research fields, including drug development and clinical trials.
Collapse
Affiliation(s)
- Jiashun Mao
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Computational Science and Material Design, Shenzhen, Guangdong 518055 China
| | - Javed Akhtar
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Cell Microenvironment and Disease Research, Shenzhen, Guangdong 518055, China
| | - Xiao Zhang
- Shanghai Rural Commercial Bank Co., Ltd, Shanghai 200002, China
| | - Liang Sun
- Department of Physics, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China
| | - Shenghui Guan
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Computational Science and Material Design, Shenzhen, Guangdong 518055 China
| | - Xinyu Li
- School of Life and Health Sciences and Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Guangming Chen
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Cell Microenvironment and Disease Research, Shenzhen, Guangdong 518055, China
| | - Jiaxin Liu
- Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Hyeon-Nae Jeon
- Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Min Sung Kim
- Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Kyoung Tai No
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
| | - Guanyu Wang
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Computational Science and Material Design, Shenzhen, Guangdong 518055 China
- Guangdong Provincial Key Laboratory of Cell Microenvironment and Disease Research, Shenzhen, Guangdong 518055, China
| |
Collapse
|
3
|
D’Souza S, Prema K, Balaji S. Machine learning models for drug–target interactions: current knowledge and future directions. Drug Discov Today 2020; 25:748-756. [DOI: 10.1016/j.drudis.2020.03.003] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Revised: 02/28/2020] [Accepted: 03/05/2020] [Indexed: 12/22/2022]
|
4
|
Continuous molecular fields and the concept of molecular co-fields in structure-activity studies. Future Med Chem 2019; 11:2701-2713. [PMID: 31596146 DOI: 10.4155/fmc-2018-0360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The analysis of information on the spatial structure of molecules and the physical fields of their interactions with biological targets is extremely important for solving various problems in drug discovery. This mini-review article surveys the main features of the continuous molecular fields approach and its use for analyzing structure-activity relationships in 3D space, building 3D quantitative structure-activity models and conducting similarity based virtual screening. Particular attention is paid to the consideration of the concept of molecular co-fields and their use for the interpretation of 3D structure-activity models. The principles of molecular design based on the overlapping and the similarity of molecular fields with corresponding co-fields are formulated.
Collapse
|
5
|
Cleves AE, Jain AN. Quantitative surface field analysis: learning causal models to predict ligand binding affinity and pose. J Comput Aided Mol Des 2018; 32:731-757. [PMID: 29934750 PMCID: PMC6096883 DOI: 10.1007/s10822-018-0126-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 06/14/2018] [Indexed: 12/27/2022]
Abstract
We introduce the QuanSA method for inducing physically meaningful field-based models of ligand binding pockets based on structure-activity data alone. The method is closely related to the QMOD approach, substituting a learned scoring field for a pocket constructed of molecular fragments. The problem of mutual ligand alignment is addressed in a general way, and optimal model parameters and ligand poses are identified through multiple-instance machine learning. We provide algorithmic details along with performance results on sixteen structure-activity data sets covering many pharmaceutically relevant targets. In particular, we show how models initially induced from small data sets can extrapolatively identify potent new ligands with novel underlying scaffolds with very high specificity. Further, we show that combining predictions from QuanSA models with those from physics-based simulation approaches is synergistic. QuanSA predictions yield binding affinities, explicit estimates of ligand strain, associated ligand pose families, and estimates of structural novelty and confidence. The method is applicable for fine-grained lead optimization as well as potent new lead identification.
Collapse
Affiliation(s)
- Ann E Cleves
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, USA
| | - Ajay N Jain
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, USA.
| |
Collapse
|
6
|
Lo YC, Rensi SE, Torng W, Altman RB. Machine learning in chemoinformatics and drug discovery. Drug Discov Today 2018; 23:1538-1546. [PMID: 29750902 DOI: 10.1016/j.drudis.2018.05.010] [Citation(s) in RCA: 470] [Impact Index Per Article: 67.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Revised: 03/29/2018] [Accepted: 05/02/2018] [Indexed: 01/03/2023]
Abstract
Chemoinformatics is an established discipline focusing on extracting, processing and extrapolating meaningful data from chemical structures. With the rapid explosion of chemical 'big' data from HTS and combinatorial synthesis, machine learning has become an indispensable tool for drug designers to mine chemical information from large compound databases to design drugs with important biological properties. To process the chemical data, we first reviewed multiple processing layers in the chemoinformatics pipeline followed by the introduction of commonly used machine learning models in drug discovery and QSAR analysis. Here, we present basic principles and recent case studies to demonstrate the utility of machine learning techniques in chemoinformatics analyses; and we discuss limitations and future directions to guide further development in this evolving field.
Collapse
Affiliation(s)
- Yu-Chen Lo
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Stefano E Rensi
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Wen Torng
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Russ B Altman
- Department of Bioengineering, Stanford University, Stanford, CA, USA.
| |
Collapse
|
7
|
Dreher J, Scheiber J, Stiefl N, Baumann K. xMaP-An Interpretable Alignment-Free Four-Dimensional Quantitative Structure-Activity Relationship Technique Based on Molecular Surface Properties and Conformer Ensembles. J Chem Inf Model 2018; 58:165-181. [PMID: 29172519 DOI: 10.1021/acs.jcim.7b00419] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A novel alignment-free molecular descriptor called xMaP (flexible MaP descriptor) is introduced. The descriptor is the advancement of the previously published translationally and rotationally invariant three-dimensional (3D) descriptor MaP (mapping property distributions onto the molecular surface) to the fourth dimension (4D). In addition to MaP, xMaP is independent of the chosen starting conformation of the encoded molecules and is therefore entirely alignment-free. This is achieved by using ensembles of conformers, which are generated by conformational searches. This step of the procedure is similar to Hopfinger's 4D quantitative structure-activity relationship (QSAR). A five-step procedure is used to compute the xMaP descriptor. First, a conformational search for each molecule is carried out. Next, for each of the conformers an approximation to the molecular surface with equally distributed surface points is computed. Third, molecular properties are projected onto this surface. Fourth, areas of identical properties are clustered to so-called patches. Fifth, the spatial distribution of the patches is converted into an alignment-free descriptor that is based on the entire conformer ensemble. The resulting descriptor can be interpreted by superimposing the most important descriptor variables and the molecules of the data set. The most important descriptor variables are identified with chemometric regression tools. The novel descriptor was applied to several benchmark data sets and was compared to other descriptors and QSAR techniques comprising a binary fingerprint, a topological pharmacophore descriptor (Cats2D), and the field-based 3D-QSAR technique GRID/PLS which is alignment-dependent. The use of conformer ensembles renders xMaP very robust. It turns out that xMaP performs very well on (almost) all data sets and that the statistical results are comparable to GRID/PLS. In addition to that, xMaP can also be used to efficiently visualize the derived quantitative structure-activity relationships.
Collapse
Affiliation(s)
- Jan Dreher
- Institute of Medicinal and Pharmaceutical Chemistry, University of Technology Braunschweig , Beethovenstrasse 55, D 38106 Braunschweig, Germany
| | - Josef Scheiber
- Institute of Medicinal and Pharmaceutical Chemistry, University of Technology Braunschweig , Beethovenstrasse 55, D 38106 Braunschweig, Germany
| | - Nikolaus Stiefl
- Institute of Medicinal and Pharmaceutical Chemistry, University of Technology Braunschweig , Beethovenstrasse 55, D 38106 Braunschweig, Germany
| | - Knut Baumann
- Institute of Medicinal and Pharmaceutical Chemistry, University of Technology Braunschweig , Beethovenstrasse 55, D 38106 Braunschweig, Germany
| |
Collapse
|
8
|
Recent Developments in 3D QSAR and Molecular Docking Studies of Organic and Nanostructures. HANDBOOK OF COMPUTATIONAL CHEMISTRY 2017. [PMCID: PMC7123761 DOI: 10.1007/978-3-319-27282-5_54] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
The development of quantitative structure–activity relationship (QSAR) methods is going very fast for the last decades. OSAR approach already plays an important role in lead structure optimization, and nowadays, with development of big data approaches and computer power, it can even handle a huge amount of data associated with combinatorial chemistry. One of the recent developments is a three-dimensional QSAR, i.e., 3D QSAR. For the last two decades, 3D-OSAR has already been successfully applied to many datasets, especially of enzyme and receptor ligands. Moreover, quite often 3D QSAR investigations are going together with protein–ligand docking studies and this combination works synergistically. In this review, we outline recent advances in development and applications of 3D QSAR and protein–ligand docking approaches, as well as combined approaches for conventional organic compounds and for nanostructured materials, such as fullerenes and carbon nanotubes.
Collapse
|
9
|
Hamzeh-Mivehroud M, Sokouti B, Dastmalchi S. An Introduction to the Basic Concepts in QSAR-Aided Drug Design. Oncology 2017. [DOI: 10.4018/978-1-5225-0549-5.ch002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The need for the development of new drugs to combat existing and newly identified conditions is unavoidable. One of the important tools used in the advanced drug development pipeline is computer-aided drug design. Traditionally, to find a drug many ligands were synthesized and evaluated for their effectiveness using suitable bioassays and if all other drug-likeness features were met, the candidate(s) would possibly reach the market. Although this approach is still in use in advanced format, computational methods are an indispensable component of modern drug development projects. One of the methods used from very early days of rationalizing the drug design approaches is Quantitative Structure-Activity Relationship (QSAR). This chapter overviews QSAR modeling steps by introducing molecular descriptors, mathematical model development for relating biological activities to molecular structures, and model validation. At the end, several successful cases where QSAR studies were used extensively are presented.
Collapse
Affiliation(s)
| | | | - Siavoush Dastmalchi
- Biotechnology Research Center, Tabriz University of Medical Sciences, Iran & School of Pharmacy, Tabriz University of Medical Sciences, Iran
| |
Collapse
|
10
|
Cleves AE, Jain AN. Extrapolative prediction using physically-based QSAR. J Comput Aided Mol Des 2016; 30:127-52. [PMID: 26860112 PMCID: PMC4796382 DOI: 10.1007/s10822-016-9896-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Accepted: 01/21/2016] [Indexed: 11/25/2022]
Abstract
Surflex-QMOD integrates chemical structure and activity data to produce physically-realistic models for binding affinity prediction
. Here, we apply QMOD to a 3D-QSAR benchmark dataset and show broad applicability to a diverse set of targets. Testing new ligands within the QMOD model employs automated flexible molecular alignment, with the model itself defining the optimal pose for each ligand. QMOD performance was compared to that of four approaches that depended on manual alignments (CoMFA, two variations of CoMSIA, and CMF). QMOD showed comparable performance to the other methods on a challenging, but structurally limited, test set. The QMOD models were also applied to test a large and structurally diverse dataset of ligands from ChEMBL, nearly all of which were synthesized years after those used for model construction. Extrapolation across diverse chemical structures was possible because the method addresses the ligand pose problem and provides structural and geometric means to quantitatively identify ligands within a model’s applicability domain. Predictions for such ligands for the four tested targets were highly statistically significant based on rank correlation. Those molecules predicted to be highly active (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\hbox {pK}_i \ge 7.5$$\end{document}pKi≥7.5) had a mean experimental \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\hbox {pK}_i$$\end{document}pKi of 7.5, with potent and structurally novel ligands being identified by QMOD for each target.
Collapse
Affiliation(s)
- Ann E Cleves
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, CA, USA
| | - Ajay N Jain
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA.
| |
Collapse
|
11
|
Hamzeh-Mivehroud M, Sokouti B, Dastmalchi S. An Introduction to the Basic Concepts in QSAR-Aided Drug Design. QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS IN DRUG DESIGN, PREDICTIVE TOXICOLOGY, AND RISK ASSESSMENT 2015. [DOI: 10.4018/978-1-4666-8136-1.ch001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The need for the development of new drugs to combat existing and newly identified conditions is unavoidable. One of the important tools used in the advanced drug development pipeline is computer-aided drug design. Traditionally, to find a drug many ligands were synthesized and evaluated for their effectiveness using suitable bioassays and if all other drug-likeness features were met, the candidate(s) would possibly reach the market. Although this approach is still in use in advanced format, computational methods are an indispensable component of modern drug development projects. One of the methods used from very early days of rationalizing the drug design approaches is Quantitative Structure-Activity Relationship (QSAR). This chapter overviews QSAR modeling steps by introducing molecular descriptors, mathematical model development for relating biological activities to molecular structures, and model validation. At the end, several successful cases where QSAR studies were used extensively are presented.
Collapse
Affiliation(s)
- Maryam Hamzeh-Mivehroud
- Biotechnology Research Center & School of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Babak Sokouti
- Biotechnology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Siavoush Dastmalchi
- Biotechnology Research Center & School of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| |
Collapse
|
12
|
Sitnikov GV, Zhokhova NI, Ustynyuk YA, Varnek A, Baskin II. Continuous indicator fields: a novel universal type of molecular fields. J Comput Aided Mol Des 2014; 29:233-47. [DOI: 10.1007/s10822-014-9818-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Accepted: 11/24/2014] [Indexed: 11/25/2022]
|