1
|
Yu X, Tang D, Chng JY, Sholl DS. Efficient Exploration of Adsorption Space for Separations in Metal-Organic Frameworks Combining the Use of Molecular Simulations, Machine Learning, and Ideal Adsorbed Solution Theory. THE JOURNAL OF PHYSICAL CHEMISTRY. C, NANOMATERIALS AND INTERFACES 2023; 127:19229-19239. [PMID: 37791097 PMCID: PMC10544990 DOI: 10.1021/acs.jpcc.3c04533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 08/15/2023] [Indexed: 10/05/2023]
Abstract
Adsorption-based separations using metal-organic frameworks (MOFs) are promising candidates for replacing common energy-intensive separation processes. The so-called adsorption space formed by the combination of billions of possible molecules and thousands of reported MOFs is vast. It is very challenging to comprehensively evaluate the performance of MOFs for chemical separation through experiments. Molecular simulations and machine learning (ML) have been widely applied to make predictions for adsorption-based separations. Previous ML approaches to these issues were typically limited to smaller molecules and often had poor accuracy in the dilute limit. To enable exploration of a wider adsorption space, we carefully selected a diverse set of 45 molecules and 335 MOFs and generated single-component isotherms of 15,075 MOF-molecule pairs by grand canonical Monte Carlo. Using this database, we successfully developed accurate (r2 > 0.9) machine learning models predicting adsorption isotherms of diverse molecules in large libraries of MOFs. With this approach, we can efficiently make predictions of large collections of MOFs for arbitrary mixture separations. By combining molecular simulation data and ML predictions with Ideal Adsorbed Solution Theory, we tested the ability of these approaches to make predictions of adsorption selectivity and loading for challenging near-azeotropic mixtures.
Collapse
Affiliation(s)
- Xiaohan Yu
- School
of Chemical & Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Dai Tang
- School
of Chemical & Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Jia Yuan Chng
- School
of Chemical & Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - David S. Sholl
- School
of Chemical & Biomolecular Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- Oak
Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States
| |
Collapse
|
2
|
Chen L, Fan Z, Chang J, Yang R, Hou H, Guo H, Zhang Y, Yang T, Zhou C, Sui Q, Chen Z, Zheng C, Hao X, Zhang K, Cui R, Zhang Z, Ma H, Ding Y, Zhang N, Lu X, Luo X, Jiang H, Zhang S, Zheng M. Sequence-based drug design as a concept in computational drug design. Nat Commun 2023; 14:4217. [PMID: 37452028 PMCID: PMC10349078 DOI: 10.1038/s41467-023-39856-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 06/27/2023] [Indexed: 07/18/2023] Open
Abstract
Drug development based on target proteins has been a successful approach in recent decades. However, the conventional structure-based drug design (SBDD) pipeline is a complex, human-engineered process with multiple independently optimized steps. Here, we propose a sequence-to-drug concept for computational drug design based on protein sequence information by end-to-end differentiable learning. We validate this concept in three stages. First, we design TransformerCPI2.0 as a core tool for the concept, which demonstrates generalization ability across proteins and compounds. Second, we interpret the binding knowledge that TransformerCPI2.0 learned. Finally, we use TransformerCPI2.0 to discover new hits for challenging drug targets, and identify new target for an existing drug based on an inverse application of the concept. Overall, this proof-of-concept study shows that the sequence-to-drug concept adds a perspective on drug design. It can serve as an alternative method to SBDD, particularly for proteins that do not yet have high-quality 3D structures available.
Collapse
Affiliation(s)
- Lifan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Zisheng Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Shanghai, 200031, China
| | - Jie Chang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
| | - Ruirui Yang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Shanghai, 200031, China
| | - Hui Hou
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Hao Guo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Yinghui Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Tianbiao Yang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Chenmao Zhou
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
| | - Qibang Sui
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Zhengyang Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Chen Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Xinyue Hao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
| | - Keke Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
| | - Rongrong Cui
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Zehong Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Hudson Ma
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Yiluan Ding
- Department of Analytical Chemistry, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Naixia Zhang
- Department of Analytical Chemistry, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Xiaojie Lu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Shanghai, 200031, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Sub-lane Xiangshan, Hangzhou, 310024, China
| | - Sulin Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China.
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Shanghai, 200031, China.
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Sub-lane Xiangshan, Hangzhou, 310024, China.
| |
Collapse
|
3
|
Kumari M, Subbarao N. Convolutional neural network-based quantitative structure-activity relationship and fingerprint analysis against inhibitors of anthrax lethal factor. Future Med Chem 2023; 15:853-866. [PMID: 37248697 DOI: 10.4155/fmc-2023-0093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 05/10/2023] [Indexed: 05/31/2023] Open
Abstract
Aim: To develop a one-dimensional convolutional neural network-based quantitative structure-activity relationship (1D-CNN-QSAR) model to identify novel anthrax inhibitors and analyze chemical space. Methods: We developed a 1D-CNN-QSAR model to identify novel anthrax inhibitors. Results: The statistical results of the 1D-CNN-QSAR model showed a mean square error of 0.045 and a predicted correlation coefficient of 0.79 for the test set. Further, chemical space analysis showed more than 80% fragment pair similarity, with activity cliffs associated with carboxylic acid, 2-phenylfurans, N-phenyldihydropyrazole, N-phenylpyrrole, furan, 4-methylene-1H-pyrazol-5-one, phenylimidazole, phenylpyrrole and phenylpyrazolidine. Conclusion: These fragments may serve as the basis for developing potent novel drug candidates for anthrax. Finally, we concluded that our proposed 1D-CNN-QSAR model and fingerprint analysis might be used to discover potential anthrax drug candidates.
Collapse
Affiliation(s)
- Madhulata Kumari
- Amity Institute of Biotechnology, Amity University, Rajasthan, Jaipur, India
| | - Naidu Subbarao
- School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India
| |
Collapse
|
4
|
Remington JM, McKay KT, Beckage NB, Ferrell JB, Schneebeli ST, Li J. GPCRLigNet: rapid screening for GPCR active ligands using machine learning. J Comput Aided Mol Des 2023; 37:147-156. [PMID: 36840893 PMCID: PMC10379640 DOI: 10.1007/s10822-023-00497-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 02/03/2023] [Indexed: 02/26/2023]
Abstract
Molecules with bioactivity towards G protein-coupled receptors represent a subset of the vast space of small drug-like molecules. Here, we compare machine learning models, including dilated graph convolutional networks, that conduct binary classification to quickly identify molecules with activity towards G protein-coupled receptors. The models are trained and validated using a large set of over 600,000 active, inactive, and decoy compounds. The best performing machine learning model, dubbed GPCRLigNet, was a surprisingly simple feedforward dense neural network mapping from Morgan fingerprints to activity. Incorporation of GPCRLigNet into a high-throughput virtual screening workflow is demonstrated with molecular docking towards a particular G protein-coupled receptor, the pituitary adenylate cyclase-activating polypeptide receptor type 1. Through rigorous comparison of docking scores for molecules selected with and without using GPCRLigNet, we demonstrate an enrichment of potentially potent molecules using GPCRLigNet. This work provides a proof of principle that GPCRLigNet can effectively hone the chemical search space towards ligands with G protein-coupled receptor activity.
Collapse
Affiliation(s)
- Jacob M Remington
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA
| | - Kyle T McKay
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA
| | - Noah B Beckage
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA
| | - Jonathon B Ferrell
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA
| | - Severin T Schneebeli
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA.,Department of Industrial and Physical Pharmacy, Department of Chemistry, Purdue University, West Lafayette, IN, 47906, USA.,Department of Pathology, University of Vermont, Burlington, VT, 05405, USA
| | - Jianing Li
- Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA. .,Department of Pathology, University of Vermont, Burlington, VT, 05405, USA. .,Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, IN, 47906, USA.
| |
Collapse
|
5
|
Bon M, Bilsland A, Bower J, McAulay K. Fragment-based drug discovery-the importance of high-quality molecule libraries. Mol Oncol 2022; 16:3761-3777. [PMID: 35749608 PMCID: PMC9627785 DOI: 10.1002/1878-0261.13277] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 05/16/2022] [Accepted: 06/23/2022] [Indexed: 12/24/2022] Open
Abstract
Fragment-based drug discovery (FBDD) is now established as a complementary approach to high-throughput screening (HTS). Contrary to HTS, where large libraries of drug-like molecules are screened, FBDD screens involve smaller and less complex molecules which, despite a low affinity to protein targets, display more 'atom-efficient' binding interactions than larger molecules. Fragment hits can, therefore, serve as a more efficient start point for subsequent optimisation, particularly for hard-to-drug targets. Since the number of possible molecules increases exponentially with molecular size, small fragment libraries allow for a proportionately greater coverage of their respective 'chemical space' compared with larger HTS libraries comprising larger molecules. However, good library design is essential to ensure optimal chemical and pharmacophore diversity, molecular complexity, and physicochemical characteristics. In this review, we describe our views on fragment library design, and on what constitutes a good fragment from a medicinal and computational chemistry perspective. We highlight emerging chemical and computational technologies in FBDD and discuss strategies for optimising fragment hits. The impact of novel FBDD approaches is already being felt, with the recent approval of the covalent KRASG12C inhibitor sotorasib highlighting the utility of FBDD against targets that were long considered undruggable.
Collapse
Affiliation(s)
- Marta Bon
- Cancer Research HorizonsCancer Research UK Beatson InstituteGlasgowUK
| | - Alan Bilsland
- Cancer Research HorizonsCancer Research UK Beatson InstituteGlasgowUK
| | - Justin Bower
- Cancer Research HorizonsCancer Research UK Beatson InstituteGlasgowUK
| | - Kirsten McAulay
- Cancer Research HorizonsCancer Research UK Beatson InstituteGlasgowUK
| |
Collapse
|
6
|
Kumar R, Sharma A, Alexiou A, Ashraf GM. Artificial Intelligence in De novo Drug Design: Are We Still There? Curr Top Med Chem 2022; 22:2483-2492. [PMID: 36263480 DOI: 10.2174/1568026623666221017143244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 09/06/2022] [Accepted: 09/15/2022] [Indexed: 01/20/2023]
Abstract
BACKGROUND The artificial intelligence (AI)-assisted design of drug candidates with novel structures and desired properties has received significant attention in the recent past, so related areas of forward prediction that aim to discover chemical matters worth synthesizing and further experimental investigation. OBJECTIVES The purpose behind developing AI-driven models is to explore the broader chemical space and suggest new drug candidate scaffolds with promising therapeutic value. Moreover, it is anticipated that such AI-based models may not only significantly reduce the cost and time but also decrease the attrition rate of drug candidates that fail to reach the desirable endpoints at the final stages of drug development. In an attempt to develop AI-based models for de novo drug design, numerous methods have been proposed by various study groups by applying machine learning and deep learning algorithms to chemical datasets. However, there are many challenges in obtaining accurate predictions, and real breakthroughs in de novo drug design are still scarce. METHODS In this review, we explore the recent trends in developing AI-based models for de novo drug design to assess the current status, challenges, and opportunities in the field. CONCLUSION The consistently improved AI algorithms and the abundance of curated training chemical data indicate that AI-based de novo drug design should perform better than the current models. Improvements in the performance are warranted to obtain better outcomes in the form of potential drug candidates, which can perform well in in vivo conditions, especially in the case of more complex diseases.
Collapse
Affiliation(s)
- Rajnish Kumar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh Lucknow Campus, Uttar Pradesh, India
| | - Anju Sharma
- Department of Applied Science, Indian Institute of Information Technology, Allahabad, Uttar Pradesh, India
| | - Athanasios Alexiou
- Novel Global Community Educational Foundation, Hebersham, 2770 NSW, Australia.,AFNP Med Austria, 1010 Wien, Austria
| | - Ghulam Md Ashraf
- Pre-Clinical Research Unit (PCRU), King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia.,Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
7
|
Interpretable Machine Learning Models for Molecular Design of Tyrosine Kinase Inhibitors Using Variational Autoencoders and Perturbation-Based Approach of Chemical Space Exploration. Int J Mol Sci 2022; 23:ijms231911262. [PMID: 36232566 PMCID: PMC9569663 DOI: 10.3390/ijms231911262] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 09/21/2022] [Accepted: 09/21/2022] [Indexed: 11/17/2022] Open
Abstract
In the current study, we introduce an integrative machine learning strategy for the autonomous molecular design of protein kinase inhibitors using variational autoencoders and a novel cluster-based perturbation approach for exploration of the chemical latent space. The proposed strategy combines autoencoder-based embedding of small molecules with a cluster-based perturbation approach for efficient navigation of the latent space and a feature-based kinase inhibition likelihood classifier that guides optimization of the molecular properties and targeted molecular design. In the proposed generative approach, molecules sharing similar structures tend to cluster in the latent space, and interpolating between two molecules in the latent space enables smooth changes in the molecular structures and properties. The results demonstrated that the proposed strategy can efficiently explore the latent space of small molecules and kinase inhibitors along interpretable directions to guide the generation of novel family-specific kinase molecules that display a significant scaffold diversity and optimal biochemical properties. Through assessment of the latent-based and chemical feature-based binary and multiclass classifiers, we developed a robust probabilistic evaluator of kinase inhibition likelihood that is specifically tailored to guide the molecular design of novel SRC kinase molecules. The generated molecules originating from LCK and ABL1 kinase inhibitors yielded ~40% of novel and valid SRC kinase compounds with high kinase inhibition likelihood probability values (p > 0.75) and high similarity (Tanimoto coefficient > 0.6) to the known SRC inhibitors. By combining the molecular perturbation design with the kinase inhibition likelihood analysis and similarity assessments, we showed that the proposed molecular design strategy can produce novel valid molecules and transform known inhibitors of different kinase families into potential chemical probes of the SRC kinase with excellent physicochemical profiles and high similarity to the known SRC kinase drugs. The results of our study suggest that task-specific manipulation of a biased latent space may be an important direction for more effective task-oriented and target-specific autonomous chemical design models.
Collapse
|
8
|
Togre NS, Vargas AM, Bhargavi G, Mallakuntla MK, Tiwari S. Fragment-Based Drug Discovery against Mycobacteria: The Success and Challenges. Int J Mol Sci 2022; 23:ijms231810669. [PMID: 36142582 PMCID: PMC9500838 DOI: 10.3390/ijms231810669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 09/10/2022] [Accepted: 09/10/2022] [Indexed: 11/29/2022] Open
Abstract
The emergence of drug-resistant mycobacteria, including Mycobacterium tuberculosis (Mtb) and non-tuberculous mycobacteria (NTM), poses an increasing global threat that urgently demands the development of new potent anti-mycobacterial drugs. One of the approaches toward the identification of new drugs is fragment-based drug discovery (FBDD), which is the most ingenious among other drug discovery models, such as structure-based drug design (SBDD) and high-throughput screening. Specialized techniques, such as X-ray crystallography, nuclear magnetic resonance spectroscopy, and many others, are part of the drug discovery approach to combat the Mtb and NTM global menaces. Moreover, the primary drawbacks of traditional methods, such as the limited measurement of biomolecular toxicity and uncertain bioavailability evaluation, are successfully overcome by the FBDD approach. The current review focuses on the recognition of fragment-based drug discovery as a popular approach using virtual, computational, and biophysical methods to identify potent fragment molecules. FBDD focuses on designing optimal inhibitors against potential therapeutic targets of NTM and Mtb (PurC, ArgB, MmpL3, and TrmD). Additionally, we have elaborated on the challenges associated with the FBDD approach in the identification and development of novel compounds. Insights into the applications and overcoming the challenges of FBDD approaches will aid in the identification of potential therapeutic compounds to treat drug-sensitive and drug-resistant NTMs and Mtb infections.
Collapse
|
9
|
Zhang X, Zheng QR, He HZ. Machine-learning-based prediction of hydrogen adsorption capacity at varied temperatures and pressures for MOFs adsorbents. J Taiwan Inst Chem Eng 2022. [DOI: 10.1016/j.jtice.2022.104479] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
10
|
Progress and Impact of Latin American Natural Product Databases. Biomolecules 2022; 12:biom12091202. [PMID: 36139041 PMCID: PMC9496143 DOI: 10.3390/biom12091202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 08/27/2022] [Accepted: 08/29/2022] [Indexed: 11/17/2022] Open
Abstract
Natural products (NPs) are a rich source of structurally novel molecules, and the chemical space they encompass is far from being fully explored. Over history, NPs have represented a significant source of bioactive molecules and have served as a source of inspiration for developing many drugs on the market. On the other hand, computer-aided drug design (CADD) has contributed to drug discovery research, mitigating costs and time. In this sense, compound databases represent a fundamental element of CADD. This work reviews the progress toward developing compound databases of natural origin, and it surveys computational methods, emphasizing chemoinformatic approaches to profile natural product databases. Furthermore, it reviews the present state of the art in developing Latin American NP databases and their practical applications to the drug discovery area.
Collapse
|
11
|
Spenke F, Hartke B. Graph-based Automated Macro-Molecule Assembly. J Chem Inf Model 2022; 62:3714-3723. [PMID: 35938711 DOI: 10.1021/acs.jcim.2c00609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We present a general molecular framework assembly algorithm that takes a largely arbitrary molecular fragment database and a user-supplied target template graph as input. Automatic assembly of molecular fragments from the database, following a prescribed, user-supplied set of connection rules, then turns the template graph into an actual, chemically reasonable molecular framework. Assembly capabilities of our algorithm are tested by producing several abstract, closed-loop shapes. To indicate a few of many possible application areas we demonstrate a host-guest complex and a road toward catalysis. Postassembly substituent exchange can be used to produce electric fields of desired values at desired points inside the framework or at its surface as a stepping stone toward rationally designed, artificial heterogeneous catalysts.
Collapse
Affiliation(s)
- Florian Spenke
- Institute for Physical Chemistry, Christian-Albrechts-University, Olshausenstrasse 40, Kiel 24098, Germany
| | - Bernd Hartke
- Institute for Physical Chemistry, Christian-Albrechts-University, Olshausenstrasse 40, Kiel 24098, Germany
| |
Collapse
|
12
|
Warr WA, Nicklaus MC, Nicolaou CA, Rarey M. Exploration of Ultralarge Compound Collections for Drug Discovery. J Chem Inf Model 2022; 62:2021-2034. [PMID: 35421301 DOI: 10.1021/acs.jcim.2c00224] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Designing new medicines more cheaply and quickly is tightly linked to the quest of exploring chemical space more widely and efficiently. Chemical space is monumentally large, but recent advances in computer software and hardware have enabled researchers to navigate virtual chemical spaces containing billions of chemical structures. This review specifically concerns collections of many millions or even billions of enumerated chemical structures as well as even larger chemical spaces that are not fully enumerated. We present examples of chemical libraries and spaces and the means used to construct them, and we discuss new technologies for searching huge libraries and for searching combinatorially in chemical space. We also cover space navigation techniques and consider new approaches to de novo drug design and the impact of the "autonomous laboratory" on synthesis of designed compounds. Finally, we summarize some other challenges and opportunities for the future.
Collapse
Affiliation(s)
- Wendy A Warr
- Wendy Warr & Associates, 6 Berwick Court, Holmes Chapel, Crewe, Cheshire CW4 7HZ, United Kingdom
| | - Marc C Nicklaus
- NCI, NIH, CADD Group, NCI-Frederick, Frederick, Maryland 21702, United States
| | - Christos A Nicolaou
- Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, Indiana 46285, United States
| | - Matthias Rarey
- Universität Hamburg, ZBH Center for Bioinformatics, 20146 Hamburg, Germany
| |
Collapse
|
13
|
Bilsland AE, Pugliese A, Bower J. Implementation of an AI-assisted fragment-generator in an open-source platform. RSC Med Chem 2022; 13:1205-1211. [PMID: 36320432 PMCID: PMC9579942 DOI: 10.1039/d2md00152g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 07/27/2022] [Indexed: 11/21/2022] Open
Abstract
We recently reported a deep learning model to facilitate fragment library design, which is critical for efficient hit identification. However, our model was implemented in Python. We have now created an implementation in the KNIME graphical pipelining environment which we hope will allow experimentation by users with limited programming knowledge. We report a deep learning model to facilitate fragment library design, which is critical for efficient hit identification, and an implementation in the KNIME graphical workflow environment which should facilitate a more codeless use.![]()
Collapse
Affiliation(s)
- Alan E. Bilsland
- Cancer Research Horizons – Therapeutic Innovation, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Glasgow G61 1BD, UK
| | - Angelo Pugliese
- BioAscent Discovery, Bo'Ness Road, Newhouse, Lanarkshire ML1 5UH, UK
| | - Justin Bower
- Cancer Research Horizons – Therapeutic Innovation, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Glasgow G61 1BD, UK
| |
Collapse
|
14
|
Piticchio SG, Martínez-Cartró M, Scaffidi S, Rachman M, Rodriguez-Arevalo S, Sanchez-Arfelis A, Escolano C, Picaud S, Krojer T, Filippakopoulos P, von Delft F, Galdeano C, Barril X. Discovery of Novel BRD4 Ligand Scaffolds by Automated Navigation of the Fragment Chemical Space. J Med Chem 2021; 64:17887-17900. [PMID: 34898210 DOI: 10.1021/acs.jmedchem.1c01108] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Fragment-based drug discovery (FBDD) is a very effective hit identification method. However, the evolution of fragment hits into suitable leads remains challenging and largely artisanal. Fragment evolution is often scaffold-centric, meaning that its outcome depends crucially on the chemical structure of the starting fragment. Considering that fragment screening libraries cover only a small proportion of the corresponding chemical space, hits should be seen as probes highlighting privileged areas of the chemical space rather than actual starting points. We have developed an automated computational pipeline to mine the chemical space around any specific fragment hit, rapidly finding analogues that share a common interaction motif but are structurally novel and diverse. On a prospective application on the bromodomain-containing protein 4 (BRD4), starting from a known fragment, the platform yields active molecules with nonobvious scaffold changes. The procedure is fast and inexpensive and has the potential to uncover many hidden opportunities in FBDD.
Collapse
Affiliation(s)
- Serena G Piticchio
- Departament de Farmacia i Tecnología Farmacèutica, i Fisicoquímica, Institut de Biomedicina (IBUB), Universitat de Barcelona, Av. Joan XXIII, 27-31, E-08028 Barcelona, Spain
| | - Míriam Martínez-Cartró
- Departament de Farmacia i Tecnología Farmacèutica, i Fisicoquímica, Institut de Biomedicina (IBUB), Universitat de Barcelona, Av. Joan XXIII, 27-31, E-08028 Barcelona, Spain
| | - Salvatore Scaffidi
- Departament de Farmacia i Tecnología Farmacèutica, i Fisicoquímica, Institut de Biomedicina (IBUB), Universitat de Barcelona, Av. Joan XXIII, 27-31, E-08028 Barcelona, Spain
| | - Moira Rachman
- Departament de Farmacia i Tecnología Farmacèutica, i Fisicoquímica, Institut de Biomedicina (IBUB), Universitat de Barcelona, Av. Joan XXIII, 27-31, E-08028 Barcelona, Spain
| | - Sergio Rodriguez-Arevalo
- Laboratory of Medicinal Chemistry (Associated Unit to CSIC), Department of Pharmacology, Toxicology and Medicinal Chemistry, Faculty of Pharmacy and Food Sciences, and Institute of Biomedicine (IBUB), University of Barcelona, Av. Joan XXIII, 27-31, E-08028 Barcelona, Spain
| | - Ainoa Sanchez-Arfelis
- Laboratory of Medicinal Chemistry (Associated Unit to CSIC), Department of Pharmacology, Toxicology and Medicinal Chemistry, Faculty of Pharmacy and Food Sciences, and Institute of Biomedicine (IBUB), University of Barcelona, Av. Joan XXIII, 27-31, E-08028 Barcelona, Spain
| | - Carmen Escolano
- Laboratory of Medicinal Chemistry (Associated Unit to CSIC), Department of Pharmacology, Toxicology and Medicinal Chemistry, Faculty of Pharmacy and Food Sciences, and Institute of Biomedicine (IBUB), University of Barcelona, Av. Joan XXIII, 27-31, E-08028 Barcelona, Spain
| | - Sarah Picaud
- Structural Genomics Consortium, Nuffield Department of Medicine, Oxford University, Old Road Campus Research Building, Roosevelt Drive, OX3 7DQ Oxford, United Kingdom
| | - Tobias Krojer
- Structural Genomics Consortium, Nuffield Department of Medicine, Oxford University, Old Road Campus Research Building, Roosevelt Drive, OX3 7DQ Oxford, United Kingdom
| | - Panagis Filippakopoulos
- Structural Genomics Consortium, Nuffield Department of Medicine, Oxford University, Old Road Campus Research Building, Roosevelt Drive, OX3 7DQ Oxford, United Kingdom
| | - Frank von Delft
- Structural Genomics Consortium, Nuffield Department of Medicine, Oxford University, Old Road Campus Research Building, Roosevelt Drive, OX3 7DQ Oxford, United Kingdom.,Diamond Light Source Ltd., Harwell Science and Innovation Campus, Didcot OX11 0QX, United Kingdom.,Research Complex at Harwell, Harwell Science and Innovation Campus, Didcot OX11 0FA, United Kingdom.,Centre for Medicines Discovery, University of Oxford, Oxford OX1 3QU, United Kingdom.,Department of Biochemistry, University of Johannesburg, Auckland Park 2006, South Africa
| | - Carles Galdeano
- Departament de Farmacia i Tecnología Farmacèutica, i Fisicoquímica, Institut de Biomedicina (IBUB), Universitat de Barcelona, Av. Joan XXIII, 27-31, E-08028 Barcelona, Spain
| | - Xavier Barril
- Departament de Farmacia i Tecnología Farmacèutica, i Fisicoquímica, Institut de Biomedicina (IBUB), Universitat de Barcelona, Av. Joan XXIII, 27-31, E-08028 Barcelona, Spain.,Catalan Institution for Research and Advanced Studies (ICREA), Barcelona 08010, Spain
| |
Collapse
|
15
|
Pujol‐Giménez J, Poirier M, Bühlmann S, Schuppisser C, Bhardwaj R, Awale M, Visini R, Javor S, Hediger MA, Reymond J. Inhibitors of Human Divalent Metal Transporters DMT1 (SLC11A2) and ZIP8 (SLC39A8) from a GDB-17 Fragment Library. ChemMedChem 2021; 16:3306-3314. [PMID: 34309203 PMCID: PMC8596699 DOI: 10.1002/cmdc.202100467] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Indexed: 11/06/2022]
Abstract
Solute carrier proteins (SLCs) are membrane proteins controlling fluxes across biological membranes and represent an emerging class of drug targets. Here we searched for inhibitors of divalent metal transporters in a library of 1,676 commercially available 3D-shaped fragment-like molecules from the generated database GDB-17, which lists all possible organic molecules up to 17 atoms of C, N, O, S and halogen following simple criteria for chemical stability and synthetic feasibility. While screening against DMT1 (SLC11A2), an iron transporter associated with hemochromatosis and for which only very few inhibitors are known, only yielded two weak inhibitors, our approach led to the discovery of the first inhibitor of ZIP8 (SLC39A8), a zinc transporter associated with manganese homeostasis and osteoarthritis but with no previously reported pharmacology, demonstrating that this target is druggable.
Collapse
Affiliation(s)
- Jonai Pujol‐Giménez
- Department of Biomedical Research and Department of Nephrology and Hypertension Membrane Transport Discovery Lab Inselspital, Bern University HospitalUniversity of BernCH-3010BernSwitzerland
| | - Marion Poirier
- Department of Chemistry Biochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Sven Bühlmann
- Department of Chemistry Biochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Céline Schuppisser
- Department of Chemistry Biochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Rajesh Bhardwaj
- Department of Biomedical Research and Department of Nephrology and Hypertension Membrane Transport Discovery Lab Inselspital, Bern University HospitalUniversity of BernCH-3010BernSwitzerland
| | - Mahendra Awale
- Department of Chemistry Biochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Ricardo Visini
- Department of Chemistry Biochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Sacha Javor
- Department of Chemistry Biochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| | - Matthias A. Hediger
- Department of Biomedical Research and Department of Nephrology and Hypertension Membrane Transport Discovery Lab Inselspital, Bern University HospitalUniversity of BernCH-3010BernSwitzerland
| | - Jean‐Louis Reymond
- Department of Chemistry Biochemistry and Pharmaceutical SciencesUniversity of BernFreiestrasse 33012BernSwitzerland
| |
Collapse
|
16
|
Bilsland AE, McAulay K, West R, Pugliese A, Bower J. Automated Generation of Novel Fragments Using Screening Data, a Dual SMILES Autoencoder, Transfer Learning and Syntax Correction. J Chem Inf Model 2021; 61:2547-2559. [PMID: 34029470 DOI: 10.1021/acs.jcim.0c01226] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Fragment-based hit identification (FBHI) allows proportionately greater coverage of chemical space using fewer molecules than traditional high-throughput screening approaches. However, effectively exploiting this advantage is highly dependent on the library design. Solubility, stability, chemical complexity, chemical/shape diversity, and synthetic tractability for fragment elaboration are all critical aspects, and molecule design remains a time-consuming task for computational and medicinal chemists. Artificial neural networks have attracted considerable attention in automated de novo design applications and could also prove useful for fragment library design. Chemical autoencoders are neural networks consisting of encoder and decoder parts, which respectively compress and decompress molecular representations. The decoder is applied to samples drawn from the space of compressed representations to generate novel molecules that can be scored for properties of interest. Here, we report an autoencoder model using a recurrent neural network architecture, which was trained using 486,565 fragments curated from commercial sources, to simultaneously reconstruct both SMILES and chemical fingerprints. To explore its utility in fragment design, we applied transfer learning to the fingerprint decoder layers to train a classifier using 66 frequent hitter fragments identified from our screening campaigns. Using a particle swarm optimization sampling approach, we compare the performance of this "dual" model to an architecture encoding SMILES only. The dual model produced valid SMILES with improved features, considering a range of properties including aromatic ring counts, heavy atom count, synthetic accessibility, and a new fragment complexity score we term Feature Complexity (FeCo). Additionally, we demonstrate that generative performance is further enhanced by use of a simple syntax-correction procedure during training, in which invalid and undesirable SMILES are spiked into the training set. Finally, we used the syntax-corrected model to generate a library of novel candidate privileged fragments.
Collapse
Affiliation(s)
- Alan E Bilsland
- Beatson Drug Discovery Unit, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Bearsden, Glasgow, G61 1BD, U.K
| | - Kirsten McAulay
- Beatson Drug Discovery Unit, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Bearsden, Glasgow, G61 1BD, U.K
| | - Ryan West
- Beatson Drug Discovery Unit, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Bearsden, Glasgow, G61 1BD, U.K
| | - Angelo Pugliese
- Beatson Drug Discovery Unit, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Bearsden, Glasgow, G61 1BD, U.K
- BioAscent Discovery Ltd., Bo'Ness Road, Newhouse, Lanarkshire ML1 5UH, U.K
| | - Justin Bower
- Beatson Drug Discovery Unit, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Bearsden, Glasgow, G61 1BD, U.K
| |
Collapse
|
17
|
Wang ZZ, Shi XX, Huang GY, Hao GF, Yang GF. Fragment-based drug design facilitates selective kinase inhibitor discovery. Trends Pharmacol Sci 2021; 42:551-565. [PMID: 33958239 DOI: 10.1016/j.tips.2021.04.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 03/30/2021] [Accepted: 04/07/2021] [Indexed: 12/16/2022]
Abstract
Protein kinases (PKs) are important drug targets, but kinases selectivity poses a challenge to protein kinase inhibitors (PKIs) design. Fragment-based drug discovery (FBDD) has achieved great success in the discovery of highly specific PKIs. It makes full use of kinase-fragment interaction in target kinase subpockets to obtain promising selectivity. However, it's difficult to understand the complicated kinase-fragment interaction space, and systemic discussion of these interactions is still lacking. Herein, we introduce the advantages of the FBDD strategy in PKIs design. Key features of the selectivity of kinase-fragment interactions are summarized and analyzed. Some promising PKIs are introduced as case studies to help understand the fragment-to-lead (F2L) optimization process. Novel strategies and technologies for FBDD in PKIs discovery are also outlooked.
Collapse
Affiliation(s)
- Zhi-Zheng Wang
- Key Laboratory of Pesticide and Chemical Biology, Ministry of Education, College of Chemistry, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, China; International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, China
| | - Xing-Xing Shi
- Key Laboratory of Pesticide and Chemical Biology, Ministry of Education, College of Chemistry, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, China; International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, China
| | - Guang-Yi Huang
- Key Laboratory of Pesticide and Chemical Biology, Ministry of Education, College of Chemistry, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, China; International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, China
| | - Ge-Fei Hao
- Key Laboratory of Pesticide and Chemical Biology, Ministry of Education, College of Chemistry, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, China; International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, China; State Key Laboratory Breeding Base of Green Pesticide and Agricultural Bioengineering, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for Research and Development of Fine Chemicals, Guizhou University, Guiyang 550025, China.
| | - Guang-Fu Yang
- Key Laboratory of Pesticide and Chemical Biology, Ministry of Education, College of Chemistry, International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, China; International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, China
| |
Collapse
|
18
|
Mei L, Wu F, Hao G, Yang G. Protocol for hit-to-lead optimization of compounds by auto in silico ligand directing evolution (AILDE) approach. STAR Protoc 2021; 2:100312. [PMID: 33554146 PMCID: PMC7856476 DOI: 10.1016/j.xpro.2021.100312] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Hit-to-lead (H2L) optimization is crucial for drug design, which has become an increasing concern in medicinal chemistry. A virtual screening strategy of auto in silico ligand directing evolution (AILDE) has been developed to yield promising lead compounds rapidly and efficiently. The protocol includes instructions for fragment compound library construction, conformational sampling by molecular dynamics simulation, ligand modification by fragment growing, as well as the binding free energy prediction. For complete details on the use and execution of this protocol, please refer to Wu et al. (2020).
Collapse
Affiliation(s)
- Longcan Mei
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, China
| | - Fengxu Wu
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, China
| | - Gefei Hao
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, China.,State Key Laboratory Breeding Base of Green Pesticide and Agricultural Bioengineering, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Research and Development Center for Fine Chemicals, Guizhou University, Guiyang 550025, China
| | - Guangfu Yang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, China.,International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan 430079, China.,Collaborative Innovation Center of Chemical Science and Engineering, Tianjin 300072, China
| |
Collapse
|
19
|
Meier K, Arús‐Pous J, Reymond J. A Potent and Selective Janus Kinase Inhibitor with a Chiral 3D‐Shaped Triquinazine Ring System from Chemical Space. Angew Chem Int Ed Engl 2021. [DOI: 10.1002/ange.202012049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Kris Meier
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Josep Arús‐Pous
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Jean‐Louis Reymond
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| |
Collapse
|
20
|
Yang T, Li Z, Chen Y, Feng D, Wang G, Fu Z, Ding X, Tan X, Zhao J, Luo X, Chen K, Jiang H, Zheng M. DrugSpaceX: a large screenable and synthetically tractable database extending drug space. Nucleic Acids Res 2021; 49:D1170-D1178. [PMID: 33104791 PMCID: PMC7778939 DOI: 10.1093/nar/gkaa920] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 09/11/2020] [Accepted: 10/05/2020] [Indexed: 02/07/2023] Open
Abstract
One of the most prominent topics in drug discovery is efficient exploration of the vast drug-like chemical space to find synthesizable and novel chemical structures with desired biological properties. To address this challenge, we created the DrugSpaceX (https://drugspacex.simm.ac.cn/) database based on expert-defined transformations of approved drug molecules. The current version of DrugSpaceX contains >100 million transformed chemical products for virtual screening, with outstanding characteristics in terms of structural novelty, diversity and large three-dimensional chemical space coverage. To illustrate its practical application in drug discovery, we used a case study of discoidin domain receptor 1 (DDR1), a kinase target implicated in fibrosis and other diseases, to show DrugSpaceX performing a quick search of initial hit compounds. Additionally, for ligand identification and optimization purposes, DrugSpaceX also provides several subsets for download, including a 10% diversity subset, an extended drug-like subset, a drug-like subset, a lead-like subset, and a fragment-like subset. In addition to chemical properties and transformation instructions, DrugSpaceX can locate the position of transformation, which will enable medicinal chemists to easily integrate strategy planning and protection design.
Collapse
Affiliation(s)
- Tianbiao Yang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China
| | - Zhaojun Li
- School of Information Management, Dezhou University, No. 566 University Rd. West, Dezhou 253023, Shandong, China
| | - Yingjia Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Dan Feng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Chemistry, College of Sciences, Shanghai University, Shanghai, China
| | - Guangchao Wang
- School of Information Management, Dezhou University, No. 566 University Rd. West, Dezhou 253023, Shandong, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing 210023, China
| | - Xiaoyu Ding
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Xiaoqin Tan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Jihui Zhao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Kaixian Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China
- School of Life Science and Technology, ShanghaiTech University, 393 Huaxiazhong Road, Shanghai 200031, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| |
Collapse
|
21
|
Kim H, Kim E, Lee I, Bae B, Park M, Nam H. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. BIOTECHNOL BIOPROC E 2021; 25:895-930. [PMID: 33437151 PMCID: PMC7790479 DOI: 10.1007/s12257-020-0049-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023]
Abstract
As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| |
Collapse
|
22
|
Meier K, Arús‐Pous J, Reymond J. A Potent and Selective Janus Kinase Inhibitor with a Chiral 3D‐Shaped Triquinazine Ring System from Chemical Space. Angew Chem Int Ed Engl 2020; 60:2074-2077. [DOI: 10.1002/anie.202012049] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 09/25/2020] [Indexed: 01/31/2023]
Affiliation(s)
- Kris Meier
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Josep Arús‐Pous
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| | - Jean‐Louis Reymond
- Department of Chemistry and Biochemistry University of Bern Freiestrasse 3 3012 Bern Switzerland
| |
Collapse
|
23
|
Poirier M, Pujol-Giménez J, Manatschal C, Bühlmann S, Embaby A, Javor S, Hediger MA, Reymond JL. Pyrazolyl-pyrimidones inhibit the function of human solute carrier protein SLC11A2 (hDMT1) by metal chelation. RSC Med Chem 2020; 11:1023-1031. [PMID: 33479694 PMCID: PMC7649969 DOI: 10.1039/d0md00085j] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 05/06/2020] [Indexed: 12/22/2022] Open
Abstract
Solute carrier proteins (SLCs) control fluxes of ions and molecules across biological membranes and represent an emerging class of drug targets. SLC11A2 (hDMT1) mediates intestinal iron uptake and its inhibition might be used to treat iron overload diseases such as hereditary hemochromatosis. Here we report a micromolar (IC50 = 1.1 μM) pyrazolyl-pyrimidone inhibitor of radiolabeled iron uptake in hDMT1 overexpressing HEK293 cells acting by a non-competitive mechanism, which however does not affect the electrophysiological properties of the transporter. Isothermal titration calorimetry, competition with calcein, induced precipitation of radioactive iron and cross inhibition of the unrelated iron transporter SLC39A8 (hZIP8) indicate that inhibition is mediated by metal chelation. Mapping the chemical space of thousands of pyrazolo-pyrimidones and similar 2,2'-diazabiaryls in ChEMBL suggests that their reported activities might partly reflect metal chelation. Such metal chelating groups are not listed in pan-assay interference compounds (PAINS) but should be checked when addressing SLCs.
Collapse
Affiliation(s)
- Marion Poirier
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Jonai Pujol-Giménez
- Institute of Biochemistry and Molecular Medicine , University of Bern , Bühlstrasse 28 , 3012 Bern , Switzerland
- Membrane Transport Discovery Lab , Department of Nephrology and Hypertension , Inselspital , University of Bern Kinderklinik , Freiburgstrasse 15 , 3010 Bern , Switzerland .
- Department of Biomedical Research , University of Bern , Murtenstrasse 35 , 3008 Bern , Switzerland
| | - Cristina Manatschal
- Department of Biochemistry , University of Zürich , Winterthurerstrasse 190 , Zürich , Switzerland
| | - Sven Bühlmann
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Ahmed Embaby
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Sacha Javor
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Matthias A Hediger
- Institute of Biochemistry and Molecular Medicine , University of Bern , Bühlstrasse 28 , 3012 Bern , Switzerland
- Membrane Transport Discovery Lab , Department of Nephrology and Hypertension , Inselspital , University of Bern Kinderklinik , Freiburgstrasse 15 , 3010 Bern , Switzerland .
- Department of Biomedical Research , University of Bern , Murtenstrasse 35 , 3008 Bern , Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| |
Collapse
|
24
|
Cai C, Wang S, Xu Y, Zhang W, Tang K, Ouyang Q, Lai L, Pei J. Transfer Learning for Drug Discovery. J Med Chem 2020; 63:8683-8694. [PMID: 32672961 DOI: 10.1021/acs.jmedchem.9b02147] [Citation(s) in RCA: 137] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The data sets available to train models for in silico drug discovery efforts are often small. Indeed, the sparse availability of labeled data is a major barrier to artificial-intelligence-assisted drug discovery. One solution to this problem is to develop algorithms that can cope with relatively heterogeneous and scarce data. Transfer learning is a type of machine learning that can leverage existing, generalizable knowledge from other related tasks to enable learning of a separate task with a small set of data. Deep transfer learning is the most commonly used type of transfer learning in the field of drug discovery. This Perspective provides an overview of transfer learning and related applications to drug discovery to date. Furthermore, it provides outlooks on the future development of transfer learning for drug discovery.
Collapse
Affiliation(s)
- Chenjing Cai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| | - Shiwei Wang
- PTN Graduate Program, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| | - Youjun Xu
- BNLMS and Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, P. R. China
| | - Weilin Zhang
- Beijing Intelligent Pharma Technology Co., Ltd., Beijing 100083, P. R. China
| | - Ke Tang
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, P. R. China
| | - Qi Ouyang
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China.,The State Key Laboratory for Artificial Microstructures and Mesoscopic Physics, School of Physics, Peking University, Beijing 100871, P. R. China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China.,BNLMS and Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, P. R. China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| |
Collapse
|
25
|
Probst D, Reymond JL. Visualization of very large high-dimensional data sets as minimum spanning trees. J Cheminform 2020; 12:12. [PMID: 33431043 PMCID: PMC7015965 DOI: 10.1186/s13321-020-0416-x] [Citation(s) in RCA: 116] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 02/04/2020] [Indexed: 01/10/2023] Open
Abstract
The chemical sciences are producing an unprecedented amount of large, high-dimensional data sets containing chemical structures and associated properties. However, there are currently no algorithms to visualize such data while preserving both global and local features with a sufficient level of detail to allow for human inspection and interpretation. Here, we propose a solution to this problem with a new data visualization method, TMAP, capable of representing data sets of up to millions of data points and arbitrary high dimensionality as a two-dimensional tree (http://tmap.gdb.tools). Visualizations based on TMAP are better suited than t-SNE or UMAP for the exploration and interpretation of large data sets due to their tree-like nature, increased local and global neighborhood and structure preservation, and the transparency of the methods the algorithm is based on. We apply TMAP to the most used chemistry data sets including databases of molecules such as ChEMBL, FDB17, the Natural Products Atlas, DSSTox, as well as to the MoleculeNet benchmark collection of data sets. We also show its broad applicability with further examples from biology, particle physics, and literature.![]()
Collapse
Affiliation(s)
- Daniel Probst
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| |
Collapse
|
26
|
Bühlmann S, Reymond JL. ChEMBL-Likeness Score and Database GDBChEMBL. Front Chem 2020; 8:46. [PMID: 32117874 PMCID: PMC7010641 DOI: 10.3389/fchem.2020.00046] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 01/15/2020] [Indexed: 01/02/2023] Open
Abstract
The generated database GDB17 enumerates 166.4 billion molecules up to 17 atoms of C, N, O, S and halogens following simple rules of chemical stability and synthetic feasibility. However, most molecules in GDB17 are too complex to be considered for chemical synthesis. To address this limitation, we report GDBChEMBL as a subset of GDB17 featuring 10 million molecules selected according to a ChEMBL-likeness score (CLscore) calculated from the frequency of occurrence of circular substructures in ChEMBL, followed by uniform sampling across molecular size, stereocenters and heteroatoms. Compared to the previously reported subsets FDB17 and GDBMedChem selected from GDB17 by fragment-likeness, respectively, medicinal chemistry criteria, our new subset features molecules with higher synthetic accessibility and possibly bioactivity yet retains a broad and continuous coverage of chemical space typical of the entire GDB17. GDBChEMBL is accessible at http://gdb.unibe.ch for download and for browsing using an interactive chemical space map at http://faerun.gdb.tools.
Collapse
Affiliation(s)
- Sven Bühlmann
- Department of Chemistry and Biochemistry, University of Bern, Bern, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Bern, Switzerland
| |
Collapse
|
27
|
Yang J, Wang D, Jia C, Wang M, Hao G, Yang G. Freely Accessible Chemical Database Resources of Compounds for In Silico Drug Discovery. Curr Med Chem 2020; 26:7581-7597. [PMID: 29737247 DOI: 10.2174/0929867325666180508100436] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Revised: 01/26/2018] [Accepted: 04/18/2018] [Indexed: 11/22/2022]
Abstract
BACKGROUND In silico drug discovery has been proved to be a solidly established key component in early drug discovery. However, this task is hampered by the limitation of quantity and quality of compound databases for screening. In order to overcome these obstacles, freely accessible database resources of compounds have bloomed in recent years. Nevertheless, how to choose appropriate tools to treat these freely accessible databases is crucial. To the best of our knowledge, this is the first systematic review on this issue. OBJECTIVE The existed advantages and drawbacks of chemical databases were analyzed and summarized based on the collected six categories of freely accessible chemical databases from literature in this review. RESULTS Suggestions on how and in which conditions the usage of these databases could be reasonable were provided. Tools and procedures for building 3D structure chemical libraries were also introduced. CONCLUSION In this review, we described the freely accessible chemical database resources for in silico drug discovery. In particular, the chemical information for building chemical database appears as attractive resources for drug design to alleviate experimental pressure.
Collapse
Affiliation(s)
- JingFang Yang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, China
| | - Di Wang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, China
| | - Chenyang Jia
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, China
| | - Mengyao Wang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, China
| | - GeFei Hao
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, China
| | - GuangFu Yang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, China.,Collaborative Innovation Center of Chemical Science and Engineering, Tianjin 300072, China
| |
Collapse
|
28
|
Horvath D, Marcou G, Varnek A. Generative topographic mapping in drug design. DRUG DISCOVERY TODAY. TECHNOLOGIES 2019; 32-33:99-107. [PMID: 33386101 DOI: 10.1016/j.ddtec.2020.06.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 06/10/2020] [Accepted: 06/18/2020] [Indexed: 06/12/2023]
Abstract
This is a review article of Generative Topographic Mapping (GTM) - a non-linear dimensionality reduction technique producing generative 2D maps of high-dimensional vector spaces - and its specific applications in Drug Design (chemical space cartography, compound library design and analysis, virtual screening, pharmacological profiling, de novo drug design, conformational space & docking interaction cartography, etc.) Written by chemoinformaticians for potential users among medicinal chemists and biologists, the article purposely avoids all underlying mathematics. First, the GTM concept is intuitively explained, based on the strong analogies with the rather popular Self-Organizing Maps (SOMs), which are well established library analysis tools. GTM is basically a fuzzy-logics-based generalization of SOMs. The second part of the review, some of published GTM applications in drug design are briefly revisited.
Collapse
Affiliation(s)
- Dragos Horvath
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France.
| | - Gilles Marcou
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France.
| |
Collapse
|
29
|
Moharreri E, Pardakhti M, Srivastava R, Suib SL. Energy-Geometry Dependency of Molecular Structures: A Multistep Machine Learning Approach. ACS COMBINATORIAL SCIENCE 2019; 21:614-621. [PMID: 31390176 DOI: 10.1021/acscombsci.9b00028] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
There is growing interest in estimating quantum observables while circumventing expensive computational overhead for facile in silico materials screening. Machine learning (ML) methods are implemented to perform such calculations in shorter times. Here, we introduce a multistep method based on machine learning algorithms to estimate total energy on the basis of spatial coordinates and charges for various chemical structures, including organic molecules, inorganic molecules, and ions. This method quickly calculates total energy with 0.76 au in root-mean-square error (RMSE) and 1.5% in mean absolute percent error (MAPE) when tested on a database of optimized and unoptimized structures. Using similar molecular representations, experimental thermochemical properties were estimated, with MAPE as low as 6% and RMSE of 8 cal/mol·K for heat capacity in a 10-fold cross-validation.
Collapse
Affiliation(s)
- Ehsan Moharreri
- Institute of Material Science, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Maryam Pardakhti
- Department of Chemical and Biomolecular Engineering, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Ranjan Srivastava
- Department of Chemical and Biomolecular Engineering, University of Connecticut, Storrs, Connecticut 06269, United States
| | - Steven L. Suib
- Institute of Material Science, University of Connecticut, Storrs, Connecticut 06269, United States
- Department of Chemistry, University of Connecticut, Storrs, Connecticut 06269, United States
| |
Collapse
|
30
|
Yoshimori A, Horita Y, Tanoue T, Bajorath J. Method for Systematic Analogue Search Using the Mega SAR Matrix Database. J Chem Inf Model 2019; 59:3727-3734. [PMID: 31468964 DOI: 10.1021/acs.jcim.9b00557] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Analogue searching is a typical requirement in hit expansion, hit-to-lead, and lead optimization projects. A new computational methodology is introduced to search for existing and virtual analogues of active compounds. The approach is based upon the SAR matrix (SARM) data structure that was originally developed for the systematic identification and structural organization of analogue series. The SARM-based analogue search algorithm further extends the capacity of current substructure-based methods by (i) simultaneously considering existing and virtual analogues that populate chemical space around query compounds, (ii) permitting not only R-group replacements but also well-defined chemical modifications in core structures to further expand the analogue space, and (iii) automatically extracting all possible analogues from large pools. In addition, as a basis for analogue searching following the SARM concept, the Mega-SARM database is introduced. Mega-SARM is derived from nearly 3.7 million compounds and contains ∼250 000 matrices with structurally related analogue series and more than 1.5 million virtual candidate compounds.
Collapse
Affiliation(s)
- Atsushi Yoshimori
- Institute for Theoretical Medicine, Inc. , 26-1 Muraoka-Higashi 2-chome , Fujisawa , Kanagawa 251-0012 , Japan
| | - Yuichi Horita
- INFOGRAM, Inc. , 2-17-19 Yasuda Building No. 5 3F, Hakataekimae, Hakata-ku , Fukuoka City , Fukuoka 812-0011 , Japan
| | - Toru Tanoue
- INFOGRAM, Inc. , 2-17-19 Yasuda Building No. 5 3F, Hakataekimae, Hakata-ku , Fukuoka City , Fukuoka 812-0011 , Japan
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry , Rheinische Friedrich-Wilhelms-Universität , Endenicher Allee 19c , D-53115 Bonn , Germany
| |
Collapse
|
31
|
Shi Y, von Itzstein M. How Size Matters: Diversity for Fragment Library Design. Molecules 2019; 24:molecules24152838. [PMID: 31387220 PMCID: PMC6696339 DOI: 10.3390/molecules24152838] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 08/02/2019] [Accepted: 08/03/2019] [Indexed: 12/11/2022] Open
Abstract
Fragment-based drug discovery (FBDD) has become a major strategy to derive novel lead candidates for various therapeutic targets, as it promises efficient exploration of chemical space by employing fragment-sized (MW < 300) compounds. One of the first challenges in implementing a FBDD approach is the design of a fragment library, and more specifically, the choice of its size and individual members. A diverse set of fragments is required to maximize the chances of discovering novel hit compounds. However, the exact diversity of a certain collection of fragments remains underdefined, which hinders direct comparisons among different selections of fragments. Based on structural fingerprints, we herein introduced quantitative metrics for the structural diversity of fragment libraries. Structures of commercially available fragments were retrieved from the ZINC database, from which libraries with sizes ranging from 100 to 100,000 compounds were selected. The selected libraries were evaluated and compared quantitatively, resulting in interesting size-diversity relationships. Our results demonstrated that while library size does matter for its diversity, there exists an optimal size for structural diversity. It is also suggested that such quantitative measures can guide the design of diverse fragment libraries under different circumstances.
Collapse
Affiliation(s)
- Yun Shi
- Institute for Glycomics, Griffith University, Gold Coast Campus, Gold Coast, Queensland 4222, Australia.
| | - Mark von Itzstein
- Institute for Glycomics, Griffith University, Gold Coast Campus, Gold Coast, Queensland 4222, Australia.
| |
Collapse
|
32
|
Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem Rev 2019; 119:10520-10594. [PMID: 31294972 DOI: 10.1021/acs.chemrev.8b00728] [Citation(s) in RCA: 346] [Impact Index Per Article: 69.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
- Xin Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Ryan Byrne
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Gisbert Schneider
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| |
Collapse
|
33
|
Awale M, Sirockin F, Stiefl N, Reymond JL. Medicinal Chemistry Aware Database GDBMedChem. Mol Inform 2019; 38:e1900031. [PMID: 31169974 DOI: 10.1002/minf.201900031] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 05/21/2019] [Indexed: 12/17/2022]
Abstract
The generated database GDB17 enumerates 166.4 billion possible molecules up to 17 atoms of C, N, O, S and halogens following simple chemical stability and synthetic feasibility rules, however medicinal chemistry criteria are not taken into account. Here we applied rules inspired by medicinal chemistry to exclude problematic functional groups and complex molecules from GDB17, and sampled the resulting subset uniformly across molecular size, stereochemistry and polarity to form GDBMedChem as a compact collection of 10 million small molecules. This collection has reduced complexity and better synthetic accessibility than the entire GDB17 but retains higher sp3 -carbon fraction and natural product likeness scores compared to known drugs. GDBMedChem molecules are more diverse and very different from known molecules in terms of substructures and represent an unprecedented source of diversity for drug design. GDBMedChem is available for 3D-visualization, similarity searching and for download at http://gdb.unibe.ch.
Collapse
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Finton Sirockin
- Novartis Institutes for Biomedical Research, Basel, Switzerland
| | - Nikolaus Stiefl
- Novartis Institutes for Biomedical Research, Basel, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| |
Collapse
|
34
|
Zhang R, McIntyre PJ, Collins PM, Foley DJ, Arter C, von Delft F, Bayliss R, Warriner S, Nelson A. Construction of a Shape‐Diverse Fragment Set: Design, Synthesis and Screen against Aurora‐A Kinase. Chemistry 2019; 25:6831-6839. [DOI: 10.1002/chem.201900815] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 03/28/2019] [Indexed: 01/16/2023]
Affiliation(s)
- Rong Zhang
- Astbury Centre for Structural Molecular BiologyUniversity of Leeds Leeds LS2 9JT UK
- School of ChemistryUniversity of Leeds Leeds LS2 9JT UK
| | - Patrick J. McIntyre
- Department of Molecular and Cell Biology, Henry Wellcome BuildingUniversity of Leicester Leicester LE1 9HN UK
| | - Patrick M. Collins
- Diamond Light Source Ltd. Harwell Science and Innovation Campus Didcot OX11 0DE UK
| | - Daniel J. Foley
- Astbury Centre for Structural Molecular BiologyUniversity of Leeds Leeds LS2 9JT UK
- School of ChemistryUniversity of Leeds Leeds LS2 9JT UK
| | - Christopher Arter
- Astbury Centre for Structural Molecular BiologyUniversity of Leeds Leeds LS2 9JT UK
- School of ChemistryUniversity of Leeds Leeds LS2 9JT UK
| | - Frank von Delft
- Diamond Light Source Ltd. Harwell Science and Innovation Campus Didcot OX11 0DE UK
- Structural Genomics Consortium, Nuffield Department of MedicineUniversity of Oxford Oxford OX3 7DQ UK
- Department of BiochemistryUniversity of Johannesburg Aukland Park 2006 South Africa
| | - Richard Bayliss
- Astbury Centre for Structural Molecular BiologyUniversity of Leeds Leeds LS2 9JT UK
- School of Molecular and Cellular BiologyUniversity of Leeds Leeds LS2 9JT UK
| | - Stuart Warriner
- Astbury Centre for Structural Molecular BiologyUniversity of Leeds Leeds LS2 9JT UK
- School of ChemistryUniversity of Leeds Leeds LS2 9JT UK
| | - Adam Nelson
- Astbury Centre for Structural Molecular BiologyUniversity of Leeds Leeds LS2 9JT UK
- School of ChemistryUniversity of Leeds Leeds LS2 9JT UK
| |
Collapse
|
35
|
Awale M, Sirockin F, Stiefl N, Reymond JL. Drug Analogs from Fragment-Based Long Short-Term Memory Generative Neural Networks. J Chem Inf Model 2019; 59:1347-1356. [DOI: 10.1021/acs.jcim.8b00902] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Finton Sirockin
- Novartis Institutes for Biomedical Research, CH-4002 Basel, Switzerland
| | - Nikolaus Stiefl
- Novartis Institutes for Biomedical Research, CH-4002 Basel, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| |
Collapse
|
36
|
Abstract
Herein we describe a method for the design, purchase, and assembly of a fragment-screening library from a list of commercially available compounds. The computational tools used in assessment of compound properties as well as the workflow for compound selection are provided for reference as implemented in commercially available software that is free and accessible to most academic users. The workflow can be modified as necessary to generate a fit-for-purpose fragment library with the desired compound property profiles. An analytical process for assessing the quality, identity, and suitability of a purchased fragment for inclusion in a screening collection is described. Results from our in-house library are presented as an example of compound progression through this quality control process.
Collapse
|
37
|
Yang JF, Wang F, Jiang W, Zhou GY, Li CZ, Zhu XL, Hao GF, Yang GF. PADFrag: A Database Built for the Exploration of Bioactive Fragment Space for Drug Discovery. J Chem Inf Model 2018; 58:1725-1730. [DOI: 10.1021/acs.jcim.8b00285] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- Jing-Fang Yang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P.R. China
- International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, P.R. China
| | - Fan Wang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P.R. China
- International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, P.R. China
| | - Wen Jiang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P.R. China
- International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, P.R. China
| | - Guang-You Zhou
- School of Computer Science, Central China Normal University, Wuhan 430079, P.R. China
| | - Cheng-Zhang Li
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P.R. China
- International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, P.R. China
| | - Xiao-Lei Zhu
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P.R. China
- International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, P.R. China
| | - Ge-Fei Hao
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P.R. China
- International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, P.R. China
| | - Guang-Fu Yang
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, P.R. China
- International Joint Research Center for Intelligent Biosensor Technology and Health, Central China Normal University, Wuhan, 430079, P.R. China
- Collaborative Innovation Center of Chemical Science and Engineering, Tianjing 300072, P.R. China
| |
Collapse
|
38
|
Helal CJ, Bartolozzi A, Goble SD, Mani NS, Guzman-Perez A, Ohri AK, Shi ZC, Subramanyam C. Increased building block access through collaboration. Drug Discov Today 2018; 23:1458-1462. [DOI: 10.1016/j.drudis.2018.03.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Revised: 02/15/2018] [Accepted: 03/06/2018] [Indexed: 12/12/2022]
|
39
|
Weiss DR, Karpiak J, Huang XP, Sassano MF, Lyu J, Roth BL, Shoichet BK. Selectivity Challenges in Docking Screens for GPCR Targets and Antitargets. J Med Chem 2018; 61:6830-6845. [PMID: 29990431 PMCID: PMC6105036 DOI: 10.1021/acs.jmedchem.8b00718] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
To investigate large library docking's ability to find molecules with joint activity against on-targets and selectivity versus antitargets, the dopamine D2 and serotonin 5-HT2A receptors were targeted, seeking selectivity against the histamine H1 receptor. In a second campaign, κ-opioid receptor ligands were sought with selectivity versus the μ-opioid receptor. While hit rates ranged from 40% to 63% against the on-targets, they were just as good against the antitargets, even though the molecules were selected for their putative lack of binding to the off-targets. Affinities, too, were often as good or better for the off-targets. Even though it was occasionally possible to find selective molecules, such as a mid-nanomolar D2/5-HT2A ligand with 21-fold selectivity versus the H1 receptor, this was the exception. Whereas false-negatives are tolerable in docking screens against on-targets, they are intolerable against antitargets; addressing this problem may demand new strategies in the field.
Collapse
Affiliation(s)
- Dahlia R Weiss
- Department of Pharmaceutical Chemistry , University of California-San Francisco , San Francisco , California 94158-2550 , United States
| | - Joel Karpiak
- Department of Pharmaceutical Chemistry , University of California-San Francisco , San Francisco , California 94158-2550 , United States
| | - Xi-Ping Huang
- Department of Pharmacology and National Institute of Mental Health Psychoactive Drug Screening Program, School of Medicine , University of North Carolina , Chapel Hill , North Carolina 27599 , United States
| | - Maria F Sassano
- Department of Pharmacology and National Institute of Mental Health Psychoactive Drug Screening Program, School of Medicine , University of North Carolina , Chapel Hill , North Carolina 27599 , United States
| | - Jiankun Lyu
- Department of Pharmaceutical Chemistry , University of California-San Francisco , San Francisco , California 94158-2550 , United States
| | - Bryan L Roth
- Department of Pharmacology and National Institute of Mental Health Psychoactive Drug Screening Program, School of Medicine , University of North Carolina , Chapel Hill , North Carolina 27599 , United States
| | - Brian K Shoichet
- Department of Pharmaceutical Chemistry , University of California-San Francisco , San Francisco , California 94158-2550 , United States
| |
Collapse
|
40
|
Tang D, Wu Y, Verploegh RJ, Sholl DS. Efficiently Exploring Adsorption Space to Identify Privileged Adsorbents for Chemical Separations of a Diverse Set of Molecules. CHEMSUSCHEM 2018; 11:1567-1575. [PMID: 29624911 DOI: 10.1002/cssc.201702289] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Revised: 03/13/2018] [Accepted: 04/01/2018] [Indexed: 06/08/2023]
Abstract
Although computational models have been used to predict adsorption of molecules in large libraries of porous adsorbents, previous work of this kind has focused on a small number of molecules as potential adsorbates. In this study, molecular simulations were used to consider the adsorption of a diverse range of molecules in a large collection of metal-organic framework (MOF) materials. Specifically, 11 304 isotherms were obtained from molecular simulations of 24 different adsorbates in 471 MOFs. This information provides insight into several interesting questions that could not be addressed with previously available data. Highly computationally efficient methods are introduced that can predict isotherms for a wide range of adsorbing molecules with far less computation than traditional molecular simulations. By characterizing the 276 binary mixtures defined by the molecules considered, "privileged" adsorbents are shown to exist, which are effective for separating many different molecular mixtures. Finally, correlations that were developed previously to predict molecular solubility in polymers are found to be surprisingly effective in predicting the average properties of molecules adsorbing in MOFs.
Collapse
Affiliation(s)
- Dai Tang
- School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, 311 Ferst Drive NW, Atlanta, Georgia, 30332-0100, USA
| | - Ying Wu
- School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, 311 Ferst Drive NW, Atlanta, Georgia, 30332-0100, USA
- School of Chemical and Chemical Engineering, South China University of Technology, Guangzhou, China
| | - Ross J Verploegh
- School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, 311 Ferst Drive NW, Atlanta, Georgia, 30332-0100, USA
| | - David S Sholl
- School of Chemical & Biomolecular Engineering, Georgia Institute of Technology, 311 Ferst Drive NW, Atlanta, Georgia, 30332-0100, USA
| |
Collapse
|
41
|
Ghahremanpour MM, van Maaren PJ, van der Spoel D. The Alexandria library, a quantum-chemical database of molecular properties for force field development. Sci Data 2018; 5:180062. [PMID: 29633987 PMCID: PMC5892371 DOI: 10.1038/sdata.2018.62] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 02/19/2018] [Indexed: 12/03/2022] Open
Abstract
Data quality as well as library size are crucial issues for force field development. In order to predict molecular properties in a large chemical space, the foundation to build force fields on needs to encompass a large variety of chemical compounds. The tabulated molecular physicochemical properties also need to be accurate. Due to the limited transparency in data used for development of existing force fields it is hard to establish data quality and reusability is low. This paper presents the Alexandria library as an open and freely accessible database of optimized molecular geometries, frequencies, electrostatic moments up to the hexadecupole, electrostatic potential, polarizabilities, and thermochemistry, obtained from quantum chemistry calculations for 2704 compounds. Values are tabulated and where available compared to experimental data. This library can assist systematic development and training of empirical force fields for a broad range of molecules.
Collapse
Affiliation(s)
- Mohammad M. Ghahremanpour
- Uppsala Centre for Computational Chemistry, Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Husargatan 3, Box 596, SE-75124 Uppsala, Sweden
| | - Paul J. van Maaren
- Uppsala Centre for Computational Chemistry, Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Husargatan 3, Box 596, SE-75124 Uppsala, Sweden
| | - David van der Spoel
- Uppsala Centre for Computational Chemistry, Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Husargatan 3, Box 596, SE-75124 Uppsala, Sweden
| |
Collapse
|
42
|
Opassi G, Gesù A, Massarotti A. The hitchhiker’s guide to the chemical-biological galaxy. Drug Discov Today 2018; 23:565-574. [DOI: 10.1016/j.drudis.2018.01.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Revised: 11/25/2017] [Accepted: 01/04/2018] [Indexed: 12/21/2022]
|
43
|
Brown N, Cambruzzi J, Cox PJ, Davies M, Dunbar J, Plumbley D, Sellwood MA, Sim A, Williams-Jones BI, Zwierzyna M, Sheppard DW. Big Data in Drug Discovery. PROGRESS IN MEDICINAL CHEMISTRY 2018; 57:277-356. [PMID: 29680150 DOI: 10.1016/bs.pmch.2017.12.003] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Interpretation of Big Data in the drug discovery community should enhance project timelines and reduce clinical attrition through improved early decision making. The issues we encounter start with the sheer volume of data and how we first ingest it before building an infrastructure to house it to make use of the data in an efficient and productive way. There are many problems associated with the data itself including general reproducibility, but often, it is the context surrounding an experiment that is critical to success. Help, in the form of artificial intelligence (AI), is required to understand and translate the context. On the back of natural language processing pipelines, AI is also used to prospectively generate new hypotheses by linking data together. We explain Big Data from the context of biology, chemistry and clinical trials, showcasing some of the impressive public domain sources and initiatives now available for interrogation.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Aaron Sim
- BenevolentAI, London, United Kingdom
| | | | - Magdalena Zwierzyna
- BenevolentAI, London, United Kingdom; Institute of Cardiovascular Science, University College London, London, United Kingdom
| | | |
Collapse
|
44
|
Lin A, Horvath D, Afonina V, Marcou G, Reymond JL, Varnek A. Mapping of the Available Chemical Space versus the Chemical Universe of Lead-Like Compounds. ChemMedChem 2018; 13:540-554. [PMID: 29154440 DOI: 10.1002/cmdc.201700561] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Revised: 11/07/2017] [Indexed: 12/15/2022]
Abstract
This is, to our knowledge, the most comprehensive analysis to date based on generative topographic mapping (GTM) of fragment-like chemical space (40 million molecules with no more than 17 heavy atoms, both from the theoretically enumerated GDB-17 and real-world PubChem/ChEMBL databases). The challenge was to prove that a robust map of fragment-like chemical space can actually be built, in spite of a limited (≪105 ) maximal number of compounds ("frame set") usable for fitting the GTM manifold. An evolutionary map building strategy has been updated with a "coverage check" step, which discards manifolds failing to accommodate compounds outside the frame set. The evolved map has a good propensity to separate actives from inactives for more than 20 external structure-activity sets. It was proven to properly accommodate the entire collection of 40 m compounds. Next, it served as a library comparison tool to highlight biases of real-world molecules (PubChem and ChEMBL) versus the universe of all possible species represented by FDB-17, a fragment-like subset of GDB-17 containing 10 million molecules. Specific patterns, proper to some libraries and absent from others (diversity holes), were highlighted.
Collapse
Affiliation(s)
- Arkadii Lin
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| | - Dragos Horvath
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| | - Valentina Afonina
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France.,Laboratory of Chemoinformatics and Molecular Modeling, Department of Organic Chemistry, A.M. Butlerov Institute of Chemistry, Kazan Federal University, 18 Kremlyovskaya str., 420008, Kazan, Russia
| | - Gilles Marcou
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Berne, 3 Freiestrasse, 3012, Berne, Switzerland
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| |
Collapse
|
45
|
Probst D, Reymond JL. SmilesDrawer: Parsing and Drawing SMILES-Encoded Molecular Structures Using Client-Side JavaScript. J Chem Inf Model 2018; 58:1-7. [PMID: 29257869 DOI: 10.1021/acs.jcim.7b00425] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Here we present SmilesDrawer, a dependency-free JavaScript component capable of both parsing and drawing SMILES-encoded molecular structures client-side, developed to be easily integrated into web projects and to display organic molecules in large numbers and fast succession. SmilesDrawer can draw structurally and stereochemically complex structures such as maitotoxin and C60 without using templates, yet has an exceptionally small computational footprint and low memory usage without the requirement for loading images or any other form of client-server communication, making it easy to integrate even in secure (intranet, firewalled) or offline applications. These features allow the rendering of thousands of molecular structure drawings on a single web page within seconds on a wide range of hardware supporting modern browsers. The source code as well as the most recent build of SmilesDrawer is available on Github ( http://doc.gdb.tools/smilesDrawer/ ). Both yarn and npm packages are also available.
Collapse
Affiliation(s)
- Daniel Probst
- Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
46
|
O'Hagan S, Kell DB. Analysing and Navigating Natural Products Space for Generating Small, Diverse, But Representative Chemical Libraries. Biotechnol J 2017; 13. [PMID: 29168302 DOI: 10.1002/biot.201700503] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 11/09/2017] [Indexed: 01/01/2023]
Abstract
Armed with the digital availability of two natural products libraries, amounting to some 195 885 molecular entities, we ask the question of how we can best sample from them to maximize their "representativeness" in smaller and more usable libraries of 96, 384, 1152, and 1920 molecules. The term "representativeness" is intended to include diversity, but for numerical reasons (and the likelihood of being able to perform a QSAR) it is necessary to focus on areas of chemical space that are more highly populated. Encoding chemical structures as fingerprints using the RDKit "patterned" algorithm, we first assess the granularity of the natural products space using a simple clustering algorithm, showing that there are major regions of "denseness" but also a great many very sparsely populated areas. We then apply a "hybrid" hierarchical K-means clustering algorithm to the data to produce more statistically robust clusters from which representative and appropriate numbers of samples may be chosen. There is necessarily again a trade-off between cluster size and cluster number, but within these constraints, libraries containing 384 or 1152 molecules can be found that come from clusters that represent some 18 and 30% of the whole chemical space, with cluster sizes of, respectively, 50 and 27 or above, just about sufficient to perform a QSAR. By using the online availability of molecules via the Molport system (www.molport.com), we are also able to construct (and, for the first time, provide the contents of) a small virtual library of available molecules that provided effective coverage of the chemical space described. Consistent with this, the average molecular similarities of the contents of the libraries developed is considerably smaller than is that of the original libraries. The suggested libraries may have use in molecular or phenotypic screening, including for determining possible transporter substrates.
Collapse
Affiliation(s)
- Steve O'Hagan
- Dr. S. O'Hagan, Prof. D. B. Kell, School of Chemistry, The University of Manchester, 131 Princess St, Manchester M1 7DN, UK.,Dr. S. O'Hagan, Prof. D. B. Kell, The Manchester Institute of Biotechnology, The University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| | - Douglas B Kell
- Dr. S. O'Hagan, Prof. D. B. Kell, School of Chemistry, The University of Manchester, 131 Princess St, Manchester M1 7DN, UK.,Dr. S. O'Hagan, Prof. D. B. Kell, The Manchester Institute of Biotechnology, The University of Manchester, 131 Princess St, Manchester M1 7DN, UK.,Prof. D. B. Kell, Centre for the Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), The University of Manchester, 131 Princess St, Manchester M1 7DN, UK
| |
Collapse
|
47
|
|
48
|
Visini R, Arús-Pous J, Awale M, Reymond JL. Virtual Exploration of the Ring Systems Chemical Universe. J Chem Inf Model 2017; 57:2707-2718. [PMID: 29019686 DOI: 10.1021/acs.jcim.7b00457] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Here, we explore the chemical space of all virtually possible organic molecules focusing on ring systems, which represent the cyclic cores of organic molecules obtained by removing all acyclic bonds and converting all remaining atoms to carbon. This approach circumvents the combinatorial explosion encountered when enumerating the molecules themselves. We report the chemical universe database GDB4c containing 916 130 ring systems up to four saturated or aromatic rings and maximum ring size of 14 atoms and GDB4c3D containing the corresponding 6 555 929 stereoisomers. Almost all (98.6%) of these ring systems are unknown and represent chiral 3D-shaped macrocycles containing small rings and quaternary centers reminiscent of polycyclic natural products. We envision that GDB4c can serve to select new ring systems from which to design analogs of such natural products. The database is available for download at www.gdb.unibe.ch together with interactive visualization and search tools as a resource for molecular design.
Collapse
Affiliation(s)
- Ricardo Visini
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Josep Arús-Pous
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Mahendra Awale
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
49
|
|
50
|
Affiliation(s)
- Xavier Barril
- a Facultat de Farmacia and Institut de Biomedicina (IBUB) , Universitat de Barcelona , Barcelona , Spain.,b Catalan Institution for Research and Advanced Studies (ICREA) , Barcelona , Spain
| |
Collapse
|