1
|
Chen LY, Li YP. AutoTemplate: enhancing chemical reaction datasets for machine learning applications in organic chemistry. J Cheminform 2024; 16:74. [PMID: 38937840 PMCID: PMC11212196 DOI: 10.1186/s13321-024-00869-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 06/09/2024] [Indexed: 06/29/2024] Open
Abstract
This paper presents AutoTemplate, an innovative data preprocessing protocol, addressing the crucial need for high-quality chemical reaction datasets in the realm of machine learning applications in organic chemistry. Recent advances in artificial intelligence have expanded the application of machine learning in chemistry, particularly in yield prediction, retrosynthesis, and reaction condition prediction. However, the effectiveness of these models hinges on the integrity of chemical reaction datasets, which are often plagued by inconsistencies like missing reactants, incorrect atom mappings, and outright erroneous reactions. AutoTemplate introduces a two-stage approach to refine these datasets. The first stage involves extracting meaningful reaction transformation rules and formulating generic reaction templates using a simplified SMARTS representation. This simplification broadens the applicability of templates across various chemical reactions. The second stage is template-guided reaction curation, where these templates are systematically applied to validate and correct the reaction data. This process effectively amends missing reactant information, rectifies atom-mapping errors, and eliminates incorrect data entries. A standout feature of AutoTemplate is its capability to concurrently identify and correct false chemical reactions. It operates on the premise that most reactions in datasets are accurate, using these as templates to guide the correction of flawed entries. The protocol demonstrates its efficacy across a range of chemical reactions, significantly enhancing dataset quality. This advancement provides a more robust foundation for developing reliable machine learning models in chemistry, thereby improving the accuracy of forward and retrosynthetic predictions. AutoTemplate marks a significant progression in the preprocessing of chemical reaction datasets, bridging a vital gap and facilitating more precise and efficient machine learning applications in organic synthesis. SCIENTIFIC CONTRIBUTION: The proposed automated preprocessing tool for chemical reaction data aims to identify errors within chemical databases. Specifically, if the errors involve atom mapping or the absence of reactant types, corrections can be systematically applied using reaction templates, ultimately elevating the overall quality of the database.
Collapse
Affiliation(s)
- Lung-Yi Chen
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, 10617, Taiwan
| | - Yi-Pei Li
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, 10617, Taiwan.
- Taiwan International Graduate Program on Sustainable Chemical Science and Technology (TIGP-SCST), No. 128, Sec. 2, Academia Road, Taipei, 11529, Taiwan.
| |
Collapse
|
2
|
Horne RI, Andrzejewska EA, Alam P, Brotzakis ZF, Srivastava A, Aubert A, Nowinska M, Gregory RC, Staats R, Possenti A, Chia S, Sormanni P, Ghetti B, Caughey B, Knowles TPJ, Vendruscolo M. Discovery of potent inhibitors of α-synuclein aggregation using structure-based iterative learning. Nat Chem Biol 2024; 20:634-645. [PMID: 38632492 PMCID: PMC11062903 DOI: 10.1038/s41589-024-01580-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 02/12/2024] [Indexed: 04/19/2024]
Abstract
Machine learning methods hold the promise to reduce the costs and the failure rates of conventional drug discovery pipelines. This issue is especially pressing for neurodegenerative diseases, where the development of disease-modifying drugs has been particularly challenging. To address this problem, we describe here a machine learning approach to identify small molecule inhibitors of α-synuclein aggregation, a process implicated in Parkinson's disease and other synucleinopathies. Because the proliferation of α-synuclein aggregates takes place through autocatalytic secondary nucleation, we aim to identify compounds that bind the catalytic sites on the surface of the aggregates. To achieve this goal, we use structure-based machine learning in an iterative manner to first identify and then progressively optimize secondary nucleation inhibitors. Our results demonstrate that this approach leads to the facile identification of compounds two orders of magnitude more potent than previously reported ones.
Collapse
Affiliation(s)
- Robert I Horne
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Ewa A Andrzejewska
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Parvez Alam
- Laboratory of Neurological Infections and Immunity, Rocky Mountain Laboratories, National Institute for Allergy and Infectious Diseases, National Institutes of Health, Hamilton, MT, USA
| | - Z Faidon Brotzakis
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Ankit Srivastava
- Laboratory of Neurological Infections and Immunity, Rocky Mountain Laboratories, National Institute for Allergy and Infectious Diseases, National Institutes of Health, Hamilton, MT, USA
| | - Alice Aubert
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Magdalena Nowinska
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Rebecca C Gregory
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Roxine Staats
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Andrea Possenti
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Sean Chia
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
- Bioprocessing Technology Institute, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Pietro Sormanni
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Bernardino Ghetti
- Department of Pathology and Laboratory Medicine, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Byron Caughey
- Laboratory of Neurological Infections and Immunity, Rocky Mountain Laboratories, National Institute for Allergy and Infectious Diseases, National Institutes of Health, Hamilton, MT, USA
| | - Tuomas P J Knowles
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Michele Vendruscolo
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
3
|
Vashishat A, Patel P, Das Gupta G, Das Kurmi B. Alternatives of Animal Models for Biomedical Research: a Comprehensive Review of Modern Approaches. Stem Cell Rev Rep 2024; 20:881-899. [PMID: 38429620 DOI: 10.1007/s12015-024-10701-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/19/2024] [Indexed: 03/03/2024]
Abstract
Biomedical research has long relied on animal models to unravel the intricacies of human physiology and pathology. However, concerns surrounding ethics, expenses, and inherent species differences have catalyzed the exploration of alternative avenues. The contemporary alternatives to traditional animal models in biomedical research delve into three main categories of alternative approaches: in vitro models, in vertebrate models, and in silico models. This unique approach to artificial intelligence and machine learning has been a keen interest to be used in different biomedical research. The main goal of this review is to serve as a guide to researchers seeking novel avenues for their investigations and underscores the importance of considering alternative models in the pursuit of scientific knowledge and medical breakthroughs, including showcasing the broad spectrum of modern approaches that are revolutionizing biomedical research and leading the way toward a more ethical, efficient, and innovative future. Models can insight into cellular processes, developmental biology, drug interaction, assessing toxicology, and understanding molecular mechanisms.
Collapse
Affiliation(s)
- Abhinav Vashishat
- Department of Pharmaceutics, ISF College of Pharmacy, GT Road, Moga, 142001, Punjab, India
| | - Preeti Patel
- Department of Pharmaceutical Chemistry, ISF College Pharmacy, GT Road, Moga, 142001, Punjab, India.
| | - Ghanshyam Das Gupta
- Department of Pharmaceutics, ISF College of Pharmacy, GT Road, Moga, 142001, Punjab, India
| | - Balak Das Kurmi
- Department of Pharmaceutics, ISF College of Pharmacy, GT Road, Moga, 142001, Punjab, India.
| |
Collapse
|
4
|
Yi J, Shi S, Fu L, Yang Z, Nie P, Lu A, Wu C, Deng Y, Hsieh C, Zeng X, Hou T, Cao D. OptADMET: a web-based tool for substructure modifications to improve ADMET properties of lead compounds. Nat Protoc 2024; 19:1105-1121. [PMID: 38263521 DOI: 10.1038/s41596-023-00942-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 10/27/2023] [Indexed: 01/25/2024]
Abstract
Lead optimization is a crucial step in the drug discovery process, which aims to design potential drug candidates from biologically active hits. During lead optimization, active hits undergo modifications to improve their absorption, distribution, metabolism, excretion and toxicity (ADMET) profiles. Medicinal chemists face key questions regarding which compound(s) should be synthesized next and how to balance multiple ADMET properties. Reliable transformation rules from multiple experimental analyses are critical to improve this decision-making process. We developed OptADMET ( https://cadd.nscc-tj.cn/deploy/optadmet/ ), an integrated web-based platform that provides chemical transformation rules for 32 ADMET properties and leverages prior experimental data for lead optimization. The multiproperty transformation rule database contains a total of 41,779 validated transformation rules generated from the analysis of 177,191 reliable experimental datasets. Additionally, 146,450 rules were generated by analyzing 239,194 molecular data predictions. OptADMET provides the ADMET profiles of all optimized molecules from the queried molecule and enables the prediction of desirable substructure transformations and subsequent validation of drug candidates. OptADMET is based on matched molecular pairs analysis derived from synthetic chemistry, thus providing improved practicality over other methods. OptADMET is designed for use by both experimental and computational scientists.
Collapse
Affiliation(s)
- Jiacai Yi
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, China
- School of Computer Science, National University of Defense Technology, Changsha, China
| | - Shaohua Shi
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, China
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Li Fu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, China
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, China
| | - Ziyi Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, China
| | - Pengfei Nie
- National Supercomputer Center in Tianjin, Tianjin, China
| | - Aiping Lu
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
- Guangdong-Hong Kong-Macau Joint Lab on Chinese Medicine and Immune Disease Research, Guangzhou, China
| | - Chengkun Wu
- School of Computer Science, National University of Defense Technology, Changsha, China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, China
| | - Changyu Hsieh
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, China
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Xiangxiang Zeng
- Deparment of Computer Science, Hunan University, Changsha, China
| | - Tingjun Hou
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, China.
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, China.
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China.
| |
Collapse
|
5
|
Chen EP, Dutta S, Ho MH, DeMartino MP. Model-Based Virtual PK/PD Exploration and Machine Learning Approach to Define PK Drivers in Early Drug Discovery. J Med Chem 2024; 67:3727-3740. [PMID: 38375820 DOI: 10.1021/acs.jmedchem.3c02169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
While poor translatability of preclinical efficacy models can be responsible for clinical phase II failures, misdefinition of the optimal PK properties required to achieve therapeutic efficacy can also be a contributing factor. In the present work, the pharmacological dependency of PK end points in driving efficacy is demonstrated for six common pharmacological processes via model-based analysis. The analysis shows that the response is driven by multiple pharmacology-specific PK end points that change with how the response is defined. Moreover, the results demonstrate that the most important chemical structural features influencing response are specific to both target and downstream pharmacology, meaning the design and screening criteria must be defined uniquely for each target and corresponding pharmacology. The model-based virtual exploration of PK/PD relationships presented in this work offers one approach to identify target pharmacology-specific PK drivers and the associated potency-ADME space early in discovery to increase the probability of success and, ultimately, clinical attrition.
Collapse
Affiliation(s)
- Emile P Chen
- Systems Modeling and Translational Biology, Computational Sciences, GSK, Collegeville, Pennsylvania 19426, United States
| | - Shayoni Dutta
- Systems Modeling and Translational Biology, Computational Sciences, GSK, Collegeville, Pennsylvania 19426, United States
| | - Ming-Hsun Ho
- Molecular Design, Computational Sciences, GSK, Collegeville, Pennsylvania 19426, United States
| | | |
Collapse
|
6
|
Chen CA, Li CX, Zhang ZH, Xu WX, Liu SL, Ni WC, Wang XQ, Cheng FF, Wang QG. Qinzhizhudan formula dampens inflammation in microglia polarization of vascular dementia rats by blocking MyD88/NF-κB signaling pathway: Through integrating network pharmacology and experimental validation. JOURNAL OF ETHNOPHARMACOLOGY 2024; 318:116769. [PMID: 37400007 DOI: 10.1016/j.jep.2023.116769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 05/24/2023] [Accepted: 06/09/2023] [Indexed: 07/05/2023]
Abstract
ETHNOPHARMACOLOGICAL RELEVANCE Qinzhizhudan Formula (QZZD) is composed of Scutellaria baicalensis Georgi (Huang Qin) extract, Gardenia jasminoides (Zhizi) extract and Suis Fellis Pulvis (Zhudanfen) (ratio of 4:5:6). This formula is optimized from Qingkailing (QKL) injection. Regarding brain injury, QZZD is protective. However, the mechanism by which QZZD treats vascular dementia (VD) has not been elucidated. AIM OF THE STUDY To ascertain QZZD's effect on the treatment of VD and further investigate the molecular mechanisms. MATERIALS AND METHODS In this study, we screened the possible components and targets of QZZD against VD and microglia polarization using network pharmacology (NP), then an animal model of bilateral common carotid artery ligation method (2VO) was induced. Afterward, The Morris water maze was employed to evaluate cognitive ability, and pathological alterations in the CA1 area of the hippocampus were detected using HE and Nissl staining. To confirm the affect of QZZD on VD and its molecular mechanism, the contents of inflammatory factors IL-1β, TNF-α, IL-4, and IL-10 were performed to detect by ELISA, the phenotype polarization of microglia cells was detected by immunofluorescence staining, and the expressions of MyD88, p-IκBα and p-NF-κB p65 in brain tissue were detected by western blot. RESULTS A total of 112 active compounds and 363 common targets of QZZD, microglia polarization, and VD were identified, according to the NP analysis. 38 hub targets were screened out from the PPI network. GO analysis and KEGG pathway analysis showed that QZZD may regulate microglia polarization through anti-inflammatory mechanism such as Toll-like receptor signaling pathway and NF-κB signaling pathway. The further results showed that QZZD can alleviate the memory impairment induced by 2VO. QZZD profoundly rescued brain hippocampus neuronal damage and increased the number of neurons. These advantageous outcomes were linked to the control of microglia polarization. QZZD decreased M1 phenotypic marker expression while increasing M2 phenotypic marker expression. QZZD may controll the polarization of the M1 microglia by blocking the core part of Toll-like receptor signaling pathway, that is the MyD88/NF-κB signaling pathway, which reduced the neurotoxic effects of the microglia. CONCLUSION Here, we explored the anti-VD microglial polarization characteristic of QZZD for the first time and clarified its mechanisms. These findings will provide valuable clues for the discovery of anti-VD agents.
Collapse
Affiliation(s)
- Cong-Ai Chen
- Dongzhimen Hospital Beijing University of Chinese Medicine, Beijing, 100700, China; Beijing University of Chinese Medicine, Beijing, 100029, China.
| | - Chang-Xiang Li
- Beijing University of Chinese Medicine, Beijing, 100029, China.
| | - Ze-Han Zhang
- Beijing University of Chinese Medicine, Beijing, 100029, China.
| | - Wen-Xiu Xu
- Beijing University of Chinese Medicine, Beijing, 100029, China.
| | - Shu-Ling Liu
- Dongzhimen Hospital Beijing University of Chinese Medicine, Beijing, 100700, China; Beijing University of Chinese Medicine, Beijing, 100029, China.
| | - Wen-Chao Ni
- Beijing University of Chinese Medicine, Beijing, 100029, China.
| | - Xue-Qian Wang
- Beijing University of Chinese Medicine, Beijing, 100029, China.
| | - Fa-Feng Cheng
- Beijing University of Chinese Medicine, Beijing, 100029, China.
| | - Qing-Guo Wang
- Beijing University of Chinese Medicine, Beijing, 100029, China.
| |
Collapse
|
7
|
McDonald SM, Augustine EK, Lanners Q, Rudin C, Catherine Brinson L, Becker ML. Applied machine learning as a driver for polymeric biomaterials design. Nat Commun 2023; 14:4838. [PMID: 37563117 PMCID: PMC10415291 DOI: 10.1038/s41467-023-40459-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 07/24/2023] [Indexed: 08/12/2023] Open
Abstract
Polymers are ubiquitous to almost every aspect of modern society and their use in medical products is similarly pervasive. Despite this, the diversity in commercial polymers used in medicine is stunningly low. Considerable time and resources have been extended over the years towards the development of new polymeric biomaterials which address unmet needs left by the current generation of medical-grade polymers. Machine learning (ML) presents an unprecedented opportunity in this field to bypass the need for trial-and-error synthesis, thus reducing the time and resources invested into new discoveries critical for advancing medical treatments. Current efforts pioneering applied ML in polymer design have employed combinatorial and high throughput experimental design to address data availability concerns. However, the lack of available and standardized characterization of parameters relevant to medicine, including degradation time and biocompatibility, represents a nearly insurmountable obstacle to ML-aided design of biomaterials. Herein, we identify a gap at the intersection of applied ML and biomedical polymer design, highlight current works at this junction more broadly and provide an outlook on challenges and future directions.
Collapse
Affiliation(s)
| | - Emily K Augustine
- Thomas Lord Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC, USA
| | - Quinn Lanners
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Cynthia Rudin
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - L Catherine Brinson
- Thomas Lord Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC, USA
| | - Matthew L Becker
- Department of Chemistry, Duke University, Durham, NC, USA.
- Thomas Lord Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC, USA.
| |
Collapse
|
8
|
Yarahmadi B, Hashemianzadeh SM, Milani Hosseini SMR. Machine-learning-based predictions of imprinting quality using ensemble and non-linear regression algorithms. Sci Rep 2023; 13:12111. [PMID: 37495673 PMCID: PMC10372080 DOI: 10.1038/s41598-023-39374-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 07/25/2023] [Indexed: 07/28/2023] Open
Abstract
The molecularly imprinted polymers are artificial polymers that, during the synthesis, create specific sites for a definite purpose. These polymers due to their characteristics such as stability, easy of synthesis, reproducibility, reusability, high accuracy, and selectivity have many applications. However, the variety of the functional monomers, templates, solvents, and synthesis conditions like pH, temperature, the rate of stirring, and time, limit the selectivity of imprinting. The Practical optimization of the synthetic conditions has many drawbacks, including chemical compound usage, equipment requirements, and time costs. The use of machine learning (ML) for the prediction of the imprinting factor (IF), which indicates the quality of imprinting is a very interesting idea to overcome these problems. The ML has many advantages, for example a lack of human error, high accuracy, high repeatability, and prediction of a large amount of data in the minimum time. In this research, ML was used to predict the IF using non-linear regression algorithms, including classification and regression tree, support vector regression, and k-nearest neighbors, and ensemble algorithms, like gradient boosting (GB), random forest, and extra trees. The data sets were obtained practically in the laboratory, and inputs, included pH, the type of the template, the type of the monomer, solvent, the distribution coefficient of the MIP (KMIP), and the distribution coefficient of the non-imprinted polymer (KNIP). The mutual information feature selection method was used to select the important features affecting the IF. The results showed that the GB algorithm had the best performance in predicting the IF, and using this algorithm, the maximum R2 value (R2 = 0.871), and the minimum mean absolute error (MAE = - 0.982), and mean square error were obtained (MSE = - 2.303).
Collapse
Affiliation(s)
- Bita Yarahmadi
- Real Samples Analysis Laboratory, Department of Chemistry, Iran University of Science and Technology, Tehran, Iran
| | - Seyed Majid Hashemianzadeh
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, Tehran, Iran.
| | | |
Collapse
|
9
|
Singh AP, Chitme H, Sharma RK, Kandpal JB, Behera A, Abdel-Wahab BA, Orabi MA, Khateeb MM, Shafiuddin Habeeb M, Bakir MB. A Comprehensive Review on Pharmacologically Active Phyto-Constituents from Hedychium species. Molecules 2023; 28:3278. [PMID: 37050042 PMCID: PMC10096824 DOI: 10.3390/molecules28073278] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 03/25/2023] [Accepted: 03/28/2023] [Indexed: 04/14/2023] Open
Abstract
In this review, we describe and discuss the phytoconstituents present in Hedychium species and emphasize their potential as drug candidates. Though they are widely validated in vitro and in vivo models, to date, no efforts have been made to compile in a single review all the pharmacologically active phytoconstituents from Hedychium species, and their pharmacological and toxicity profile. In this study, we present a reinvestigation of the chemical constituents present in Hedychium species obtained from the essential oil and solvent extraction of the flowers, leaves and rhizomes under consideration. Key databases such as PubMed, Science Direct, Scopus, and Google Scholar amongst others were probed for a systematic search using keywords to retrieve relevant publications on this plant. An exhaustive electronic survey of the related literature on Hedychium species resulted in around 200 articles. Articles published between the years 1975-2021 were included. The studies conducted on either crude extracts, solvent fractions or isolated pure compounds from Hedychium species reported with a varied range of biological effects such as anti-inflammatory, analgesic, antidiabetic, potentially anti-asthmatic, and cytotoxic, among other related activities of the chemical constituents present in its essential oil and solvent extract deployed in this review. Traditional and herbal medication around the world that uses different parts of Hedychium species were considered for anti-inflammatory, skincare, analgesic, anti-asthmatic, anti-diabetic, antidotal uses, among others. These uses support the idea that chemical constituents obtained from solvent extraction may also exert the same action individually or in a synergistic manner. The review concluded that there is scope for computation and biological study to find out possible new targets for strengthening the potency and selectivity of the relevant compounds, and to find a commercial method for extraction of active pharmaceutical ingredients.
Collapse
Affiliation(s)
- Alok Pratap Singh
- Faculty of Pharmacy, DIT University, Dehradun 248009, Uttarakhand, India;
- Department of Research and Development, India Glycols Ltd., Pharma City, Selaqui, Dehradun 248009, Uttarakhand, India
| | - Havagiray Chitme
- Faculty of Pharmacy, DIT University, Dehradun 248009, Uttarakhand, India;
| | | | - JB Kandpal
- Department of Research and Development, India Glycols Ltd., Pharma City, Selaqui, Dehradun 248009, Uttarakhand, India
| | - Ashok Behera
- Faculty of Pharmacy, DIT University, Dehradun 248009, Uttarakhand, India;
| | - Basel A. Abdel-Wahab
- Department of Pharmacology, College of Pharmacy, Najran University, Najran P.O. Box 1988, Saudi Arabia
| | - Mohammed Abdelmalek Orabi
- Department of Pharmacognosy, College of Pharmacy, Najran University, Najran P.O. Box 1988, Saudi Arabia
| | - Masood Medleri Khateeb
- Department of Pharmacology, College of Pharmacy, Najran University, Najran P.O. Box 1988, Saudi Arabia
| | | | - Marwa B. Bakir
- Department of Pharmacology, College of Medicine Najran University, Najran P.O. Box 1988, Saudi Arabia
| |
Collapse
|
10
|
Hu P, Zou J, Yu J, Shi S. De novo drug design based on Stack-RNN with multi-objective reward-weighted sum and reinforcement learning. J Mol Model 2023; 29:121. [PMID: 36991180 DOI: 10.1007/s00894-023-05523-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 03/20/2023] [Indexed: 03/31/2023]
Abstract
CONTEXT In recent decades, drug development has become extremely important as different new diseases have emerged. However, drug discovery is a long and complex process with a very low success rate, and methods are needed to improve the efficiency of the process and reduce the possibility of failure. Among them, drug design from scratch has become a promising approach. Molecules are generated from scratch, reducing the reliance on trial and error and prefabricated molecular repositories, but the optimization of its molecular properties is still a challenging multi-objective optimization problem. METHODS In this study, two stack-augmented recurrent neural networks were used to compose a generative model for generating drug-like molecules, and then reinforcement learning was used for optimization to generate molecules with desirable properties, such as binding affinity and the logarithm of the partition coefficient between octanol and water. In addition, a memory storage network was added to increase the internal diversity of the generated molecules. For multi-objective optimization, we proposed a new approach which utilized the magnitude of different attribute reward values to assign different weights to molecular optimization. The proposed model not only solves the problem that the properties of the generated molecules are extremely biased towards a certain attribute due to the possible conflict between the attributes, but also improves various properties of the generated molecules compared with the traditional weighted sum and alternating weighted sum, among which the molecular validity reaches 97.3%, the internal diversity is 0.8613, and the desirable molecules increases from 55.9 to 92%.
Collapse
Affiliation(s)
- Pengwei Hu
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Jinping Zou
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Jialin Yu
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Shaoping Shi
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China.
| |
Collapse
|
11
|
Li X, Tang L, Li Z, Qiu D, Yang Z, Li B. Prediction of ADMET Properties of Anti-Breast Cancer Compounds Using Three Machine Learning Algorithms. Molecules 2023; 28:molecules28052326. [PMID: 36903569 PMCID: PMC10005249 DOI: 10.3390/molecules28052326] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 01/18/2023] [Accepted: 01/22/2023] [Indexed: 03/06/2023] Open
Abstract
In recent years, machine learning methods have been applied successfully in many fields. In this paper, three machine learning algorithms, including partial least squares-discriminant analysis (PLS-DA), adaptive boosting (AdaBoost), and light gradient boosting machine (LGBM), were applied to establish models for predicting the Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET for short) properties, namely Caco-2, CYP3A4, hERG, HOB, MN of anti-breast cancer compounds. To the best of our knowledge, the LGBM algorithm was applied to classify the ADMET property of anti-breast cancer compounds for the first time. We evaluated the established models in the prediction set using accuracy, precision, recall, and F1-score. Compared with the performance of the models established using the three algorithms, the LGBM yielded most satisfactory results (accuracy > 0.87, precision > 0.72, recall > 0.73, and F1-score > 0.73). According to the obtained results, it can be inferred that LGBM can establish reliable models to predict the molecular ADMET properties and provide a useful tool for virtual screening and drug design researchers.
Collapse
|
12
|
Eswar K, Mukherjee S, Ganesan P, Kumar Rengan A. Immunomodulatory Natural Polysaccharides: An Overview of the Mechanisms Involved. Eur Polym J 2023. [DOI: 10.1016/j.eurpolymj.2023.111935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
|
13
|
Roy D, Patel C. Revisiting the Use of Quantum Chemical Calculations in LogP octanol-water Prediction. Molecules 2023; 28:801. [PMID: 36677858 PMCID: PMC9866719 DOI: 10.3390/molecules28020801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 01/06/2023] [Accepted: 01/10/2023] [Indexed: 01/15/2023] Open
Abstract
The partition coefficients of drug and drug-like molecules between an aqueous and organic phase are an important property for developing new therapeutics. The predictive power of computational methods is used extensively to predict partition coefficients of molecules. The application of quantum chemical calculations is used to develop methods to develop structure-activity relationship models for such prediction, either based on molecular fragment methods, or via direct calculation of solvation free energy in solvent continuum. The applicability, merits, and shortcomings of these developments are revisited here.
Collapse
Affiliation(s)
- Dipankar Roy
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Chandan Patel
- Department of Applied Sciences, COEP Technological University, Wellesely Road, Shivajinagar, Pune 411005, Maharashtra, India
| |
Collapse
|
14
|
Weiss AM, Hossainy S, Rowan SJ, Hubbell JA, Esser-Kahn AP. Immunostimulatory Polymers as Adjuvants, Immunotherapies, and Delivery Systems. Macromolecules 2022; 55:6913-6937. [PMID: 36034324 PMCID: PMC9404695 DOI: 10.1021/acs.macromol.2c00854] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 07/16/2022] [Indexed: 12/14/2022]
Abstract
![]()
Activating innate immunity in a controlled manner is
necessary
for the development of next-generation therapeutics. Adjuvants, or
molecules that modulate the immune response, are critical components
of vaccines and immunotherapies. While small molecules and biologics
dominate the adjuvant market, emerging evidence supports the use of
immunostimulatory polymers in therapeutics. Such polymers can stabilize
and deliver cargo while stimulating the immune system by functioning
as pattern recognition receptor (PRR) agonists. At the same time,
in designing polymers that engage the immune system, it is important
to consider any unintended initiation of an immune response that results
in adverse immune-related events. Here, we highlight biologically
derived and synthetic polymer scaffolds, as well as polymer–adjuvant
systems and stimuli-responsive polymers loaded with adjuvants, that
can invoke an immune response. We present synthetic considerations
for the design of such immunostimulatory polymers, outline methods
to target their delivery, and discuss their application in therapeutics.
Finally, we conclude with our opinions on the design of next-generation
immunostimulatory polymers, new applications of immunostimulatory
polymers, and the development of improved preclinical immunocompatibility
tests for new polymers.
Collapse
Affiliation(s)
- Adam M. Weiss
- Pritzker School of Molecular Engineering, University of Chicago 5640 S. Ellis Ave., Chicago, Illinois 60637, United States
- Department of Chemistry, University of Chicago 5735 S Ellis Ave., Chicago, Illinois 60637, United States
| | - Samir Hossainy
- Pritzker School of Molecular Engineering, University of Chicago 5640 S. Ellis Ave., Chicago, Illinois 60637, United States
| | - Stuart J. Rowan
- Pritzker School of Molecular Engineering, University of Chicago 5640 S. Ellis Ave., Chicago, Illinois 60637, United States
- Department of Chemistry, University of Chicago 5735 S Ellis Ave., Chicago, Illinois 60637, United States
| | - Jeffrey A. Hubbell
- Pritzker School of Molecular Engineering, University of Chicago 5640 S. Ellis Ave., Chicago, Illinois 60637, United States
| | - Aaron P. Esser-Kahn
- Pritzker School of Molecular Engineering, University of Chicago 5640 S. Ellis Ave., Chicago, Illinois 60637, United States
| |
Collapse
|
15
|
Gorgulla C, Jayaraj A, Fackeldey K, Arthanari H. Emerging frontiers in virtual drug discovery: From quantum mechanical methods to deep learning approaches. Curr Opin Chem Biol 2022; 69:102156. [PMID: 35576813 PMCID: PMC9990419 DOI: 10.1016/j.cbpa.2022.102156] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 03/16/2022] [Accepted: 04/07/2022] [Indexed: 11/19/2022]
Abstract
Virtual screening-based approaches to discover initial hit and lead compounds have the potential to reduce both the cost and time of early drug discovery stages, as well as to find inhibitors for even challenging target sites such as protein-protein interfaces. Here in this review, we provide an overview of the progress that has been made in virtual screening methodology and technology on multiple fronts in recent years. The advent of ultra-large virtual screens, in which hundreds of millions to billions of compounds are screened, has proven to be a powerful approach to discover highly potent hit compounds. However, these developments are just the tip of the iceberg, with new technologies and methods emerging to propel the field forward. Examples include novel machine-learning approaches, which can reduce the computational costs of virtual screening dramatically, while progress in quantum-mechanical approaches can increase the accuracy of predictions of various small molecule properties.
Collapse
Affiliation(s)
- Christoph Gorgulla
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School (HMS), Boston, MA, USA; Department of Physics, Faculty of Arts and Sciences, Harvard University, Cambridge, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA
| | | | - Konstantin Fackeldey
- Institute of Mathematics, Technical University Berlin, Berlin, Germany; Zuse Institute Berlin, Berlin, Germany
| | - Haribabu Arthanari
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School (HMS), Boston, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA.
| |
Collapse
|
16
|
Woodward DJ, Bradley AR, van Hoorn WP. Coverage Score: A Model Agnostic Method to Efficiently Explore Chemical Space. J Chem Inf Model 2022; 62:4391-4402. [PMID: 35867814 DOI: 10.1021/acs.jcim.2c00258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Selecting the most appropriate compounds to synthesize and test is a vital aspect of drug discovery. Methods like clustering and diversity present weaknesses in selecting the optimal sets for information gain. Active learning techniques often rely on an initial model and computationally expensive semi-supervised batch selection. Herein, we describe a new subset-based selection method, Coverage Score, that combines Bayesian statistics and information entropy to balance representation and diversity to select a maximally informative subset. Coverage Score can be influenced by prior selections and desirable properties. In this paper, subsets selected through Coverage Score are compared against subsets selected through model-independent and model-dependent techniques for several datasets. In drug-like chemical space, Coverage Score consistently selects subsets that lead to more accurate predictions compared to other selection methods. Subsets selected through Coverage Score produced Random Forest models that have a root-mean-square-error up to 12.8% lower than subsets selected at random and can retain up to 99% of the structural dissimilarity of a diversity selection.
Collapse
Affiliation(s)
- Daniel J Woodward
- Exscientia plc, The Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| | - Anthony R Bradley
- Exscientia plc, The Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| | - Willem P van Hoorn
- Exscientia plc, The Schrödinger Building, Oxford Science Park, Oxford OX4 4GE, U.K
| |
Collapse
|
17
|
Ignacz G, Szekely G. Deep learning meets quantitative structure–activity relationship (QSAR) for leveraging structure-based prediction of solute rejection in organic solvent nanofiltration. J Memb Sci 2022. [DOI: 10.1016/j.memsci.2022.120268] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
18
|
Beshore DC, Haidle AM, Arasappan A, Lim YH, Raheem I, Roecker AJ, Shockley SE, Simov V. Building a Culture of Medicinal Chemistry Knowledge Sharing. J Med Chem 2022; 65:3776-3785. [PMID: 35192762 DOI: 10.1021/acs.jmedchem.1c02144] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Increasing the efficiency of the drug discovery process is a challenge faced by drug hunters everywhere. One strategy medicinal chemists employ to meet this challenge is learning from knowledge sources within and beyond their organization. In this Perspective, we discuss the evolution of mechanisms for medicinal chemistry knowledge capture and sharing at Merck & Co. over the past 15 years. We describe our approach to knowledge management and report on the multiple enduring and complementary teams and initiatives we have created to capture and share knowledge within a geographically diverse medicinal chemistry community. In addition, this Perspective will share the benefits we have observed and also reflect on what has allowed our efforts to be both successful and sustainable.
Collapse
Affiliation(s)
- Douglas C Beshore
- Department of Discovery Chemistry, Merck & Co., Inc., 2000 Galloping Hill Road, Kenilworth, New Jersey 07033, United States
| | - Andrew M Haidle
- Department of Discovery Chemistry, Merck & Co., Inc., 2000 Galloping Hill Road, Kenilworth, New Jersey 07033, United States
| | - Ashok Arasappan
- Department of Discovery Chemistry, Merck & Co., Inc., 2000 Galloping Hill Road, Kenilworth, New Jersey 07033, United States
| | - Yeon-Hee Lim
- Department of Discovery Chemistry, Merck & Co., Inc., 2000 Galloping Hill Road, Kenilworth, New Jersey 07033, United States
| | - Izzat Raheem
- Department of Discovery Chemistry, Merck & Co., Inc., 2000 Galloping Hill Road, Kenilworth, New Jersey 07033, United States
| | - Anthony J Roecker
- Department of Discovery Chemistry, Merck & Co., Inc., 2000 Galloping Hill Road, Kenilworth, New Jersey 07033, United States
| | - Samantha E Shockley
- Department of Discovery Chemistry, Merck & Co., Inc., 2000 Galloping Hill Road, Kenilworth, New Jersey 07033, United States
| | - Vladimir Simov
- Department of Discovery Chemistry, Merck & Co., Inc., 2000 Galloping Hill Road, Kenilworth, New Jersey 07033, United States
| |
Collapse
|
19
|
Yang ZY, Fu L, Lu AP, Liu S, Hou TJ, Cao DS. Semi-automated workflow for molecular pair analysis and QSAR-assisted transformation space expansion. J Cheminform 2021; 13:86. [PMID: 34774096 PMCID: PMC8590336 DOI: 10.1186/s13321-021-00564-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 10/30/2021] [Indexed: 12/01/2022] Open
Abstract
In the process of drug discovery, the optimization of lead compounds has always been a challenge faced by pharmaceutical chemists. Matched molecular pair analysis (MMPA), a promising tool to efficiently extract and summarize the relationship between structural transformation and property change, is suitable for local structural optimization tasks. Especially, the integration of MMPA with QSAR modeling can further strengthen the utility of MMPA in molecular optimization navigation. In this study, a new semi-automated procedure based on KNIME was developed to support MMPA on both large- and small-scale datasets, including molecular preparation, QSAR model construction, applicability domain evaluation, and MMP calculation and application. Two examples covering regression and classification tasks were provided to gain a better understanding of the importance of MMPA, which has also shown the reliability and utility of this MMPA-by-QSAR pipeline. ![]()
Collapse
Affiliation(s)
- Zi-Yi Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, People's Republic of China.,Hunan Key Laboratory of Diagnostic and Therapeutic Drug Research for Chronic Diseases, Changsha, 410013, Hunan, China
| | - Li Fu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, People's Republic of China.,Hunan Key Laboratory of Diagnostic and Therapeutic Drug Research for Chronic Diseases, Changsha, 410013, Hunan, China
| | - Ai-Ping Lu
- Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, 999077, SAR, People's Republic of China
| | - Shao Liu
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, People's Republic of China
| | - Ting-Jun Hou
- Hangzhou Institute of Innovative Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, People's Republic of China.
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, People's Republic of China. .,Hunan Key Laboratory of Diagnostic and Therapeutic Drug Research for Chronic Diseases, Changsha, 410013, Hunan, China. .,Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, 999077, SAR, People's Republic of China.
| |
Collapse
|
20
|
Williams W, Zeng L, Gensch T, Sigman MS, Doyle AG, Anslyn EV. The Evolution of Data-Driven Modeling in Organic Chemistry. ACS CENTRAL SCIENCE 2021; 7:1622-1637. [PMID: 34729406 PMCID: PMC8554870 DOI: 10.1021/acscentsci.1c00535] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Indexed: 05/14/2023]
Abstract
Organic chemistry is replete with complex relationships: for example, how a reactant's structure relates to the resulting product formed; how reaction conditions relate to yield; how a catalyst's structure relates to enantioselectivity. Questions like these are at the foundation of understanding reactivity and developing novel and improved reactions. An approach to probing these questions that is both longstanding and contemporary is data-driven modeling. Here, we provide a synopsis of the history of data-driven modeling in organic chemistry and the terms used to describe these endeavors. We include a timeline of the steps that led to its current state. The case studies included highlight how, as a community, we have advanced physical organic chemistry tools with the aid of computers and data to augment the intuition of expert chemists and to facilitate the prediction of structure-activity and structure-property relationships.
Collapse
Affiliation(s)
- Wendy
L. Williams
- Department
of Chemistry and Biochemistry, University
of California, Los Angeles, California 90095, United States
- Department
of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Lingyu Zeng
- Department
of Chemistry, The University of Texas at
Austin, Austin, Texas 78712, United States
| | - Tobias Gensch
- Department
of Chemistry, TU Berlin, Straße des 17. Juni 135, Sekr. C2, 10623 Berlin, Germany
| | - Matthew S. Sigman
- Department
of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Abigail G. Doyle
- Department
of Chemistry and Biochemistry, University
of California, Los Angeles, California 90095, United States
- Department
of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Eric V. Anslyn
- Department
of Chemistry, The University of Texas at
Austin, Austin, Texas 78712, United States
| |
Collapse
|
21
|
da Silva TH, Hachigian TZ, Lee J, King MD. Using computers to ESKAPE the antibiotic resistance crisis. Drug Discov Today 2021; 27:456-470. [PMID: 34688913 DOI: 10.1016/j.drudis.2021.10.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 08/01/2021] [Accepted: 10/15/2021] [Indexed: 12/16/2022]
Abstract
Since the discovery of penicillin, the development and use of antibiotics have promoted safe and effective control of bacterial infections. However, the number of antibiotic-resistance cases has been ever increasing over time. Thus, the drug discovery process demands fast, efficient and cost-effective alternative approaches for developing lead candidates with outstanding performance. Computational approaches are appealing techniques to develop lead candidates in an in silico fashion. In this review, we provide an overview of the implementation of current in silico state-of-the-art techniques, including machine learning (ML) and deep learning (DL), in drug discovery. We also discuss the development of quantum computing and its potential benefits for antibiotics research and current bottlenecks that limit computational drug discovery advancement.
Collapse
Affiliation(s)
- Thiago H da Silva
- Micron School of Materials Science and Engineering, Boise State University, Boise, ID 83725, USA
| | - Timothy Z Hachigian
- Micron School of Materials Science and Engineering, Boise State University, Boise, ID 83725, USA
| | - Jeunghoon Lee
- Micron School of Materials Science and Engineering, Boise State University, Boise, ID 83725, USA; Department of Chemistry and Biochemistry, Boise State University, Boise, ID 83725, USA
| | - Matthew D King
- Micron School of Materials Science and Engineering, Boise State University, Boise, ID 83725, USA; Department of Chemistry and Biochemistry, Boise State University, Boise, ID 83725, USA.
| |
Collapse
|
22
|
Duan Q, Lee J, Chen J, Feng Y, Luo R, Wang C, Bi S, Liu F, Wang W, Huang Y, Xu Z. Image learning to accurately identify complex mixture components. Analyst 2021; 146:5942-5950. [PMID: 34570841 DOI: 10.1039/d1an01288f] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The study of complex mixtures is very important for exploring the evolution of natural phenomena, but the complexity of the mixtures greatly increases the difficulty of material information extraction. Image perception-based machine-learning techniques have the ability to cope with this problem in a data-driven way. Herein, we report a 2D-spectral imaging method to collect matter information from mixture components, and the obtained feature images can be easily provided to deep convolutional neural networks (CNNs) for establishing a spectral network. The results demonstrated that a single CNN trained end-to-end from the proposed images can directly accomplish synchronous measurement of multi-component samples using only raw pixels as inputs. Our strategy has some innate advantages, such as fast data acquisition, low cost, and simple chemical treatment, suggesting that it can be extensively applied in many fields, including environmental science, biology, medicine, and chemistry.
Collapse
Affiliation(s)
- Qiannan Duan
- Department of Environmental Science, Shaanxi Normal University, Xi'an 710062, China. .,State Key Laboratory of Pollution Control and Resource Reuse, Jiangsu Key Laboratory of Vehicle Emissions Control, School of the Environment, Nanjing University, Nanjing 210023, China.,Shaanxi Key Laboratory of Earth Surface System and Environmental Carrying Capacity, College of Urban and Environmental Sciences, Northwest University, Xi'an710127, China
| | - Jianchao Lee
- Department of Environmental Science, Shaanxi Normal University, Xi'an 710062, China.
| | - Jiayuan Chen
- Department of Environmental Science, Shaanxi Normal University, Xi'an 710062, China.
| | - Yunjin Feng
- Department of Environmental Science, Shaanxi Normal University, Xi'an 710062, China.
| | - Run Luo
- Department of Environmental Science, Shaanxi Normal University, Xi'an 710062, China.
| | - Can Wang
- Big Data and Urban Spatial Analytics Laboratory, College of Architecture and Urban Planning, Tongji University, Shanghai 200092, China
| | - Sifan Bi
- Department of Environmental Science, Shaanxi Normal University, Xi'an 710062, China.
| | - Fenli Liu
- Department of Environmental Science, Shaanxi Normal University, Xi'an 710062, China.
| | - Wenjing Wang
- Department of Environmental Science, Shaanxi Normal University, Xi'an 710062, China.
| | - Yicai Huang
- Department of Environmental Science, Shaanxi Normal University, Xi'an 710062, China.
| | - Zhaoyi Xu
- State Key Laboratory of Pollution Control and Resource Reuse, Jiangsu Key Laboratory of Vehicle Emissions Control, School of the Environment, Nanjing University, Nanjing 210023, China
| |
Collapse
|
23
|
Abstract
Given the importance of catalysts in the chemical industry, they have been extensively investigated by experimental and numerical methods. With the development of computational algorithms and computer hardware, large-scale simulations have enabled influential studies with more atomic details reflecting microscopic mechanisms. This review provides a comprehensive summary of recent developments in molecular dynamics, including ab initio molecular dynamics and reaction force-field molecular dynamics. Recent research on both approaches to catalyst calculations is reviewed, including growth, dehydrogenation, hydrogenation, oxidation reactions, bias, and recombination of carbon materials that can guide catalyst calculations. Machine learning has attracted increasing interest in recent years, and its combination with the field of catalysts has inspired promising development approaches. Its applications in machine learning potential, catalyst design, performance prediction, structure optimization, and classification have been summarized in detail. This review hopes to shed light and perspective on ML approaches in catalysts.
Collapse
|
24
|
Bule M, Jalalimanesh N, Bayrami Z, Baeeri M, Abdollahi M. The rise of deep learning and transformations in bioactivity prediction power of molecular modeling tools. Chem Biol Drug Des 2021; 98:954-967. [PMID: 34532977 DOI: 10.1111/cbdd.13750] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2019] [Revised: 04/21/2020] [Accepted: 06/07/2020] [Indexed: 12/18/2022]
Abstract
The search and design for the better use of bioactive compounds are used in many experiments to best mimic compounds' functions in the human body. However, finding a cost-effective and timesaving approach is a top priority in different disciplines. Nowadays, artificial intelligence (AI) and particularly deep learning (DL) methods are widely applied to improve the precision and accuracy of models used in the drug discovery process. DL approaches have been used to provide more opportunities for a faster, efficient, cost-effective, and reliable computer-aided drug discovery. Moreover, the increasing biomedical data volume in areas, like genome sequences, medical images, protein structures, etc., has made data mining algorithms very important in finding novel compounds that could be drugs, uncovering or repurposing drugs and improving the area of genetic markers-based personalized medicine. Furthermore, deep neural networks (DNNs) have been demonstrated to outperform other techniques such as random forests and SVMs for QSAR studies and ligand-based virtual screening. Despite this, in QSAR studies, the quality of different data sources and potential experimental errors has greatly affected the accuracy of QSAR predictions. Therefore, further researches are still needed to improve the accuracy, selectivity, and sensitivity of the DL approach in building the best models of drug discovery.
Collapse
Affiliation(s)
- Mohammed Bule
- Department of Pharmacy, College of Medicine and Health Sciences, Ambo University, Ambo, Ethiopia.,Department of Medicinal Chemistry, School of Pharmacy, Tehran University of Medical Sciences, Tehran, Iran.,Toxicology and Diseases Group, Pharmaceutical Sciences Research Center (PSRC), The Institute of Pharmaceutical Sciences (TIPS), Tehran University of Medical Sciences, Tehran, Iran
| | - Nafiseh Jalalimanesh
- Toxicology and Diseases Group, Pharmaceutical Sciences Research Center (PSRC), The Institute of Pharmaceutical Sciences (TIPS), Tehran University of Medical Sciences, Tehran, Iran
| | - Zahra Bayrami
- Toxicology and Diseases Group, Pharmaceutical Sciences Research Center (PSRC), The Institute of Pharmaceutical Sciences (TIPS), Tehran University of Medical Sciences, Tehran, Iran
| | - Maryam Baeeri
- Toxicology and Diseases Group, Pharmaceutical Sciences Research Center (PSRC), The Institute of Pharmaceutical Sciences (TIPS), Tehran University of Medical Sciences, Tehran, Iran
| | - Mohammad Abdollahi
- Toxicology and Diseases Group, Pharmaceutical Sciences Research Center (PSRC), The Institute of Pharmaceutical Sciences (TIPS), Tehran University of Medical Sciences, Tehran, Iran.,Department of Toxicology and Pharmacology, School of Pharmacy, Tehran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
25
|
Rácz A, Bajusz D, Miranda-Quintana RA, Héberger K. Machine learning models for classification tasks related to drug safety. Mol Divers 2021; 25:1409-1424. [PMID: 34110577 PMCID: PMC8342376 DOI: 10.1007/s11030-021-10239-x] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 05/27/2021] [Indexed: 12/23/2022]
Abstract
In this review, we outline the current trends in the field of machine learning-driven classification studies related to ADME (absorption, distribution, metabolism and excretion) and toxicity endpoints from the past six years (2015-2021). The study focuses only on classification models with large datasets (i.e. more than a thousand compounds). A comprehensive literature search and meta-analysis was carried out for nine different targets: hERG-mediated cardiotoxicity, blood-brain barrier penetration, permeability glycoprotein (P-gp) substrate/inhibitor, cytochrome P450 enzyme family, acute oral toxicity, mutagenicity, carcinogenicity, respiratory toxicity and irritation/corrosion. The comparison of the best classification models was targeted to reveal the differences between machine learning algorithms and modeling types, endpoint-specific performances, dataset sizes and the different validation protocols. Based on the evaluation of the data, we can say that tree-based algorithms are (still) dominating the field, with consensus modeling being an increasing trend in drug safety predictions. Although one can already find classification models with great performances to hERG-mediated cardiotoxicity and the isoenzymes of the cytochrome P450 enzyme family, these targets are still central to ADMET-related research efforts.
Collapse
Affiliation(s)
- Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.
| | - Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary
| | | | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.
| |
Collapse
|
26
|
Piroozmand F, Mohammadipanah F, Sajedi H. Spectrum of deep learning algorithms in drug discovery. Chem Biol Drug Des 2021; 96:886-901. [PMID: 33058458 DOI: 10.1111/cbdd.13674] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 02/11/2020] [Accepted: 02/19/2020] [Indexed: 12/16/2022]
Abstract
Deep learning (DL) algorithms are a subset of machine learning algorithms with the aim of modeling complex mapping between a set of elements and their classes. In parallel to the advance in revealing the molecular bases of diseases, a notable innovation has been undertaken to apply DL in data/libraries management, reaction optimizations, differentiating uncertainties, molecule constructions, creating metrics from qualitative results, and prediction of structures or interactions. From source identification to lead discovery and medicinal chemistry of the drug candidate, drug delivery, and modification, the challenges can be subjected to artificial intelligence algorithms to aid in the generation and interpretation of data. Discovery and design approach, both demand automation, large data management and data fusion by the advance in high-throughput mode. The application of DL can accelerate the exploration of drug mechanisms, finding novel indications for existing drugs (drug repositioning), drug development, and preclinical and clinical studies. The impact of DL in the workflow of drug discovery, design, and their complementary tools are highlighted in this review. Additionally, the type of DL algorithms used for this purpose, and their pros and cons along with the dominant directions of future research are presented.
Collapse
Affiliation(s)
- Firoozeh Piroozmand
- Pharmaceutical Biotechnology Lab, Department of Microbiology, School of Biology and Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, Tehran, Iran
| | - Fatemeh Mohammadipanah
- Pharmaceutical Biotechnology Lab, Department of Microbiology, School of Biology and Center of Excellence in Phylogeny of Living Organisms, College of Science, University of Tehran, Tehran, Iran
| | - Hedieh Sajedi
- Department of Computer Science, School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
| |
Collapse
|
27
|
Bouhedjar K, Boukelia A, Khorief Nacereddine A, Boucheham A, Belaidi A, Djerourou A. A natural language processing approach based on embedding deep learning from heterogeneous compounds for quantitative structure-activity relationship modeling. Chem Biol Drug Des 2021; 96:961-972. [PMID: 33058460 DOI: 10.1111/cbdd.13742] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 05/27/2020] [Accepted: 05/31/2020] [Indexed: 12/15/2022]
Abstract
Over the past decade, rapid development in biological and chemical technologies such as high-throughput screening, parallel synthesis, has been significantly increased the amount of data, which requires the creation and the integration of new analytical methods, especially deep learning models. Recently, there is an increasing interest in deep learning utilization in computer-aided drug discovery due to its exceptional successful application in many fields. The present work proposed a natural language processing approach, based on embedding deep neural networks. Our method aims to transform the Simplified Molecular Input Line Entry System format into word embedding vectors to represent the semantics of compounds. These vectors are fed into supervised machine learning algorithms such as convolutional long short-term memory neural network, support vector machine, and random forest to build up quantitative structure-activity relationship models on toxicity data sets. The obtained results on toxicity data to the ciliate Tetrahymena pyriformis (IGC50 ), and acute toxicity rat data expressed as median lethal dose of treated rats (LD50 ) show that our approach can eventually be used to predict the activities of chemical compounds efficiently. All material used in this study is available online through the GitHub portal (https://github.com/BoukeliaAbdelbasset/NLPDeepQSAR.git).
Collapse
Affiliation(s)
- Khalid Bouhedjar
- Laboratoire de Synthèse et Biocatalyse Organique, Département de Chimie, Faculté des Sciences, Université Badji Mokhtar Annaba, Annaba, Algeria.,Laboratoire Bioinformatique, Centre de Recherche en Biotechnologie (CRBt), Constantine, Algeria
| | - Abdelbasset Boukelia
- Laboratoire Bioinformatique, Centre de Recherche en Biotechnologie (CRBt), Constantine, Algeria.,Computer Science Department, Faculty of NTIC University of Constantine 2 - Abdelhamid Mehri, Constantine, Algeria
| | - Abdelmalek Khorief Nacereddine
- Laboratory of Physical Chemistry and Biology of Materials, Department of Physics and Chemistry, Higher Normal School of Technological Education-Skikda, Skikda, Algeria
| | - Anouar Boucheham
- University Salah Boubnider Constantine, Constantine, Algeria.,Laboratory of Molecular and Cellular Biology, Constantine, Algeria
| | - Amine Belaidi
- Laboratoire Bioinformatique, Centre de Recherche en Biotechnologie (CRBt), Constantine, Algeria
| | - Abdelhafid Djerourou
- Laboratoire de Synthèse et Biocatalyse Organique, Département de Chimie, Faculté des Sciences, Université Badji Mokhtar Annaba, Annaba, Algeria
| |
Collapse
|
28
|
Del Cueto M, Troisi A. Determining usefulness of machine learning in materials discovery using simulated research landscapes. Phys Chem Chem Phys 2021; 23:14156-14163. [PMID: 34079968 DOI: 10.1039/d1cp01761f] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
When existing experimental data are combined with machine learning (ML) to predict the performance of new materials, the data acquisition bias determines ML usefulness and the prediction accuracy. In this context, the following two conditions are highly common: (i) constructing new unbiased data sets is too expensive and the global knowledge effectively does not change by performing a limited number of novel measurements; (ii) the performance of the material depends on a limited number of physical parameters, much smaller than the range of variables that can be changed, albeit such parameters are unknown or not measurable. To determine the usefulness of ML under these conditions, we introduce the concept of simulated research landscapes, which describe how datasets of arbitrary complexity evolve over time. Simulated research landscapes allow us to use different discovery strategies to compare standard materials exploration with ML-guided explorations, i.e. we can measure quantitatively the benefit of using a specific ML model. We show that there is a window of opportunity to obtain a significant benefit from ML-guided strategies. The adoption of ML can take place too soon (not enough information to find patterns) or too late (dense datasets only allow for negligible ML benefit), and the adoption of ML can even slow down the discovery process in some cases. We offer a qualitative guide on when ML can accelerate the discovery of new best-performing materials in a field under specific conditions. The answer in each case depends on factors like data dimensionality, corrugation and data collection strategy. We consider how these factors may affect the ML prediction capabilities and discuss some general trends.
Collapse
Affiliation(s)
- Marcos Del Cueto
- Department of Chemistry, University of Liverpool, Liverpool, L69 3BX, UK.
| | - Alessandro Troisi
- Department of Chemistry, University of Liverpool, Liverpool, L69 3BX, UK.
| |
Collapse
|
29
|
Dos Santos Nascimento IJ, de Aquino TM, da Silva-Júnior EF. Drug Repurposing: A Strategy for Discovering Inhibitors against Emerging Viral Infections. Curr Med Chem 2021; 28:2887-2942. [PMID: 32787752 DOI: 10.2174/0929867327666200812215852] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 07/21/2020] [Accepted: 07/22/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Viral diseases are responsible for several deaths around the world. Over the past few years, the world has seen several outbreaks caused by viral diseases that, for a long time, seemed to possess no risk. These are diseases that have been forgotten for a long time and, until nowadays, there are no approved drugs or vaccines, leading the pharmaceutical industry and several research groups to run out of time in the search for new pharmacological treatments or prevention methods. In this context, drug repurposing proves to be a fast and economically viable technique, considering the fact that it uses drugs that have a well-established safety profile. Thus, in this review, we present the main advances in drug repurposing and their benefit for searching new treatments against emerging viral diseases. METHODS We conducted a search in the bibliographic databases (Science Direct, Bentham Science, PubMed, Springer, ACS Publisher, Wiley, and NIH's COVID-19 Portfolio) using the keywords "drug repurposing", "emerging viral infections" and each of the diseases reported here (CoV; ZIKV; DENV; CHIKV; EBOV and MARV) as an inclusion/exclusion criterion. A subjective analysis was performed regarding the quality of the works for inclusion in this manuscript. Thus, the selected works were those that presented drugs repositioned against the emerging viral diseases presented here by means of computational, high-throughput screening or phenotype-based strategies, with no time limit and of relevant scientific value. RESULTS 291 papers were selected, 24 of which were CHIKV; 52 for ZIKV; 43 for DENV; 35 for EBOV; 10 for MARV; and 56 for CoV and the rest (72 papers) related to the drugs repurposing and emerging viral diseases. Among CoV-related articles, most were published in 2020 (31 papers), updating the current topic. Besides, between the years 2003 - 2005, 10 articles were created, and from 2011 - 2015, there were 7 articles, portraying the outbreaks that occurred at that time. For ZIKV, similar to CoV, most publications were during the period of outbreaks between the years 2016 - 2017 (23 articles). Similarly, most CHIKV (13 papers) and DENV (14 papers) publications occur at the same time interval. For EBOV (13 papers) and MARV (4 papers), they were between the years 2015 - 2016. Through this review, several drugs were highlighted that can be evolved in vivo and clinical trials as possible used against these pathogens showed that remdesivir represent potential treatments against CoV. Furthermore, ribavirin may also be a potential treatment against CHIKV; sofosbuvir against ZIKV; celgosivir against DENV, and favipiravir against EBOV and MARV, representing new hopes against these pathogens. CONCLUSION The conclusions of this review manuscript show the potential of the drug repurposing strategy in the discovery of new pharmaceutical products, as from this approach, drugs could be used against emerging viral diseases. Thus, this strategy deserves more attention among research groups and is a promising approach to the discovery of new drugs against emerging viral diseases and also other diseases.
Collapse
|
30
|
Nichols PL. Automated and enabling technologies for medicinal chemistry. PROGRESS IN MEDICINAL CHEMISTRY 2021; 60:191-272. [PMID: 34147203 DOI: 10.1016/bs.pmch.2021.01.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Having always been driven by the need to get new treatments to patients as quickly as possible, drug discovery is a constantly evolving process. This chapter will review how medicinal chemistry was established, how it has changed over the years due to the emergence of new enabling technologies, and how early advances in synthesis, purification and analysis, have provided the foundations upon which the current automated and enabling technologies are built. Looking beyond the established technologies, this chapter will also consider technologies that are now emerging, and their impact on the future of drug discovery and the role of medicinal chemists.
Collapse
Affiliation(s)
- Paula L Nichols
- Synple Chem AG, Kemptthal, Switzerland; ETH, Zurich, Switzerland.
| |
Collapse
|
31
|
Koutsoukos S, Philippi F, Malaret F, Welton T. A review on machine learning algorithms for the ionic liquid chemical space. Chem Sci 2021; 12:6820-6843. [PMID: 34123314 PMCID: PMC8153233 DOI: 10.1039/d1sc01000j] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 04/28/2021] [Indexed: 01/05/2023] Open
Abstract
There are thousands of papers published every year investigating the properties and possible applications of ionic liquids. Industrial use of these exceptional fluids requires adequate understanding of their physical properties, in order to create the ionic liquid that will optimally suit the application. Computational property prediction arose from the urgent need to minimise the time and cost that would be required to experimentally test different combinations of ions. This review discusses the use of machine learning algorithms as property prediction tools for ionic liquids (either as standalone methods or in conjunction with molecular dynamics simulations), presents common problems of training datasets and proposes ways that could lead to more accurate and efficient models.
Collapse
Affiliation(s)
- Spyridon Koutsoukos
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| | - Frederik Philippi
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| | - Francisco Malaret
- Department of Chemical Engineering, Imperial College London South Kensington Campus London SW7 2AZ UK
| | - Tom Welton
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| |
Collapse
|
32
|
Taking the leap between analytical chemistry and artificial intelligence: A tutorial review. Anal Chim Acta 2021; 1161:338403. [DOI: 10.1016/j.aca.2021.338403] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 03/02/2021] [Accepted: 03/03/2021] [Indexed: 01/01/2023]
|
33
|
Nayarisseri A, Khandelwal R, Tanwar P, Madhavi M, Sharma D, Thakur G, Speck-Planche A, Singh SK. Artificial Intelligence, Big Data and Machine Learning Approaches in Precision Medicine & Drug Discovery. Curr Drug Targets 2021; 22:631-655. [PMID: 33397265 DOI: 10.2174/1389450122999210104205732] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/21/2020] [Accepted: 09/14/2020] [Indexed: 11/22/2022]
Abstract
Artificial Intelligence revolutionizes the drug development process that can quickly identify potential biologically active compounds from millions of candidate within a short period. The present review is an overview based on some applications of Machine Learning based tools, such as GOLD, Deep PVP, LIB SVM, etc. and the algorithms involved such as support vector machine (SVM), random forest (RF), decision tree and Artificial Neural Network (ANN), etc. at various stages of drug designing and development. These techniques can be employed in SNP discoveries, drug repurposing, ligand-based drug design (LBDD), Ligand-based Virtual Screening (LBVS) and Structure- based Virtual Screening (SBVS), Lead identification, quantitative structure-activity relationship (QSAR) modeling, and ADMET analysis. It is demonstrated that SVM exhibited better performance in indicating that the classification model will have great applications on human intestinal absorption (HIA) predictions. Successful cases have been reported which demonstrate the efficiency of SVM and RF models in identifying JFD00950 as a novel compound targeting against a colon cancer cell line, DLD-1, by inhibition of FEN1 cytotoxic and cleavage activity. Furthermore, a QSAR model was also used to predict flavonoid inhibitory effects on AR activity as a potent treatment for diabetes mellitus (DM), using ANN. Hence, in the era of big data, ML approaches have been evolved as a powerful and efficient way to deal with the huge amounts of generated data from modern drug discovery to model small-molecule drugs, gene biomarkers and identifying the novel drug targets for various diseases.
Collapse
Affiliation(s)
- Anuraj Nayarisseri
- In silico Research Laboratory, Eminent Biosciences, Mahalakshmi Nagar, Indore - 452010, Madhya Pradesh, India
| | - Ravina Khandelwal
- In silico Research Laboratory, Eminent Biosciences, Mahalakshmi Nagar, Indore - 452010, Madhya Pradesh, India
| | - Poonam Tanwar
- In silico Research Laboratory, Eminent Biosciences, Mahalakshmi Nagar, Indore - 452010, Madhya Pradesh, India
| | - Maddala Madhavi
- Department of Zoology, Nizam College, Osmania University, Hyderabad - 500001, Telangana State, India
| | - Diksha Sharma
- In silico Research Laboratory, Eminent Biosciences, Mahalakshmi Nagar, Indore - 452010, Madhya Pradesh, India
| | - Garima Thakur
- In silico Research Laboratory, Eminent Biosciences, Mahalakshmi Nagar, Indore - 452010, Madhya Pradesh, India
| | - Alejandro Speck-Planche
- Programa Institucional de Fomento a la Investigacion, Desarrollo e Innovacion, Universidad Tecnologica Metropolitana, Ignacio Valdivieso 2409, P.O. 8940577, San Joaquin, Santiago, Chile
| | - Sanjeev Kumar Singh
- Computer Aided Drug Designing and Molecular Modeling Lab, Department of Bioinformatics, Alagappa University, Karaikudi-630003, Tamil Nadu, India
| |
Collapse
|
34
|
Pereira T, Abbasi M, Ribeiro B, Arrais JP. Diversity oriented Deep Reinforcement Learning for targeted molecule generation. J Cheminform 2021; 13:21. [PMID: 33750461 PMCID: PMC7944916 DOI: 10.1186/s13321-021-00498-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 02/22/2021] [Indexed: 11/10/2022] Open
Abstract
In this work, we explore the potential of deep learning to streamline the process of identifying new potential drugs through the computational generation of molecules with interesting biological properties. Two deep neural networks compose our targeted generation framework: the Generator, which is trained to learn the building rules of valid molecules employing SMILES strings notation, and the Predictor which evaluates the newly generated compounds by predicting their affinity for the desired target. Then, the Generator is optimized through Reinforcement Learning to produce molecules with bespoken properties. The innovation of this approach is the exploratory strategy applied during the reinforcement training process that seeks to add novelty to the generated compounds. This training strategy employs two Generators interchangeably to sample new SMILES: the initially trained model that will remain fixed and a copy of the previous one that will be updated during the training to uncover the most promising molecules. The evolution of the reward assigned by the Predictor determines how often each one is employed to select the next token of the molecule. This strategy establishes a compromise between the need to acquire more information about the chemical space and the need to sample new molecules, with the experience gained so far. To demonstrate the effectiveness of the method, the Generator is trained to design molecules with an optimized coefficient of partition and also high inhibitory power against the Adenosine [Formula: see text] and [Formula: see text] opioid receptors. The results reveal that the model can effectively adjust the newly generated molecules towards the wanted direction. More importantly, it was possible to find promising sets of unique and diverse molecules, which was the main purpose of the newly implemented strategy.
Collapse
Affiliation(s)
- Tiago Pereira
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| | - Maryam Abbasi
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| | - Bernardete Ribeiro
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| | - Joel P. Arrais
- Department of Informatics Engineering, Centre for Informatics and Systems of the University of Coimbra, University of Coimbra, Pinhal de Marrocos, Coimbra, Portugal
| |
Collapse
|
35
|
Mills B, Isaac RE, Foster R. Metalloaminopeptidases of the Protozoan Parasite Plasmodium falciparum as Targets for the Discovery of Novel Antimalarial Drugs. J Med Chem 2021; 64:1763-1785. [PMID: 33534577 DOI: 10.1021/acs.jmedchem.0c01721] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Malaria poses a significant threat to approximately half of the world's population with an annual death toll close to half a million. The emergence of resistance to front-line antimalarials in the most lethal human parasite species, Plasmodium falciparum (Pf), threatens progress made in malaria control. The prospect of losing the efficacy of antimalarial drugs is driving the search for small molecules with new modes of action. Asexual reproduction of the parasite is critically dependent on the recycling of amino acids through catabolism of hemoglobin (Hb), which makes metalloaminopeptidases (MAPs) attractive targets for the development of new drugs. The Pf genome encodes eight MAPs, some of which have been found to be essential for parasite survival. In this article, we discuss the biological structure and function of each MAP within the Pf genome, along with the drug discovery efforts that have been undertaken to identify novel antimalarial candidates of therapeutic value.
Collapse
Affiliation(s)
- Belinda Mills
- School of Chemistry, University of Leeds, Leeds, U.K., LS2 9JT
| | - R Elwyn Isaac
- School of Biology, University of Leeds, Leeds, U.K., LS2 9JT
| | - Richard Foster
- School of Chemistry, University of Leeds, Leeds, U.K., LS2 9JT
| |
Collapse
|
36
|
Abstract
Technology advancement demands energy storage devices (ESD) and systems (ESS) with better performance, longer life, higher reliability, and smarter management strategy. Designing such systems involve a trade-off among a large set of parameters, whereas advanced control strategies need to rely on the instantaneous status of many indicators. Machine learning can dramatically accelerate calculations, capture complex mechanisms to improve the prediction accuracy, and make optimized decisions based on comprehensive status information. The computational efficiency makes it applicable for real-time management. This paper reviews recent progresses in this emerging area, especially new concepts, approaches, and applications of machine learning technologies for commonly used energy storage devices (including batteries, capacitors/supercapacitors, fuel cells, other ESDs) and systems (including battery ESS, hybrid ESS, grid and microgrid-containing energy storage units, pumped-storage system, thermal ESS). The perspective on future directions is also discussed.
Collapse
Affiliation(s)
- Tianhan Gao
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Wei Lu
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA.,Department of Materials Science & Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
37
|
Maser MR, Cui AY, Ryou S, DeLano TJ, Yue Y, Reisman SE. Multilabel Classification Models for the Prediction of Cross-Coupling Reaction Conditions. J Chem Inf Model 2021; 61:156-166. [PMID: 33417449 DOI: 10.1021/acs.jcim.0c01234] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Machine-learned ranking models have been developed for the prediction of substrate-specific cross-coupling reaction conditions. Data sets of published reactions were curated for Suzuki, Negishi, and C-N couplings, as well as Pauson-Khand reactions. String, descriptor, and graph encodings were tested as input representations, and models were trained to predict the set of conditions used in a reaction as a binary vector. Unique reagent dictionaries categorized by expert-crafted reaction roles were constructed for each data set, leading to context-aware predictions. We find that relational graph convolutional networks and gradient-boosting machines are very effective for this learning task, and we disclose a novel reaction-level graph attention operation in the top-performing model.
Collapse
Affiliation(s)
- Michael R Maser
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Alexander Y Cui
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California 91125, United States
| | - Serim Ryou
- Computational Vision Lab, California Institute of Technology, Pasadena, California 91125, United States
| | - Travis J DeLano
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| | - Yisong Yue
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California 91125, United States
| | - Sarah E Reisman
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
| |
Collapse
|
38
|
Rognan D. Modeling Protein-Ligand Interactions: Are We Ready for Deep Learning? SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11521-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
39
|
|
40
|
Agamah FE, Mazandu GK, Hassan R, Bope CD, Thomford NE, Ghansah A, Chimusa ER. Computational/in silico methods in drug target and lead prediction. Brief Bioinform 2020; 21:1663-1675. [PMID: 31711157 PMCID: PMC7673338 DOI: 10.1093/bib/bbz103] [Citation(s) in RCA: 116] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 07/17/2019] [Accepted: 07/18/2019] [Indexed: 01/10/2023] Open
Abstract
Drug-like compounds are most of the time denied approval and use owing to the unexpected clinical side effects and cross-reactivity observed during clinical trials. These unexpected outcomes resulting in significant increase in attrition rate centralizes on the selected drug targets. These targets may be disease candidate proteins or genes, biological pathways, disease-associated microRNAs, disease-related biomarkers, abnormal molecular phenotypes, crucial nodes of biological network or molecular functions. This is generally linked to several factors, including incomplete knowledge on the drug targets and unpredicted pharmacokinetic expressions upon target interaction or off-target effects. A method used to identify targets, especially for polygenic diseases, is essential and constitutes a major bottleneck in drug development with the fundamental stage being the identification and validation of drug targets of interest for further downstream processes. Thus, various computational methods have been developed to complement experimental approaches in drug discovery. Here, we present an overview of various computational methods and tools applied in predicting or validating drug targets and drug-like molecules. We provide an overview on their advantages and compare these methods to identify effective methods which likely lead to optimal results. We also explore major sources of drug failure considering the challenges and opportunities involved. This review might guide researchers on selecting the most efficient approach or technique during the computational drug discovery process.
Collapse
Affiliation(s)
- Francis E Agamah
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
| | - Gaston K Mazandu
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
- African Institute for Mathematical Sciences, Muizenberg, Cape Town 7945, South Africa
| | - Radia Hassan
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
| | - Christian D Bope
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
- Faculty of Sciences, University of Kinshasa, Kinshasa, Democratic Republic of Congo
| | - Nicholas E Thomford
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
- School of Medical Sciences, University of Cape Coast, PMB, Cape Coast, Ghana
| | - Anita Ghansah
- Noguchi Memorial Institute for Medical Research, College of Health Sciences, University of Ghana, PO Box LG 581, Legon, Ghana
| | - Emile R Chimusa
- Division of Human Genetics, Department of Pathology, University of Cape Town, Observatory 7925, South Africa
| |
Collapse
|
41
|
Samanta S, O’Hagan S, Swainston N, Roberts TJ, Kell DB. VAE-Sim: A Novel Molecular Similarity Measure Based on a Variational Autoencoder. Molecules 2020; 25:E3446. [PMID: 32751155 PMCID: PMC7435890 DOI: 10.3390/molecules25153446] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 07/21/2020] [Accepted: 07/28/2020] [Indexed: 01/13/2023] Open
Abstract
Molecular similarity is an elusive but core "unsupervised" cheminformatics concept, yet different "fingerprint" encodings of molecular structures return very different similarity values, even when using the same similarity metric. Each encoding may be of value when applied to other problems with objective or target functions, implying that a priori none are "better" than the others, nor than encoding-free metrics such as maximum common substructure (MCSS). We here introduce a novel approach to molecular similarity, in the form of a variational autoencoder (VAE). This learns the joint distribution p(z|x) where z is a latent vector and x are the (same) input/output data. It takes the form of a "bowtie"-shaped artificial neural network. In the middle is a "bottleneck layer" or latent vector in which inputs are transformed into, and represented as, a vector of numbers (encoding), with a reverse process (decoding) seeking to return the SMILES string that was the input. We train a VAE on over six million druglike molecules and natural products (including over one million in the final holdout set). The VAE vector distances provide a rapid and novel metric for molecular similarity that is both easily and rapidly calculated. We describe the method and its application to a typical similarity problem in cheminformatics.
Collapse
Affiliation(s)
- Soumitra Samanta
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool L69 7ZB, UK; (S.S.); (N.S.); (T.J.R.)
| | - Steve O’Hagan
- Department of Chemistry, The Manchester Institute of Biotechnology, The University of Manchester, 131 Princess St, Manchester M1 7DN, UK;
| | - Neil Swainston
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool L69 7ZB, UK; (S.S.); (N.S.); (T.J.R.)
| | - Timothy J. Roberts
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool L69 7ZB, UK; (S.S.); (N.S.); (T.J.R.)
| | - Douglas B. Kell
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool L69 7ZB, UK; (S.S.); (N.S.); (T.J.R.)
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs Lyngby, Denmark
| |
Collapse
|
42
|
Tan YS, Mhoumadi Y, Verma CS. Roles of computational modelling in understanding p53 structure, biology, and its therapeutic targeting. J Mol Cell Biol 2020; 11:306-316. [PMID: 30726928 PMCID: PMC6487789 DOI: 10.1093/jmcb/mjz009] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Revised: 12/14/2018] [Accepted: 01/31/2019] [Indexed: 12/21/2022] Open
Abstract
The transcription factor p53 plays pivotal roles in numerous biological processes, including the suppression of tumours. The rich availability of biophysical data aimed at understanding its structure–function relationships since the 1990s has enabled the application of a variety of computational modelling techniques towards the establishment of mechanistic models. Together they have provided deep insights into the structure, mechanics, energetics, and dynamics of p53. In parallel, the observation that mutations in p53 or changes in its associated pathways characterize several human cancers has resulted in a race to develop therapeutic modulators of p53, some of which have entered clinical trials. This review describes how computational modelling has played key roles in understanding structural-dynamic aspects of p53, formulating hypotheses about domains that are beyond current experimental investigations, and the development of therapeutic molecules that target the p53 pathway.
Collapse
Affiliation(s)
- Yaw Sing Tan
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore
| | - Yasmina Mhoumadi
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore.,School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore
| | - Chandra S Verma
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore.,School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore.,Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore
| |
Collapse
|
43
|
Serafim MSM, Kronenberger T, Oliveira PR, Poso A, Honório KM, Mota BEF, Maltarollo VG. The application of machine learning techniques to innovative antibacterial discovery and development. Expert Opin Drug Discov 2020; 15:1165-1180. [PMID: 32552005 DOI: 10.1080/17460441.2020.1776696] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
INTRODUCTION After the initial wave of antibiotic discovery, few novel classes of antibiotics have emerged, with the latest dating back to the 1980's. Furthermore, the pace of antibiotic drug discovery is unable to keep up with the increasing prevalence of antibiotic drug resistance. However, the increasing amount of available data promotes the use of machine learning techniques (MLT) in drug discovery projects (e.g. construction of regression/classification models and ranking/virtual screening of compounds). AREAS COVERED In this review, the authors cover some of the applications of MLT in medicinal chemistry, focusing on the development of new antibiotics, the prediction of resistance and its mechanisms. The aim of this review is to illustrate the main advantages and disadvantages and the major trends from studies over the past 5 years. EXPERT OPINION The application of MLT to antibacterial drug discovery can aid the selection of new and potent lead compounds, with desirable pharmacokinetic and toxic profiles for further optimization. The increasing volume of available data along with the constant improvement in computational power and algorithms has meant that we are experiencing a transition in the way we face modern issues such as drug resistance, where our decisions are data-driven and experiments can be focused by data-suggested hypotheses.
Collapse
Affiliation(s)
- Mateus Sá Magalhães Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG) , Belo Horizonte, Brazil
| | - Thales Kronenberger
- Department of Internal Medicine VIII, University Hospital of Tübingen , Tübingen, Germany
| | | | - Antti Poso
- Department of Internal Medicine VIII, University Hospital of Tübingen , Tübingen, Germany.,School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland , Kuopio, Finland
| | - Káthia Maria Honório
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP) , São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC , Santo André, Brazil
| | - Bruno Eduardo Fernandes Mota
- Departamento de Análises Clínicas e Toxicológicas, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG) , Belo Horizonte, Brazil
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG) , Belo Horizonte, Brazil
| |
Collapse
|
44
|
Coley CW, Eyke NS, Jensen KF. Autonomous Discovery in the Chemical Sciences Part I: Progress. Angew Chem Int Ed Engl 2020; 59:22858-22893. [DOI: 10.1002/anie.201909987] [Citation(s) in RCA: 100] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Indexed: 01/05/2023]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Natalie S. Eyke
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
45
|
Coley CW, Eyke NS, Jensen KF. Autonome Entdeckung in den chemischen Wissenschaften, Teil I: Fortschritt. Angew Chem Int Ed Engl 2020. [DOI: 10.1002/ange.201909987] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Natalie S. Eyke
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
46
|
Chuang KV, Gunsalus LM, Keiser MJ. Learning Molecular Representations for Medicinal Chemistry. J Med Chem 2020; 63:8705-8722. [PMID: 32366098 DOI: 10.1021/acs.jmedchem.0c00385] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The accurate modeling and prediction of small molecule properties and bioactivities depend on the critical choice of molecular representation. Decades of informatics-driven research have relied on expert-designed molecular descriptors to establish quantitative structure-activity and structure-property relationships for drug discovery. Now, advances in deep learning make it possible to efficiently and compactly learn molecular representations directly from data. In this review, we discuss how active research in molecular deep learning can address limitations of current descriptors and fingerprints while creating new opportunities in cheminformatics and virtual screening. We provide a concise overview of the role of representations in cheminformatics, key concepts in deep learning, and argue that learning representations provides a way forward to improve the predictive modeling of small molecule bioactivities and properties.
Collapse
Affiliation(s)
- Kangway V Chuang
- Department of Pharmaceutical Chemistry, Department of Bioengineering & Therapeutic Sciences, Institute for Neurodegenerative Diseases, Kavli Institute for Fundamental Neuroscience, Bakar Computational Health Sciences Institute, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, California 94143, United States
| | - Laura M Gunsalus
- Department of Pharmaceutical Chemistry, Department of Bioengineering & Therapeutic Sciences, Institute for Neurodegenerative Diseases, Kavli Institute for Fundamental Neuroscience, Bakar Computational Health Sciences Institute, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, California 94143, United States
| | - Michael J Keiser
- Department of Pharmaceutical Chemistry, Department of Bioengineering & Therapeutic Sciences, Institute for Neurodegenerative Diseases, Kavli Institute for Fundamental Neuroscience, Bakar Computational Health Sciences Institute, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, California 94143, United States
| |
Collapse
|
47
|
Cova TFGG, Pais AACC. Deep Learning for Deep Chemistry: Optimizing the Prediction of Chemical Patterns. Front Chem 2019; 7:809. [PMID: 32039134 PMCID: PMC6988795 DOI: 10.3389/fchem.2019.00809] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 11/11/2019] [Indexed: 12/14/2022] Open
Abstract
Computational Chemistry is currently a synergistic assembly between ab initio calculations, simulation, machine learning (ML) and optimization strategies for describing, solving and predicting chemical data and related phenomena. These include accelerated literature searches, analysis and prediction of physical and quantum chemical properties, transition states, chemical structures, chemical reactions, and also new catalysts and drug candidates. The generalization of scalability to larger chemical problems, rather than specialization, is now the main principle for transforming chemical tasks in multiple fronts, for which systematic and cost-effective solutions have benefited from ML approaches, including those based on deep learning (e.g. quantum chemistry, molecular screening, synthetic route design, catalysis, drug discovery). The latter class of ML algorithms is capable of combining raw input into layers of intermediate features, enabling bench-to-bytes designs with the potential to transform several chemical domains. In this review, the most exciting developments concerning the use of ML in a range of different chemical scenarios are described. A range of different chemical problems and respective rationalization, that have hitherto been inaccessible due to the lack of suitable analysis tools, is thus detailed, evidencing the breadth of potential applications of these emerging multidimensional approaches. Focus is given to the models, algorithms and methods proposed to facilitate research on compound design and synthesis, materials design, prediction of binding, molecular activity, and soft matter behavior. The information produced by pairing Chemistry and ML, through data-driven analyses, neural network predictions and monitoring of chemical systems, allows (i) prompting the ability to understand the complexity of chemical data, (ii) streamlining and designing experiments, (ii) discovering new molecular targets and materials, and also (iv) planning or rethinking forthcoming chemical challenges. In fact, optimization engulfs all these tasks directly.
Collapse
Affiliation(s)
- Tânia F. G. G. Cova
- Coimbra Chemistry Centre, CQC, Department of Chemistry, Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
| | - Alberto A. C. C. Pais
- Coimbra Chemistry Centre, CQC, Department of Chemistry, Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
| |
Collapse
|
48
|
Zhao Z, Dai X, Li C, Wang X, Tian J, Feng Y, Xie J, Ma C, Nie Z, Fan P, Qian M, He X, Wu S, Zhang Y, Zheng X. Pyrazolone structural motif in medicinal chemistry: Retrospect and prospect. Eur J Med Chem 2019; 186:111893. [PMID: 31761383 PMCID: PMC7115706 DOI: 10.1016/j.ejmech.2019.111893] [Citation(s) in RCA: 95] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 11/14/2019] [Accepted: 11/14/2019] [Indexed: 12/13/2022]
Abstract
The pyrazolone structural motif is a critical element of drugs aimed at different biological end-points. Medicinal chemistry researches have synthesized drug-like pyrazolone candidates with several medicinal features including antimicrobial, antitumor, CNS (central nervous system) effect, anti-inflammatory activities and so on. Meanwhile, SAR (Structure-Activity Relationship) investigations have drawn attentions among medicinal chemists, along with a plenty of analogues have been derived for multiple targets. In this review, we comprehensively summarize the biological activity and SAR for pyrazolone analogues, wishing to give an overall retrospect and prospect on the pyrazolone derivatives. The pyrazolone structural motif is a critical element of drugs aimed at different biological end-points. The pyrazolone analogues have been carried out to drug-like candidates with broad range of medicinal properties. This review wishes to give an overall retrospect and prospect on the pyrazolone derivatives.
Collapse
Affiliation(s)
- Zefeng Zhao
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| | - Xufen Dai
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| | - Chenyang Li
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| | - Xiao Wang
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| | - Jiale Tian
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| | - Ying Feng
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| | - Jing Xie
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| | - Cong Ma
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| | - Zhuang Nie
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| | - Peinan Fan
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| | - Mingcheng Qian
- Department of Medicinal Chemistry, School of Pharmaceutical Engineering and Life Science, Changzhou University, Changzhou, 213164, Jiangsu, China; Laboratory for Medicinal Chemistry, Ghent University, Ottergemsesteenweg 460, B-9000, Ghent, Belgium
| | - Xirui He
- Department of Bioengineering, Zhuhai Campus of Zunyi Medical University, Zhuhai, 519041, China
| | - Shaoping Wu
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China.
| | - Yongmin Zhang
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China; Sorbonne Université, Institut Parisien de Chimie Moléculaire, CNRS UMR 8232, 4 Place Jussieu, 75005, Paris, France
| | - Xiaohui Zheng
- School of Pharmacy, Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Biomedicine Key Laboratory of Shaanxi Province, Northwest University, 229 Taibai Road, Xi'an, 710069, China
| |
Collapse
|
49
|
Onay A, Onay M. A Drug Decision Support System for Developing a Successful Drug Candidate Using Machine Learning Techniques. Curr Comput Aided Drug Des 2019; 16:407-419. [PMID: 31438830 DOI: 10.2174/1573409915666190716143601] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2019] [Revised: 04/24/2019] [Accepted: 05/06/2019] [Indexed: 11/22/2022]
Abstract
BACKGROUND Virtual screening of candidate drug molecules using machine learning techniques plays a key role in pharmaceutical industry to design and discovery of new drugs. Computational classification methods can determine drug types according to the disease groups and distinguish approved drugs from withdrawn ones. INTRODUCTION Classification models developed in this study can be used as a simple filter in drug modelling to eliminate potentially inappropriate molecules in the early stages. In this work, we developed a Drug Decision Support System (DDSS) to classify each drug candidate molecule as potentially drug or non-drug and to predict its disease group. METHODS Molecular descriptors were identified for the determination of a number of rules in drug molecules. They were derived using ADRIANA.Code program and Lipinski's rule of five. We used Artificial Neural Network (ANN) to classify drug molecules correctly according to the types of diseases. Closed frequent molecular structures in the form of subgraph fragments were also obtained with Gaston algorithm included in ParMol Package to find common molecular fragments for withdrawn drugs. RESULTS We observed that TPSA, XlogP Natoms, HDon_O and TPSA are the most distinctive features in the pool of the molecular descriptors and evaluated the performances of classifiers on all datasets and found that classification accuracies are very high on all the datasets. Neural network models achieved 84.6% and 83.3% accuracies on test sets including cardiac therapy, anti-epileptics and anti-parkinson drugs with approved and withdrawn drugs for drug classification problems. CONCLUSION The experimental evaluation shows that the system is promising at determination of potential drug molecules to classify drug molecules correctly according to the types of diseases.
Collapse
Affiliation(s)
- Aytun Onay
- Department of Computer Engineering, Faculty of Engineering & Architecture, Kafkas University, Kars, 36100, Turkey
| | - Melih Onay
- Department of Environmental Engineering, Computational & Experimental Biochemistry Lab, Faculty of Engineering, Van Yuzuncu Yil University, 65100, Van, Turkey
| |
Collapse
|
50
|
Sato A, Tanimura N, Honma T, Konagaya A. Significance of Data Selection in Deep Learning for Reliable Binding Mode Prediction of Ligands in the Active Site of CYP3A4. Chem Pharm Bull (Tokyo) 2019; 67:1183-1190. [PMID: 31423003 DOI: 10.1248/cpb.c19-00443] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
For rational drug design, it is essential to predict the binding mode of protein-ligand complexes. Although various machine learning-based models have been reported that use convolutional neural networks (deep learning) to predict binding modes from three-dimensional structures, there are few detailed reports on how best to construct and use datasets. Here, we examined how different datasets affected the prediction of the binding mode of CYP3A4 by a three-dimensional neural network when the number of crystal structures for the target protein was limited. We used four different training datasets: one large, general dataset containing various protein complexes and three smaller, more specific datasets containing complexes with CYP3A4-like pockets, complexes with CYP3A4-binding ligands, and complexes with CYP protein family members. We then trained models with different combinations of datasets with or without subsequent fine-tuning and evaluated the binding mode prediction performance of each model. The best receiver operating characteristic (ROC) area under the curve (AUC) model with respect to area under the receiver operating characteristic curve was obtained by training with a combination of the general protein and CYP family datasets. However, the ROC AUC-recall balanced model was obtained by training with this combination of datasets followed by fine-tuning with the CYP3A4-binding ligands dataset. Our results suggest that datasets that balance protein functionality and data size are important for optimizing binding mode prediction performance. In addition, datasets with large median binding pocket sizes may be important for the binding mode prediction specifically of CYP3A4.
Collapse
Affiliation(s)
- Atsuko Sato
- School of Computing, Department of Computer Science, Tokyo Institute of Technology
| | - Naoki Tanimura
- Science Solutions Division, Mizuho Information & Research Institute, Inc
| | - Teruki Honma
- School of Computing, Department of Computer Science, Tokyo Institute of Technology.,Center for Biosystems Dynamics Research, RIKEN.,Medical Sciences Innovation Hub Program, RIKEN
| | - Akihiko Konagaya
- School of Computing, Department of Computer Science, Tokyo Institute of Technology
| |
Collapse
|