51
|
Wang X, Yang X, Wang Q, Meng D. Unnatural amino acids: promising implications for the development of new antimicrobial peptides. Crit Rev Microbiol 2023; 49:231-255. [PMID: 35254957 DOI: 10.1080/1040841x.2022.2047008] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
The increasing incidence and rapid spread of bacterial resistance to conventional antibiotics are a serious global threat to public health, highlighting the need to develop new antimicrobial alternatives. Antimicrobial peptides (AMPs) represent a class of promising natural antibiotic candidates due to their broad-spectrum activity and low tendency to induce resistance. However, the development of AMPs for medical use is hampered by several obstacles, such as moderate activity, lability to proteolytic degradation, and low bioavailability. To date, many researchers have focussed on the optimization or design of novel artificial AMPs with desired properties. Unnatural amino acids (UAAs) are valuable building blocks in the manufacture of a variety of pharmaceuticals, and have been used to develop artificial AMPs with specific structural and physicochemical properties. Rational incorporation of UAAs has become a very promising approach to endow AMPs with strong and long-lasting activity but no toxicity. This review aims to summarize key approaches that have been used to incorporate UAAs to develop novel AMPs with improved properties and better performance. It is anticipated that this review will guide future design considerations for UAA-based antimicrobial applications.
Collapse
Affiliation(s)
- Xiuhong Wang
- State Key Laboratory of Food Nutrition and Safety, College of Food Science and Engineering, Tianjin University of Science & Technology, Tianjin, People's Republic of China
| | - Xiaomin Yang
- State Key Laboratory of Food Nutrition and Safety, College of Food Science and Engineering, Tianjin University of Science & Technology, Tianjin, People's Republic of China
| | - Qiaoe Wang
- Key Laboratory of Cosmetic, China National Light Industry, Beijing Technology and Business University, Beijing, People's Republic of China
| | - Demei Meng
- State Key Laboratory of Food Nutrition and Safety, College of Food Science and Engineering, Tianjin University of Science & Technology, Tianjin, People's Republic of China.,Tianjin Gasin-DH Preservation Technology Co., Ltd, Tianjin, People's Republic of China
| |
Collapse
|
52
|
Ravi V, Desikan K. Curvilinear regression analysis of benzenoid hydrocarbons and computation of some reduced reverse degree based topological indices for hyaluronic acid-paclitaxel conjugates. Sci Rep 2023; 13:3239. [PMID: 36828838 PMCID: PMC9958057 DOI: 10.1038/s41598-023-28416-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 01/18/2023] [Indexed: 02/26/2023] Open
Abstract
Graph theoretical molecular descriptors alias topological indices are a convenient means for expressing in numerical form the chemical structure encoded in a molecular graph. The structure descriptors derived from molecular graphs are widely used in quantitative structure-property relationship (QSPR) and quantitative structure-activity relationship (QSAR) studies. The reason for introducing new indices is to obtain predictions of target properties of considered molecules that are better than the predictions obtained using already known indices. In this paper, we apply the reduced reverse degree based indices introduced in 2021 by Vignesh et al. In the QSPR analysis, we first compute the reduced reverse degree based indices for a family of benzenoid hydrocarbon molecules and then we obtain the correlation with the Physico-chemical properties of the considered molecules. We show that all the properties taken into consideration for the benzenoid hydrocarbons can be very effectively predicted by the reduced reverse degree based indices. Also, we have compared the predictive capability of reduced reverse degree based topological descriptors against 16 existing degree based indices. Further, we compute the defined reduced reverse degree based topological indices for Hyaluronic Acid-Paclitaxel Conjugates [Formula: see text], [Formula: see text].
Collapse
Affiliation(s)
- Vignesh Ravi
- grid.412813.d0000 0001 0687 4946Division of Mathematics, School of Advanced Sciences, Vellore Institute of Technology, Chennai, India
| | - Kalyani Desikan
- Division of Mathematics, School of Advanced Sciences, Vellore Institute of Technology, Chennai, India.
| |
Collapse
|
53
|
Szwabowski GL, Baker DL, Parrill AL. Application of computational methods for class A GPCR Ligand discovery. J Mol Graph Model 2023; 121:108434. [PMID: 36841204 DOI: 10.1016/j.jmgm.2023.108434] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 02/11/2023] [Accepted: 02/13/2023] [Indexed: 02/22/2023]
Abstract
G protein-coupled receptors (GPCR) are integral membrane proteins of considerable interest as targets for drug development due to their role in transmitting cellular signals in a multitude of biological processes. Of the six classes categorizing GPCR (A, B, C, D, E, and F), class A contains the largest number of therapeutically relevant GPCR. Despite their importance as drug targets, many challenges exist for the discovery of novel class A GPCR ligands serving as drug precursors. Though knowledge of the structural and functional characteristics of GPCR has grown significantly over the past 20 years, a large portion of GPCR lack reported, experimentally determined structures. Furthermore, many GPCR have no known endogenous and/or synthetic ligands, limiting further exploration of their biochemical, cellular, and physiological roles. While many successes in GPCR ligand discovery have resulted from experimental high-throughput screening, computational methods have played an increasingly important role in GPCR ligand identification in the past decade. Here we discuss computational techniques applied to GPCR ligand discovery. This review summarizes class A GPCR structure/function and provides an overview of many obstacles currently faced in GPCR ligand discovery. Furthermore, we discuss applications and recent successes of computational techniques used to predict GPCR structure as well as present a summary of ligand- and structure-based methods used to identify potential GPCR ligands. Finally, we discuss computational hit list generation and refinement and provide comprehensive workflows for GPCR ligand identification.
Collapse
Affiliation(s)
| | - Daniel L Baker
- Department of Chemistry, The University of Memphis, Memphis, TN, 38152, USA
| | - Abby L Parrill
- Department of Chemistry, The University of Memphis, Memphis, TN, 38152, USA.
| |
Collapse
|
54
|
Comparative Studies on Resampling Techniques in Machine Learning and Deep Learning Models for Drug-Target Interaction Prediction. Molecules 2023; 28:molecules28041663. [PMID: 36838652 PMCID: PMC9964614 DOI: 10.3390/molecules28041663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 01/23/2023] [Accepted: 01/24/2023] [Indexed: 02/12/2023] Open
Abstract
The prediction of drug-target interactions (DTIs) is a vital step in drug discovery. The success of machine learning and deep learning methods in accurately predicting DTIs plays a huge role in drug discovery. However, when dealing with learning algorithms, the datasets used are usually highly dimensional and extremely imbalanced. To solve this issue, the dataset must be resampled accordingly. In this paper, we have compared several data resampling techniques to overcome class imbalance in machine learning methods as well as to study the effectiveness of deep learning methods in overcoming class imbalance in DTI prediction in terms of binary classification using ten (10) cancer-related activity classes from BindingDB. It is found that the use of Random Undersampling (RUS) in predicting DTIs severely affects the performance of a model, especially when the dataset is highly imbalanced, thus, rendering RUS unreliable. It is also found that SVM-SMOTE can be used as a go-to resampling method when paired with the Random Forest and Gaussian Naïve Bayes classifiers, whereby a high F1 score is recorded for all activity classes that are severely and moderately imbalanced. Additionally, the deep learning method called Multilayer Perceptron recorded high F1 scores for all activity classes even when no resampling method was applied.
Collapse
|
55
|
Zhu T, Yu Y, Tao T. A comprehensive evaluation of liposome/water partition coefficient prediction models based on the Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS) method: Challenges from different descriptor dimension reduction methods and machine learning algorithms. JOURNAL OF HAZARDOUS MATERIALS 2023; 443:130181. [PMID: 36257111 DOI: 10.1016/j.jhazmat.2022.130181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 10/09/2022] [Accepted: 10/10/2022] [Indexed: 06/16/2023]
Abstract
The liposome/water partition coefficient (Klip/w) is a key parameter to evaluate the bioaccumulation potential of pollutants. Considering that it is difficult to determine the Klip/w values of all pollutants through experiments, researchers gradually developed models to predict it. However, there is currently no research on how to comprehensively evaluate prediction models and recommend a compelling optimal modeling method. To remedy the defect of single parameters in a traditional model comparison, the TOPSIS evaluation method, based on entropy weight, was first proposed. We use this method to comprehensively evaluate models from multiple angles in this study. Thirty QSPR models, including 3 descriptor dimension reduction methods and 10 algorithms (belonging to 4 tribes), were used to predict Klip/w and verify the effectiveness of the comprehensive assessment method. The results showed that RF (descriptor dimension reduction method), symbolism (tribes) and RF (algorithm) exhibited significant advantages in establishing the Klip/w value prediction model. At present, the application of TOPSIS in environmental model evaluations is almost absent. We hope that the proposed TOPSIS evaluation method can be applied to more chemical datasets and provide a more systematic and comprehensive basis for the application of the QSPR model in environmental studies and other fields.
Collapse
Affiliation(s)
- Tengyi Zhu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China.
| | - Yan Yu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Tianyun Tao
- College of Agriculture, Yangzhou University, Yangzhou 225009, Jiangsu, China
| |
Collapse
|
56
|
New avenues in artificial-intelligence-assisted drug discovery. Drug Discov Today 2023; 28:103516. [PMID: 36736583 DOI: 10.1016/j.drudis.2023.103516] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 12/08/2022] [Accepted: 01/26/2023] [Indexed: 02/05/2023]
Abstract
Over the past decade, the amount of biomedical data available has grown at unprecedented rates. Increased automation technology and larger data volumes have encouraged the use of machine learning (ML) or artificial intelligence (AI) techniques for mining such data and extracting useful patterns. Because the identification of chemical entities with desired biological activity is a crucial task in drug discovery, AI technologies have the potential to accelerate this process and support decision making. In addition, the advent of deep learning (DL) has shown great promise in addressing diverse problems in drug discovery, such as de novo molecular design. Herein, we will appraise the current state-of-the-art in AI-assisted drug discovery, discussing the recent applications covering generative models for chemical structure generation, scoring functions to improve binding affinity and pose prediction, and molecular dynamics to assist in the parametrization, featurization and generalization tasks. Finally, we will discuss current hurdles and the strategies to overcome them, as well as potential future directions.
Collapse
|
57
|
Jaradat NJ, Alshaer W, Hatmal M, Taha MO. Discovery of new STAT3 inhibitors as anticancer agents using ligand-receptor contact fingerprints and docking-augmented machine learning. RSC Adv 2023; 13:4623-4640. [PMID: 36760267 PMCID: PMC9896621 DOI: 10.1039/d2ra07007c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 01/28/2023] [Indexed: 02/05/2023] Open
Abstract
STAT3 belongs to a family of seven vital transcription factors. High levels of STAT3 are detected in several types of cancer. Hence, STAT3 inhibition is considered a promising therapeutic anti-cancer strategy. In this work, we used multiple docked poses of STAT3 inhibitors to augment training data for machine learning QSAR modeling. Ligand-Receptor Contact Fingerprints and scoring values were implemented as descriptor variables. Escalating docking-scoring consensus levels were scanned against orthogonal machine learners, and the best learners (Random Forests and XGBoost) were coupled with genetic algorithm and Shapley additive explanations (SHAP) to identify critical descriptors that determine anti-STAT3 bioactivity to be translated into pharmacophore model(s). Two successful pharmacophores were deduced and subsequently used for in silico screening against the National Cancer Institute (NCI) database. A total of 26 hits were evaluated in vitro for their anti-STAT3 bioactivities. Out of which, three hits of novel chemotypes, showed cytotoxic IC50 values in the nanomolar range (35 nM to 6.7 μM). However, two are potent dihydrofolate reductase (DHFR) inhibitors and therefore should have significant indirect STAT3 inhibitory effects. The third hit (cytotoxic IC50 = 0.44 μM) is purely direct STAT3 inhibitor (devoid of DHFR activity) and caused, at its cytotoxic IC50, more than two-fold reduction in the expression of STAT3 downstream genes (c-Myc and Bcl-xL). The presented work indicates that the concept of data augmentation using multiple docked poses is a promising strategy for generating valid machine learning models capable of discriminating active from inactive compounds.
Collapse
Affiliation(s)
- Nour Jamal Jaradat
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan Amman 11492 Jordan +962 6 5339649 +962 6 5355000 ext. 23305
| | - Walhan Alshaer
- Cell Therapy Center, The University of Jordan Amman 11942 Jordan
| | - Mamon Hatmal
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University P.O. Box 330127 Zarqa 13133 Jordan
| | - Mutasem Omar Taha
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan Amman 11492 Jordan +962 6 5339649 +962 6 5355000 ext. 23305
| |
Collapse
|
58
|
Hossain D, Scott SH, Cluff T, Dukelow SP. The use of machine learning and deep learning techniques to assess proprioceptive impairments of the upper limb after stroke. J Neuroeng Rehabil 2023; 20:15. [PMID: 36707846 PMCID: PMC9881388 DOI: 10.1186/s12984-023-01140-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 01/18/2023] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Robots can generate rich kinematic datasets that have the potential to provide far more insight into impairments than standard clinical ordinal scales. Determining how to define the presence or absence of impairment in individuals using kinematic data, however, can be challenging. Machine learning techniques offer a potential solution to this problem. In the present manuscript we examine proprioception in stroke survivors using a robotic arm position matching task. Proprioception is impaired in 50-60% of stroke survivors and has been associated with poorer motor recovery and longer lengths of hospital stay. We present a simple cut-off score technique for individual kinematic parameters and an overall task score to determine impairment. We then compare the ability of different machine learning (ML) techniques and the above-mentioned task score to correctly classify individuals with or without stroke based on kinematic data. METHODS Participants performed an Arm Position Matching (APM) task in an exoskeleton robot. The task produced 12 kinematic parameters that quantify multiple attributes of position sense. We first quantified impairment in individual parameters and an overall task score by determining if participants with stroke fell outside of the 95% cut-off score of control (normative) values. Then, we applied five machine learning algorithms (i.e., Logistic Regression, Decision Tree, Random Forest, Random Forest with Hyperparameters Tuning, and Support Vector Machine), and a deep learning algorithm (i.e., Deep Neural Network) to classify individual participants as to whether or not they had a stroke based only on kinematic parameters using a tenfold cross-validation approach. RESULTS We recruited 429 participants with neuroimaging-confirmed stroke (< 35 days post-stroke) and 465 healthy controls. Depending on the APM parameter, we observed that 10.9-48.4% of stroke participants were impaired, while 44% were impaired based on their overall task score. The mean performance metrics of machine learning and deep learning models were: accuracy 82.4%, precision 85.6%, recall 76.5%, and F1 score 80.6%. All machine learning and deep learning models displayed similar classification accuracy; however, the Random Forest model had the highest numerical accuracy (83%). Our models showed higher sensitivity and specificity (AUC = 0.89) in classifying individual participants than the overall task score (AUC = 0.85) based on their performance in the APM task. We also found that variability was the most important feature in classifying performance in the APM task. CONCLUSION Our ML models displayed similar classification performance. ML models were able to integrate more kinematic information and relationships between variables into decision making and displayed better classification performance than the overall task score. ML may help to provide insight into individual kinematic features that have previously been overlooked with respect to clinical importance.
Collapse
Affiliation(s)
- Delowar Hossain
- grid.22072.350000 0004 1936 7697Department of Clinical Neuroscience, Cumming School of Medicine, University of Calgary, Calgary, AB Canada
| | - Stephen H. Scott
- grid.410356.50000 0004 1936 8331Department of Biomedical and Molecular Sciences, Queen’s University, Kingston, ON Canada
| | - Tyler Cluff
- grid.22072.350000 0004 1936 7697Faculty of Kinesiology, University of Calgary, Calgary, AB Canada
| | - Sean P. Dukelow
- grid.22072.350000 0004 1936 7697Department of Clinical Neuroscience, Cumming School of Medicine, University of Calgary, Calgary, AB Canada
| |
Collapse
|
59
|
Ahmad W, Tayara H, Chong KT. Attention-Based Graph Neural Network for Molecular Solubility Prediction. ACS OMEGA 2023; 8:3236-3244. [PMID: 36713733 PMCID: PMC9878542 DOI: 10.1021/acsomega.2c06702] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 12/23/2022] [Indexed: 06/18/2023]
Abstract
Drug discovery (DD) research is aimed at the discovery of new medications. Solubility is an important physicochemical property in drug development. Active pharmaceutical ingredients (APIs) are essential substances for high drug efficacy. During DD research, aqueous solubility (AS) is a key physicochemical attribute required for API characterization. High-precision in silico solubility prediction reduces the experimental cost and time of drug development. Several artificial tools have been employed for solubility prediction using machine learning and deep learning techniques. This study aims to create different deep learning models that can predict the solubility of a wide range of molecules using the largest currently available solubility data set. Simplified molecular-input line-entry system (SMILES) strings were used as molecular representation, models developed using simple graph convolution, graph isomorphism network, graph attention network, and AttentiveFP network. Based on the performance of the models, the AttentiveFP-based network model was finally selected. The model was trained and tested on 9943 compounds. The model outperformed on 62 anticancer compounds with metric Pearson correlation R 2 and root-mean-square error values of 0.52 and 0.61, respectively. AS can be improved by graph algorithm improvement or more molecular properties addition.
Collapse
Affiliation(s)
- Waqar Ahmad
- Department
of Electronics and Information Engineering, Jeonbuk National University, Jeonju54896, South Korea
| | - Hilal Tayara
- School
of International Engineering and Science, Jeonbuk National University, Jeonju54896, South Korea
| | - Kil To Chong
- Department
of Electronics and Information Engineering, Jeonbuk National University, Jeonju54896, South Korea
- Advanced
Electronics and Information Research Center, Jeonbuk National University, Jeonju54896, South Korea
| |
Collapse
|
60
|
Luo Y, Wang P, Mou M, Zheng H, Hong J, Tao L, Zhu F. A novel strategy for designing the magic shotguns for distantly related target pairs. Brief Bioinform 2023; 24:6984790. [PMID: 36631399 DOI: 10.1093/bib/bbac621] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Revised: 11/09/2022] [Accepted: 12/17/2022] [Indexed: 01/13/2023] Open
Abstract
Due to its promising capacity in improving drug efficacy, polypharmacology has emerged to be a new theme in the drug discovery of complex disease. In the process of novel multi-target drugs (MTDs) discovery, in silico strategies come to be quite essential for the advantage of high throughput and low cost. However, current researchers mostly aim at typical closely related target pairs. Because of the intricate pathogenesis networks of complex diseases, many distantly related targets are found to play crucial role in synergistic treatment. Therefore, an innovational method to develop drugs which could simultaneously target distantly related target pairs is of utmost importance. At the same time, reducing the false discovery rate in the design of MTDs remains to be the daunting technological difficulty. In this research, effective small molecule clustering in the positive dataset, together with a putative negative dataset generation strategy, was adopted in the process of model constructions. Through comprehensive assessment on 10 target pairs with hierarchical similarity-levels, the proposed strategy turned out to reduce the false discovery rate successfully. Constructed model types with much smaller numbers of inhibitor molecules gained considerable yields and showed better false-hit controllability than before. To further evaluate the generalization ability, an in-depth assessment of high-throughput virtual screening on ChEMBL database was conducted. As a result, this novel strategy could hierarchically improve the enrichment factors for each target pair (especially for those distantly related/unrelated target pairs), corresponding to target pair similarity-levels.
Collapse
Affiliation(s)
- Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Panpan Wang
- College of Chemistry and Pharmaceutical Engineering, Huanghuai University, Zhumadian 463000, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Hanqi Zheng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Jiajun Hong
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicine of Zhejiang Province, School of Medicine, Hangzhou Normal University, Hangzhou 310036, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
61
|
Jasial S, Hu J, Miyao T, Hirama Y, Onishi S, Matsui R, Osaki K, Funatsu K. Screening and Validation of Odorants against Influenza A Virus Using Interpretable Regression Models. ACS Pharmacol Transl Sci 2023; 6:139-150. [PMID: 36654744 PMCID: PMC9841774 DOI: 10.1021/acsptsci.2c00193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Indexed: 12/23/2022]
Abstract
Influenza is a respiratory infection caused by the influenza virus that is prevalent worldwide. One of the most contagious variants of influenza is influenza A virus (IAV), which usually spreads in closed spaces through aerosols. Preventive measures such as novel compounds are needed that can act on viral membranes and provide a safe environment against IAV infection. In this study, we screened compounds with common fragrances that are generally used to mask unpleasant odors but can also exhibit antiviral activity against a strain of IAV. Initially, a set of 188 structurally diverse odorants were collected, and their antiviral activity was measured in vapor phase against the IAV solution. Regression models were built for the prediction of antiviral activity using this set of odorants by taking into account their structural features along with vapor pressure and partition coefficient (n-octanol/water). The models were interpreted using a feature weighting approach and Shapley Additive exPlanations to rationalize the predictions as an additional validation for virtual screening. This model was used to screen odorants from an in-house odorant data set consisting of 2020 odorants, which were later evaluated using in vitro experiments. Out of 11 odorants proposed using the final model, 8 odorants were found to exhibit antiviral activity. The feature interpretation of screened odorants suggested that they contained hydrophilic substructures, such as hydroxyl group, which might contribute to denaturation of proteins on the surface of the virus. These odorants should be explored as a preventive measure in closed spaces to decrease the risk of infections of IAV.
Collapse
Affiliation(s)
- Swarit Jasial
- Data
Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara630-0192, Japan
| | - Jieying Hu
- Material
Science Research, Kao Corporation, 1334 Minato, Wakayama-shi, Wakayama640-8580, Japan
| | - Tomoyuki Miyao
- Data
Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara630-0192, Japan
| | - Yui Hirama
- Biological
Science Research, Kao Corporation, 2606 Akabane, Ichikai-machi, Haga-gun, Tochigi321-3426, Japan
| | - Shintaro Onishi
- Biological
Science Research, Kao Corporation, 2606 Akabane, Ichikai-machi, Haga-gun, Tochigi321-3426, Japan
| | - Ryoichi Matsui
- Material
Science Research, Kao Corporation, 1334 Minato, Wakayama-shi, Wakayama640-8580, Japan
| | - Koji Osaki
- Material
Science Research, Kao Corporation, 1334 Minato, Wakayama-shi, Wakayama640-8580, Japan
| | - Kimito Funatsu
- Data
Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara630-0192, Japan
| |
Collapse
|
62
|
Togo MV, Mastrolorito F, Ciriaco F, Trisciuzzi D, Tondo AR, Gambacorta N, Bellantuono L, Monaco A, Leonetti F, Bellotti R, Altomare CD, Amoroso N, Nicolotti O. TIRESIA: An eXplainable Artificial Intelligence Platform for Predicting Developmental Toxicity. J Chem Inf Model 2023; 63:56-66. [PMID: 36520016 DOI: 10.1021/acs.jcim.2c01126] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Herein, a robust and reproducible eXplainable Artificial Intelligence (XAI) approach is presented, which allows prediction of developmental toxicity, a challenging human-health endpoint in toxicology. The application of XAI as an alternative method is of the utmost importance with developmental toxicity being one of the most animal-intensive areas of regulatory toxicology. In this work, the established CAESAR (Computer Assisted Evaluation of industrial chemical Substances According to Regulations) training set made of 234 chemicals for model learning is employed. Two test sets, including as a whole 585 chemicals, were instead used for validation and generalization purposes. The proposed framework favorably compares with the state-of-the-art approaches in terms of accuracy, sensitivity, and specificity, thus resulting in a reliable support system for developmental toxicity ensuring informativeness, uncertainty estimation, generalization, and transparency. Based on the eXtreme Gradient Boosting (XGB) algorithm, our predictive model provides easy interpretative keys based on specific molecular descriptors and structural alerts enabling one to distinguish toxic and nontoxic chemicals. Inspired by the Organisation for Economic Co-operation and Development (OECD) principles for the validation of Quantitative Structure-Activity Relationships (QSARs) for regulatory purposes, the results are summarized in a standard report in portable document format, enclosing also details concerned with a density-based model applicability domain and SHAP (SHapley Additive exPlanations) explainability, the latter particularly useful to better understand the effective roles played by molecular features. Notably, our model has been implemented in TIRESIA (Toxicology Intelligence and Regulatory Evaluations for Scientific and Industry Applications), a free of charge web platform available at http://tiresia.uniba.it.
Collapse
Affiliation(s)
- Maria Vittoria Togo
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Fabrizio Mastrolorito
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Fulvio Ciriaco
- Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, 70125, Bari, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Anna Rita Tondo
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Nicola Gambacorta
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Loredana Bellantuono
- Dipartimento di Biomedicina Traslazionale e Neuroscienze (DiBraiN), Università degli Studi di Bari Aldo Moro, 70124Bari, Italy.,Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125Bari, Italy
| | - Alfonso Monaco
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125Bari, Italy.,Dipartimento Interateneo di Fisica M. Merlin, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Francesco Leonetti
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Roberto Bellotti
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125Bari, Italy.,Dipartimento Interateneo di Fisica M. Merlin, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| | - Nicola Amoroso
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy.,Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125Bari, Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125Bari, Italy
| |
Collapse
|
63
|
Pirzada RH, Ahmad B, Qayyum N, Choi S. Modeling structure-activity relationships with machine learning to identify GSK3-targeted small molecules as potential COVID-19 therapeutics. Front Endocrinol (Lausanne) 2023; 14:1084327. [PMID: 36950681 PMCID: PMC10025526 DOI: 10.3389/fendo.2023.1084327] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 02/20/2023] [Indexed: 03/08/2023] Open
Abstract
Coronaviruses induce severe upper respiratory tract infections, which can spread to the lungs. The nucleocapsid protein (N protein) plays an important role in genome replication, transcription, and virion assembly in SARS-CoV-2, the virus causing COVID-19, and in other coronaviruses. Glycogen synthase kinase 3 (GSK3) activation phosphorylates the viral N protein. To combat COVID-19 and future coronavirus outbreaks, interference with the dependence of N protein on GSK3 may be a viable strategy. Toward this end, this study aimed to construct robust machine learning models to identify GSK3 inhibitors from Food and Drug Administration-approved and investigational drug libraries using the quantitative structure-activity relationship approach. A non-redundant dataset consisting of 495 and 3070 compounds for GSK3α and GSK3β, respectively, was acquired from the ChEMBL database. Twelve sets of molecular descriptors were used to define these inhibitors, and machine learning algorithms were selected using the LazyPredict package. Histogram-based gradient boosting and light gradient boosting machine algorithms were used to develop predictive models that were evaluated based on the root mean square error and R-squared value. Finally, the top two drugs (selinexor and ruboxistaurin) were selected for molecular dynamics simulation based on the highest predicted activity (negative log of the half-maximal inhibitory concentration, pIC50 value) to further investigate the structural stability of the protein-ligand complexes. This artificial intelligence-based virtual high-throughput screening approach is an effective strategy for accelerating drug discovery and finding novel pharmacological targets while reducing the cost and time.
Collapse
Affiliation(s)
- Rameez Hassan Pirzada
- Department of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
- S&K Therapeutics, Ajou University Campus Plaza, Suwon, Republic of Korea
| | - Bilal Ahmad
- Department of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
| | - Naila Qayyum
- Department of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
| | - Sangdun Choi
- Department of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
- S&K Therapeutics, Ajou University Campus Plaza, Suwon, Republic of Korea
- *Correspondence: Sangdun Choi,
| |
Collapse
|
64
|
Didachos C, Kintos DP, Fousteris M, Mylonas P, Kanavos A. An Optimized Cloud Computing Method for Extracting Molecular Descriptors. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2023; 1424:247-254. [PMID: 37486501 DOI: 10.1007/978-3-031-31982-2_28] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/25/2023]
Abstract
Extracting molecular descriptors from chemical compounds is an essential preprocessing phase for developing accurate classification models. Supervised machine learning algorithms offer the capability to detect "hidden" patterns that may exist in a large dataset of compounds, which are represented by their molecular descriptors. Assuming that molecules with similar structure tend to share similar physicochemical properties, large chemical libraries can be screened by applying similarity sourcing techniques in order to detect potential bioactive compounds against a molecular target. However, the process of generating these compound features is time-consuming. Our proposed methodology not only employs cloud computing to accelerate the process of extracting molecular descriptors but also introduces an optimized approach to utilize the computational resources in the most efficient way.
Collapse
Affiliation(s)
- Christos Didachos
- Computer Engineering and Informatics Department, University of Patras, Patras, Greece
| | | | | | - Phivos Mylonas
- Department of Informatics, Ionian University, Corfu, Greece
| | - Andreas Kanavos
- Department of Informatics, Ionian University, Corfu, Greece.
| |
Collapse
|
65
|
Singh DP, Kaushik B. A systematic literature review for the prediction of anticancer drug response using various machine-learning and deep-learning techniques. Chem Biol Drug Des 2023; 101:175-194. [PMID: 36303299 DOI: 10.1111/cbdd.14164] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 10/13/2022] [Accepted: 10/24/2022] [Indexed: 12/24/2022]
Abstract
Computational methods have gained prominence in healthcare research. The accessibility of healthcare data has greatly incited academicians and researchers to develop executions that help in prognosis of cancer drug response. Among various computational methods, machine-learning (ML) and deep-learning (DL) methods provide the most consistent and effectual approaches to handle the serious aftermaths of the deadly disease and drug administered to the patients. Hence, this systematic literature review has reviewed researches that have investigated drug discovery and prognosis of anticancer drug response using ML and DL algorithms. Fot this purpose, PRISMA guidelines have been followed to choose research papers from Google Scholar, PubMed, and Sciencedirect websites. A total count of 105 papers that align with the context of this review were chosen. Further, the review also presents accuracy of the existing ML and DL methods in the prediction of anticancer drug response. It has been found from the review that, amidst the availability of various studies, there are certain challenges associated with each method. Thus, future researchers can consider these limitations and challenges to develop a prominent anticancer drug response prediction method, and it would be greatly beneficial to the medical professionals in administering non-invasive treatment to the patients.
Collapse
Affiliation(s)
- Davinder Paul Singh
- School of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra, Jammu and Kashmir, India
| | - Baijnath Kaushik
- School of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra, Jammu and Kashmir, India
| |
Collapse
|
66
|
Trisciuzzi D, Siragusa L, Baroni M, Cruciani G, Nicolotti O. An Integrated Machine Learning Model To Spot Peptide Binding Pockets in 3D Protein Screening. J Chem Inf Model 2022; 62:6812-6824. [PMID: 36320100 DOI: 10.1021/acs.jcim.2c00583] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The prediction of peptide-protein binding sites is of utmost importance to tackle the onset of severe neurodegenerative diseases and cancer. In this work, we detail a novel machine learning model based on Linear Discriminant Analysis (LDA) demonstrating to be highly predictive in detecting the putative protein binding regions of small peptides. Starting from 439 high-quality pockets derived from peptide-protein crystallographic complexes, three sets of well-established peptide-binding regions were first selected through a Partitioning Around Medoids (PAM) clustering algorithm based on morphological and energetic 3D GRID-MIF molecular descriptors. Next, the best combination between all the putative interacting peptide pockets and related GRID-MIF scores was automatically explored by using the LDA-based protocol implemented in BioGPS. This approach proved successful to recognize the actual interacting peptide regions (that is, AUC = 0.86 and partial ROC enrichment at 5% of 0.48) from all the other pockets of the protein. Validated on two external collections sets, including 445 and 347 crystallographic peptide-protein complexes, our LDA-based model could be effective to further run peptide-protein virtual screening campaigns.
Collapse
Affiliation(s)
- Daniela Trisciuzzi
- Department of Pharmacy-Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", 70125Bari, Italy.,Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, HertfordshireWD6 4PJ, United Kingdom
| | - Lydia Siragusa
- Molecular Horizon s.r.l., Via Montelino, 30, 06084Bettona (PG), Italy.,Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, HertfordshireWD6 4PJ, United Kingdom
| | - Massimo Baroni
- Molecular Discovery Ltd., Kinetic Business Centre, Theobald Street, Elstree, Borehamwood, HertfordshireWD6 4PJ, United Kingdom
| | - Gabriele Cruciani
- Department of Chemistry, Biology and Biotechnology, Università degli Studi di Perugia, via Elce di Sotto, 8, 06123Perugia (PG), Italy
| | - Orazio Nicolotti
- Department of Pharmacy-Pharmaceutical Sciences, Università degli Studi di Bari "Aldo Moro", 70125Bari, Italy
| |
Collapse
|
67
|
Cerchia C, Lavecchia A. In Silico Drug Design and Discovery: Big Data for Small Molecule Design. Biomolecules 2022; 13:biom13010044. [PMID: 36671429 PMCID: PMC9855915 DOI: 10.3390/biom13010044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 12/23/2022] [Indexed: 12/29/2022] Open
Abstract
Across life sciences, the steadily and rapidly increasing amount of data provide new opportunities for advancing knowledge and represent a key driver of emerging technological advancements [...].
Collapse
|
68
|
Fernandes PDO, Martins JPA, de Melo EB, de Oliveira RB, Kronenberger T, Maltarollo VG. Quantitative structure-activity relationship and machine learning studies of 2-thiazolylhydrazone derivatives with anti- Cryptococcus neoformans activity. J Biomol Struct Dyn 2022; 40:9789-9800. [PMID: 34121616 DOI: 10.1080/07391102.2021.1935321] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Cryptococcus neoformans is a fungus responsible for infections in humans with a significant number of cases in immunosuppressed patients, mainly in underdeveloped countries. In this context, the thiazolylhydrazones are a promising class of compounds with activity against C. neoformans. The understanding of the structure-activity relationship of these derivatives could lead to the design of robust compounds that could be promising drug candidates for fungal infections. Specifically, modern techniques such as 4D-QSAR and machine learning methods were employed in this work to generate two QSAR models (one 2D and one 4D) with high predictive power (r2 for the test set equals to 0.934 and 0.831, respectively), and one random forest classification model was reported with Matthews correlation coefficient equals to 1 and 0.62 for internal and external validations, respectively. The physicochemical interpretation of selected models, indicated the importance of aliphatic substituents at the hydrazone moiety to antifungal activity, corroborating experimental data.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Philipe de Oliveira Fernandes
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - João Paulo A Martins
- Departamento de Química, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Eduardo B de Melo
- Laboratório de Química Medicinal e Ambiental Teórica, Universidade Estadual do Oeste do Paraná, Cascavel, Paraná, Brazil
| | - Renata Barbosa de Oliveira
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Thales Kronenberger
- Department of Pneumonology and Oncology, Internal Medicine VIII, University Hospital of Tübingen, Tübingen, Baden-Württemberg, Germany
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| |
Collapse
|
69
|
Li F, Hu Q, Zhang X, Sun R, Liu Z, Wu S, Tian S, Ma X, Dai Z, Yang X, Gao S, Bai F. DeepPROTACs is a deep learning-based targeted degradation predictor for PROTACs. Nat Commun 2022; 13:7133. [PMID: 36414666 PMCID: PMC9681730 DOI: 10.1038/s41467-022-34807-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 11/08/2022] [Indexed: 11/24/2022] Open
Abstract
The rational design of PROTACs is difficult due to their obscure structure-activity relationship. This study introduces a deep neural network model - DeepPROTACs to help design potent PROTACs molecules. It can predict the degradation capacity of a proposed PROTAC molecule based on structures of given target protein and E3 ligase. The experimental dataset is mainly collected from PROTAC-DB and appropriately labeled according to the DC50 and Dmax values. In the model of DeepPROTACs, the ligands as well as the ligand binding pockets are generated and represented with graphs and fed into Graph Convolutional Networks for feature extraction. While SMILES representations of linkers are fed into a Bidirectional Long Short-Term Memory layer to generate the features. Experiments show that DeepPROTACs model achieves 77.95% average prediction accuracy and 0.8470 area under receiver operating characteristic curve on the test set. DeepPROTACs is available online at a web server ( https://bailab.siais.shanghaitech.edu.cn/services/deepprotacs/ ) and at github ( https://github.com/fenglei104/DeepPROTACs ).
Collapse
Affiliation(s)
- Fenglei Li
- grid.440637.20000 0004 4657 8879Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China ,grid.440637.20000 0004 4657 8879School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China
| | - Qiaoyu Hu
- grid.440637.20000 0004 4657 8879Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China
| | - Xianglei Zhang
- grid.440637.20000 0004 4657 8879Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China
| | - Renhong Sun
- Gluetacs Therapeutics (Shanghai) Co., Ltd., 99 Haike Road, Zhangjiang Hi-Tech Park, Shanghai, 201210 China
| | - Zhuanghua Liu
- grid.440637.20000 0004 4657 8879School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China
| | - Sanan Wu
- grid.440637.20000 0004 4657 8879Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China
| | - Siyuan Tian
- grid.440637.20000 0004 4657 8879Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China ,grid.440637.20000 0004 4657 8879School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China
| | - Xinyue Ma
- grid.440637.20000 0004 4657 8879Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China ,grid.440637.20000 0004 4657 8879School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China
| | - Zhizhuo Dai
- grid.440637.20000 0004 4657 8879School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China
| | - Xiaobao Yang
- Gluetacs Therapeutics (Shanghai) Co., Ltd., 99 Haike Road, Zhangjiang Hi-Tech Park, Shanghai, 201210 China
| | - Shenghua Gao
- grid.440637.20000 0004 4657 8879School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China
| | - Fang Bai
- grid.440637.20000 0004 4657 8879Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China ,grid.440637.20000 0004 4657 8879School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China ,grid.440637.20000 0004 4657 8879School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210 China ,grid.452344.0Shanghai Clinical Research and Trial Center, Shanghai, 201210 China
| |
Collapse
|
70
|
Kanapeckaitė A, Mažeikienė A, Geris L, Burokienė N, Cottrell GS, Widera D. Computational pharmacology: New avenues for COVID-19 therapeutics search and better preparedness for future pandemic crises. Biophys Chem 2022; 290:106891. [PMID: 36137310 PMCID: PMC9464258 DOI: 10.1016/j.bpc.2022.106891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 09/03/2022] [Accepted: 09/04/2022] [Indexed: 01/07/2023]
Abstract
The COVID-19 pandemic created an unprecedented global healthcare emergency prompting the exploration of new therapeutic avenues, including drug repurposing. A large number of ongoing studies revealed pervasive issues in clinical research, such as the lack of accessible and organised data. Moreover, current shortcomings in clinical studies highlighted the need for a multi-faceted approach to tackle this health crisis. Thus, we set out to explore and develop new strategies for drug repositioning by employing computational pharmacology, data mining, systems biology, and computational chemistry to advance shared efforts in identifying key targets, affected networks, and potential pharmaceutical intervention options. Our study revealed that formulating pharmacological strategies should rely on both therapeutic targets and their networks. We showed how data mining can reveal regulatory patterns, capture novel targets, alert about side-effects, and help identify new therapeutic avenues. We also highlighted the importance of the miRNA regulatory layer and how this information could be used to monitor disease progression or devise treatment strategies. Importantly, our work bridged the interactome with the chemical compound space to better understand the complex landscape of COVID-19 drugs. Machine and deep learning allowed us to showcase limitations in current chemical libraries for COVID-19 suggesting that both in silico and experimental analyses should be combined to retrieve therapeutically valuable compounds. Based on the gathered data, we strongly advocate for taking this opportunity to establish robust practices for treating today's and future infectious diseases by preparing solid analytical frameworks.
Collapse
Affiliation(s)
- Austė Kanapeckaitė
- AK Consulting, Laisvės g. 7, LT 12007 Vilnius, Lithuania,Corresponding author
| | - Asta Mažeikienė
- Department of Physiology, Biochemistry, Microbiology and Laboratory Medicine, Institute of Biomedical Sciences, Faculty of Medicine, Vilnius University, M. K. Čiurlionio g. 21, LT-03101 Vilnius, Lithuania
| | - Liesbet Geris
- Biomechanics Research Unit, GIGA In Silico Medicine, University of Liège, Quartier Hôpital, Avenue de l'Hôpital 11 (B34), Liège 4000, Belgium,Biomechanics Section, Department of Mechanical Engineering, KU Leuven, Celestijnenlaan 300C (2419), Leuven 3001, Belgium,Skeletel Biology and Engineering Research Center, Department of Development and Regeneration, KU Leuven, Herestraat 49 (813), Leuven 3000, Belgium
| | - Neringa Burokienė
- Clinics of Internal Diseases, Family Medicine and Oncology, Institute of Clinical Medicine, Faculty of Medicine, Vilnius University, M. K. Čiurlionio str. 21/27, LT-03101 Vilnius, Lithuania
| | - Graeme S. Cottrell
- University of Reading, School of Pharmacy, Hopkins Building, Reading RG6 6UB, United Kingdom
| | - Darius Widera
- University of Reading, School of Pharmacy, Hopkins Building, Reading RG6 6UB, United Kingdom
| |
Collapse
|
71
|
Diéguez-Santana K, Nachimba-Mayanchi MM, Puris A, Gutiérrez RT, González-Díaz H. Prediction of acute toxicity of pesticides for Americamysis bahia using linear and nonlinear QSTR modelling approaches. ENVIRONMENTAL RESEARCH 2022; 214:113984. [PMID: 35981614 DOI: 10.1016/j.envres.2022.113984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 06/19/2022] [Accepted: 07/22/2022] [Indexed: 06/15/2023]
Abstract
Globally, pesticides are toxic substances with wide applications. However, the widespread use of pesticides has received increasing attention from regulatory agencies due to their various acute and chronic effects on multiple organisms. In this study, Quantitative Structure-Toxicity Relationship (QSTR) models were established using Multiple Linear Regression (MLR) and five Machine Learning (ML) algorithms to predict pesticide toxicity in Americamysis bahia. The most influential descriptors included in the MLR model are RBF, JGI2, nCbH, nRCOOR, nRSR, nPO4 and 'Cl-090', with positive contributions to the dependent variable (negative decimal logarithm of median lethal concentration at 96-h). The Random Forest (RF) regression model was superior amongst the five ML models. We observed higher values of R2 (0.812) and lower values of RMSE (0.595) and MAE (0.462) in the cross-validation training set and external validation set. Similarly, this study had a high level of fitness and was internally robust and externally predictive compared to models presented in similar studies. The results suggest that the developed QSTR models are suitable for reliably predicting the aquatic toxicity of structurally diverse pesticides and can be used for screening, prioritising new pesticides, filling data gaps and overcoming the limitations of in vivo and in vitro tests.
Collapse
Affiliation(s)
- Karel Diéguez-Santana
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, 48940, Leioa, Spain; Universidad Regional Amazónica Ikiam, Tena, Ecuador.
| | | | - Amilkar Puris
- Facultad de Ciencias de la Ingeniería, Universidad Técnica Estatal de Quevedo, Ecuador
| | | | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, 48940, Leioa, Spain; Basque Center for Biophysics CSIC-UPVEH, University of Basque Country UPV/EHU, 48940, Leioa, Spain; IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Biscay, Spain
| |
Collapse
|
72
|
van Beek B, Zito J, Visscher L, Infante I. CAT: A Compound Attachment Tool for the Construction of Composite Chemical Compounds. J Chem Inf Model 2022; 62:5525-5535. [PMID: 36314636 PMCID: PMC9976287 DOI: 10.1021/acs.jcim.2c00690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The continuous improvement of computer architectures allows for the simulation of molecular systems of growing sizes. However, such calculations still require the input of initial structures, which are also becoming increasingly complex. In this work, we present CAT, a Compound Attachment Tool (source code available at https://github.com/nlesc-nano/CAT) and Python package for the automatic construction of composite chemical compounds, which supports the functionalization of organic, inorganic, and hybrid organic-inorganic materials. The CAT workflow consists in defining the anchoring sites on the reference material, usually a large molecular system denoted as a scaffold, and on the molecular species that are attached to it, i.e., the ligands. Usually, ligands are pre-optimized in a conformation biased toward more linear structures to minimize interligand(s) steric interactions, a bias that is important when multiple ligands are attached onto the scaffold. The resulting superstructure(s) are then stored in various formats that can be used afterward in quantum chemical calculations or classical force field-based simulations.
Collapse
Affiliation(s)
- Bas van Beek
- Division
of Theoretical Chemistry, Faculty of Science, Vrije Universiteit Amsterdam, de Boelelaan 1083, Amsterdam 1081 HV, the Netherlands
| | - Juliette Zito
- Dipartimento
di Chimica e Chimica Industriale, Università
degli Studi di Genova, Via Dodecaneso 31, Genova 16146, Italy,Department
of Nanochemistry, Istituto Italiano di Tecnologia, Via Morego 30, Genova 16163, Italy
| | - Lucas Visscher
- Division
of Theoretical Chemistry, Faculty of Science, Vrije Universiteit Amsterdam, de Boelelaan 1083, Amsterdam 1081 HV, the Netherlands,
| | - Ivan Infante
- Department
of Nanochemistry, Istituto Italiano di Tecnologia, Via Morego 30, Genova 16163, Italy,BCMaterials,
Basque Center for Materials, Applications, and Nanostructures, UPV/EHU Science Park, Leioa 48940, Spain,Ikerbasque
Basque Foundation for Science Bilbao 48009, Spain,
| |
Collapse
|
73
|
KUALA: a machine learning-driven framework for kinase inhibitors repositioning. Sci Rep 2022; 12:17877. [PMID: 36284125 PMCID: PMC9595087 DOI: 10.1038/s41598-022-22324-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 10/12/2022] [Indexed: 01/20/2023] Open
Abstract
The family of protein kinases comprises more than 500 genes involved in numerous functions. Hence, their physiological dysfunction has paved the way toward drug discovery for cancer, cardiovascular, and inflammatory diseases. As a matter of fact, Kinase binding sites high similarity has a double role. On the one hand it is a critical issue for selectivity, on the other hand, according to poly-pharmacology, a synergistic controlled effect on more than one target could be of great pharmacological interest. Another important aspect of binding similarity is the possibility of exploit it for repositioning of drugs on targets of the same family. In this study, we propose our approach called Kinase drUgs mAchine Learning frAmework (KUALA) to automatically identify kinase active ligands by using specific sets of molecular descriptors and provide a multi-target priority score and a repurposing threshold to suggest the best repurposable and non-repurposable molecules. The comprehensive list of all kinase-ligand pairs and their scores can be found at https://github.com/molinfrimed/multi-kinases .
Collapse
|
74
|
Rahman ASMZ, Liu C, Sturm H, Hogan AM, Davis R, Hu P, Cardona ST. A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery. PLoS Comput Biol 2022; 18:e1010613. [PMID: 36228001 PMCID: PMC9624395 DOI: 10.1371/journal.pcbi.1010613] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 11/01/2022] [Accepted: 09/26/2022] [Indexed: 01/24/2023] Open
Abstract
Screening for novel antibacterial compounds in small molecule libraries has a low success rate. We applied machine learning (ML)-based virtual screening for antibacterial activity and evaluated its predictive power by experimental validation. We first binarized 29,537 compounds according to their growth inhibitory activity (hit rate 0.87%) against the antibiotic-resistant bacterium Burkholderia cenocepacia and described their molecular features with a directed-message passing neural network (D-MPNN). Then, we used the data to train an ML model that achieved a receiver operating characteristic (ROC) score of 0.823 on the test set. Finally, we predicted antibacterial activity in virtual libraries corresponding to 1,614 compounds from the Food and Drug Administration (FDA)-approved list and 224,205 natural products. Hit rates of 26% and 12%, respectively, were obtained when we tested the top-ranked predicted compounds for growth inhibitory activity against B. cenocepacia, which represents at least a 14-fold increase from the previous hit rate. In addition, more than 51% of the predicted antibacterial natural compounds inhibited ESKAPE pathogens showing that predictions expand beyond the organism-specific dataset to a broad range of bacteria. Overall, the developed ML approach can be used for compound prioritization before screening, increasing the typical hit rate of drug discovery.
Collapse
Affiliation(s)
| | - Chengyou Liu
- Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Hunter Sturm
- Department of Chemistry, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Andrew M. Hogan
- Department of Microbiology, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Rebecca Davis
- Department of Chemistry, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Pingzhao Hu
- Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, Manitoba, Canada
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Silvia T. Cardona
- Department of Microbiology, University of Manitoba, Winnipeg, Manitoba, Canada
- Department of Medical Microbiology & Infectious Diseases, University of Manitoba, Winnipeg, Canada
- * E-mail:
| |
Collapse
|
75
|
Structural Model Based on Genetic Algorithm for Inhibiting Fatty Acid Amide Hydrolase. AI 2022. [DOI: 10.3390/ai3040052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The fatty acid amide hydrolase (FAAH) is an enzyme responsible for the degradation of anandamide, an endocannabinoid. Pharmacologically blocking this target can lead to anxiolytic effects; therefore, new inhibitors can improve therapy in this field. In order to speed up the process of drug discovery, various in silico methods can be used, such as molecular docking, quantitative structure–activity relationship models (QSAR), and artificial intelligence (AI) classification algorithms. Besides architecture, one important factor for an AI model with high accuracy is the dataset quality. This issue can be solved by a genetic algorithm that can select optimal features for the prediction. The objective of the current study is to use this feature selection method in order to identify the most relevant molecular descriptors that can be used as independent variables, thus improving the efficacy of AI algorithms that can predict FAAH inhibitors. The model that used features chosen by the genetic algorithm had better accuracy than the model that used all molecular descriptors generated by the CDK descriptor calculator 1.4.6 software. Hence, carefully selecting the input data used by AI classification algorithms by using a GA is a promising strategy in drug development.
Collapse
|
76
|
Mudedla SK, Braka A, Wu S. Quantum-based machine learning and AI models to generate force field parameters for drug-like small molecules. Front Mol Biosci 2022; 9:1002535. [PMID: 36304919 PMCID: PMC9592901 DOI: 10.3389/fmolb.2022.1002535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 09/15/2022] [Indexed: 11/28/2022] Open
Abstract
Force fields for drug-like small molecules play an essential role in molecular dynamics simulations and binding free energy calculations. In particular, the accurate generation of partial charges on small molecules is critical to understanding the interactions between proteins and drug-like molecules. However, it is a time-consuming process. Thus, we generated a force field for small molecules and employed a machine learning (ML) model to rapidly predict partial charges on molecules in less than a minute of time. We performed density functional theory (DFT) calculation for 31770 small molecules that covered the chemical space of drug-like molecules. The partial charges for the atoms in a molecule were predicted using an ML model trained on DFT-based atomic charges. The predicted values were comparable to the charges obtained from DFT calculations. The ML model showed high accuracy in the prediction of atomic charges for external test data sets. We also developed neural network (NN) models to assign atom types, phase angles and periodicities. All the models performed with high accuracy on test data sets. Our code calculated all the descriptors that were needed for the prediction of force field parameters and produced topologies for small molecules by combining results from ML and NN models. To assess the accuracy of the predicted force field parameters, we calculated solvation free energies for small molecules, and the results were in close agreement with experimental free energies. The AI-generated force field was effective in the fast and accurate generation of partial charges and other force field parameters for small drug-like molecules.
Collapse
Affiliation(s)
| | | | - Sangwook Wu
- R&D Center, PharmCADD, Busan, South Korea
- Department of Physics, Pukyong National University, Busan, South Korea
- *Correspondence: Sangwook Wu,
| |
Collapse
|
77
|
Ramamurthi D, Selvaraj J, Raj PV, Tallapaneni V, Chandrasekar MJN. Downregulation of NT5C3 gene expressions by elastin-like polypeptide gemcitabine conjugate for ovarian cancer therapy. J Drug Deliv Sci Technol 2022. [DOI: 10.1016/j.jddst.2022.103821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
78
|
Hadiby S, Ali YMB. FNN Based-Virtual Screening Using 2D Pharmacophore Fingerprint for Activity Prediction in Drug Discovery. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS 2022. [DOI: 10.1142/s1469026822500195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Drug discovery remains a hard field that faces from the beginning of its process to the end many difficulties and challenges in order to discover a new potential drug. The use of technology has helped a lot in achieving many goals at the lowest cost and in the shortest possible time. Machine learning methods have proven for many years their performance although their limitations in some cases. The use of deep learning for virtual screening in drug discovery allows to process efficiently the huge amount of data and gives more precise results. In this paper, we propose a procedure for virtual screening (VS) based on Feedforward Neural Network in order to predict the biological activity of a set of chemical compounds on a given receptor. we have proposed a distance interval and it divisions to describe the chemical compound by the 2D pharmacophore fingerprint. Our model was trained on a dataset of active and inactive chemical compounds on cyclin A kinase1 receptor (CDK1), a very important protein family which has a role in the regulation of the cell cycle and cancer development. The results have proven that the proposed model is efficient and comparable with some widely used machine learning methods in drug discovery.
Collapse
Affiliation(s)
- Seloua Hadiby
- Department of Computer Science, Computer Research Laboratory, Badji Mokhtar University, Annaba, Algeria
| | - Yamina Mohamed Ben Ali
- Department of Computer Science, Computer Research Laboratory, Badji Mokhtar University, Annaba, Algeria
| |
Collapse
|
79
|
Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:biom12091246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein–ligand binding, including allosteric effects, protein–protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
|
80
|
An emerging machine learning strategy for electrochemical sensor and supercapacitor using carbonized metal–organic framework. J Electroanal Chem (Lausanne) 2022. [DOI: 10.1016/j.jelechem.2022.116634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
81
|
Veríssimo GC, Serafim MSM, Kronenberger T, Ferreira RS, Honorio KM, Maltarollo VG. Designing drugs when there is low data availability: one-shot learning and other approaches to face the issues of a long-term concern. Expert Opin Drug Discov 2022; 17:929-947. [PMID: 35983695 DOI: 10.1080/17460441.2022.2114451] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Modern drug discovery generally is accessed by useful information from previous large databases or uncovering novel data. The lack of biological and/or chemical data tends to slow the development of scientific research and innovation. Here, approaches that may help provide solutions to generate or obtain enough relevant data or improve/accelerate existing methods within the last five years were reviewed. AREAS COVERED One-shot learning (OSL) approaches, structural modeling, molecular docking, scoring function space (SFS), molecular dynamics (MD), and quantum mechanics (QM) may be used to amplify the amount of available data to drug design and discovery campaigns, presenting methods, their perspectives, and discussions to be employed in the near future. EXPERT OPINION Recent works have successfully used these techniques to solve a range of issues in the face of data scarcity, including complex problems such as the challenging scenario of drug design aimed at intrinsically disordered proteins and the evaluation of potential adverse effects in a clinical scenario. These examples show that it is possible to improve and kickstart research from scarce available data to design and discover new potential drugs.
Collapse
Affiliation(s)
- Gabriel C Veríssimo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Mateus Sá M Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Thales Kronenberger
- Department of Medical Oncology and Pneumology, Internal Medicine VIII, University Hospital of Tübingen, Tübingen, Germany.,School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, Kuopio, Finland
| | - Rafaela S Ferreira
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Kathia M Honorio
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP), São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC (UFABC), Santo André, Brazil
| | - Vinícius G Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| |
Collapse
|
82
|
García-Ortegón M, Simm GNC, Tripp AJ, Hernández-Lobato JM, Bender A, Bacallado S. DOCKSTRING: Easy Molecular Docking Yields Better Benchmarks for Ligand Design. J Chem Inf Model 2022; 62:3486-3502. [PMID: 35849793 PMCID: PMC9364321 DOI: 10.1021/acs.jcim.1c01334] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Indexed: 01/05/2023]
Abstract
The field of machine learning for drug discovery is witnessing an explosion of novel methods. These methods are often benchmarked on simple physicochemical properties such as solubility or general druglikeness, which can be readily computed. However, these properties are poor representatives of objective functions in drug design, mainly because they do not depend on the candidate compound's interaction with the target. By contrast, molecular docking is a widely applied method in drug discovery to estimate binding affinities. However, docking studies require a significant amount of domain knowledge to set up correctly, which hampers adoption. Here, we present dockstring, a bundle for meaningful and robust comparison of ML models using docking scores. dockstring consists of three components: (1) an open-source Python package for straightforward computation of docking scores, (2) an extensive dataset of docking scores and poses of more than 260,000 molecules for 58 medically relevant targets, and (3) a set of pharmaceutically relevant benchmark tasks such as virtual screening or de novo design of selective kinase inhibitors. The Python package implements a robust ligand and target preparation protocol that allows nonexperts to obtain meaningful docking scores. Our dataset is the first to include docking poses, as well as the first of its size that is a full matrix, thus facilitating experiments in multiobjective optimization and transfer learning. Overall, our results indicate that docking scores are a more realistic evaluation objective than simple physicochemical properties, yielding benchmark tasks that are more challenging and more closely related to real problems in drug discovery.
Collapse
Affiliation(s)
- Miguel García-Ortegón
- Statistical
Laboratory, Centre for Mathematical Sciences, University of Cambridge, Wilberforce Rd., Cambridge CB3 0WB, United Kingdom
| | - Gregor N. C. Simm
- Department
of Engineering, University of Cambridge, Trumpington St., Cambridge CB2 1PZ, United Kingdom
| | - Austin J. Tripp
- Department
of Engineering, University of Cambridge, Trumpington St., Cambridge CB2 1PZ, United Kingdom
| | | | - Andreas Bender
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield
Rd., Cambridge CB2 1EW, United Kingdom
| | - Sergio Bacallado
- Statistical
Laboratory, Centre for Mathematical Sciences, University of Cambridge, Wilberforce Rd., Cambridge CB3 0WB, United Kingdom
| |
Collapse
|
83
|
When machine learning meets molecular synthesis. TRENDS IN CHEMISTRY 2022. [DOI: 10.1016/j.trechm.2022.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
84
|
Feldmann C, Bajorath J. Calculation of Exact Shapley Values for Support Vector Machines with Tanimoto Kernel Enables Model Interpretation. iScience 2022; 25:105023. [PMID: 36105596 PMCID: PMC9464958 DOI: 10.1016/j.isci.2022.105023] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 08/09/2022] [Accepted: 08/20/2022] [Indexed: 11/24/2022] Open
Abstract
The support vector machine (SVM) algorithm is popular in chemistry and drug discovery. SVM models have black box character. Their predictions can be interpreted through feature weighting or the model-agnostic Shapley additive explanations (SHAP) formalism that locally approximates Shapley values (SVs) originating from game theory. We introduce an algorithm termed SV-expressed Tanimoto similarity (SVETA) for the exact calculation of SVs to explain SVM models employing the Tanimoto kernel, the gold standard for the assessment of molecular similarity. For a model system, the exact calculation of SVs is demonstrated. In an SVM-based compound classification task from drug discovery, only a limited correlation between exact SV and SHAP values is observed, prohibiting the use of approximate values for rationalizing predictions. For exemplary test compounds, atom-based mapping of prioritized features delineates coherent substructures that closely resemble those obtained by analyzing independently derived random forest models, thus providing consistent explanations. SVETA: new methodology for explaining support vector machine (SVM) predictions Tanimoto similarity-based SVM models are popular in chemistry SVETA enables the calculation of exact Shapley values for rationalizing SVM models SVETA-based feature mapping provides intuitive explanations of SVM decisions
Collapse
|
85
|
Zhu S, Bai Q, Li L, Xu T. Drug repositioning in drug discovery of T2DM and repositioning potential of antidiabetic agents. Comput Struct Biotechnol J 2022; 20:2839-2847. [PMID: 35765655 PMCID: PMC9189996 DOI: 10.1016/j.csbj.2022.05.057] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/30/2022] [Accepted: 05/30/2022] [Indexed: 12/19/2022] Open
Abstract
Repositioning or repurposing drugs account for a substantial part of entering approval pipeline drugs, which indicates that drug repositioning has huge market potential and value. Computational technologies such as machine learning methods have accelerated the process of drug repositioning in the last few decades years. The repositioning potential of type 2 diabetes mellitus (T2DM) drugs for various diseases such as cancer, neurodegenerative diseases, and cardiovascular diseases have been widely studied. Hence, the related summary about repurposing antidiabetic drugs is of great significance. In this review, we focus on the machine learning methods for the development of new T2DM drugs and give an overview of the repurposing potential of the existing antidiabetic agents.
Collapse
Affiliation(s)
- Sha Zhu
- Key Lab of Preclinical Study for New Drugs of Gansu Province, Institute of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Lanzhou University, Lanzhou, Gansu 730000, PR China
| | - Qifeng Bai
- Key Lab of Preclinical Study for New Drugs of Gansu Province, Institute of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Lanzhou University, Lanzhou, Gansu 730000, PR China
- Corresponding author.
| | | | | |
Collapse
|
86
|
Deep Learning Based-Virtual Screening Using 2D Pharmacophore Fingerprint in Drug Discovery. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10879-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
87
|
Zhang S, Yan Z, Huang Y, Liu L, He D, Wang W, Fang X, Zhang X, Wang F, Wu H, Wang H. HelixADMET: a robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer. Bioinformatics 2022; 38:3444-3453. [PMID: 35604079 DOI: 10.1093/bioinformatics/btac342] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Revised: 05/06/2022] [Accepted: 05/17/2022] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Accurate ADMET (an abbreviation for "absorption, distribution, metabolism, excretion, and toxicity") predictions can efficiently screen out undesirable drug candidates in the early stage of drug discovery. In recent years, multiple comprehensive ADMET systems that adopt advanced machine learning models have been developed, providing services to estimate multiple endpoints. However, those ADMET systems usually suffer from weak extrapolation ability. First, due to the lack of labelled data for each endpoint, typical machine learning models perform frail for the molecules with unobserved scaffolds. Second, most systems only provide fixed built-in endpoints and cannot be customised to satisfy various research requirements. To this end, we develop a robust and endpoint extensible ADMET system, HelixADMET (H-ADMET). H-ADMET incorporates the concept of self-supervised learning to produce a robust pre-trained model. The model is then fine-tuned with a multi-task and multi-stage framework to transfer knowledge between ADMET endpoints, auxiliary tasks, and self-supervised tasks. RESULTS Our results demonstrate that H-ADMET achieves an overall improvement of 4%, compared with existing ADMET systems on comparable endpoints. Additionally, the pre-trained model provided by H-ADMET can be fine-tuned to generate new and customised ADMET endpoints, meeting various demands of drug research and development requirements. AVAILABILITY H-ADMET is freely accessible at https://paddlehelix.baidu.com/app/drug/admet/train. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shanzhuo Zhang
- Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen, China
| | - Zhiyuan Yan
- Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen, China
| | - Yueyang Huang
- Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen, China
| | - Lihang Liu
- Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen, China
| | - Donglong He
- Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen, China
| | - Wei Wang
- School of Computer Science and Technology, Harbin Institute of Technology (HIT), Shenzhen, China
| | - Xiaomin Fang
- Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen, China
| | - Xiaonan Zhang
- Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen, China
| | - Fan Wang
- Baidu International Technology (Shenzhen) Co., Ltd., Shenzhen, China
| | - Hua Wu
- Baidu Inc., Beijing, China
| | | |
Collapse
|
88
|
Yang C, Zhang Y. Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein-Ligand Scoring Functions. J Chem Inf Model 2022; 62:2696-2712. [PMID: 35579568 DOI: 10.1021/acs.jcim.2c00485] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Protein-ligand scoring functions are widely used in structure-based drug design for fast evaluation of protein-ligand interactions, and it is of strong interest to develop scoring functions with machine-learning approaches. In this work, by expanding the training set, developing physically meaningful features, employing our recently developed linear empirical scoring function Lin_F9 (Yang, C. J. Chem. Inf. Model. 2021, 61, 4630-4644) as the baseline, and applying extreme gradient boosting (XGBoost) with Δ-machine learning, we have further improved the robustness and applicability of machine-learning scoring functions. Besides the top performances for scoring-ranking-screening power tests of the CASF-2016 benchmark, the new scoring function ΔLin_F9XGB also achieves superior scoring and ranking performances in different structure types that mimic real docking applications. The scoring powers of ΔLin_F9XGB for locally optimized poses, flexible redocked poses, and ensemble docked poses of the CASF-2016 core set achieve Pearson's correlation coefficient (R) values of 0.853, 0.839, and 0.813, respectively. In addition, the large-scale docking-based virtual screening test on the LIT-PCBA data set demonstrates the reliability and robustness of ΔLin_F9XGB in virtual screening application. The ΔLin_F9XGB scoring function and its code are freely available on the web at (https://yzhang.hpc.nyu.edu/Delta_LinF9_XGB).
Collapse
Affiliation(s)
- Chao Yang
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department of Chemistry, New York University, New York, New York 10003, United States.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
89
|
Luo L, Zheng T, Wang Q, Liao Y, Zheng X, Zhong A, Huang Z, Luo H. Virtual Screening Based on Machine Learning Explores Mangrove Natural Products as KRASG12C Inhibitors. Pharmaceuticals (Basel) 2022; 15:ph15050584. [PMID: 35631410 PMCID: PMC9146975 DOI: 10.3390/ph15050584] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 05/03/2022] [Accepted: 05/05/2022] [Indexed: 12/10/2022] Open
Abstract
Mangrove secondary metabolites have many unique biological activities. We identified lead compounds among them that might target KRASG12C. KRAS is considered to be closely related to various cancers. A variety of novel small molecules that directly target KRAS are being developed, including covalent allosteric inhibitors for KRASG12C mutant, protein–protein interaction inhibitors that bind in the switch I/II pocket or the A59 site, and GTP-competitive inhibitors targeting the nucleotide-binding site. To identify a candidate pool of mangrove secondary metabolic natural products, we tested various machine learning algorithms and selected random forest as a model for predicting the targeting activity of compounds. Lead compounds were then subjected to virtual screening and covalent docking, integrated absorption, distribution, metabolism and excretion (ADME) testing, and structure-based pharmacophore model validation to select the most suitable compounds. Finally, we performed molecular dynamics simulations to verify the binding mode of the lead compound to KRASG12C. The lazypredict function package was initially used, and the Accuracy score and F1 score of the random forest algorithm exceeded 60%, which can be considered to carry a strong ability to distinguish the data. Four marine natural products were obtained through machine learning identification and covalent docking screening. Compound 44 and compound 14 were selected for further validation after ADME and toxicity studies, and pharmacophore analysis indicated that they had a favorable pharmacodynamic profile. Comparison with the positive control showed that they stabilized switch I and switch II, and like MRTX849, retained a novel binding mechanism at the molecular level. Molecular dynamics analysis showed that they maintained a stable conformation with the target protein, so compound 44 and compound 14 may be effective inhibitors of the G12C mutant. These findings reveal that the mangrove-derived secondary metabolite compound 44 and compound 14 might be potential therapeutic agents for KRASG12C.
Collapse
Affiliation(s)
- Lianxiang Luo
- The Marine Biomedical Research Institute, Guangdong Medical University, Zhanjiang 524023, China
- The Marine Biomedical Research Institute of Guangdong Zhanjiang, Zhanjiang 524023, China
- Correspondence: (L.L.); (Z.H.); (H.L.)
| | - Tongyu Zheng
- The First Clinical College, Guangdong Medical University, Zhanjiang 524023, China; (T.Z.); (Q.W.); (Y.L.); (X.Z.); (A.Z.)
| | - Qu Wang
- The First Clinical College, Guangdong Medical University, Zhanjiang 524023, China; (T.Z.); (Q.W.); (Y.L.); (X.Z.); (A.Z.)
| | - Yingling Liao
- The First Clinical College, Guangdong Medical University, Zhanjiang 524023, China; (T.Z.); (Q.W.); (Y.L.); (X.Z.); (A.Z.)
| | - Xiaoqi Zheng
- The First Clinical College, Guangdong Medical University, Zhanjiang 524023, China; (T.Z.); (Q.W.); (Y.L.); (X.Z.); (A.Z.)
| | - Ai Zhong
- The First Clinical College, Guangdong Medical University, Zhanjiang 524023, China; (T.Z.); (Q.W.); (Y.L.); (X.Z.); (A.Z.)
| | - Zunnan Huang
- School of Pharmacy, Guangdong Medical University, Dongguan 523808, China
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Dongguan 523808, China
- Correspondence: (L.L.); (Z.H.); (H.L.)
| | - Hui Luo
- The Marine Biomedical Research Institute, Guangdong Medical University, Zhanjiang 524023, China
- The Marine Biomedical Research Institute of Guangdong Zhanjiang, Zhanjiang 524023, China
- Correspondence: (L.L.); (Z.H.); (H.L.)
| |
Collapse
|
90
|
Nag S, Baidya ATK, Mandal A, Mathew AT, Das B, Devi B, Kumar R. Deep learning tools for advancing drug discovery and development. 3 Biotech 2022; 12:110. [PMID: 35433167 PMCID: PMC8994527 DOI: 10.1007/s13205-022-03165-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 03/18/2022] [Indexed: 12/26/2022] Open
Abstract
A few decades ago, drug discovery and development were limited to a bunch of medicinal chemists working in a lab with enormous amount of testing, validations, and synthetic procedures, all contributing to considerable investments in time and wealth to get one drug out into the clinics. The advancements in computational techniques combined with a boom in multi-omics data led to the development of various bioinformatics/pharmacoinformatics/cheminformatics tools that have helped speed up the drug development process. But with the advent of artificial intelligence (AI), machine learning (ML) and deep learning (DL), the conventional drug discovery process has been further rationalized. Extensive biological data in the form of big data present in various databases across the globe acts as the raw materials for the ML/DL-based approaches and helps in accurate identifications of patterns and models which can be used to identify therapeutically active molecules with much fewer investments on time, workforce and wealth. In this review, we have begun by introducing the general concepts in the drug discovery pipeline, followed by an outline of the fields in the drug discovery process where ML/DL can be utilized. We have also introduced ML and DL along with their applications, various learning methods, and training models used to develop the ML/DL-based algorithms. Furthermore, we have summarized various DL-based tools existing in the public domain with their application in the drug discovery paradigm which includes DL tools for identification of drug targets and drug–target interaction such as DeepCPI, DeepDTA, WideDTA, PADME DeepAffinity, and DeepPocket. Additionally, we have discussed various DL-based models used in protein structure prediction, de novo design of new chemical scaffolds, virtual screening of chemical libraries for hit identification, absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction, metabolite prediction, clinical trial design, and oral bioavailability prediction. In the end, we have tried to shed light on some of the successful ML/DL-based models used in the drug discovery and development pipeline while also discussing the current challenges and prospects of the application of DL tools in drug discovery and development. We believe that this review will be useful for medicinal and computational chemists searching for DL tools for use in their drug discovery projects.
Collapse
Affiliation(s)
- Sagorika Nag
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Anurag T. K. Baidya
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Abhimanyu Mandal
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Alen T. Mathew
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Bhanuranjan Das
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Bharti Devi
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Rajnish Kumar
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| |
Collapse
|
91
|
Gurvic D, Leach AG, Zachariae U. Data-Driven Derivation of Molecular Substructures That Enhance Drug Activity in Gram-Negative Bacteria. J Med Chem 2022; 65:6088-6099. [PMID: 35427114 PMCID: PMC9059115 DOI: 10.1021/acs.jmedchem.1c01984] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Indexed: 11/28/2022]
Abstract
The complex cell envelope of Gram-negative bacteria creates a formidable barrier to antibiotic influx. Reduced drug uptake impedes drug development and contributes to a wide range of drug-resistant bacterial infections, including those caused by extremely resistant species prioritized by the World Health Organization. To develop new and efficient treatments, a better understanding of the molecular features governing Gram-negative permeability is essential. Here, we present a data-driven approach, using matched molecular pair analysis and machine learning on minimal inhibitory concentration data from Gram-positive and Gram-negative bacteria to uncover chemical features that influence Gram-negative bioactivity. We find recurring chemical moieties, of a wider range than previously known, that consistently improve activity and suggest that this insight can be used to optimize compounds for increased Gram-negative uptake. Our findings may help to expand the chemical space of broad-spectrum antibiotics and aid the search for new antibiotic compound classes.
Collapse
Affiliation(s)
- Dominik Gurvic
- Computational
Biology, School of Life Sciences, University
of Dundee, Dow Street, Dundee DD1
5EH, United Kingdom
| | - Andrew G. Leach
- Division
of Pharmacy and Optometry, University of
Manchester, Oxford Road, Manchester M13 9PL, United Kingdom
- Medchemica
Limited, Mereside, Alderley
Park, Macclesfield, SK10
4TG, United Kingdom
| | - Ulrich Zachariae
- Computational
Biology, School of Life Sciences, University
of Dundee, Dow Street, Dundee DD1
5EH, United Kingdom
| |
Collapse
|
92
|
Alqahtani A. Application of Artificial Intelligence in Discovery and Development of Anticancer and Antidiabetic Therapeutic Agents. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE : ECAM 2022; 2022:6201067. [PMID: 35509623 PMCID: PMC9060979 DOI: 10.1155/2022/6201067] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 03/17/2022] [Accepted: 04/05/2022] [Indexed: 11/18/2022]
Abstract
Spectacular developments in molecular and cellular biology have led to important discoveries in cancer research. Despite cancer is one of the major causes of morbidity and mortality globally, diabetes is one of the most leading sources of group of disorders. Artificial intelligence (AI) has been considered the fourth industrial revolution machine. The most major hurdles in drug discovery and development are the time and expenditures required to sustain the drug research pipeline. Large amounts of data can be explored and generated by AI, which can then be converted into useful knowledge. Because of this, the world's largest drug companies have already begun to use AI in their drug development research. In the present era, AI has a huge amount of potential for the rapid discovery and development of new anticancer drugs. Clinical studies, electronic medical records, high-resolution medical imaging, and genomic assessments are just a few of the tools that could aid drug development. Large data sets are available to researchers in the pharmaceutical and medical fields, which can be analyzed by advanced AI systems. This review looked at how computational biology and AI technologies may be utilized in cancer precision drug development by combining knowledge of cancer medicines, drug resistance, and structural biology. This review also highlighted a realistic assessment of the potential for AI in understanding and managing diabetes.
Collapse
Affiliation(s)
- Amal Alqahtani
- College of Medicine, Imam Abdulrahman Bin Faisal University, Dammam, 31541, Saudi Arabia
- Department of Basic Sciences, Deanship of Preparatory Year and Supporting Studies, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 34212, Saudi Arabia
| |
Collapse
|
93
|
Periwal V, Bassler S, Andrejev S, Gabrielli N, Patil KR, Typas A, Patil KR. Bioactivity assessment of natural compounds using machine learning models trained on target similarity between drugs. PLoS Comput Biol 2022; 18:e1010029. [PMID: 35468126 PMCID: PMC9071136 DOI: 10.1371/journal.pcbi.1010029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 05/05/2022] [Accepted: 03/17/2022] [Indexed: 11/19/2022] Open
Abstract
Natural compounds constitute a rich resource of potential small molecule therapeutics. While experimental access to this resource is limited due to its vast diversity and difficulties in systematic purification, computational assessment of structural similarity with known therapeutic molecules offers a scalable approach. Here, we assessed functional similarity between natural compounds and approved drugs by combining multiple chemical similarity metrics and physicochemical properties using a machine-learning approach. We computed pairwise similarities between 1410 drugs for training classification models and used the drugs shared protein targets as class labels. The best performing models were random forest which gave an average area under the ROC of 0.9, Matthews correlation coefficient of 0.35, and F1 score of 0.33, suggesting that it captured the structure-activity relation well. The models were then used to predict protein targets of circa 11k natural compounds by comparing them with the drugs. This revealed therapeutic potential of several natural compounds, including those with support from previously published sources as well as those hitherto unexplored. We experimentally validated one of the predicted pair’s activities, viz., Cox-1 inhibition by 5-methoxysalicylic acid, a molecule commonly found in tea, herbs and spices. In contrast, another natural compound, 4-isopropylbenzoic acid, with the highest similarity score when considering most weighted similarity metric but not picked by our models, did not inhibit Cox-1. Our results demonstrate the utility of a machine-learning approach combining multiple chemical features for uncovering protein binding potential of natural compounds.
Collapse
Affiliation(s)
- Vinita Periwal
- European Molecular Biology Laboratory, Heidelberg, Germany
- Medical Research Council Toxicology Unit, University of Cambridge, Cambridge, United Kingdom
| | - Stefan Bassler
- European Molecular Biology Laboratory, Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | | | | | - Kaustubh Raosaheb Patil
- Institute of Neuroscience and Medicine (INM-7), Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | | | - Kiran Raosaheb Patil
- European Molecular Biology Laboratory, Heidelberg, Germany
- Medical Research Council Toxicology Unit, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
94
|
Rodríguez-Pérez R, Miljković F, Bajorath J. Machine Learning in Chemoinformatics and Medicinal Chemistry. Annu Rev Biomed Data Sci 2022; 5:43-65. [PMID: 35440144 DOI: 10.1146/annurev-biodatasci-122120-124216] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In chemoinformatics and medicinal chemistry, machine learning has evolved into an important approach. In recent years, increasing computational resources and new deep learning algorithms have put machine learning onto a new level, addressing previously unmet challenges in pharmaceutical research. In silico approaches for compound activity predictions, de novo design, and reaction modeling have been further advanced by new algorithmic developments and the emergence of big data in the field. Herein, novel applications of machine learning and deep learning in chemoinformatics and medicinal chemistry are reviewed. Opportunities and challenges for new methods and applications are discussed, placing emphasis on proper baseline comparisons, robust validation methodologies, and new applicability domains. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany; .,Current affiliation: Novartis Institutes for Biomedical Research, Novartis Campus, Basel, Switzerland
| | - Filip Miljković
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany; .,Current affiliation: Data Science and AI, Imaging and Data Analytics, Clinical Pharmacology and Safety Sciences, R&D AstraZeneca, Gothenburg, Sweden
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany;
| |
Collapse
|
95
|
In Silico Identification of Novel Inhibitors Targeting the Homodimeric Interface of Superoxide Dismutase from the Dental Pathogen Streptococcus mutans. Antioxidants (Basel) 2022; 11:antiox11040785. [PMID: 35453470 PMCID: PMC9029323 DOI: 10.3390/antiox11040785] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 04/10/2022] [Accepted: 04/12/2022] [Indexed: 02/04/2023] Open
Abstract
The microaerophile Streptococcus mutans, the main microaerophile responsible for the development of dental plaque, has a single cambialistic superoxide dismutase (SmSOD) for its protection against reactive oxygen species. In order to discover novel inhibitors of SmSOD, possibly interfering with the biofilm formation by this pathogen, a virtual screening study was realised using the available 3D-structure of SmSOD. Among the selected molecules, compound ALS-31 was capable of inhibiting SmSOD with an IC50 value of 159 µM. Its inhibition power was affected by the Fe/Mn ratio in the active site of SmSOD. Furthermore, ALS-31 also inhibited the activity of other SODs. Gel-filtration of SmSOD in the presence of ALS-31 showed that the compound provoked the dissociation of the SmSOD homodimer in two monomers, thus compromising the catalytic activity of the enzyme. A docking model, showing the binding mode of ALS-31 at the dimer interface of SmSOD, is presented. Cell viability of the fibroblast cell line BJ5-ta was not affected up to 100 µM ALS-31. A preliminary lead optimization program allowed the identification of one derivative, ALS-31-9, endowed with a 2.5-fold improved inhibition power. Interestingly, below this concentration, planktonic growth and biofilm formation of S. mutans cultures were inhibited by ALS-31, and even more by its derivative, thus opening the perspective of future drug design studies to fight against dental caries.
Collapse
|
96
|
Cantrell JM, Chung CH, Chandrasekaran S. Machine learning to design antimicrobial combination therapies: promises and pitfalls. Drug Discov Today 2022; 27:1639-1651. [DOI: 10.1016/j.drudis.2022.04.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 01/20/2022] [Accepted: 04/04/2022] [Indexed: 01/13/2023]
|
97
|
Arif SM, Floto RA, Blundell TL. Using Structure-guided Fragment-Based Drug Discovery to Target Pseudomonas aeruginosa Infections in Cystic Fibrosis. Front Mol Biosci 2022; 9:857000. [PMID: 35433835 PMCID: PMC9006449 DOI: 10.3389/fmolb.2022.857000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 02/23/2022] [Indexed: 11/13/2022] Open
Abstract
Cystic fibrosis (CF) is progressive genetic disease that predisposes lungs and other organs to multiple long-lasting microbial infections. Pseudomonas aeruginosa is the most prevalent and deadly pathogen among these microbes. Lung function of CF patients worsens following chronic infections with P. aeruginosa and is associated with increased mortality and morbidity. Emergence of multidrug-resistant, extensively drug-resistant and pandrug-resistant strains of P. aeruginosa due to intrinsic and adaptive antibiotic resistance mechanisms has failed the current anti-pseudomonal antibiotics. Hence new antibacterials are urgently needed to treat P. aeruginosa infections. Structure-guided fragment-based drug discovery (FBDD) is a powerful approach in the field of drug development that has succeeded in delivering six FDA approved drugs over the past 20 years targeting a variety of biological molecules. However, FBDD has not been widely used in the development of anti-pseudomonal molecules. In this review, we first give a brief overview of our structure-guided FBDD pipeline and then give a detailed account of FBDD campaigns to combat P. aeruginosa infections by developing small molecules having either bactericidal or anti-virulence properties. We conclude with a brief overview of the FBDD efforts in our lab at the University of Cambridge towards targeting P. aeruginosa infections.
Collapse
Affiliation(s)
| | - R. Andres Floto
- Molecular Immunity Unit, Department of Medicine University of Cambridge, MRC-Laboratory of Molecular Biology, Cambridge, United Kingdom
- Cambridge Centre for Lung Infection, Royal Papworth Hospital, Cambridge, United Kingdom
| | - Tom L. Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- *Correspondence: Tom L. Blundell,
| |
Collapse
|
98
|
Alibakhshi A, Hartke B. Implicitly perturbed Hamiltonian as a class of versatile and general-purpose molecular representations for machine learning. Nat Commun 2022; 13:1245. [PMID: 35273170 PMCID: PMC8913769 DOI: 10.1038/s41467-022-28912-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 02/01/2022] [Indexed: 11/28/2022] Open
Abstract
Unraveling challenging problems by machine learning has recently become a hot topic in many scientific disciplines. For developing rigorous machine-learning models to study problems of interest in molecular sciences, translating molecular structures to quantitative representations as suitable machine-learning inputs play a central role. Many different molecular representations and the state-of-the-art ones, although efficient in studying numerous molecular features, still are suboptimal in many challenging cases, as discussed in the context of the present research. The main aim of the present study is to introduce the Implicitly Perturbed Hamiltonian (ImPerHam) as a class of versatile representations for more efficient machine learning of challenging problems in molecular sciences. ImPerHam representations are defined as energy attributes of the molecular Hamiltonian, implicitly perturbed by a number of hypothetic or real arbitrary solvents based on continuum solvation models. We demonstrate the outstanding performance of machine-learning models based on ImPerHam representations for three diverse and challenging cases of predicting inhibition of the CYP450 enzyme, high precision, and transferrable evaluation of non-covalent interaction energy of molecular systems, and accurately reproducing solvation free energies for large benchmark sets. Molecular representations are fundamental tools for machine-learning models. The current work introduces a new set of molecular representations demonstrated to enable accurate predictions of molecular conformational energy and solvation free energy.
Collapse
Affiliation(s)
- Amin Alibakhshi
- Theoretical Chemistry, Institute for Physical Chemistry, Christian-Albrechts-University, Olshausenstr. 40, Kiel, Germany.
| | - Bernd Hartke
- Theoretical Chemistry, Institute for Physical Chemistry, Christian-Albrechts-University, Olshausenstr. 40, Kiel, Germany
| |
Collapse
|
99
|
Kalezhi J, Chibuluma M, Chembe C, Chama V, Lungo F, Kunda D. Modelling Covid-19 infections in Zambia using data mining techniques. RESULTS IN ENGINEERING 2022; 13:100363. [PMID: 35317385 PMCID: PMC8813672 DOI: 10.1016/j.rineng.2022.100363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 01/08/2022] [Accepted: 02/01/2022] [Indexed: 06/14/2023]
Abstract
The outbreak of Covid-19 pandemic has been declared a global health crisis by the World Health Organization since its emergence. Several researchers have proposed a number of techniques to understand how the pandemic affects the populations. Reported among these techniques are data mining models which have been successfully applied in a wide range of situations before the advent of Covid-19 pandemic. In this work, the researchers have applied a number of existing data mining methods (classifiers) available in the Waikato Environment for Knowledge Analysis (WEKA) machine learning library. WEKA was used to gain a better understanding on how the epidemic spread within Zambia. The classifiers used are J48 decision tree, Multilayer Perceptron and Naïve Bayes among others. The predictions of these techniques are compared against simpler classifiers and those reported in related works.
Collapse
Affiliation(s)
- Josephat Kalezhi
- Department of Computer Engineering, Copperbelt University, Kitwe, Zambia
| | - Mathews Chibuluma
- Department of Information Technology/Systems, Copperbelt University, Kitwe, Zambia
| | | | - Victoria Chama
- Department of Computer Science and Information Technology, Mulungushi University, Kabwe, Zambia
| | - Francis Lungo
- School of Social Sciences, Mulungushi University, Kabwe, Zambia
| | | |
Collapse
|
100
|
Staszak M, Staszak K, Wieszczycka K, Bajek A, Roszkowski K, Tylkowski B. Machine learning in drug design: Use of artificial intelligence to explore the chemical structure–biological activity relationship. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1568] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Maciej Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Katarzyna Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Karolina Wieszczycka
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Anna Bajek
- Department of Tissue Engineering Collegium Medicum, Nicolaus Copernicus University Bydgoszcz Poland
| | - Krzysztof Roszkowski
- Department of Oncology Collegium Medicum Nicolaus Copernicus University Bydgoszcz Poland
| | - Bartosz Tylkowski
- Department of Chemical Engineering University Rovira i Virgili Tarragona Spain
- Eurecat, Centre Tecnològic de Catalunya Chemical Technologies Unit Tarragona Spain
| |
Collapse
|