1
|
Lin M, Cai J, Wei Y, Peng X, Luo Q, Li B, Chen Y, Wang L. MalariaFlow: A comprehensive deep learning platform for multistage phenotypic antimalarial drug discovery. Eur J Med Chem 2024; 277:116776. [PMID: 39173285 DOI: 10.1016/j.ejmech.2024.116776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 07/31/2024] [Accepted: 08/01/2024] [Indexed: 08/24/2024]
Abstract
Malaria remains a significant global health challenge due to the growing drug resistance of Plasmodium parasites and the failure to block transmission within human host. While machine learning (ML) and deep learning (DL) methods have shown promise in accelerating antimalarial drug discovery, the performance of deep learning models based on molecular graph and other co-representation approaches warrants further exploration. Current research has overlooked mutant strains of the malaria parasite with varying degrees of sensitivity or resistance, and has not covered the prediction of inhibitory activities across the three major life cycle stages (liver, asexual blood, and gametocyte) within the human host, which is crucial for both treatment and transmission blocking. In this study, we manually curated a benchmark antimalarial activity dataset comprising 407,404 unique compounds and 410,654 bioactivity data points across ten Plasmodium phenotypes and three stages. The performance was systematically compared among two fingerprint-based ML models (RF::Morgan and XGBoost:Morgan), four graph-based DL models (GCN, GAT, MPNN, and Attentive FP), and three co-representations DL models (FP-GNN, HiGNN, and FG-BERT), which reveal that: 1) The FP-GNN model achieved the best predictive performance, outperforming the other methods in distinguishing active and inactive compounds across balanced, more positive, and more negative datasets, with an overall AUROC of 0.900; 2) Fingerprint-based ML models outperformed graph-based DL models on large datasets (>1000 compounds), but the three co-representations DL models were able to incorporate domain-specific chemical knowledge to bridge this gap, achieving better predictive performance. These findings provide valuable guidance for selecting appropriate ML and DL methods for antimalarial activity prediction tasks. The interpretability analysis of the FP-GNN model revealed its ability to accurately capture the key structural features responsible for the liver- and blood-stage activities of the known antimalarial drug atovaquone. Finally, we developed a web server, MalariaFlow, incorporating these high-quality models for antimalarial activity prediction, virtual screening, and similarity search, successfully predicting novel triple-stage antimalarial hits validated through experimental testing, demonstrating its effectiveness and value in discovering potential multistage antimalarial drug candidates.
Collapse
Affiliation(s)
- Mujie Lin
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Junxi Cai
- School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, 510006, China
| | - Yuancheng Wei
- School of Software Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Xinru Peng
- School of Software Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Qianhui Luo
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Biaoshun Li
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Yihao Chen
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China
| | - Ling Wang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, 510006, China.
| |
Collapse
|
2
|
Madushanka A, Laird E, Clark C, Kraka E. SmartCADD: AI-QM Empowered Drug Discovery Platform with Explainability. J Chem Inf Model 2024; 64:6799-6813. [PMID: 39177478 DOI: 10.1021/acs.jcim.4c00720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/24/2024]
Abstract
Artificial intelligence (AI) has emerged as a pivotal force in enhancing productivity across various sectors, with its impact being profoundly felt within the pharmaceutical and biotechnology domains. Despite AI's rapid adoption, its integration into scientific research faces resistance due to myriad challenges: the opaqueness of AI models, the intricate nature of their implementation, and the issue of data scarcity. In response to these impediments, we introduce SmartCADD, an innovative, open-source virtual screening platform that combines deep learning, computer-aided drug design (CADD), and quantum mechanics methodologies within a user-friendly Python framework. SmartCADD is engineered to streamline the construction of comprehensive virtual screening workflows that incorporate a variety of formerly independent techniques─spanning ADMET property predictions, de novo 2D and 3D pharmacophore modeling, molecular docking, to the integration of explainable AI mechanisms. This manuscript highlights the foundational principles, key functionalities, and the unique integrative approach of SmartCADD. Furthermore, we demonstrate its efficacy through a case study focused on the identification of promising lead compounds for HIV inhibition. By democratizing access to advanced AI and quantum mechanics tools, SmartCADD stands as a catalyst for progress in pharmaceutical research and development, heralding a new era of innovation and efficiency.
Collapse
Affiliation(s)
- Ayesh Madushanka
- Department of Chemistry, Southern Methodist University, Dallas, Texas 75205, United States
| | - Eli Laird
- Department of Computer Science, Southern Methodist University, Dallas, Texas 75205, United States
| | - Corey Clark
- Department of Computer Science, Southern Methodist University, Dallas, Texas 75205, United States
| | - Elfi Kraka
- Department of Chemistry, Southern Methodist University, Dallas, Texas 75205, United States
| |
Collapse
|
3
|
Srinivasan K, Puliyanda A, Prasad V. Identification of Reaction Network Hypotheses for Complex Feedstocks from Spectroscopic Measurements with Minimal Human Intervention. J Phys Chem A 2024; 128:4714-4729. [PMID: 38836378 DOI: 10.1021/acs.jpca.4c01592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]
Abstract
In this work, we detail an automated reaction network hypothesis generation protocol for processes involving complex feedstocks where information about the species and reactions involved is unknown. Our methodology is process agnostic and can be utilized in any reactive process with spectroscopic measurements that provide information on the evolution of the components in the mixture. We decompose the mixture spectra to obtain spectroscopic signatures of the individual components and use a 1-D convolutional neural network to automatically identify functional groups indicated by them. We employ atom-atom mapping to automatically recover reaction rules that are applied on candidate molecules identified from chemistry databases through fingerprint similarity. The method is tested on synthetic data and on spectroscopic measurements of lab-scale batch hydrothermal liquefaction (HTL) of biomass to determine the accuracy of prediction across datasets of varying complexities. Our methodology is able to identify reaction network hypotheses containing reaction networks close to the ground truth in the case of synthetic data, and we are also able to recover candidate molecules and reaction networks close to the ones reported in the previous literature studies for biomass pyrolysis.
Collapse
Affiliation(s)
- Karthik Srinivasan
- Department of Chemical and Materials Engineering, Donadeo Innovation Centre for Engineering, 9211, 116st NW, Edmonton T6G 1H9, AB, Canada
| | - Anjana Puliyanda
- Department of Chemical and Materials Engineering, Donadeo Innovation Centre for Engineering, 9211, 116st NW, Edmonton T6G 1H9, AB, Canada
| | - Vinay Prasad
- Department of Chemical and Materials Engineering, Donadeo Innovation Centre for Engineering, 9211, 116st NW, Edmonton T6G 1H9, AB, Canada
| |
Collapse
|
4
|
López-Pérez K, Kim TD, Miranda-Quintana RA. iSIM: instant similarity. DIGITAL DISCOVERY 2024; 3:1160-1171. [PMID: 38873032 PMCID: PMC11167700 DOI: 10.1039/d4dd00041b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 05/06/2024] [Indexed: 06/15/2024]
Abstract
The quantification of molecular similarity has been present since the beginning of cheminformatics. Although several similarity indices and molecular representations have been reported, all of them ultimately reduce to the calculation of molecular similarities of only two objects at a time. Hence, to obtain the average similarity of a set of molecules, all the pairwise comparisons need to be computed, which demands a quadratic scaling in the number of computational resources. Here we propose an exact alternative to this problem: iSIM (instant similarity). iSIM performs comparisons of multiple molecules at the same time and yields the same value as the average pairwise comparisons of molecules represented by binary fingerprints and real-value descriptors. In this work, we introduce the mathematical framework and several applications of iSIM in chemical sampling, visualization, diversity selection, and clustering.
Collapse
Affiliation(s)
- Kenneth López-Pérez
- Department of Chemistry and Quantum Theory Project, University of Florida Gainesville Florida 32611 USA
| | - Taewon D Kim
- Department of Chemistry and Quantum Theory Project, University of Florida Gainesville Florida 32611 USA
| | | |
Collapse
|
5
|
Liu H, Chen P, Hu B, Wang S, Wang H, Luan J, Wang J, Lin B, Cheng M. FaissMolLib: An efficient and easy deployable tool for ligand-based virtual screening. Comput Biol Chem 2024; 110:108057. [PMID: 38581840 DOI: 10.1016/j.compbiolchem.2024.108057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 03/06/2024] [Accepted: 03/20/2024] [Indexed: 04/08/2024]
Abstract
Virtual screening-based molecular similarity and fingerprint are crucial in drug design, target prediction, and ADMET prediction, aiding in identifying potential hits and optimizing lead compounds. However, challenges such as lack of comprehensive open-source molecular fingerprint databases and efficient search methods for virtual screening are prevalent. To address these issues, we introduce FaissMolLib, an open-source virtual screening tool that integrates 2.8 million compounds from ChEMBL and ZINC databases. Notably, FaissMolLib employs the highly efficient Faiss search algorithm, outperforming the Tanimoto algorithm in identifying similar molecules with its tighter clustering in scatter plots and lower mean, standard deviation, and variance in key molecular properties. This feature enables FaissMolLib to screen 2.8 million compounds in just 0.05 seconds, offering researchers an efficient, easily deployable solution for virtual screening on laptops and building unique compound databases. This significant advancement holds great potential for accelerating drug discovery efforts and enhancing chemical data analysis. FaissMolLib is freely available at http://liuhaihan.gnway.cc:80. The code and dataset of FaissMolLib are freely available at https://github.com/Superhaihan/FiassMolLib.
Collapse
Affiliation(s)
- Haihan Liu
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Peiying Chen
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Baichun Hu
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Shizun Wang
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Hanxun Wang
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Jiasi Luan
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Medical Devices, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China
| | - Jian Wang
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China.
| | - Bin Lin
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China.
| | - Maosheng Cheng
- Key Laboratory of Structure-Based Drug Design &Discovery of Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; Key Laboratory of Intelligent Drug Design and New Drug Discovery of Liaoning Province, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China; School of Pharmaceutical Engineering, Shenyang Pharmaceutical University, Shenyang 110016, People's Republic of China.
| |
Collapse
|
6
|
Sardar S, Bhattacharya A, Amin SA, Jha T, Gayen S. Exploring molecular fingerprints of different drugs having bile interaction: a stepping stone towards better drug delivery. Mol Divers 2024; 28:1471-1483. [PMID: 37369957 DOI: 10.1007/s11030-023-10670-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 06/10/2023] [Indexed: 06/29/2023]
Abstract
Bile acids are amphiphilic substances produced naturally in humans. In the context of drug delivery and dosage form design, it is critical to understand whether a drug interacts with bile inside the gastrointestinal (GI) tract or not. This study focuses on the identification of structural fingerprints/features important for bile interaction. Molecular modelling methods such as Bayesian classification and recursive partitioning (RP) studies are executed to find important fingerprints/features for the bile interaction. For the Bayesian classification study, the ROC score of 0.837 and 0.950 are found for the training set and the test set compounds, respectively. The fluorine-containing aliphatic/aromatic group, the branched chain of the alkyl group containing hydroxyl moiety and the phenothiazine ring etc. are identified as good fingerprints having a positive contribution towards bile interactions, whereas, the bad fingerprints such as free carboxylate group, purine, and pyrimidine ring etc. have a negative contribution towards bile interactions. The best tree (tree ID: 1) from the RP study classifies the bile interacting or non-interacting compounds with a ROC score of 0.941 for the training and 0.875 for the test set. Additionally, SARpy and QSAR-Co analyses are also been performed to classify compounds as bile interacting/non-interacting. Moreover, forty-six recently FDA-approved drugs have been screened by the developed SARpy and QSAR-Co models to assess their bile interaction properties. Overall, this attempt may facilitate the researchers to identify bile interacting/non-interacting molecules in a faster way and help in the design of formulations and target-specific drug development.
Collapse
Affiliation(s)
- Sourav Sardar
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Arijit Bhattacharya
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Sk Abdul Amin
- Department of Pharmaceutical Technology, JIS University, 81, Nilgunj Road, Agarpara, Kolkata, West Bengal, India
| | - Tarun Jha
- Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India.
| | - Shovanlal Gayen
- Laboratory of Drug Design and Discovery, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India.
| |
Collapse
|
7
|
Vogt M. Chemoinformatic approaches for navigating large chemical spaces. Expert Opin Drug Discov 2024; 19:403-414. [PMID: 38300511 DOI: 10.1080/17460441.2024.2313475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 01/30/2024] [Indexed: 02/02/2024]
Abstract
INTRODUCTION Large chemical spaces (CSs) include traditional large compound collections, combinatorial libraries covering billions to trillions of molecules, DNA-encoded chemical libraries comprising complete combinatorial CSs in a single mixture, and virtual CSs explored by generative models. The diverse nature of these types of CSs require different chemoinformatic approaches for navigation. AREAS COVERED An overview of different types of large CSs is provided. Molecular representations and similarity metrics suitable for large CS exploration are discussed. A summary of navigation of CSs in generative models is provided. Methods for characterizing and comparing CSs are discussed. EXPERT OPINION The size of large CSs might restrict navigation to specialized algorithms and limit it to considering neighborhoods of structurally similar molecules. Efficient navigation of large CSs not only requires methods that scale with size but also requires smart approaches that focus on better but not necessarily larger molecule selections. Deep generative models aim to provide such approaches by implicitly learning features relevant for targeted biological properties. It is unclear whether these models can fulfill this ideal as validation is difficult as long as the covered CSs remain mainly virtual without experimental verification.
Collapse
Affiliation(s)
- Martin Vogt
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany
| |
Collapse
|
8
|
Avellaneda-Tamayo JF, Chávez-Hernández AL, Prado-Romero DL, Medina-Franco JL. Chemical Multiverse and Diversity of Food Chemicals. J Chem Inf Model 2024; 64:1229-1244. [PMID: 38356237 PMCID: PMC10900296 DOI: 10.1021/acs.jcim.3c01617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 02/03/2024] [Accepted: 02/06/2024] [Indexed: 02/16/2024]
Abstract
Food chemicals have a fundamental role in our lives, with an extended impact on nutrition, disease prevention, and marked economic implications in the food industry. The number of food chemical compounds in public databases has substantially increased in the past few years, which can be characterized using chemoinformatics approaches. We and other groups explored public food chemical libraries containing up to 26,500 compounds. This study aimed to analyze the chemical contents, diversity, and coverage in the chemical space of food chemicals and additives and, from here on, food components. The approach to food components addressed in this study is a public database with more than 70,000 compounds, including those predicted via omics techniques. It was concluded that food components have distinctive physicochemical properties and constitutional descriptors despite sharing many chemical structures with natural products. Food components, on average, have large molecular weights and several apolar structures with saturated hydrocarbons. Compared to reference databases, food component structures have low scaffold and fingerprint-based diversity and high structural complexity, as measured by the fraction of sp3 carbons. These structural features are associated with a large fraction of macronutrients as lipids. Lipids in food components were decompiled by an analysis of the maximum common substructures. The chemical multiverse representation of food chemicals showed a larger coverage of chemical space than natural products and FDA-approved drugs by using different sets of representations.
Collapse
Affiliation(s)
- Juan F. Avellaneda-Tamayo
- DIFACQUIM Research Group, Department
of Pharmacy, School of Chemistry, Universidad
Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| | - Ana L. Chávez-Hernández
- DIFACQUIM Research Group, Department
of Pharmacy, School of Chemistry, Universidad
Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| | - Diana L. Prado-Romero
- DIFACQUIM Research Group, Department
of Pharmacy, School of Chemistry, Universidad
Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| | - José L. Medina-Franco
- DIFACQUIM Research Group, Department
of Pharmacy, School of Chemistry, Universidad
Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| |
Collapse
|
9
|
Siddharth T, Lewis NE. Predicting pathways for old and new metabolites through clustering. J Theor Biol 2024; 578:111684. [PMID: 38048983 PMCID: PMC11139542 DOI: 10.1016/j.jtbi.2023.111684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 11/17/2023] [Accepted: 11/29/2023] [Indexed: 12/06/2023]
Abstract
The diverse metabolic pathways are fundamental to all living organisms, as they harvest energy, synthesize biomass components, produce molecules to interact with the microenvironment, and neutralize toxins. While the discovery of new metabolites and pathways continues, the prediction of pathways for new metabolites can be challenging. It can take vast amounts of time to elucidate pathways for new metabolites; thus, according to HMDB (Human Metabolome Database), only 60% of metabolites get assigned to pathways. Here, we present an approach to identify pathways based on metabolite structure. We extracted 201 features from SMILES annotations and identified new metabolites from PubMed abstracts and HMDB. After applying clustering algorithms to both groups of features, we quantified correlations between metabolites, and found the clusters accurately linked 92% of known metabolites to their respective pathways. Thus, this approach could be valuable for predicting metabolic pathways for new metabolites.
Collapse
Affiliation(s)
- Thiru Siddharth
- Department of Computer Science and Engineering, Indian Institute of Information Technology, Bhopal, MP 462003, India
| | - Nathan E Lewis
- Department of Pediatrics and Bioengineering, University of California San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
10
|
Barrera-Vázquez OS, Escobar-Ramírez JL, Santiago-Mejía J, Carrasco-Ortega OF, Magos-Guerrero GA. Discovering Potential Compounds for Venous Disease Treatment through Virtual Screening and Network Pharmacology Approach. Molecules 2023; 28:7937. [PMID: 38138427 PMCID: PMC10745828 DOI: 10.3390/molecules28247937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 11/28/2023] [Accepted: 11/30/2023] [Indexed: 12/24/2023] Open
Abstract
Peripheral venous hypertension has emerged as a prominent characteristic of venous disease (VD). This disease causes lower limb edema due to impaired blood transport in the veins. The phlebotonic drugs in use showed moderate evidence for reducing edema slightly in the lower legs and little or no difference in the quality of life. To enhance the probability of favorable experimental results, a virtual screening procedure was employed to identify molecules with potential therapeutic activity in VD. Compounds obtained from multiple databases, namely AC Discovery, NuBBE, BIOFACQUIM, and InflamNat, were compared with reference compounds. The examination of structural similarity, targets, and signaling pathways in venous diseases allows for the identification of compounds with potential usefulness in VD. The computational tools employed were rcdk and chemminer from R-Studio and Cytoscape. An extended fingerprint analysis allowed us to obtain 1846 from 41,655 compounds compiled. Only 229 compounds showed pharmacological targets in the PubChem server, of which 84 molecules interacted with the VD network. Because of their descriptors and multi-target capacity, only 18 molecules of 84 were identified as potential candidates for experimental evaluation. We opted to evaluate the berberine compound because of its affordability, and extensive literature support. The experiment showed the proposed activity in an acute venous hypertension model.
Collapse
Affiliation(s)
| | | | | | | | - Gil Alfonso Magos-Guerrero
- Department of Pharmacology, Faculty of Medicine, University National Autonomous of Mexico (UNAM), Mexico City 04510, Mexico; (O.S.B.-V.); (J.L.E.-R.); (J.S.-M.); (O.F.C.-O.)
| |
Collapse
|
11
|
Li X, Yuan H, Wu X, Wang C, Wu M, Shi H, Lv Y. MultiDS-MDA: Integrating multiple data sources into heterogeneous network for predicting novel metabolite-drug associations. Comput Biol Med 2023; 162:107067. [PMID: 37276756 DOI: 10.1016/j.compbiomed.2023.107067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/15/2023] [Accepted: 05/27/2023] [Indexed: 06/07/2023]
Abstract
Metabolic processes in the human body play an important role in maintaining normal life activities, and the abnormal concentration of metabolites is closely related to the occurrence and development of diseases. The use of drugs is considered to have a major impact on metabolism, and drug metabolites can contribute to efficacy, drug toxicity and drug-drug interaction. However, our understanding of metabolite-drug associations is far from complete, and individual data source tends to be incomplete and noisy. Therefore, the integration of various types of data sources for inferring reliable metabolite-drug associations is urgently needed. In this study, we proposed a computational framework, MultiDS-MDA, for identifying metabolite-drug associations by integrating multiple data sources, including chemical structure information of metabolites and drugs, the relationships of metabolite-gene, metabolite-disease, drug-gene and drug-disease, the data of gene ontology (GO) and disease ontology (DO) and known metabolite-drug connections. The performance of MultiDS-MDA was evaluated by 5-fold cross-validation, which achieved an area under the ROC curve (AUROC) of 0.911 and an area under the precision-recall curve (AUPRC) of 0.907. Additionally, MultiDS-MDA showed outstanding performance compared with similar approaches. Case studies for three metabolites (cholesterol, thromboxane B2 and coenzyme Q10) and three drugs (simvastatin, pravastatin and morphine) also demonstrated the reliability and efficiency of MultiDS-MDA, and it is anticipated that MultiDS-MDA will serve as a powerful tool for future exploration of metabolite-drug interactions and contribute to drug development and drug combination.
Collapse
Affiliation(s)
- Xiuhong Li
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Hao Yuan
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Xiaoliang Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Chengyi Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Meitao Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Hongbo Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, China.
| | - Yingli Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, China.
| |
Collapse
|
12
|
Pikalyova R, Zabolotna Y, Horvath D, Marcou G, Varnek A. Chemical Library Space: Definition and DNA-Encoded Library Comparison Study Case. J Chem Inf Model 2023. [PMID: 37368824 DOI: 10.1021/acs.jcim.3c00520] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2023]
Abstract
The development of DNA-encoded library (DEL) technology introduced new challenges for the analysis of chemical libraries. It is often useful to consider a chemical library as a stand-alone chemoinformatic object─represented both as a collection of independent molecules, and yet an individual entity─in particular, when they are inseparable mixtures, like DELs. Herein, we introduce the concept of chemical library space (CLS), in which resident items are individual chemical libraries. We define and compare four vectorial library representations obtained using generative topographic mapping. These allow for an effective comparison of libraries, with the ability to tune and chemically interpret the similarity relationships. In particular, property-tuned CLS encodings enable to simultaneously compare libraries with respect to both property and chemotype distributions. We apply the various CLS encodings for the selection problem of DELs that optimally "match" a reference collection (here ChEMBL28), showing how the choice of the CLS descriptors may help to fine-tune the "matching" (overlap) criteria. Hence, the proposed CLS may represent a new efficient way for polyvalent analysis of thousands of chemical libraries. Selection of an easily accessible compound collection for drug discovery, as a substitute for a difficult to produce reference library, can be tuned for either primary or target-focused screening, also considering property distributions of compounds. Alternatively, selection of libraries covering novel regions of the chemical space with respect to a reference compound subspace may serve for library portfolio enrichment.
Collapse
Affiliation(s)
- Regina Pikalyova
- Laboratory of Chemoinformatics, University of Strasbourg, 4, rue B. Pascal, Strasbourg 67081, France
| | - Yuliana Zabolotna
- Laboratory of Chemoinformatics, University of Strasbourg, 4, rue B. Pascal, Strasbourg 67081, France
| | - Dragos Horvath
- Laboratory of Chemoinformatics, University of Strasbourg, 4, rue B. Pascal, Strasbourg 67081, France
| | - Gilles Marcou
- Laboratory of Chemoinformatics, University of Strasbourg, 4, rue B. Pascal, Strasbourg 67081, France
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, University of Strasbourg, 4, rue B. Pascal, Strasbourg 67081, France
| |
Collapse
|
13
|
Minh Quang N, Tran Thai H, Le Thi H, Duc Cuong N, Hien NQ, Hoang D, Ngoc VTB, Ky Minh V, Van Tat P. Novel Thiosemicarbazone Quantum Dots in the Treatment of Alzheimer's Disease Combining In Silico Models Using Fingerprints and Physicochemical Descriptors. ACS OMEGA 2023; 8:11076-11099. [PMID: 37008140 PMCID: PMC10061515 DOI: 10.1021/acsomega.2c07934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 03/07/2023] [Indexed: 06/19/2023]
Abstract
Searching for thiosemicarbazone derivatives with the potential to inhibit acetylcholinesterase for the treatment of Alzheimer's disease (AD) is an important current goal. The QSARKPLS, QSARANN, and QSARSVR models were constructed using binary fingerprints and physicochemical (PC) descriptors of 129 thiosemicarbazone compounds screened from a database of 3791 derivatives. The R 2 and Q 2 values for the QSARKPLS, QSARANN, and QSARSVR models are greater than 0.925 and 0.713 using dendritic fingerprint (DF) and PC descriptors, respectively. The in vitro pIC50 activities of four new design-oriented compounds N1, N2, N3, and N4, from the QSARKPLS model using DFs, are consistent with the experimental results and those from the QSARANN and QSARSVR models. The designed compounds N1, N2, N3, and N4 do not violate Lipinski-5 and Veber rules using the ADME and BoiLED-Egg methods. The binding energy, kcal mol-1, of the novel compounds to the 1ACJ-PDB protein receptor of the AChE enzyme was also obtained by molecular docking and dynamics simulations consistent with those predicted from the QSARANN and QSARSVR models. New compounds N1, N2, N3, and N4 were synthesized, and the experimental in vitro pIC50 activity was determined in agreement with those obtained from in silico models. The newly synthesized thiosemicarbazones N1, N2, N3, and N4 can inhibit 1ACJ-PDB, which is predicted to be able to cross the barrier. The DFT B3LYP/def-SV(P)-ECP quantization calculation method was used to calculate E HOMO and E LUMO to account for the activities of compounds N1, N2, N3, and N4. The quantum calculation results explained are consistent with those obtained in in silico models. The successful results here may contribute to the search for new drugs for the treatment of AD.
Collapse
Affiliation(s)
- Nguyen Minh Quang
- Faculty
of Chemical Engineering, Industrial University
of Ho Chi Minh City, 12 Nguyen Van Bao, Dist. Go Vap, Ho Chi Minh 700000, Viet Nam
| | - Hoa Tran Thai
- Faculty
of Chemistry, Hue University of Sciences, Hue University, 77 Nguyen Hue, Hue City 530000, Viet Nam
| | - Hoa Le Thi
- Faculty
of Chemistry, Hue University of Sciences, Hue University, 77 Nguyen Hue, Hue City 530000, Viet Nam
| | - Nguyen Duc Cuong
- Faculty
of Chemistry, Hue University of Sciences, Hue University, 77 Nguyen Hue, Hue City 530000, Viet Nam
- School
of Hospitality and Tourism, Hue University, 22 Lam Hoang, Hue City 530000, Viet
Nam
| | - Nguyen Quoc Hien
- Vietnam
Atomic Energy Institute, 59 Ly Thuong Kiet, Dist. Hoan Kiem, Hanoi
City 100000, Viet Nam
| | - DongQuy Hoang
- Faculty
of
Materials Science and Technology, University of Science, Vietnam National University, Ho Chi Minh 700000, Viet Nam
- Vietnam
National University, Ho Chi Minh
City 700000, Viet Nam
| | - Vu Thi Bao Ngoc
- Faculty
of Chemistry and Environment, University
of Dalat, 01 Phu Dong Thien Vuong, Dalat City 660000, Viet Nam
| | - Vo Ky Minh
- Franklin
High School, 6400 Whitelock Pkwy, Elk Grove, California 95757, United States
| | - Pham Van Tat
- Department
of Sciences and Journal Management, Hoa
Sen University, 08 Nguyen Van Trang, Dist. 01, Ho Chi Minh 700000, Viet Nam
| |
Collapse
|
14
|
Pope JD, Drummer OH, Schneider HG. False-Positive Amphetamines in Urine Drug Screens: A 6-Year Review. J Anal Toxicol 2023; 47:263-270. [PMID: 36367744 DOI: 10.1093/jat/bkac089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 09/14/2022] [Accepted: 11/10/2022] [Indexed: 11/13/2022] Open
Abstract
Immunoassays are routinely used to provide rapid urine drug screening results in the clinical setting. These screening tests are prone to false-positive results and ideally require confirmation by mass spectrometry. In this study, we have examined a large number of urine specimens where drugs other than amphetamines may have caused a false-positive amphetamine immunoassay screening result. Urine drug screens (12,250) in a clinical laboratory that used the CEDIA amphetamine/ecstasy method were reviewed for false-positive results over a 6-year period (2015-2020). An additional 3,486 referred samples, for which confirmatory--mass spectrometry was requested, were also reviewed. About 86 in-house samples and 175 referral samples that were CEDIA false-positive screens were further analyzed by an LC-QTOF general unknown screen. Potential cross-reacting drugs were identified, and their molecular similarities to the CEDIA targets were determined. Commercial standards were also analyzed for cross-reactivity in the amphetamine/ecstasy CEDIA screen. Positive amphetamine results in 3.9% of in-house samples and 9.9% of referred tests for confirmatory analysis were false positive for amphetamines. Of these false-positive specimens, on average, 6.8 drugs were detected by the LC-QTOF screen. Several drugs were identified as possible cross-reacting drugs to the CEDIA amphetamine/ecstasy assay. Maximum common substructure scores for 70 potential cross-reacting compounds were calculated. This was not helpful in identifying cross-reacting drugs. False-positive amphetamine screens make up to 3.9-9.9% of positive amphetamine screens in the clinical laboratory. Knowledge of cross-reacting drugs may be helpful when mass spectrometry testing is unavailable.
Collapse
Affiliation(s)
- Jeffrey D Pope
- Clinical Biochemistry Unit, Alfred Health, 55 Commercial Rd, Melbourne, VIC 3004, Australia
- Department of Forensic Medicine, Monash University, 65 Kavanagh St., Southbank, VIC 3006, Australia
| | - Olaf H Drummer
- Department of Forensic Medicine, Monash University, 65 Kavanagh St., Southbank, VIC 3006, Australia
- Victorian Institute of Forensic Medicine, 65 Kavanagh St., Southbank, VIC 3006, Australia
| | - Hans G Schneider
- Clinical Biochemistry Unit, Alfred Health, 55 Commercial Rd, Melbourne, VIC 3004, Australia
- School of Public Health and Preventative Medicine, Monash University, 99 Commercial Rd, Melbourne, VIC 3004, Australia
| |
Collapse
|
15
|
Caballero Alfonso AY, Chayawan C, Gadaleta D, Roncaglioni A, Benfenati E. A KNIME Workflow to Assist the Analogue Identification for Read-Across, Applied to Aromatase Activity. Molecules 2023; 28:molecules28041832. [PMID: 36838826 PMCID: PMC9961311 DOI: 10.3390/molecules28041832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 02/07/2023] [Accepted: 02/10/2023] [Indexed: 02/18/2023] Open
Abstract
The reduction and replacement of in vivo tests have become crucial in terms of resources and animal benefits. The read-across approach reduces the number of substances to be tested, exploiting existing experimental data to predict the properties of untested substances. Currently, several tools have been developed to perform read-across, but other approaches, such as computational workflows, can offer a more flexible and less prescriptive approach. In this paper, we are introducing a workflow to support analogue identification for read-across. The implementation of the workflow was performed using a database of azole chemicals with in vitro toxicity data for human aromatase enzymes. The workflow identified analogues based on three similarities: structural similarity (StrS), metabolic similarity (MtS), and mechanistic similarity (McS). Our results showed how multiple similarity metrics can be combined within a read-across assessment. The use of the similarity based on metabolism and toxicological mechanism improved the predictions in particular for sensitivity. Beyond the results predicting a large population of substances, practical examples illustrate the advantages of the proposed approach.
Collapse
Affiliation(s)
- Ana Yisel Caballero Alfonso
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche “Mario Negri”—IRCCS, Via Mario Negri, 2, 20156 Milano, Italy
- Jozef Stefan International Postgraduate School, Jamova cesta 39, 1000 Ljubljana, Slovenia
- Correspondence: (A.Y.C.A.); (E.B.)
| | - Chayawan Chayawan
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche “Mario Negri”—IRCCS, Via Mario Negri, 2, 20156 Milano, Italy
| | - Domenico Gadaleta
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche “Mario Negri”—IRCCS, Via Mario Negri, 2, 20156 Milano, Italy
| | - Alessandra Roncaglioni
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche “Mario Negri”—IRCCS, Via Mario Negri, 2, 20156 Milano, Italy
| | - Emilio Benfenati
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche “Mario Negri”—IRCCS, Via Mario Negri, 2, 20156 Milano, Italy
- Correspondence: (A.Y.C.A.); (E.B.)
| |
Collapse
|
16
|
Erlina L, Paramita RI, Kusuma WA, Fadilah F, Tedjo A, Pratomo IP, Ramadhanti NS, Nasution AK, Surado FK, Fitriawan A, Istiadi KA, Yanuar A. Virtual screening of Indonesian herbal compounds as COVID-19 supportive therapy: machine learning and pharmacophore modeling approaches. BMC Complement Med Ther 2022; 22:207. [PMID: 35922786 PMCID: PMC9347098 DOI: 10.1186/s12906-022-03686-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 07/21/2022] [Indexed: 11/10/2022] Open
Abstract
Background The number of COVID-19 cases continues to grow in Indonesia. This phenomenon motivates researchers to find alternative drugs that function for prevention or treatment. Due to the rich biodiversity of Indonesian medicinal plants, one alternative is to examine the potential of herbal medicines to support COVID therapy. This study aims to identify potential compound candidates in Indonesian herbal using a machine learning and pharmacophore modeling approaches. Methods We used three classification methods that had different decision-making processes: support vector machine (SVM), multilayer perceptron (MLP), and random forest (RF). For the pharmacophore modeling approach, we performed a structure-based analysis on the 3D structure of the main protease SARS-CoV-2 (3CLPro) and repurposed SARS, MERS, and SARS-CoV-2 drugs identified from the literature as datasets in the ligand-based method. Lastly, we used molecular docking to analyze the interactions between the 3CLpro and 14 hit compounds from the Indonesian Herbal Database (HerbalDB), with lopinavir as a positive control. Results From the molecular docking analysis, we found six potential compounds that may act as the main proteases of the SARS-CoV-2 inhibitor: hesperidin, kaempferol-3,4'-di-O-methyl ether (Ermanin); myricetin-3-glucoside, peonidin 3-(4’-arabinosylglucoside); quercetin 3-(2G-rhamnosylrutinoside); and rhamnetin 3-mannosyl-(1-2)-alloside. Conclusions Our layered virtual screening with machine learning and pharmacophore modeling approaches provided a more objective and optimal virtual screening and avoided subjective decision making of the results. Herbal compounds from the screening, i.e. hesperidin, kaempferol-3,4'-di-O-methyl ether (Ermanin); myricetin-3-glucoside, peonidin 3-(4’-arabinosylglucoside); quercetin 3-(2G-rhamnosylrutinoside); and rhamnetin 3-mannosyl-(1-2)-alloside are potential antiviral candidates for SARS-CoV-2. Moringa oleifera and Psidium guajava that consist of those compounds, could be an alternative option as COVID-19 herbal preventions. Supplementary Information The online version contains supplementary material available at 10.1186/s12906-022-03686-y.
Collapse
|
17
|
Yang R, Zha X, Gao X, Wang K, Cheng B, Yan B. Multi-stage virtual screening of natural products against p38α mitogen-activated protein kinase: predictive modeling by machine learning, docking study and molecular dynamics simulation. Heliyon 2022; 8:e10495. [PMID: 36105464 PMCID: PMC9465123 DOI: 10.1016/j.heliyon.2022.e10495] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 03/20/2022] [Accepted: 08/25/2022] [Indexed: 11/20/2022] Open
Abstract
p38α is a mitogen-activated protein kinase (MAPK), and the signaling pathways involved are closely related to the inflammation, apoptosis and differentiation of cells, which also makes it an attractive target for drug discovery. With the high efficiency and low cost, virtual screening technology is becoming an indispensable part of drug development. In this study, a novel multi-stage virtual screening method based on machine learning, molecular docking and molecular dynamics simulation was developed to identify p38α MAPK inhibitors from natural products in ZINC database, which improves the prediction accuracy by considering and utilizing both ligand and receptor information compared to any individual approach. Ultimately, we screened out two candidate inhibitors with acceptable ADMET properties (ZINC4260400 and ZINC8300300). Among the generated machine learning models, Random Forest (RF) and Support Vector Machine (SVM) performed better, with the area under the receiver operating characteristic curve (AUC) values of 0.932 and 0.931 on the test set, as well as 0.834 and 0.850 on the external validation set. In addition, the results of molecular docking and ADMET prediction showed that two compounds with appropriate pharmacokinetic properties had binding free energies less than −8.0 kcal/mol for the target protein, and the results of molecular dynamics simulations further confirmed that they were stable during the process of inhibition.
Collapse
|
18
|
Panda G, Mishra N, Sharma D, Kutum R, Bhoyar RC, Jain A, Imran M, Senthilvel V, Divakar MK, Mishra A, Garg P, Banerjee P, Sivasubbu S, Scaria V, Ray A. Comprehensive Assessment of Indian Variations in the Druggable Kinome Landscape Highlights Distinct Insights at the Sequence, Structure and Pharmacogenomic Stratum. Front Pharmacol 2022; 13:858345. [PMID: 35865963 PMCID: PMC9294532 DOI: 10.3389/fphar.2022.858345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open
Abstract
India confines more than 17% of the world’s population and has a diverse genetic makeup with several clinically relevant rare mutations belonging to many sub-group which are undervalued in global sequencing datasets like the 1000 Genome data (1KG) containing limited samples for Indian ethnicity. Such databases are critical for the pharmaceutical and drug development industry where diversity plays a crucial role in identifying genetic disposition towards adverse drug reactions. A qualitative and comparative sequence and structural study utilizing variant information present in the recently published, largest curated Indian genome database (IndiGen) and the 1000 Genome data was performed for variants belonging to the kinase coding genes, the second most targeted group of drug targets. The sequence-level analysis identified similarities and differences among different populations based on the nsSNVs and amino acid exchange frequencies whereas a comparative structural analysis of IndiGen variants was performed with pathogenic variants reported in UniProtKB Humsavar data. The influence of these variations on structural features of the protein, such as structural stability, solvent accessibility, hydrophobicity, and the hydrogen-bond network was investigated. In-silico screening of the known drugs to these Indian variation-containing proteins reveals critical differences imparted in the strength of binding due to the variations present in the Indian population. In conclusion, this study constitutes a comprehensive investigation into the understanding of common variations present in the second largest population in the world and investigating its implications in the sequence, structural and pharmacogenomic landscape. The preliminary investigation reported in this paper, supporting the screening and detection of ADRs specific to the Indian population could aid in the development of techniques for pre-clinical and post-market screening of drug-related adverse events in the Indian population.
Collapse
Affiliation(s)
- Gayatri Panda
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla, India
| | - Neha Mishra
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla, India
| | - Disha Sharma
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Rintu Kutum
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
- Ashoka University, Sonipat, India
| | - Rahul C. Bhoyar
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Abhinav Jain
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Mohamed Imran
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Vigneshwar Senthilvel
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Mohit Kumar Divakar
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Anushree Mishra
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Parth Garg
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla, India
| | - Priyanka Banerjee
- Institute for Physiology, Charité-University Medicine Berlin, Berlin, Germany
| | - Sridhar Sivasubbu
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Vinod Scaria
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Arjun Ray
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla, India
- *Correspondence: Arjun Ray,
| |
Collapse
|
19
|
Yang R, Zhao G, Cheng B, Yan B. Identification of potential matrix metalloproteinase-2 inhibitors from natural products through advanced machine learning-based cheminformatics approaches. Mol Divers 2022:10.1007/s11030-022-10467-9. [PMID: 35773549 DOI: 10.1007/s11030-022-10467-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 05/20/2022] [Indexed: 11/29/2022]
Abstract
Matrix metalloproteinase-2 (MMP-2) is capable of degrading Collage TypeIV in the vascular basement membrane and extracellular matrix. Studies have shown that MMP-2 is tightly associated with the biological behavior of malignant tumors. Therefore, the identification of inhibitors targeting MMP-2 could be effective in treating the disease by maintaining extracellular matrix homeostasis. In the pharmaceutical and biomedical fields, many computational tools are widely used, which improve the efficiency of the whole process to some extent. Apart from the conventional cheminformatics approaches (e.g., pharmacophore model and molecular docking), virtual screening strategies based on machine learning also have promising applications. In this study, we collected 2871 compound activity data against MMP-2 from the ChEMBL database and divided the training and test sets in a 3:1 ratio. Four machine learning algorithms were then selected to construct the classification models, and the best-performing model, i.e., the stacking-based fusion model with the highest AUC value in both training and test datasets, was used for the virtual screening of ZINC database. Next, we screened 17 potential MMP-2 inhibitors from the results predicted by the machine learning model via ADME/T analysis. The interactions between these compounds and the target protein were explored through molecular docking calculations, and the results showed that ZINC712249, ZINC4270723, and ZINC15858504 had lower binding free energies than the co-crystal ligand. To further examine the binding stability of the complexes, we performed molecular dynamics simulations and finally identified these three hits as the most promising natural products for MMP-2 inhibitors.
Collapse
Affiliation(s)
- Ruoqi Yang
- Shandong University of Traditional Chinese Medicine, Jinan, 250355, China.
| | - Guiping Zhao
- Shandong University of Traditional Chinese Medicine, Jinan, 250355, China
| | - Bin Cheng
- Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, 250355, China
| | - Bin Yan
- Shandong University of Traditional Chinese Medicine, Jinan, 250355, China.
| |
Collapse
|
20
|
Machine Learning for the Prediction of Antiviral Compounds Targeting Avian Influenza A/H9N2 Viral Proteins. Symmetry (Basel) 2022. [DOI: 10.3390/sym14061114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Avian influenza subtype A/H9N2—which infects chickens, reducing egg production by up to 80%—may be transmissible to humans. In humans, this virus is very harmful since it attacks the respiratory system and reproductive tract, replicating in both. Previous attempts to find antiviral candidates capable of inhibiting influenza A/H9N2 transmission were unsuccessful. This study aims to better characterize A/H9N2 to facilitate the discovery of antiviral compounds capable of inhibiting its transmission. The Symmetry of this study is to apply several machine learning methods to perform virtual screening to identify H9N2 antivirus candidates. The parameters used to measure the machine learning model’s quality included accuracy, sensitivity, specificity, balanced accuracy, and receiver operating characteristic score. We found that the extreme gradient boosting method yielded better results in classifying compounds predicted to be suitable antiviral compounds than six other machine learning methods, including logistic regression, k-nearest neighbor analysis, support vector machine, multilayer perceptron, random forest, and gradient boosting. Using this algorithm, we identified 10 candidate synthetic compounds with the highest scores. These high scores predicted that the molecular fingerprint may involve strong bonding characteristics. Thus, we were able to find significant candidates for synthetic H9N2 antivirus compounds and identify the best machine learning method to perform virtual screenings.
Collapse
|
21
|
Zhu Y, Du C, Zheng H, Wang F, Tian F, Liu X, Li D. Molecular representation of coal-derived asphaltene based on high resolution mass spectrometry. ARAB J CHEM 2022. [DOI: 10.1016/j.arabjc.2021.103531] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
|
22
|
Screening of Potential Indonesia Herbal Compounds Based on Multi-Label Classification for 2019 Coronavirus Disease. BIG DATA AND COGNITIVE COMPUTING 2021. [DOI: 10.3390/bdcc5040075] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Coronavirus disease 2019 pandemic spreads rapidly and requires an acceleration in the process of drug discovery. Drug repurposing can help accelerate the drug discovery process by identifying new efficacy for approved drugs, and it is considered an efficient and economical approach. Research in drug repurposing can be done by observing the interactions of drug compounds with protein related to a disease (DTI), then predicting the new drug-target interactions. This study conducted multilabel DTI prediction using the stack autoencoder-deep neural network (SAE-DNN) algorithm. Compound features were extracted using PubChem fingerprint, daylight fingerprint, MACCS fingerprint, and circular fingerprint. The results showed that the SAE-DNN model was able to predict DTI in COVID-19 cases with good performance. The SAE-DNN model with a circular fingerprint dataset produced the best average metrics with an accuracy of 0.831, recall of 0.918, precision of 0.888, and F-measure of 0.89. Herbal compounds prediction results using the SAE-DNN model with the circular, daylight, and PubChem fingerprint dataset resulted in 92, 65, and 79 herbal compounds contained in herbal plants in Indonesia respectively.
Collapse
|
23
|
Lee Y, Nam S. Performance Comparisons of AlexNet and GoogLeNet in Cell Growth Inhibition IC50 Prediction. Int J Mol Sci 2021; 22:7721. [PMID: 34299341 PMCID: PMC8305019 DOI: 10.3390/ijms22147721] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 07/09/2021] [Accepted: 07/16/2021] [Indexed: 12/17/2022] Open
Abstract
Drug responses in cancer are diverse due to heterogenous genomic profiles. Drug responsiveness prediction is important in clinical response to specific cancer treatments. Recently, multi-class drug responsiveness models based on deep learning (DL) models using molecular fingerprints and mutation statuses have emerged. However, for multi-class models for drug responsiveness prediction, comparisons between convolution neural network (CNN) models (e.g., AlexNet and GoogLeNet) have not been performed. Therefore, in this study, we compared the two CNN models, GoogLeNet and AlexNet, along with the least absolute shrinkage and selection operator (LASSO) model as a baseline model. We constructed the models by taking drug molecular fingerprints of drugs and cell line mutation statuses, as input, to predict high-, intermediate-, and low-class for half-maximal inhibitory concentration (IC50) values of the drugs in the cancer cell lines. Additionally, we compared the models in breast cancer patients as well as in an independent gastric cancer cell line drug responsiveness data. We measured the model performance based on the area under receiver operating characteristic (ROC) curves (AUROC) value. In this study, we compared CNN models for multi-class drug responsiveness prediction. The AlexNet and GoogLeNet showed better performances in comparison to LASSO. Thus, DL models will be useful tools for precision oncology in terms of drug responsiveness prediction.
Collapse
Affiliation(s)
- Yeeun Lee
- Department of Health Sciences and Technology, Gachon Advanced Institute for Health Sciences and Technology, Gachon University, Incheon 21999, Korea;
| | - Seungyoon Nam
- Department of Health Sciences and Technology, Gachon Advanced Institute for Health Sciences and Technology, Gachon University, Incheon 21999, Korea;
- College of Medicine, Gachon University, Incheon 21565, Korea
- Gachon Institute of Genome Medicine and Science, Gachon University Gil Medical Center, Incheon 21565, Korea
- Department of Life Sciences, Gachon University, Seongnam 13120, Korea
| |
Collapse
|
24
|
Miranda-Quintana RA, Rácz A, Bajusz D, Héberger K. Extended similarity indices: the benefits of comparing more than two objects simultaneously. Part 2: speed, consistency, diversity selection. J Cheminform 2021; 13:33. [PMID: 33892799 PMCID: PMC8067665 DOI: 10.1186/s13321-021-00504-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 03/12/2021] [Indexed: 11/10/2022] Open
Abstract
Despite being a central concept in cheminformatics, molecular similarity has so far been limited to the simultaneous comparison of only two molecules at a time and using one index, generally the Tanimoto coefficent. In a recent contribution we have not only introduced a complete mathematical framework for extended similarity calculations, (i.e. comparisons of more than two molecules at a time) but defined a series of novel idices. Part 1 is a detailed analysis of the effects of various parameters on the similarity values calculated by the extended formulas. Their features were revealed by sum of ranking differences and ANOVA. Here, in addition to characterizing several important aspects of the newly introduced similarity metrics, we will highlight their applicability and utility in real-life scenarios using datasets with popular molecular fingerprints. Remarkably, for large datasets, the use of extended similarity measures provides an unprecedented speed-up over “traditional” pairwise similarity matrix calculations. We also provide illustrative examples of a more direct algorithm based on the extended Tanimoto similarity to select diverse compound sets, resulting in much higher levels of diversity than traditional approaches. We discuss the inner and outer consistency of our indices, which are key in practical applications, showing whether the n-ary and binary indices rank the data in the same way. We demonstrate the use of the new n-ary similarity metrics on t-distributed stochastic neighbor embedding (t-SNE) plots of datasets of varying diversity, or corresponding to ligands of different pharmaceutical targets, which show that our indices provide a better measure of set compactness than standard binary measures. We also present a conceptual example of the applicability of our indices in agglomerative hierarchical algorithms. The Python code for calculating the extended similarity metrics is freely available at: https://github.com/ramirandaq/MultipleComparisons
Collapse
Affiliation(s)
| | - Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117, Budapest, Hungary
| | - Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117, Budapest, Hungary
| | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117, Budapest, Hungary.
| |
Collapse
|
25
|
Shen WX, Zeng X, Zhu F, Wang YL, Qin C, Tan Y, Jiang YY, Chen YZ. Out-of-the-box deep learning prediction of pharmaceutical properties by broadly learned knowledge-based molecular representations. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00301-6] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
26
|
Čmelo I, Voršilák M, Svozil D. Profiling and analysis of chemical compounds using pointwise mutual information. J Cheminform 2021; 13:3. [PMID: 33423694 PMCID: PMC7798221 DOI: 10.1186/s13321-020-00483-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 12/24/2020] [Indexed: 12/21/2022] Open
Abstract
Pointwise mutual information (PMI) is a measure of association used in information theory. In this paper, PMI is used to characterize several publicly available databases (DrugBank, ChEMBL, PubChem and ZINC) in terms of association strength between compound structural features resulting in database PMI interrelation profiles. As structural features, substructure fragments obtained by coding individual compounds as MACCS, PubChemKey and ECFP fingerprints are used. The analysis of publicly available databases reveals, in accord with other studies, unusual properties of DrugBank compounds which further confirms the validity of PMI profiling approach. Z-standardized relative feature tightness (ZRFT), a PMI-derived measure that quantifies how well the given compound's feature combinations fit these in a particular compound set, is applied for the analysis of compound synthetic accessibility (SA), as well as for the classification of compounds as easy (ES) and hard (HS) to synthesize. ZRFT value distributions are compared with these of SYBA and SAScore. The analysis of ZRFT values of structurally complex compounds in the SAVI database reveals oligopeptide structures that are mispredicted by SAScore as HS, while correctly predicted by ZRFT and SYBA as ES. Compared to SAScore, SYBA and random forest, ZRFT predictions are less accurate, though by a narrow margin (AccZRFT = 94.5%, AccSYBA = 98.8%, AccSAScore = 99.0%, AccRF = 97.3%). However, ZRFT ability to distinguish between ES and HS compounds is surprisingly high considering that while SYBA, SAScore and random forest are dedicated SA models, ZRFT is a generic measurement that merely quantifies the strength of interrelations between structural feature pairs. The results presented in the current work indicate that structural feature co-occurrence, quantified by PMI or ZRFT, contains a significant amount of information relevant to physico-chemical properties of organic compounds.
Collapse
Affiliation(s)
- I. Čmelo
- CZ-OPENSCREEN National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28 Prague, Czech Republic
| | - M. Voršilák
- CZ-OPENSCREEN National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28 Prague, Czech Republic
- CZ-OPENSCREEN National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR v. v. i., Vídeňská 1083, 142 20 Prague 4, Czech Republic
| | - D. Svozil
- CZ-OPENSCREEN National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28 Prague, Czech Republic
- CZ-OPENSCREEN National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the ASCR v. v. i., Vídeňská 1083, 142 20 Prague 4, Czech Republic
| |
Collapse
|
27
|
Choi KE, Balupuri A, Kang NS. The Study on the hERG Blocker Prediction Using Chemical Fingerprint Analysis. Molecules 2020; 25:E2615. [PMID: 32512802 PMCID: PMC7321128 DOI: 10.3390/molecules25112615] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 06/01/2020] [Accepted: 06/02/2020] [Indexed: 01/31/2023] Open
Abstract
Human ether-a-go-go-related gene (hERG) potassium channel blockage by small molecules may cause severe cardiac side effects. Thus, it is crucial to screen compounds for activity on the hERG channels early in the drug discovery process. In this study, we collected 5299 hERG inhibitors with diverse chemical structures from a number of sources. Based on this dataset, we evaluated different machine learning (ML) and deep learning (DL) algorithms using various integer and binary type fingerprints. A training set of 3991 compounds was used to develop quantitative structure-activity relationship (QSAR) models. The performance of the developed models was evaluated using a test set of 998 compounds. Models were further validated using external set 1 (263 compounds) and external set 2 (47 compounds). Overall, models with integer type fingerprints showed better performance than models with no fingerprints, converted binary type fingerprints or original binary type fingerprints. Comparison of ML and DL algorithms revealed that integer type fingerprints are suitable for ML, whereas binary type fingerprints are suitable for DL. The outcomes of this study indicate that the rational selection of fingerprints is important for hERG blocker prediction.
Collapse
Affiliation(s)
| | | | - Nam Sook Kang
- Graduate School of New Drug Discovery and Development, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Korea; (K.-E.C.); (A.B.)
| |
Collapse
|
28
|
Kammeraad JA, Goetz J, Walker EA, Tewari A, Zimmerman PM. What Does the Machine Learn? Knowledge Representations of Chemical Reactivity. J Chem Inf Model 2020; 60:1290-1301. [PMID: 32091880 DOI: 10.1021/acs.jcim.9b00721] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
In a departure from conventional chemical approaches, data-driven models of chemical reactions have recently been shown to be statistically successful using machine learning. These models, however, are largely black box in character and have not provided the kind of chemical insights that historically advanced the field of chemistry. To examine the knowledgebase of machine-learning models-what does the machine learn-this article deconstructs black-box machine-learning models of a diverse chemical reaction data set. Through experimentation with chemical representations and modeling techniques, the analysis provides insights into the nature of how statistical accuracy can arise, even when the model lacks informative physical principles. By peeling back the layers of these complicated models we arrive at a minimal, chemically intuitive model (and no machine learning involved). This model is based on systematic reaction-type classification and Evans-Polanyi relationships within reaction types which are easily visualized and interpreted. Through exploring this simple model, we gain deeper understanding of the data set and uncover a means for expert interactions to improve the model's reliability.
Collapse
Affiliation(s)
- Joshua A Kammeraad
- Department of Chemistry, University of Michigan, 930 North University Avenue, Ann Arbor, Michigan 48109, United States
| | - Jack Goetz
- Department of Statistics, University of Michigan, 1085 South University Avenue, Ann Arbor, Michigan 48109, United States
| | - Eric A Walker
- Department of Chemistry, University of Michigan, 930 North University Avenue, Ann Arbor, Michigan 48109, United States
| | - Ambuj Tewari
- Department of Statistics, University of Michigan, 1085 South University Avenue, Ann Arbor, Michigan 48109, United States
| | - Paul M Zimmerman
- Department of Chemistry, University of Michigan, 930 North University Avenue, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
29
|
Yang Y, Zhang Y, Hua Y, Chen X, Fan Y, Wang Y, Liang L, Deng C, Lu T, Chen Y, Liu H. In Silico Design and Analysis of a Kinase-Focused Combinatorial Library Considering Diversity and Quality. J Chem Inf Model 2020; 60:92-107. [PMID: 31886658 DOI: 10.1021/acs.jcim.9b00841] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
A structurally diverse, high-quality, and kinase-focused database plays a critical role in finding hits or leads in kinase drug discovery. Here, we propose a workflow for designing a virtual kinase-focused combinatorial library using existing structures. Based on the analysis of known protein kinase inhibitors (PKIs), detailed fragment optimization, fragment selection, fragment linking, and a molecular filtering scheme were defined. Quick recognition of core fragments that can possibly form dual hydrogen bonds with the hinge region of the ATP-pocket was proposed. Furthermore, three diversity and four quality metrics were chosen for compound library analysis, which can be applied to databases with over 30 million structures. Compared with 13 commercial libraries, our protocol demonstrates a special advantage in terms of good skeleton diversity, acceptable fingerprint diversity, balanced scaffold distribution, and high quality, which can work well not only on existing PKIs, but also on four chosen commercial libraries. Overall, the strategy can greatly facilitate the expansion of a desirable chemical space for kinase drug discovery.
Collapse
Affiliation(s)
- Yan Yang
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Yi Hua
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Xingye Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Yuanrong Fan
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Yuchen Wang
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Li Liang
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Chenglong Deng
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Tao Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China.,State Key Laboratory of Natural Medicines , China Pharmaceutical University , 24 Tongjiaxiang , Nanjing 210009 , China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| |
Collapse
|
30
|
Vo AH, Van Vleet TR, Gupta RR, Liguori MJ, Rao MS. An Overview of Machine Learning and Big Data for Drug Toxicity Evaluation. Chem Res Toxicol 2019; 33:20-37. [DOI: 10.1021/acs.chemrestox.9b00227] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Andy H. Vo
- Department of Preclinical Safety, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064, United States
| | - Terry R. Van Vleet
- Department of Preclinical Safety, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064, United States
| | - Rishi R. Gupta
- Information Research, Research and Development, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064, United States
| | - Michael J. Liguori
- Department of Preclinical Safety, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064, United States
| | - Mohan S. Rao
- Department of Preclinical Safety, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064, United States
| |
Collapse
|
31
|
Walker E, Kammeraad J, Goetz J, Robo MT, Tewari A, Zimmerman PM. Learning To Predict Reaction Conditions: Relationships between Solvent, Molecular Structure, and Catalyst. J Chem Inf Model 2019; 59:3645-3654. [PMID: 31381340 DOI: 10.1021/acs.jcim.9b00313] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Reaction databases provide a great deal of useful information to assist planning of experiments but do not provide any interpretation or chemical concepts to accompany this information. In this work, reactions are labeled with experimental conditions, and network analysis shows that consistencies within clusters of data points can be leveraged to organize this information. In particular, this analysis shows how particular experimental conditions (specifically solvent) are effective in enabling specific organic reactions (Friedel-Crafts, Aldol addition, Claisen condensation, Diels-Alder, and Wittig), including variations within each reaction class. Network analysis shows data points for reactions tend to break into clusters that depend on the catalyst and chemical structure. This type of clustering, which mimics how a chemist reasons, is derived directly from the network. Therefore, the findings of this work could augment synthesis planning by providing predictions in a fashion that mimics human chemists. To numerically evaluate solvent prediction ability, three methods are compared: network analysis (through the k-nearest neighbor algorithm), a support vector machine, and a deep neural network. The most accurate method in 4 of the 5 test cases is the network analysis, with deep neural networks also showing good prediction scores. The network analysis tool was evaluated by an expert panel of chemists, who generally agreed that the algorithm produced accurate solvent choices while simultaneously being transparent in the underlying reasons for its predictions.
Collapse
Affiliation(s)
- Eric Walker
- Department of Chemistry , University of Michigan , 930 North University Avenue , Ann Arbor , Michigan 48109 , United States
| | - Joshua Kammeraad
- Department of Chemistry , University of Michigan , 930 North University Avenue , Ann Arbor , Michigan 48109 , United States
| | - Jonathan Goetz
- Department of Statistics , University of Michigan , 1085 South University Avenue , Ann Arbor , Michigan 48109 , United States
| | - Michael T Robo
- Department of Chemistry , University of Michigan , 930 North University Avenue , Ann Arbor , Michigan 48109 , United States
| | - Ambuj Tewari
- Department of Statistics , University of Michigan , 1085 South University Avenue , Ann Arbor , Michigan 48109 , United States
| | - Paul M Zimmerman
- Department of Chemistry , University of Michigan , 930 North University Avenue , Ann Arbor , Michigan 48109 , United States
| |
Collapse
|
32
|
Abstract
Pharmacological science is trying to establish the link between chemicals, targets, and disease-related phenotypes. A plethora of chemical proteomics and structural data have been generated, thanks to the target-based approach that has dominated drug discovery at the turn of the century. There is an invaluable source of information for in silico target profiling. Prediction is based on the principle of chemical similarity (similar drugs bind similar targets) or on first principles from the biophysics of molecular interactions. In the first case, compound comparison is made through ligand-based chemical similarity search or through classifier-based machine learning approach. The 3D techniques are based on 3D structural descriptors or energy-based scoring scheme to infer a binding affinity of a compound with its putative target. More recently, a new approach based on compound set metric has been proposed in which a query compound is compared with a whole of compounds associated with a target or a family of targets. This chapter reviews the different techniques of in silico target profiling and their main applications such as inference of unwanted targets, drug repurposing, or compound prioritization after phenotypic-based screening campaigns.
Collapse
|
33
|
A review of ligand-based virtual screening web tools and screening algorithms in large molecular databases in the age of big data. Future Med Chem 2018; 10:2641-2658. [PMID: 30499744 DOI: 10.4155/fmc-2018-0076] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Virtual screening has become a widely used technique for helping in drug discovery processes. The key to this success is its ability to aid in the identification of novel bioactive compounds by screening large molecular databases. Several web servers have emerged in the last few years supplying platforms to guide users in screening publicly accessible chemical databases in a reasonable time. In this review, we discuss a representative set of online virtual screening servers and their underlying similarity algorithms. Other related topics, such as molecular representation or freely accessible databases are also treated. The most relevant contributions to this review arise from critical discussions concerning the pros and cons of servers and algorithms, and the challenges that future works must solve in a virtual screening framework.
Collapse
|
34
|
Sánchez-Cruz N, Medina-Franco JL. Statistical-based database fingerprint: chemical space dependent representation of compound databases. J Cheminform 2018; 10:55. [PMID: 30467740 PMCID: PMC6755589 DOI: 10.1186/s13321-018-0311-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2018] [Accepted: 11/14/2018] [Indexed: 11/30/2022] Open
Abstract
Background Simplified representation of compound databases has several applications in cheminformatics. Herein, we introduce an alternative and general method to build single fingerprint representations of compound databases. The approach is inspired on the previously published modal fingerprints that are aimed to capture the most significant bits of a fingerprint representation for a compound data set. The novelty of the herein proposed statistical-based database fingerprint (SB-DFP) is that it is generated based on binomial proportions comparisons taking as reference the distribution of “1” bits on a large representative set of the chemical space. Results To illustrate the Method, SB-DFPs were constructed for 28 epigenetic target data sets retrieved from a recently published epigenomics database of interest in probe and drug discovery. For each target data set, the SB-DFPs were built based on two representative fingerprints of different design using as reference a data set with more than 15 million compounds from ZINC. The application of SB-DFP was illustrated and compared to other methods through association relationships of the 28 epigenetic data sets and similarity searching. It was found that SB-DFPs captured overall, the common features between data sets and the distinct features of each set. In similarity searching SB-DFP equaled or outperformed other approaches for at least 20 out of the 28 sets. Conclusions SB-DFP is a general approach based on binomial proportion comparisons to represent a compound data set with a single fingerprint. SB-DFP can be developed, at least in principle, based on any fingerprint and reference data set. SB-DFP is a good alternative for exploration of relationships between targets through its associated compound data sets and performing similarity searching. Electronic supplementary material The online version of this article (10.1186/s13321-018-0311-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Norberto Sánchez-Cruz
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.
| | - José L Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.
| |
Collapse
|
35
|
Saldívar-González FI, Gómez-García A, Chávez-Ponce de León DE, Sánchez-Cruz N, Ruiz-Rios J, Pilón-Jiménez BA, Medina-Franco JL. Inhibitors of DNA Methyltransferases From Natural Sources: A Computational Perspective. Front Pharmacol 2018; 9:1144. [PMID: 30364171 PMCID: PMC6191485 DOI: 10.3389/fphar.2018.01144] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 09/21/2018] [Indexed: 12/15/2022] Open
Abstract
Naturally occurring small molecules include a large variety of natural products from different sources that have confirmed activity against epigenetic targets. In this work we review chemoinformatic, molecular modeling, and other computational approaches that have been used to uncover natural products as inhibitors of DNA methyltransferases, a major family of epigenetic targets with therapeutic interest. Examples of computational approaches surveyed in this work are docking, similarity-based virtual screening, and pharmacophore modeling. It is also discussed the chemoinformatic-guided exploration of the chemical space of naturally occurring compounds as epigenetic modulators which may have significant implications in epigenetic drug discovery and nutriepigenetics.
Collapse
Affiliation(s)
| | - Alejandro Gómez-García
- Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, Mexico City, Mexico
| | | | - Norberto Sánchez-Cruz
- Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, Mexico City, Mexico
| | - Javier Ruiz-Rios
- Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, Mexico City, Mexico
| | - B Angélica Pilón-Jiménez
- Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, Mexico City, Mexico
| | - José L Medina-Franco
- Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, Mexico City, Mexico
| |
Collapse
|
36
|
Capuzzi SJ, Sun W, Muratov EN, Martínez-Romero C, He S, Zhu W, Li H, Tawa G, Fisher EG, Xu M, Shinn P, Qiu X, García-Sastre A, Zheng W, Tropsha A. Computer-Aided Discovery and Characterization of Novel Ebola Virus Inhibitors. J Med Chem 2018; 61:3582-3594. [PMID: 29624387 DOI: 10.1021/acs.jmedchem.8b00035] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The Ebola virus (EBOV) causes severe human infection that lacks effective treatment. A recent screen identified a series of compounds that block EBOV-like particle entry into human cells. Using data from this screen, quantitative structure-activity relationship models were built and employed for virtual screening of a ∼17 million compound library. Experimental testing of 102 hits yielded 14 compounds with IC50 values under 10 μM, including several sub-micromolar inhibitors, and more than 10-fold selectivity against host cytotoxicity. These confirmed hits include FDA-approved drugs and clinical candidates with non-antiviral indications, as well as compounds with novel scaffolds and no previously known bioactivity. Five selected hits inhibited BSL-4 live-EBOV infection in a dose-dependent manner, including vindesine (0.34 μM). Additional studies of these novel anti-EBOV compounds revealed their mechanisms of action, including the inhibition of NPC1 protein, cathepsin B/L, and lysosomal function. Compounds identified in this study are among the most potent and well-characterized anti-EBOV inhibitors reported to date.
Collapse
Affiliation(s)
- Stephen J Capuzzi
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry , UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina 27599 , United States
| | - Wei Sun
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Eugene N Muratov
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry , UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina 27599 , United States.,Department of Chemical Technology , Odessa National Polytechnic University , Odessa 65000 , Ukraine
| | - Carles Martínez-Romero
- Department of Microbiology , Icahn School of Medicine at Mount Sinai , New York , New York 10029 , United States.,Global Health and Emerging Pathogens Institute , Icahn School of Medicine at Mount Sinai , New York , New York 10029 , United States
| | - Shihua He
- Special Pathogens Program, National Microbiology Laboratory , Public Health Agency of Canada , 1015 Arlington Street , Winnipeg , Manitoba R3E 3R2 , Canada
| | - Wenjun Zhu
- Special Pathogens Program, National Microbiology Laboratory , Public Health Agency of Canada , 1015 Arlington Street , Winnipeg , Manitoba R3E 3R2 , Canada.,Department of Medical Microbiology , University of Manitoba , 745 Bannatyne Avenue , Winnipeg , Manitoba R3E 0J9 , Canada
| | - Hao Li
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Gregory Tawa
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Ethan G Fisher
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Miao Xu
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Paul Shinn
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Xiangguo Qiu
- Special Pathogens Program, National Microbiology Laboratory , Public Health Agency of Canada , 1015 Arlington Street , Winnipeg , Manitoba R3E 3R2 , Canada.,Department of Medical Microbiology , University of Manitoba , 745 Bannatyne Avenue , Winnipeg , Manitoba R3E 0J9 , Canada
| | - Adolfo García-Sastre
- Department of Microbiology , Icahn School of Medicine at Mount Sinai , New York , New York 10029 , United States.,Global Health and Emerging Pathogens Institute , Icahn School of Medicine at Mount Sinai , New York , New York 10029 , United States.,Department of Medicine, Division of Infectious Diseases , Icahn School of Medicine at Mount Sinai , New York , New York 10029 , United States
| | - Wei Zheng
- National Center for Advancing Translational Sciences , National Institutes of Health , Bethesda , Maryland 20892 , United States
| | - Alexander Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry , UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill , Chapel Hill , North Carolina 27599 , United States
| |
Collapse
|
37
|
Naveja JJ, Medina-Franco JL. Insights from pharmacological similarity of epigenetic targets in epipolypharmacology. Drug Discov Today 2018; 23:141-150. [DOI: 10.1016/j.drudis.2017.10.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Revised: 09/05/2017] [Accepted: 10/05/2017] [Indexed: 01/10/2023]
|
38
|
Naveja JJ, Oviedo-Osornio CI, Trujillo-Minero NN, Medina-Franco JL. Chemoinformatics: a perspective from an academic setting in Latin America. Mol Divers 2017; 22:247-258. [PMID: 29204824 DOI: 10.1007/s11030-017-9802-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 11/26/2017] [Indexed: 12/13/2022]
Abstract
This perspective discusses the current progress of a chemoinformatics group in a major university in Latin America. Three major aspects are discussed in a critical manner: research, education, and collaboration with industry and other public research networks. It is also presented an overview of the progress in applied research and development of research concepts. Efforts to teach chemoinformatics at the undergraduate and graduate levels are discussed. It is addressed how the partnership with industry and other not-for-profit research institutions not only brings additional sources of funding but, more importantly, increases the impact of the multidisciplinary work and offers the students to be exposed to other research environments. We also discuss the main perspectives and challenges that remain to be addressed in these settings.
Collapse
Affiliation(s)
- J Jesús Naveja
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.,PECEM, Facultad de Medicina, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - C Iluhí Oviedo-Osornio
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - Nicole N Trujillo-Minero
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - José L Medina-Franco
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.
| |
Collapse
|
39
|
The potential role of in silico approaches to identify novel bioactive molecules from natural resources. Future Med Chem 2017; 9:1665-1686. [PMID: 28841048 DOI: 10.4155/fmc-2017-0124] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
In recent years, integration of in silico approaches to natural product (NP) research reawakened the declined interest in NP-based drug discovery efforts. In particular, advancements in cheminformatics enabled comparison of NP databases with contemporary small-molecule libraries in terms of molecular properties and chemical space localizations. Virtual screening and target fishing approaches were successful in recognizing the untold macromolecular targets for NPs to exploit the unmet therapeutic needs. Developments in molecular docking and scoring methods along with molecular dynamics enabled to predict the target-ligand interactions more accurately taking into consideration the remarkable structural complexity of NPs. Hence, innovative in silico strategies have contributed valuably to the NP research in drug discovery processes as reviewed herein. [Formula: see text].
Collapse
|