1
|
Golyak IS, Anfimov DR, Demkin PP, Berezhanskiy PV, Nebritova OA, Morozov AN, Fufurin IL. A hybrid learning approach to better classify exhaled breath's infrared spectra: A noninvasive optical diagnosis for socially significant diseases. JOURNAL OF BIOPHOTONICS 2024:e202400151. [PMID: 39075328 DOI: 10.1002/jbio.202400151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 07/05/2024] [Accepted: 07/15/2024] [Indexed: 07/31/2024]
Abstract
Early diagnosis is crucial for effective treatment of socially significant diseases, such as type 1 diabetes mellitus (T1DM), pneumonia, and asthma. This study employs a diagnostic method based on infrared laser spectroscopy of human exhaled breath. The experimental setup comprises a quantum cascade laser, which emits in a pulsed mode with a peak power of up to 150 mW in the spectral range of 5.3-12.8 μm (780-1890 cm-1), and a Herriott multipass gas cell with a specific optical path length of 76 m. Using this setup, spectra of exhaled breath in the mid-infrared range were obtained from 165 volunteers, including healthy individuals, patients with T1DM, asthma, and pneumonia. The study proposes a hybrid approach for classifying these spectra, utilizing a variational autoencoder for dimensionality reduction and a support vector machine method for classification. The results demonstrate that the proposed hybrid approach outperforms other machine learning method combinations.
Collapse
|
2
|
Mervin L, Voronov A, Kabeshov M, Engkvist O. QSARtuna: An Automated QSAR Modeling Platform for Molecular Property Prediction in Drug Design. J Chem Inf Model 2024; 64:5365-5374. [PMID: 38950185 DOI: 10.1021/acs.jcim.4c00457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
Machine-learning (ML) and deep-learning (DL) approaches to predict the molecular properties of small molecules are increasingly deployed within the design-make-test-analyze (DMTA) drug design cycle to predict molecular properties of interest. Despite this uptake, there are only a few automated packages to aid their development and deployment that also support uncertainty estimation, model explainability, and other key aspects of model usage. This represents a key unmet need within the field, and the large number of molecular representations and algorithms (and associated parameters) means it is nontrivial to robustly optimize, evaluate, reproduce, and deploy models. Here, we present QSARtuna, a molecule property prediction modeling pipeline, written in Python and utilizing the Optuna, Scikit-learn, RDKit, and ChemProp packages, which enables the efficient and automated comparison between molecular representations and machine learning models. The platform was developed by considering the increasingly important aspect of model uncertainty quantification and explainability by design. We provide details for our framework and provide illustrative examples to demonstrate the capability of the software when applied to simple molecular property, reaction/reactivity prediction, and DNA encoded library enrichment classification. We hope that the release of QSARtuna will further spur innovation in automatic ML modeling and provide a platform for education of best practices in molecular property modeling. The code for the QSARtuna framework is made freely available via GitHub.
Collapse
Affiliation(s)
- Lewis Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge CB2 0AA, United Kingdom
| | - Alexey Voronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 412 96, Sweden
| | - Mikhail Kabeshov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 412 96, Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 412 96, Sweden
- Department of Computer Science and Engineering, University of Gothenburg, Chalmers University of Technology, Gothenburg 412 96, Sweden
| |
Collapse
|
3
|
Karampuri A, Kundur S, Perugu S. Exploratory drug discovery in breast cancer patients: A multimodal deep learning approach to identify novel drug candidates targeting RTK signaling. Comput Biol Med 2024; 174:108433. [PMID: 38642491 DOI: 10.1016/j.compbiomed.2024.108433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/04/2024] [Accepted: 04/07/2024] [Indexed: 04/22/2024]
Abstract
Breast cancer, a highly formidable and diverse malignancy predominantly affecting women globally, poses a significant threat due to its intricate genetic variability, rendering it challenging to diagnose accurately. Various therapies such as immunotherapy, radiotherapy, and diverse chemotherapy approaches like drug repurposing and combination therapy are widely used depending on cancer subtype and metastasis severity. Our study revolves around an innovative drug discovery strategy targeting potential drug candidates specific to RTK signalling, a prominently targeted receptor class in cancer. To accomplish this, we have developed a multimodal deep neural network (MM-DNN) based QSAR model integrating omics datasets to elucidate genomic, proteomic expression data, and drug responses, validated rigorously. The results showcase an R2 value of 0.917 and an RMSE value of 0.312, affirming the model's commendable predictive capabilities. Structural analogs of drug molecules specific to RTK signalling were sourced from the PubChem database, followed by meticulous screening to eliminate dissimilar compounds. Leveraging the MM-DNN-based QSAR model, we predicted the biological activity of these molecules, subsequently clustering them into three distinct groups. Feature importance analysis was performed. Consequently, we successfully identified prime drug candidates tailored for each potential downstream regulatory protein within the RTK signalling pathway. This method makes the early stages of drug development faster by removing inactive compounds, providing a hopeful path in combating breast cancer.
Collapse
Affiliation(s)
- Anush Karampuri
- Department of Biotechnology, National Institute of Technology, Warangal, 500604, India
| | - Sunitha Kundur
- Department of Biotechnology, National Institute of Technology, Warangal, 500604, India
| | - Shyam Perugu
- Department of Biotechnology, National Institute of Technology, Warangal, 500604, India.
| |
Collapse
|
4
|
Łapińska N, Pacławski A, Szlęk J, Mendyk A. SerotoninAI: Serotonergic System Focused, Artificial Intelligence-Based Application for Drug Discovery. J Chem Inf Model 2024; 64:2150-2157. [PMID: 38289046 PMCID: PMC11005036 DOI: 10.1021/acs.jcim.3c01517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 01/02/2024] [Accepted: 01/04/2024] [Indexed: 04/09/2024]
Abstract
SerotoninAI is an innovative web application for scientific purposes focused on the serotonergic system. By leveraging SerotoninAI, researchers can assess the affinity (pKi value) of a molecule to all main serotonin receptors and serotonin transporters based on molecule structure introduced as SMILES. Additionally, the application provides essential insights into critical attributes of potential drugs such as blood-brain barrier penetration and human intestinal absorption. The complexity of the serotonergic system demands advanced tools for accurate predictions, which is a fundamental requirement in drug development. SerotoninAI addresses this need by providing an intuitive user interface that generates predictions of pKi values for the main serotonergic targets. The application is freely available on the Internet at https://serotoninai.streamlit.app/, implemented in Streamlit with all major web browsers supported. Currently, to the best of our knowledge, there is no tool that allows users to access affinity predictions for serotonergic targets without registration or financial obligations. SerotoninAI significantly increases the scope of drug development activities worldwide. The source code of the application is available at https://github.com/nczub/SerotoninAI_streamlit.
Collapse
Affiliation(s)
- Natalia Łapińska
- Department
of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, 30-688 Kraków, Poland
- Doctoral
School of Medicinal and Health Sciences, Jagiellonian University Medical College, 30-688 Kraków, Poland
| | - Adam Pacławski
- Department
of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, 30-688 Kraków, Poland
| | - Jakub Szlęk
- Department
of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, 30-688 Kraków, Poland
| | - Aleksander Mendyk
- Department
of Pharmaceutical Technology and Biopharmaceutics, Jagiellonian University Medical College, 30-688 Kraków, Poland
| |
Collapse
|
5
|
Oliveira PF, Guedes RC, Falcao AO. Inferring molecular inhibition potency with AlphaFold predicted structures. Sci Rep 2024; 14:8252. [PMID: 38589418 PMCID: PMC11001998 DOI: 10.1038/s41598-024-58394-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Accepted: 03/28/2024] [Indexed: 04/10/2024] Open
Abstract
Even though in silico drug ligand-based methods have been successful in predicting interactions with known target proteins, they struggle with new, unassessed targets. To address this challenge, we propose an approach that integrates structural data from AlphaFold 2 predicted protein structures into machine learning models. Our method extracts 3D structural protein fingerprints and combines them with ligand structural data to train a single machine learning model. This model captures the relationship between ligand properties and the unique structural features of various target proteins, enabling predictions for never before tested molecules and protein targets. To assess our model, we used a dataset of 144 Human G-protein Coupled Receptors (GPCRs) with over 140,000 measured inhibition constants (Ki) values. Results strongly suggest that our approach performs as well as state-of-the-art ligand-based methods. In a second modeling approach that used 129 targets for training and a separate test set of 15 different protein targets, our model correctly predicted interactions for 73% of targets, with explained variances exceeding 0.50 in 22% of cases. Our findings further verified that the usage of experimentally determined protein structures produced models that were statistically indistinct from the Alphafold synthetic structures. This study presents a proteo-chemometric drug screening approach that uses a simple and scalable method for extracting protein structural information for usage in machine learning models capable of predicting protein-molecule interactions even for orphan targets.
Collapse
Affiliation(s)
- Pedro F Oliveira
- Lasige, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| | - Rita C Guedes
- Research Institute for Medicines (iMed.ULisboa), Faculdade de Farmácia, Universidade de Lisboa, Av. Prof. Gama Pinto, 1649-003 Lisboa, Portugal
| | - Andre O Falcao
- Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016, Lisboa, Portugal.
| |
Collapse
|
6
|
Ilan Y. Special Issue "Computer-Aided Drug Discovery and Treatment". Int J Mol Sci 2024; 25:2683. [PMID: 38473929 DOI: 10.3390/ijms25052683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 02/21/2024] [Indexed: 03/14/2024] Open
Abstract
This Special Issue aims to highlight some of the latest developments in drug discovery [...].
Collapse
Affiliation(s)
- Yaron Ilan
- Department of Medicine, Hadassah Medical Center, Faculty of Medicine, Hebrew University, Jerusalem 91120, Israel
| |
Collapse
|
7
|
Karampuri A, Perugu S. A breast cancer-specific combinational QSAR model development using machine learning and deep learning approaches. FRONTIERS IN BIOINFORMATICS 2024; 3:1328262. [PMID: 38288043 PMCID: PMC10822965 DOI: 10.3389/fbinf.2023.1328262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 12/21/2023] [Indexed: 01/31/2024] Open
Abstract
Breast cancer is the most prevalent and heterogeneous form of cancer affecting women worldwide. Various therapeutic strategies are in practice based on the extent of disease spread, such as surgery, chemotherapy, radiotherapy, and immunotherapy. Combinational therapy is another strategy that has proven to be effective in controlling cancer progression. Administration of Anchor drug, a well-established primary therapeutic agent with known efficacy for specific targets, with Library drug, a supplementary drug to enhance the efficacy of anchor drugs and broaden the therapeutic approach. Our work focused on harnessing regression-based Machine learning (ML) and deep learning (DL) algorithms to develop a structure-activity relationship between the molecular descriptors of drug pairs and their combined biological activity through a QSAR (Quantitative structure-activity relationship) model. 11 popularly known machine learning and deep learning algorithms were used to develop QSAR models. A total of 52 breast cancer cell lines, 25 anchor drugs, and 51 library drugs were considered in developing the QSAR model. It was observed that Deep Neural Networks (DNNs) achieved an impressive R2 (Coefficient of Determination) of 0.94, with an RMSE (Root Mean Square Error) value of 0.255, making it the most effective algorithm for developing a structure-activity relationship with strong generalization capabilities. In conclusion, applying combinational therapy alongside ML and DL techniques represents a promising approach to combating breast cancer.
Collapse
Affiliation(s)
| | - Shyam Perugu
- Department of Biotechnology, National Institute of Technology, Warangal, India
| |
Collapse
|
8
|
Evbuomwan IO, Alejolowo OO, Elebiyo TC, Nwonuma CO, Ojo OA, Edosomwan EU, Chikwendu JI, Elosiuba NV, Akulue JC, Dogunro FA, Rotimi DE, Osemwegie OO, Ojo AB, Ademowo OG, Adeyemi OS, Oluba OM. In silico modeling revealed phytomolecules derived from Cymbopogon citratus (DC.) leaf extract as promising candidates for malaria therapy. J Biomol Struct Dyn 2024; 42:101-118. [PMID: 36974933 DOI: 10.1080/07391102.2023.2192799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 03/10/2023] [Indexed: 03/29/2023]
Abstract
The emergence of varying levels of resistance to currently available antimalarial drugs significantly threatens global health. This factor heightens the urgency to explore bioactive compounds from natural products with a view to discovering and developing newer antimalarial drugs with novel mode of actions. Therefore, we evaluated the inhibitory effects of sixteen phytocompounds from Cymbopogon citratus leaf extract against Plasmodium falciparum drug targets such as P. falciparum circumsporozoite protein (PfCSP), P. falciparum merozoite surface protein 1 (PfMSP1) and P. falciparum erythrocyte membrane protein 1 (PfEMP1). In silico approaches including molecular docking, pharmacophore modeling and 3D-QSAR were adopted to analyze the inhibitory activity of the compounds under consideration. The molecular docking results indicated that a compound swertiajaponin from C. citratus exhibited a higher binding affinity (-7.8 kcal/mol) to PfMSP1 as against the standard artesunate-amodiaquine (-6.6 kcal/mol). Swertiajaponin also formed strong hydrogen bond interactions with LYS29, CYS30, TYR34, ASN52, GLY55 and CYS28 amino acid residues. In addition, quercetin another compound from C. citratus exhibited significant binding energies -6.8 and -8.3 kcal/mol with PfCSP and PfEMP1, respectively but slightly lower than the standard artemether-lumefantrine with binding energies of -7.4 kcal/mol against PfCSP and -8.7 kcal/mol against PfEMP1. Overall, the present study provides evidence that swertiajaponin and other phytomolecules from C. citratus have modulatory properties toward P. falciparum drug targets and thus may warrant further exploration in early drug discovery efforts against malaria. Furthermore, these findings lend credence to the folkloric use of C. citratus for malaria treatment.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Ikponmwosa Owen Evbuomwan
- SDG #03 Group - Good Health and Well-Being Research Cluster, Landmark University, Omu-Aran, Nigeria
- Department of Biochemistry, Landmark University, Omu-Aran, Nigeria
- Department of Food Science and Microbiology, Landmark University, Omu-Aran, Nigeria
| | - Omokolade Oluwaseyi Alejolowo
- SDG #03 Group - Good Health and Well-Being Research Cluster, Landmark University, Omu-Aran, Nigeria
- Department of Biochemistry, Landmark University, Omu-Aran, Nigeria
| | | | - Charles Obiora Nwonuma
- SDG #03 Group - Good Health and Well-Being Research Cluster, Landmark University, Omu-Aran, Nigeria
- Department of Biochemistry, Landmark University, Omu-Aran, Nigeria
| | - Oluwafemi Adeleke Ojo
- Phytomedicine, Molecular Toxicology and Computational Biochemistry Research Group, Department of Biochemistry, Bowen University, Iwo, Nigeria
| | - Evelyn Uwa Edosomwan
- Department of Animal and Environmental Biology, University of Benin, Benin City, Nigeria
| | | | | | | | | | - Damilare Emmanuel Rotimi
- SDG #03 Group - Good Health and Well-Being Research Cluster, Landmark University, Omu-Aran, Nigeria
- Department of Biochemistry, Landmark University, Omu-Aran, Nigeria
| | | | | | - Olusegun George Ademowo
- Department of Pharmacology and Therapeutics, Faculty of Basic Medical Sciences, University of Ibadan, Ibadan, Nigeria
- Drug Research Laboratory, Institute of Advanced Medical Research and Training (IMRAT), College of Medicine, University of Ibadan, Ibadan, Nigeria
| | - Oluyomi Stephen Adeyemi
- SDG #03 Group - Good Health and Well-Being Research Cluster, Landmark University, Omu-Aran, Nigeria
- Department of Biochemistry, Landmark University, Omu-Aran, Nigeria
- Laboratory of Sustainable Animal Environment, Graduate School of Agricultural Science, Tohoku University, Osaki, Miyagi, Japan
| | - Olarewaju Michael Oluba
- SDG #03 Group - Good Health and Well-Being Research Cluster, Landmark University, Omu-Aran, Nigeria
- Department of Biochemistry, Landmark University, Omu-Aran, Nigeria
| |
Collapse
|
9
|
Oselusi SO, Sibuyi NRS, Meyer M, Madiehe AM. Ehretia Species Phytoconstituents as Potential Lead Compounds against Klebsiella pneumoniae Carbapenemase: A Computational Approach. BIOMED RESEARCH INTERNATIONAL 2023; 2023:8022356. [PMID: 37869630 PMCID: PMC10586912 DOI: 10.1155/2023/8022356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 09/05/2023] [Accepted: 09/26/2023] [Indexed: 10/24/2023]
Abstract
The evolution of antibiotic-resistant carbapenemase has negatively impacted the management of critical healthcare-associated infections. K. pneumoniae carbapenemase-2- (KPC-2-) expressing bacteria have developed resistance to conventional therapeutic options, including those used as a last resort for life-threatening diseases. In this study, Ehretia species phytoconstituents were screened for their potential to inhibit KPC-2 protein using in silico approaches. Molecular docking was used to identify strong KPC-2 protein binding phytoconstituents retrieved from the literature. The best-docked conformation of the ligands was selected based on their glide energy and binding interactions. To determine their binding free energies, these hit compounds were subjected to molecular mechanics with generalized born and surface area (MM-GBSA) in the PRIME module. Pharmacological assessments of the ligands were performed to evaluate their drug-likeness. Molecular dynamic (MD) simulations were used to analyze the conformational stability of the selected druglike compounds within the active site of the KPC-2 protein. Overall, a total of 69 phytoconstituents were compiled from the literature. Fourteen of these compounds exhibited a stronger binding affinity for the protein target than the reference drugs. Four of these top hit compounds, DB09, DB12, DB28, and DB66, revealed the highest efficacy in terms of drug-likeness properties. The MD simulation established that among the druglike compounds, DB66 attained stable conformations after 150 ns simulation in the active site of the protein. We concluded that DB66 from Ehretia species could play a significant role in therapeutic efforts against KPC-2-expressing bacteria.
Collapse
Affiliation(s)
- Samson O. Oselusi
- Nanobiotechnology Research Group, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town 7535, South Africa
- DSI/Mintek Nanotechnology Innovation Centre (NIC), Biolabels Research Node, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town 7535, South Africa
| | - Nicole R. S. Sibuyi
- DSI/Mintek Nanotechnology Innovation Centre (NIC), Biolabels Research Node, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town 7535, South Africa
| | - Mervin Meyer
- DSI/Mintek Nanotechnology Innovation Centre (NIC), Biolabels Research Node, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town 7535, South Africa
| | - Abram M. Madiehe
- Nanobiotechnology Research Group, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town 7535, South Africa
- DSI/Mintek Nanotechnology Innovation Centre (NIC), Biolabels Research Node, Department of Biotechnology, University of the Western Cape, Private Bag X17, Bellville, Cape Town 7535, South Africa
| |
Collapse
|
10
|
Lanini J, Santarossa G, Sirockin F, Lewis R, Fechner N, Misztela H, Lewis S, Maziarz K, Stanley M, Segler M, Stiefl N, Schneider N. PREFER: A New Predictive Modeling Framework for Molecular Discovery. J Chem Inf Model 2023; 63:4497-4504. [PMID: 37487018 DOI: 10.1021/acs.jcim.3c00523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]
Abstract
Machine-learning and deep-learning models have been extensively used in cheminformatics to predict molecular properties, to reduce the need for direct measurements, and to accelerate compound prioritization. However, different setups and frameworks and the large number of molecular representations make it difficult to properly evaluate, reproduce, and compare them. Here we present a new PREdictive modeling FramEwoRk for molecular discovery (PREFER), written in Python (version 3.7.7) and based on AutoSklearn (version 0.14.7), that allows comparison between different molecular representations and common machine-learning models. We provide an overview of the design of our framework and show exemplary use cases and results of several representation-model combinations on diverse data sets, both public and in-house. Finally, we discuss the use of PREFER on small data sets. The code of the framework is freely available on GitHub.
Collapse
Affiliation(s)
- Jessica Lanini
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Gianluca Santarossa
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Finton Sirockin
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Richard Lewis
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Nikolas Fechner
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | | | - Sarah Lewis
- Microsoft Research AI4Science, Cambridge CB1 2FB, U.K
| | | | - Megan Stanley
- Microsoft Research AI4Science, Cambridge CB1 2FB, U.K
| | - Marwin Segler
- Microsoft Research AI4Science, Cambridge CB1 2FB, U.K
| | - Nikolaus Stiefl
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Nadine Schneider
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| |
Collapse
|
11
|
Jihad MI, Mahdi MF. Molecular Docking Study of New Sorafenib Analogues as Platelet-Derived Growth Factor Receptor Inhibitors for the Treatment of Cancer. JOURNAL OF PHARMACY AND BIOALLIED SCIENCES 2023; 15:S1023-S1026. [PMID: 37694099 PMCID: PMC10485473 DOI: 10.4103/jpbs.jpbs_244_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Revised: 03/16/2023] [Accepted: 03/17/2023] [Indexed: 09/12/2023] Open
Abstract
Cancer is a disease triggered by an uncontrolled growth of a group of cells usually from a single cell. Chemotherapy is a common and systematic therapy that involves the use of anticancer drugs also known as chemotherapeutical agents to treat cancer. Tyrosine kinases are a subset of protein kinases that are a family of over 90 enzymes that selectively phosphorylate tyrosine residues in various substrates. Receptors with internal tyrosine kinase activity mediate the actions of several growth factors, differentiation factors, and hormones, resulting in the reproduction and differentiation of the affected cells. In the fight against cancer, the platelet-derived growth factor receptor has emerged as a novel target via inhibition of this receptor resulting in the inhibition of tyrosine kinase cascade. Docking investigations were conducted using the Genetic Optimization for Ligand Docking (GOLD) Suite (v. 5.7.1) from the Cambridge Crystallographic Data Center. A high-definition X-ray crystallography of the platelet-derived growth factor protein [Protein Data Bank (PDB) ID 6JOL] was downloaded from the website PDB with a resolution of 2 A. Compounds II, III, VII, and VIII have greater binding energies than the GOLD standard medication sorafenib, which gives Piecewise Linear Potential (PLP) fitness value (85.3). Other ligands exhibit good inhibitory action and docking scores comparable to that of the reference ligand sorafenib.
Collapse
Affiliation(s)
- Marwan I. Jihad
- Department of Pharmaceutical Chemistry, College of Pharmacy, University of Mustansiriyah, Baghdad, Iraq
| | - Monther F. Mahdi
- Department of Pharmaceutical Chemistry, College of Pharmacy, University of Mustansiriyah, Baghdad, Iraq
| |
Collapse
|
12
|
Qureshi R, Irfan M, Gondal TM, Khan S, Wu J, Hadi MU, Heymach J, Le X, Yan H, Alam T. AI in drug discovery and its clinical relevance. Heliyon 2023; 9:e17575. [PMID: 37396052 PMCID: PMC10302550 DOI: 10.1016/j.heliyon.2023.e17575] [Citation(s) in RCA: 28] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 06/17/2023] [Accepted: 06/21/2023] [Indexed: 07/04/2023] Open
Abstract
The COVID-19 pandemic has emphasized the need for novel drug discovery process. However, the journey from conceptualizing a drug to its eventual implementation in clinical settings is a long, complex, and expensive process, with many potential points of failure. Over the past decade, a vast growth in medical information has coincided with advances in computational hardware (cloud computing, GPUs, and TPUs) and the rise of deep learning. Medical data generated from large molecular screening profiles, personal health or pathology records, and public health organizations could benefit from analysis by Artificial Intelligence (AI) approaches to speed up and prevent failures in the drug discovery pipeline. We present applications of AI at various stages of drug discovery pipelines, including the inherently computational approaches of de novo design and prediction of a drug's likely properties. Open-source databases and AI-based software tools that facilitate drug design are discussed along with their associated problems of molecule representation, data collection, complexity, labeling, and disparities among labels. How contemporary AI methods, such as graph neural networks, reinforcement learning, and generated models, along with structure-based methods, (i.e., molecular dynamics simulations and molecular docking) can contribute to drug discovery applications and analysis of drug responses is also explored. Finally, recent developments and investments in AI-based start-up companies for biotechnology, drug design and their current progress, hopes and promotions are discussed in this article.
Collapse
Affiliation(s)
- Rizwan Qureshi
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
- Department of Imaging Physics, MD Anderson Cancer Center, The University of Texas, Houston, USA
| | - Muhammad Irfan
- Faculty of Electrical Engineering, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Swabi, Pakistan
| | | | - Sheheryar Khan
- School of Professional Education & Executive Development, The Hong Kong Polytechnic University, Hong Kong
| | - Jia Wu
- Department of Imaging Physics, MD Anderson Cancer Center, The University of Texas, Houston, USA
| | | | - John Heymach
- Department of Thoracic Head and Neck Medical Oncology, Division of Cancer Medicine, The University of Texas, MD Anderson Cancer Center, Houston, USA
| | - Xiuning Le
- Department of Thoracic Head and Neck Medical Oncology, Division of Cancer Medicine, The University of Texas, MD Anderson Cancer Center, Houston, USA
| | - Hong Yan
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| |
Collapse
|
13
|
Al Haj Ishak Al Ali R, Mondamert L, Berjeaud JM, Jandry J, Crépin A, Labanowski J. Application of QSAR Approach to Assess the Effects of Organic Pollutants on Bacterial Virulence Factors. Microorganisms 2023; 11:1375. [PMID: 37374877 DOI: 10.3390/microorganisms11061375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 05/15/2023] [Accepted: 05/22/2023] [Indexed: 06/29/2023] Open
Abstract
The release of a wide variety of persistent chemical contaminants into wastewater has become a growing concern due to their potential health and environmental risks. While the toxic effects of these pollutants on aquatic organisms have been extensively studied, their impact on microbial pathogens and their virulence mechanisms remains largely unexplored. This research paper focuses on the identification and prioritization of chemical pollutants that increase bacterial pathogenicity, which is a public health concern. In order to predict how chemical compounds, such as pesticides and pharmaceuticals, would affect the virulence mechanisms of three bacterial strains (Escherichia coli K12, Pseudomonas aeruginosa H103, and Salmonella enterica serovar. Typhimurium), this study has developed quantitative structure-activity relationship (QSAR) models. The use of analysis of variance (ANOVA) functions assists in developing QSAR models based on the chemical structure of the compounds, to predict their effect on the growth and swarming behavior of the bacterial strains. The results showed an uncertainty in the created model, and that increases in virulence factors, including growth and motility of bacteria, after exposure to the studied compounds are possible to be predicted. These results could be more accurate if the interactions between groups of functions are included. For that, to make an accurate and universal model, it is essential to incorporate a larger number of compounds of similar and different structures.
Collapse
Affiliation(s)
- Roukaya Al Haj Ishak Al Ali
- Institute of Chemistry, Materials and Natural Resources of Poitiers, UMR CNRS 7285, University of Poitiers, 86000 Poitiers, France
| | - Leslie Mondamert
- Institute of Chemistry, Materials and Natural Resources of Poitiers, UMR CNRS 7285, University of Poitiers, 86000 Poitiers, France
| | - Jean-Marc Berjeaud
- Ecology and Biology of Interactions, UMR CNRS 7267, University of Poitiers, 86000 Poitiers, France
| | - Joelle Jandry
- Faculty of Agronomy and Veterinary Sciences, Lebanese University, Dekwaneh, Lebanon
| | - Alexandre Crépin
- Ecology and Biology of Interactions, UMR CNRS 7267, University of Poitiers, 86000 Poitiers, France
| | - Jérôme Labanowski
- Institute of Chemistry, Materials and Natural Resources of Poitiers, UMR CNRS 7285, University of Poitiers, 86000 Poitiers, France
| |
Collapse
|
14
|
Srisongkram T, Khamtang P, Weerapreeyakul N. Prediction of KRAS G12C inhibitors using conjoint fingerprint and machine learning-based QSAR models. J Mol Graph Model 2023; 122:108466. [PMID: 37058997 DOI: 10.1016/j.jmgm.2023.108466] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 03/19/2023] [Accepted: 03/29/2023] [Indexed: 04/16/2023]
Abstract
Kirsten rat sarcoma virus G12C (KRASG12C) is the major protein mutation associated with non-small cell lung cancer (NSCLC) severity. Inhibiting KRASG12C is therefore one of the key therapeutic strategies for NSCLC patients. In this paper, a cost-effective data driven drug design employing machine learning-based quantitative structure-activity relationship (QSAR) analysis was built for predicting ligand affinities against KRASG12C protein. A curated and non-redundant dataset of 1033 compounds with KRASG12C inhibitory activity (pIC50) was used to build and test the models. The PubChem fingerprint, Substructure fingerprint, Substructure fingerprint count, and the conjoint fingerprint-a combination of PubChem fingerprint and Substructure fingerprint count-were used to train the models. Using comprehensive validation methods and various machine learning algorithms, the results clearly showed that the XGBoost regression (XGBoost) achieved the highest performance in term of goodness of fit, predictivity, generalizability and model robustness (R2 = 0.81, Q2CV = 0.60, Q2Ext = 0.62, R2 - Q2Ext = 0.19, R2Y-Random = 0.31 ± 0.03, Q2Y-Random = -0.09 ± 0.04). The top 13 molecular fingerprints that correlated with the predicted pIC50 values were SubFPC274 (aromatic atoms), SubFPC307 (number of chiral-centers), PubChemFP37 (≥1 Chlorine), SubFPC18 (Number of alkylarylethers), SubFPC1 (number of primary carbons), SubFPC300 (number of 1,3-tautomerizables), PubChemFP621 (N-C:C:C:N structure), PubChemFP23 (≥1 Fluorine), SubFPC2 (number of secondary carbons), SubFPC295 (number of C-ONS bonds), PubChemFP199 (≥4 6-membered rings), PubChemFP180 (≥1 nitrogen-containing 6-membered ring), and SubFPC180 (number of tertiary amine). These molecular fingerprints were virtualized and validated using molecular docking experiments. In conclusion, this conjoint fingerprint and XGBoost-QSAR model demonstrated to be useful as a high-throughput screening tool for KRASG12C inhibitor identification and drug design.
Collapse
Affiliation(s)
- Tarapong Srisongkram
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand.
| | | | - Natthida Weerapreeyakul
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand
| |
Collapse
|
15
|
Xu Z, Chughtai H, Tian L, Liu L, Roy JF, Bayen S. Development of quantitative structure-retention relationship models to improve the identification of leachables in food packaging using non-targeted analysis. Talanta 2023; 253:123861. [PMID: 36095943 DOI: 10.1016/j.talanta.2022.123861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 08/15/2022] [Accepted: 08/17/2022] [Indexed: 12/13/2022]
Abstract
Quantitative structure-retention relationship (QSRR) models can be used to predict the chromatographic retention time of chemicals and facilitate the identification of unknown compounds, notably with non-targeted analysis. In this study, QSRR models were developed from the data obtained for 178 pure chemical standards and four types of analytical columns (C18, phenylhexyl, pentafluorophenyl, cyano) in liquid chromatography quadrupole time-of-flight mass spectrometry (LC-Q-TOF-MS). First, different data partitioning ratios and feature selection methods [random forest (RF) and support vector machine (SVM)] were tested to build models to predict chromatographic retention times based on 2D molecular descriptors. The internal and external performances of the non-linear (RF) and corresponding linear predictive models were systematically compared, and RF models resulted in better predictive capacities [p < 0.05, with an average PVE (proportion of variance explained) value of 0.89 ± 0.02] than linear models (0.79 ± 0.03). For each column, the resulting model was applied to identify leachables from actual plastic packaging samples. An in-depth investigation of the top 20 most intense molecular features revealed that all false-positives could be identified as outliers in the QSRR models (outside of the 95% prediction bands). Furthermore, analyzing a sample on multiple chromatographic columns and applying the associated QSRR models increased the capacity to filter false positives. Such an approach will contribute to a more effective identification of unknown or unexpected leachables in plastics (e.g. non-intended added substances), therefore refining our understanding of the chemical risks associated with food contact materials.
Collapse
Affiliation(s)
- Ziyun Xu
- Department of Food Science and Agricultural Chemistry, McGill University, Ste-Anne-de-Bellevue, QC, Canada
| | - Hamza Chughtai
- Department of Food Science and Agricultural Chemistry, McGill University, Ste-Anne-de-Bellevue, QC, Canada
| | - Lei Tian
- Department of Food Science and Agricultural Chemistry, McGill University, Ste-Anne-de-Bellevue, QC, Canada
| | - Lan Liu
- Department of Food Science and Agricultural Chemistry, McGill University, Ste-Anne-de-Bellevue, QC, Canada
| | | | - Stéphane Bayen
- Department of Food Science and Agricultural Chemistry, McGill University, Ste-Anne-de-Bellevue, QC, Canada.
| |
Collapse
|
16
|
Metwally AA, Nayel AA, Hathout RM. In silico prediction of siRNA ionizable-lipid nanoparticles In vivo efficacy: Machine learning modeling based on formulation and molecular descriptors. Front Mol Biosci 2022; 9:1042720. [PMID: 36619167 PMCID: PMC9811823 DOI: 10.3389/fmolb.2022.1042720] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 12/07/2022] [Indexed: 12/24/2022] Open
Abstract
In silico prediction of the in vivo efficacy of siRNA ionizable-lipid nanoparticles is desirable as it can save time and resources dedicated to wet-lab experimentation. This study aims to computationally predict siRNA nanoparticles in vivo efficacy. A data set containing 120 entries was prepared by combining molecular descriptors of the ionizable lipids together with two nanoparticles formulation characteristics. Input descriptor combinations were selected by an evolutionary algorithm. Artificial neural networks, support vector machines and partial least squares regression were used for QSAR modeling. Depending on how the data set is split, two training sets and two external validation sets were prepared. Training and validation sets contained 90 and 30 entries respectively. The results showed the successful predictions of validation set log (siRNA dose) with Rval 2= 0.86-0.89 and 0.75-80 for validation sets one and two, respectively. Artificial neural networks resulted in the best Rval 2 for both validation sets. For predictions that have high bias, improvement of Rval 2 from 0.47 to 0.96 was achieved by selecting the training set lipids lying within the applicability domain. In conclusion, in vivo performance of siRNA nanoparticles was successfully predicted by combining cheminformatics with machine learning techniques.
Collapse
Affiliation(s)
- Abdelkader A. Metwally
- Department of Pharmaceutics, Faculty of Pharmacy, Health Sciences Center, Kuwait University, Kuwait City, Kuwait,Department of Pharmaceutics and Industrial Pharmacy, Faculty of Pharmacy, Ain Shams University, Cairo, Egypt,*Correspondence: Abdelkader A. Metwally,
| | - Amira A. Nayel
- Clinical Pharmacy Department, Alexandria Ophthalmology Hospital, Alexandria, Egypt,Department of Clinical Pharmacy and Pharmacy Practice, Faculty of Pharmacy, Alexandria University, Alexandria, Egypt
| | - Rania M. Hathout
- Department of Pharmaceutics and Industrial Pharmacy, Faculty of Pharmacy, Ain Shams University, Cairo, Egypt
| |
Collapse
|
17
|
Franco C, Kausar S, Silva MFB, Guedes RC, Falcao AO, Brito MA. Multi-Targeting Approach in Glioblastoma Using Computer-Assisted Drug Discovery Tools to Overcome the Blood–Brain Barrier and Target EGFR/PI3Kp110β Signaling. Cancers (Basel) 2022; 14:cancers14143506. [PMID: 35884571 PMCID: PMC9317902 DOI: 10.3390/cancers14143506] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 07/12/2022] [Indexed: 02/04/2023] Open
Abstract
Simple Summary Treatment of glioblastoma is hampered by the activation of compensatory survival mechanisms by malignant cells that lead to drug resistance. Moreover, the blood–brain barrier (BBB) precludes the brain entrance of most drugs. We hypothesized that computer-assisted drug discovery tools would reveal novel multi-targeting drug candidates with BBB-permeant and favorable ADMET properties. We aimed to discover molecules with predicted ability to inhibit the EGFR/PI3Kp110β pathway and to validate their efficacy and safety in biological assays. We used quantitative structure–activity relationship models and structure-based virtual screening, and assessed ADMET properties, to identify BBB-permeant drug candidates. Moreover, we tested their anti-tumor efficacy and BBB safety and permeation in cell models. We found two EGFR, two PI3Kp110β, and, mostly, two dual inhibitors with anti-tumor effects. Among them, one EGFR and two PI3Kp110β inhibitors were able to cross the BBB endothelium without compromising it. These studies revealed novel drug candidates for glioblastoma treatment. Abstract The epidermal growth factor receptor (EGFR) is upregulated in glioblastoma, becoming an attractive therapeutic target. However, activation of compensatory pathways generates inputs to downstream PI3Kp110β signaling, leading to anti-EGFR therapeutic resistance. Moreover, the blood–brain barrier (BBB) limits drugs’ brain penetration. We aimed to discover EGFR/PI3Kp110β pathway inhibitors for a multi-targeting approach, with favorable ADMET and BBB-permeant properties. We used quantitative structure–activity relationship models and structure-based virtual screening, and assessed ADMET properties, to identify BBB-permeant drug candidates. Predictions were validated in in vitro models of the human BBB and BBB-glioma co-cultures. The results disclosed 27 molecules (18 EGFR, 6 PI3Kp110β, and 3 dual inhibitors) for biological validation, performed in two glioblastoma cell lines (U87MG and U87MG overexpressing EGFR). Six molecules (two EGFR, two PI3Kp110β, and two dual inhibitors) decreased cell viability by 40–99%, with the greatest effect observed for the dual inhibitors. The glioma cytotoxicity was confirmed by analysis of targets’ downregulation and increased apoptosis (15–85%). Safety to BBB endothelial cells was confirmed for three of those molecules (one EGFR and two PI3Kp110β inhibitors). These molecules crossed the endothelial monolayer in the BBB in vitro model and in the BBB-glioblastoma co-culture system. These results revealed novel drug candidates for glioblastoma treatment.
Collapse
Affiliation(s)
- Catarina Franco
- LASIGE, Department of Informatics, Faculty of Sciences, Universidade de Lisboa, Campo Grande, 1749-016 Lisboa, Portugal; (C.F.); (S.K.)
- Research Institute for Medicines, Faculty of Pharmacy, Universidade de Lisboa, Av. Prof. Gama Pinto, 1649-003 Lisboa, Portugal; (M.F.B.S.); (R.C.G.)
| | - Samina Kausar
- LASIGE, Department of Informatics, Faculty of Sciences, Universidade de Lisboa, Campo Grande, 1749-016 Lisboa, Portugal; (C.F.); (S.K.)
- Research Institute for Medicines, Faculty of Pharmacy, Universidade de Lisboa, Av. Prof. Gama Pinto, 1649-003 Lisboa, Portugal; (M.F.B.S.); (R.C.G.)
| | - Margarida F. B. Silva
- Research Institute for Medicines, Faculty of Pharmacy, Universidade de Lisboa, Av. Prof. Gama Pinto, 1649-003 Lisboa, Portugal; (M.F.B.S.); (R.C.G.)
- Department of Pharmaceutical Sciences and Medicines, Faculty of Pharmacy, Universidade de Lisboa, Av. Prof. Gama Pinto, 1649-003 Lisboa, Portugal
| | - Rita C. Guedes
- Research Institute for Medicines, Faculty of Pharmacy, Universidade de Lisboa, Av. Prof. Gama Pinto, 1649-003 Lisboa, Portugal; (M.F.B.S.); (R.C.G.)
- Department of Pharmaceutical Sciences and Medicines, Faculty of Pharmacy, Universidade de Lisboa, Av. Prof. Gama Pinto, 1649-003 Lisboa, Portugal
| | - Andre O. Falcao
- LASIGE, Department of Informatics, Faculty of Sciences, Universidade de Lisboa, Campo Grande, 1749-016 Lisboa, Portugal; (C.F.); (S.K.)
- Correspondence: (A.O.F.); (M.A.B.); Tel.: +351-217500239 (A.O.F.); +351-217946449 (M.A.B.)
| | - Maria Alexandra Brito
- Research Institute for Medicines, Faculty of Pharmacy, Universidade de Lisboa, Av. Prof. Gama Pinto, 1649-003 Lisboa, Portugal; (M.F.B.S.); (R.C.G.)
- Department of Pharmaceutical Sciences and Medicines, Faculty of Pharmacy, Universidade de Lisboa, Av. Prof. Gama Pinto, 1649-003 Lisboa, Portugal
- Correspondence: (A.O.F.); (M.A.B.); Tel.: +351-217500239 (A.O.F.); +351-217946449 (M.A.B.)
| |
Collapse
|
18
|
Hung TNK, Le NQK, Le NH, Tuan LV, Nguyen TP, Thi C, Kang JH. An AI-based prediction model for drug-drug interactions in osteoporosis and Paget's diseases from SMILES. Mol Inform 2022; 41:e2100264. [PMID: 34989149 DOI: 10.1002/minf.202100264] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 01/05/2022] [Indexed: 11/06/2022]
Abstract
Referring to common skeletal-related diseases, osteoporosis and Paget's are two of the most frequently found diseases in the elderly. Nowadays, the combination of multiple drugs is the optimal therapy to decelerate osteoporosis and Paget's pathologic process, which contains various underlying adverse effects due to drug-drug interactions (DDIs). Artificial intelligence (AI) has the potential to evaluate the interaction, pharmacodynamics, and possible side effects between drugs. In this research, we created an AI-based machine-learning model to predict the outcomes of interactions between drugs used for osteoporosis and Paget's treatment, furthermore, to mitigate cost and time in implementing the best combination of medications in clinical practice. Our dataset was collected from the DrugBank database, and we then extracted a variety of chemical features from the simplified molecular-input line-entry system (SMILES) of defined drug pairs that interact with each other. Finally, machine-learning algorithms have been implemented to learn the extracted features. Our stack ensemble model from Random Forest and XGBoost reached an average accuracy of 74% in predicting DDIs. It was superior to individual models and previous methods in most measurement metrics. This study showed the potential of AI models in predicting DDIs of Osteoporosis-Paget's disease in particular, and other diseases in general.
Collapse
Affiliation(s)
| | | | | | | | | | - Cao Thi
- University of Medicine and Pharmacy at Ho Chi Minh City, VIET NAM
| | | |
Collapse
|
19
|
Ye Q, Hsieh CY, Yang Z, Kang Y, Chen J, Cao D, He S, Hou T. A unified drug-target interaction prediction framework based on knowledge graph and recommendation system. Nat Commun 2021; 12:6775. [PMID: 34811351 PMCID: PMC8635420 DOI: 10.1038/s41467-021-27137-3] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 11/05/2021] [Indexed: 02/06/2023] Open
Abstract
Prediction of drug-target interactions (DTI) plays a vital role in drug development in various areas, such as virtual screening, drug repurposing and identification of potential drug side effects. Despite extensive efforts have been invested in perfecting DTI prediction, existing methods still suffer from the high sparsity of DTI datasets and the cold start problem. Here, we develop KGE_NFM, a unified framework for DTI prediction by combining knowledge graph (KG) and recommendation system. This framework firstly learns a low-dimensional representation for various entities in the KG, and then integrates the multimodal information via neural factorization machine (NFM). KGE_NFM is evaluated under three realistic scenarios, and achieves accurate and robust predictions on four benchmark datasets, especially in the scenario of the cold start for proteins. Our results indicate that KGE_NFM provides valuable insight to integrate KG and recommendation system-based techniques into a unified framework for novel DTI discovery.
Collapse
Affiliation(s)
- Qing Ye
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang China ,grid.13402.340000 0004 1759 700XCollege of Control Science and Engineering, Zhejiang University, Hangzhou, 310027 Zhejiang China ,grid.13402.340000 0004 1759 700XState Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058 China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory, Shenzhen, 518057 Guangdong China
| | - Ziyi Yang
- Tencent Quantum Laboratory, Shenzhen, 518057 Guangdong China
| | - Yu Kang
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang China
| | - Jiming Chen
- grid.13402.340000 0004 1759 700XCollege of Control Science and Engineering, Zhejiang University, Hangzhou, 310027 Zhejiang China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, China.
| | - Shibo He
- College of Control Science and Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China.
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China. .,State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
| |
Collapse
|
20
|
Hermansyah O, Bustamam A, Yanuar A. Virtual screening of dipeptidyl peptidase-4 inhibitors using quantitative structure-activity relationship-based artificial intelligence and molecular docking of hit compounds. Comput Biol Chem 2021; 95:107597. [PMID: 34800858 DOI: 10.1016/j.compbiolchem.2021.107597] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 10/25/2021] [Accepted: 10/26/2021] [Indexed: 12/31/2022]
Abstract
Dipeptidyl peptidase-4 (DPP-4) inhibitors are becoming an essential drug in the treatment of type 2 diabetes mellitus; however, some classes of these drugs exert side effects, including joint pain and pancreatitis. Studies suggest that these side effects might be related to secondary inhibition of DPP-8 and DPP-9. In this study, we identified DPP-4-inhibitor hit compounds selective against DPP-8 and DPP-9. We built a virtual screening workflow using a quantitative structure-activity relationship (QSAR) strategy based on artificial intelligence to allow faster screening of millions of molecules for the DPP-4 target relative to other screening methods. Five regression machine learning algorithms and four classification machine learning algorithms were applied to build virtual screening workflows, with the QSAR model applied using support vector regression (R2pred 0.78) and the classification QSAR model using the random forest algorithm with 92.2% accuracy. Virtual screening results of > 10 million molecules obtained 2 716 hits compounds with a pIC50 value of > 7.5. Additionally, molecular docking results of several potential hit compounds for DPP-4, DPP-8, and DPP-9 identified CH0002 as showing high inhibitory potential against DPP-4 and low inhibitory potential for DPP-8 and DPP-9 enzymes. These results demonstrated the effectiveness of this technique for identifying DPP-4-inhibitor hit compounds selective for DPP-4 and against DPP-8 and DPP-9 and suggest its potential efficacy for applications to discover hit compounds of other targets.
Collapse
Affiliation(s)
- Oky Hermansyah
- Laboratory of Biomedical Computation and Drug Design, Faculty of Pharmacy, Universitas Indonesia, Depok 16424, Indonesia
| | - Alhadi Bustamam
- Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Indonesia, Depok 16424, Indonesia
| | - Arry Yanuar
- Laboratory of Biomedical Computation and Drug Design, Faculty of Pharmacy, Universitas Indonesia, Depok 16424, Indonesia.
| |
Collapse
|
21
|
Guo J, Janet JP, Bauer MR, Nittinger E, Giblin KA, Papadopoulos K, Voronov A, Patronov A, Engkvist O, Margreitter C. DockStream: a docking wrapper to enhance de novo molecular design. J Cheminform 2021; 13:89. [PMID: 34789335 PMCID: PMC8596819 DOI: 10.1186/s13321-021-00563-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Accepted: 10/29/2021] [Indexed: 01/09/2023] Open
Abstract
Recently, we have released the de novo design platform REINVENT in version 2.0. This improved and extended iteration supports far more features and scoring function components, which allows bespoke and tailor-made protocols to maximize impact in small molecule drug discovery projects. A major obstacle of generative models is producing active compounds, in which predictive (QSAR) models have been applied to enrich target activity. However, QSAR models are inherently limited by their applicability domains. To overcome these limitations, we introduce a structure-based scoring component for REINVENT. DockStream is a flexible, stand-alone molecular docking wrapper that provides access to a collection of ligand embedders and docking backends. Using the benchmarking and analysis workflow provided in DockStream, execution and subsequent analysis of a variety of docking configurations can be automated. Docking algorithms vary greatly in performance depending on the target and the benchmarking and analysis workflow provides a streamlined solution to identifying productive docking configurations. We show that an informative docking configuration can inform the REINVENT agent to optimize towards improving docking scores using public data. With docking activated, REINVENT is able to retain key interactions in the binding site, discard molecules which do not fit the binding cavity, harness unused (sub-)pockets, and improve overall performance in the scaffold-hopping scenario. The code is freely available at https://github.com/MolecularAI/DockStream .
Collapse
Affiliation(s)
- Jeff Guo
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Jon Paul Janet
- Medicinal Chemistry, Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Matthias R Bauer
- Structure & Biophysics, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Eva Nittinger
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | - Kathryn A Giblin
- Medicinal Chemistry, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge, UK
| | | | - Alexey Voronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Atanas Patronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
- Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | | |
Collapse
|
22
|
Machine Learning Applied to the Modeling of Pharmacological and ADMET Endpoints. Methods Mol Biol 2021. [PMID: 34731464 DOI: 10.1007/978-1-0716-1787-8_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2023]
Abstract
The well-known concept of quantitative structure-activity relationships (QSAR) has been gaining significant interest in the recent years. Data, descriptors, and algorithms are the main pillars to build useful models that support more efficient drug discovery processes with in silico methods. Significant advances in all three areas are the reason for the regained interest in these models. In this book chapter we review various machine learning (ML) approaches that make use of measured in vitro/in vivo data of many compounds. We put these in context with other digital drug discovery methods and present some application examples.
Collapse
|
23
|
Computational identification of 2,4-disubstituted amino-pyrimidines as L858R/T790M-EGFR double mutant inhibitors using pharmacophore mapping, molecular docking, binding free energy calculation, DFT study and molecular dynamic simulation. In Silico Pharmacol 2021; 9:54. [PMID: 34631361 DOI: 10.1007/s40203-021-00113-x] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 09/24/2021] [Indexed: 10/20/2022] Open
Abstract
Pharmacophore modelling studies have been performed for a series of 2,4-disubstituted-pyrimidines derivatives as EGFR L858R/T790M tyrosine kinase inhibitors. The high scoring AARR.15 hypothesis was selected as the best pharmacophore model with the highest survival score of 3.436 having two hydrogen bond acceptors and two aromatic ring features. Pharmacophore-based virtual screening followed by structure-based yielded the six molecules (ZINC17013227, ZINC17013215, ZINC9573324, ZINC9573445, ZINC24023331 and ZINC17013503) from the ZINC database with significant in silico predicted activity and strong binding affinity towords the EGFR L858R/T790M tyrosine kinase. In silico toxicity and cytochrome profiling indicates that all the 06 virtually screened compounds were substrate/inhibitors of the CYP-3A4 metabolizing enzyme and were non-carcinogenic and devoid of Ames mutagenesis. Density functional theory (DFT) and molecular dynamic (MD) simulation further validated the obtained hits. Supplementary Information The online version contains supplementary material available at 10.1007/s40203-021-00113-x.
Collapse
|
24
|
Nandy A, Duan C, Taylor MG, Liu F, Steeves AH, Kulik HJ. Computational Discovery of Transition-metal Complexes: From High-throughput Screening to Machine Learning. Chem Rev 2021; 121:9927-10000. [PMID: 34260198 DOI: 10.1021/acs.chemrev.1c00347] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Transition-metal complexes are attractive targets for the design of catalysts and functional materials. The behavior of the metal-organic bond, while very tunable for achieving target properties, is challenging to predict and necessitates searching a wide and complex space to identify needles in haystacks for target applications. This review will focus on the techniques that make high-throughput search of transition-metal chemical space feasible for the discovery of complexes with desirable properties. The review will cover the development, promise, and limitations of "traditional" computational chemistry (i.e., force field, semiempirical, and density functional theory methods) as it pertains to data generation for inorganic molecular discovery. The review will also discuss the opportunities and limitations in leveraging experimental data sources. We will focus on how advances in statistical modeling, artificial intelligence, multiobjective optimization, and automation accelerate discovery of lead compounds and design rules. The overall objective of this review is to showcase how bringing together advances from diverse areas of computational chemistry and computer science have enabled the rapid uncovering of structure-property relationships in transition-metal chemistry. We aim to highlight how unique considerations in motifs of metal-organic bonding (e.g., variable spin and oxidation state, and bonding strength/nature) set them and their discovery apart from more commonly considered organic molecules. We will also highlight how uncertainty and relative data scarcity in transition-metal chemistry motivate specific developments in machine learning representations, model training, and in computational chemistry. Finally, we will conclude with an outlook of areas of opportunity for the accelerated discovery of transition-metal complexes.
Collapse
Affiliation(s)
- Aditya Nandy
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.,Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Michael G Taylor
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Fang Liu
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Adam H Steeves
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
25
|
Casanova-Alvarez O, Morales-Helguera A, Cabrera-Pérez MÁ, Molina-Ruiz R, Molina C. A Novel Automated Framework for QSAR Modeling of Highly Imbalanced Leishmania High-Throughput Screening Data. J Chem Inf Model 2021; 61:3213-3231. [PMID: 34191520 DOI: 10.1021/acs.jcim.0c01439] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In silico prediction of antileishmanial activity using quantitative structure-activity relationship (QSAR) models has been developed on limited and small datasets. Nowadays, the availability of large and diverse high-throughput screening data provides an opportunity to the scientific community to model this activity from the chemical structure. In this study, we present the first KNIME automated workflow to modeling a large, diverse, and highly imbalanced dataset of compounds with antileishmanial activity. Because the data is strongly biased toward inactive compounds, a novel strategy was implemented based on the selection of different balanced training sets and a further consensus model using single decision trees as the base model and three criteria for output combinations. The decision tree consensus was adopted after comparing its classification performance to consensuses built upon Gaussian-Naı̈ve-Bayes, Support-Vector-Machine, Random-Forest, Gradient-Boost, and Multi-Layer-Perceptron base models. All these consensuses were rigorously validated using internal and external test validation sets and were compared against each other using Friedman and Bonferroni-Dunn statistics. For the retained decision tree-based consensus model, which covers 100% of the chemical space of the dataset and with the lowest consensus level, the overall accuracy statistics for test and external sets were between 71 and 74% and 71 and 76%, respectively, while for a reduced chemical space (21%) and with an incremental consensus level, the accuracy statistics were substantially improved with values for the test and external sets between 86 and 92% and 88 and 92%, respectively. These results highlight the relevance of the consensus model to prioritize a relatively small set of active compounds with high prediction sensitivity using the Incremental Consensus at high level values or to predict as many compounds as possible, lowering the level of Incremental Consensus. Finally, the workflow developed eliminates human bias, improves the procedure reproducibility, and allows other researchers to reproduce our design and use it in their own QSAR problems.
Collapse
Affiliation(s)
- Omar Casanova-Alvarez
- Departamento de Química, Facultad de Química-Farmacia, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, Villa Clara 54830, Cuba
| | - Aliuska Morales-Helguera
- Centro de Bioactivos Químicos, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, Villa Clara 54830, Cuba
| | - Miguel Ángel Cabrera-Pérez
- Centro de Bioactivos Químicos, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, Villa Clara 54830, Cuba
| | - Reinaldo Molina-Ruiz
- Centro de Bioactivos Químicos, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, Villa Clara 54830, Cuba
| | - Christophe Molina
- PIKAÏROS S.A., B03 - 2 Allée de la Clairière, 31650 Saint Orens de Gameville, France
| |
Collapse
|
26
|
Artificial intelligence in drug design: algorithms, applications, challenges and ethics. FUTURE DRUG DISCOVERY 2021. [DOI: 10.4155/fdd-2020-0028] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The discovery paradigm of drugs is rapidly growing due to advances in machine learning (ML) and artificial intelligence (AI). This review covers myriad faces of AI and ML in drug design. There is a plethora of AI algorithms, the most common of which are summarized in this review. In addition, AI is fraught with challenges that are highlighted along with plausible solutions to them. Examples are provided to illustrate the use of AI and ML in drug discovery and in predicting drug properties such as binding affinities and interactions, solubility, toxicology, blood–brain barrier permeability and chemical properties. The review also includes examples depicting the implementation of AI and ML in tackling intractable diseases such as COVID-19, cancer and Alzheimer’s disease. Ethical considerations and future perspectives of AI are also covered in this review.
Collapse
|
27
|
A hybrid modeling approach for assessing mechanistic models of small molecule partitioning in vivo using a machine learning-integrated modeling platform. Sci Rep 2021; 11:11143. [PMID: 34045592 PMCID: PMC8160209 DOI: 10.1038/s41598-021-90637-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 05/13/2021] [Indexed: 12/17/2022] Open
Abstract
Prediction of the first-in-human dosing regimens is a critical step in drug development and requires accurate quantitation of drug distribution. Traditional in vivo studies used to characterize clinical candidate’s volume of distribution are error-prone, time- and cost-intensive and lack reproducibility in clinical settings. The paper demonstrates how a computational platform integrating machine learning optimization with mechanistic modeling can be used to simulate compound plasma concentration profile and predict tissue-plasma partition coefficients with high accuracy by varying the lipophilicity descriptor logP. The approach applied to chemically diverse small molecules resulted in comparable geometric mean fold-errors of 1.50 and 1.63 in pharmacokinetic outputs for direct tissue:plasma partition and hybrid logP optimization, with the latter enabling prediction of tissue permeation that can be used to guide toxicity and efficacy dosing in human subjects. The optimization simulations required to achieve these results were parallelized on the AWS cloud and generated outputs in under 5 h. Accuracy, speed, and scalability of the framework indicate that it can be used to assess the relevance of other mechanistic relationships implicated in pharmacokinetic-pharmacodynamic phenomena with a lower risk of overfitting datasets and generate large database of physiologically-relevant drug disposition for further integration with machine learning models.
Collapse
|
28
|
Balakrishnan S, VanGessel FG, Boukouvalas Z, Barnes BC, Fuge MD, Chung PW. Locally Optimizable Joint Embedding Framework to Design Nitrogen-rich Molecules that are Similar but Improved. Mol Inform 2021; 40:e2100011. [PMID: 33909951 DOI: 10.1002/minf.202100011] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 03/31/2021] [Indexed: 11/06/2022]
Abstract
Deep learning has shown great potential for generating molecules with desired properties. But the cost and time required to obtain relevant property data have limited study to only a few classes of materials for which extensive data have already been collected. We develop a deep learning method that combines a generative model with a property prediction model to fuse small data of one class of molecules with larger data in another class. Common low-level physicochemical properties are jointly embedded into a latent space that can be used to design molecules in the smaller class. The chemical space around the molecules in the training set is explored through local gradient ascent optimization. Based on nine molecules from the original training set, nine new molecules are found to have improved properties while remaining structurally similar to the training molecules thereby easing requirements for entirely new synthesis routes. Validation is performed using an equilibrium thermochemistry code to verify the molecules and target properties. A specific example targeting the Chapman-Jouguet velocity and small data for nitrogen-rich molecules is shown. Despite the relative lack of nitrogen-rich molecule data, the results demonstrate that fusing and joint embedding with plentiful low nitrogen molecular data can produce higher generative performance than using the scarce data alone.
Collapse
Affiliation(s)
- Sangeeth Balakrishnan
- Department of Mechanical Engineering, University of Maryland in College Park, College Park, Maryland, USA
| | - Francis G VanGessel
- U.S. Naval Surface Warfare Center, Indian Head Division, Indian Head, Maryland, USA
| | - Zois Boukouvalas
- Department of Mathematics & Statistics, American University, Washington, DC, USA
| | - Brian C Barnes
- U.S. Army Combat Capabilities Development, Command Army Research Laboratory, Aberdeen Proving Ground, Maryland, USA
| | - Mark D Fuge
- Department of Mechanical Engineering, University of Maryland in College Park, College Park, Maryland, USA
| | - Peter W Chung
- Department of Mechanical Engineering, University of Maryland in College Park, College Park, Maryland, USA
| |
Collapse
|
29
|
Pastor M, Gómez-Tamayo JC, Sanz F. Flame: an open source framework for model development, hosting, and usage in production environments. J Cheminform 2021; 13:31. [PMID: 33875019 PMCID: PMC8054391 DOI: 10.1186/s13321-021-00509-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 04/08/2021] [Indexed: 01/17/2023] Open
Abstract
This article describes Flame, an open source software for building predictive models and supporting their use in production environments. Flame is a web application with a web-based graphic interface, which can be used as a desktop application or installed in a server receiving requests from multiple users. Models can be built starting from any collection of biologically annotated chemical structures since the software supports structural normalization, molecular descriptor calculation, and machine learning model generation using predefined workflows. The model building workflow can be customized from the graphic interface, selecting the type of normalization, molecular descriptors, and machine learning algorithm to be used from a panel of state-of-the-art methods implemented natively. Moreover, Flame implements a mechanism allowing to extend its source code, adding unlimited model customization. Models generated with Flame can be easily exported, facilitating collaborative model development. All models are stored in a model repository supporting model versioning. Models are identified by unique model IDs and include detailed documentation formatted using widely accepted standards. The current version is the result of nearly 3 years of development in collaboration with users from the pharmaceutical industry within the IMI eTRANSAFE project, which aims, among other objectives, to develop high-quality predictive models based on shared legacy data for assessing the safety of drug candidates.
Collapse
Affiliation(s)
- Manuel Pastor
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Hospital del Mar Medical Research Institute (IMIM), Universitat Pompeu Fabra, Barcelona, Spain.
| | - José Carlos Gómez-Tamayo
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Hospital del Mar Medical Research Institute (IMIM), Universitat Pompeu Fabra, Barcelona, Spain
| | - Ferran Sanz
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Hospital del Mar Medical Research Institute (IMIM), Universitat Pompeu Fabra, Barcelona, Spain
| |
Collapse
|
30
|
Nazem F, Ghasemi F, Fassihi A, Dehnavi AM. 3D U-Net: A voxel-based method in binding site prediction of protein structure. J Bioinform Comput Biol 2021; 19:2150006. [PMID: 33866960 DOI: 10.1142/s0219720021500062] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Binding site prediction for new proteins is important in structure-based drug design. The identified binding sites may be helpful in the development of treatments for new viral outbreaks in the world when there is no information available about their pockets with COVID-19 being a case in point. Identification of the pockets using computational methods, as an alternative method, has recently attracted much interest. In this study, the binding site prediction is viewed as a semantic segmentation problem. An improved 3D version of the U-Net model based on the dice loss function is utilized to predict the binding sites accurately. The performance of the proposed model on the independent test datasets and SARS-COV-2 shows the segmentation model could predict the binding sites with a more accurate shape than the recently published deep learning model, i.e. DeepSite. Therefore, the model may help predict the binding sites of proteins and could be used in drug design for novel proteins.
Collapse
Affiliation(s)
- Fatemeh Nazem
- Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences Hezar-Jerib Ave, Isfahan 81746 73461, Iran
| | - Fahimeh Ghasemi
- Department of Bioinformatics and Systems Biology, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Hezar-Jerib Ave, Isfahan 81746 73461, Iran
| | - Afshin Fassihi
- Department of Medicinal Chemistry, School of Pharmacology and Pharmaceutical Sciences, Isfahan University of Medical Sciences, Hezar-Jerib Ave, Isfahan 81746 73461, Iran
| | - Alireza Mehri Dehnavi
- Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences Hezar-Jerib Ave, Isfahan 81746 73461, Iran
| |
Collapse
|
31
|
Benassi JC, Barbosa FAR, Candiotto G, Grinevicius VMAS, Filho DW, Braga AL, Pedrosa RC. Docking and molecular dynamics predicted B-DNA and dihydropyrimidinone selenoesters interactions elucidating antiproliferative effects on breast adenocarcinoma cells. J Biomol Struct Dyn 2021; 40:8261-8273. [PMID: 33847252 DOI: 10.1080/07391102.2021.1910569] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Dihydropyrimidinones have demonstrated different biological activities including anticancer properties. Cytotoxic potential and antiproliferative potential of new dihydropyrimidinone-derived selenoesters (Se-DHPM) compounds were assessed in vitro against the breast adenocarcinoma cells (MCF-7). Among the eight Se-DHPM compounds tested just 49A and 49F were the most cytotoxic for MCF-7 and the most selective for the non-tumor strain (McCoy) and reduced cell viability in a time- and concentration-dependent manner. Compounds 49A and 49F increased the rate of cell death due to apoptosis and necrosis comparatively to the control, however only the 49F showed antiproliferative potential, reducing the number of colonies formed. In the molecular assay 49A interacts with CT-DNA and caused hyperchromism while 49F caused a hypochromic effect. The intercalation test revealed that the two compounds caused destabilization in the CT-DNA molecule. This effect was evidenced by the loss of fluorescence when the compounds competed and caused the displacement of propidium iodide. Simulations (docking and molecular dynamics) using B-DNA brought a greater understanding of ligand-B-DNA interactions. Furthermore, they predicted that the compounds act as minor groove ligands that are stabilized through hydrogen bonds and hydrophobic interactions. However, the form of interaction foreseen for 49A was more energetically favorable and had more stable hydrogen bonds during the simulation time. Despite some violations foreseen in the ADMET for 49F, the set of other results point to this Se-DHPM as a promising leader compound with anti-tumor potential for breast cancer.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Jean C Benassi
- Department of Biochemistry, Federal University of Santa Catarina, Florianópolis, Brazil
| | - Flavio A R Barbosa
- Department of Chemistry, Federal University of Santa Catarina, Florianópolis, Brazil
| | - Graziâni Candiotto
- Institute of Chemistry, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | | | - Danilo Wilhelm Filho
- Departament of Ecology and Zoology, Federal University of Santa Catarina, Florianópolis, Brazil
| | - Antônio L Braga
- Department of Chemistry, Federal University of Santa Catarina, Florianópolis, Brazil
| | - Rozangela C Pedrosa
- Department of Biochemistry, Federal University of Santa Catarina, Florianópolis, Brazil
| |
Collapse
|
32
|
AKT Inhibitors: The Road Ahead to Computational Modeling-Guided Discovery. Int J Mol Sci 2021; 22:ijms22083944. [PMID: 33920446 PMCID: PMC8070654 DOI: 10.3390/ijms22083944] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2021] [Revised: 04/07/2021] [Accepted: 04/08/2021] [Indexed: 12/26/2022] Open
Abstract
AKT, is a serine/threonine protein kinase comprising three isoforms-namely: AKT1, AKT2 and AKT3, whose inhibitors have been recognized as promising therapeutic targets for various human disorders, especially cancer. In this work, we report a systematic evaluation of multi-target Quantitative Structure-Activity Relationship (mt-QSAR) models to probe AKT' inhibitory activity, based on different feature selection algorithms and machine learning tools. The best predictive linear and non-linear mt-QSAR models were found by the genetic algorithm-based linear discriminant analysis (GA-LDA) and gradient boosting (Xgboost) techniques, respectively, using a dataset containing 5523 inhibitors of the AKT isoforms assayed under various experimental conditions. The linear model highlighted the key structural attributes responsible for higher inhibitory activity whereas the non-linear model displayed an overall accuracy higher than 90%. Both these predictive models, generated through internal and external validation methods, were then used for screening the Asinex kinase inhibitor library to identify the most potential virtual hits as pan-AKT inhibitors. The virtual hits identified were then filtered by stepwise analyses based on reverse pharmacophore-mapping based prediction. Finally, results of molecular dynamics simulations were used to estimate the theoretical binding affinity of the selected virtual hits towards the three isoforms of enzyme AKT. Our computational findings thus provide important guidelines to facilitate the discovery of novel AKT inhibitors.
Collapse
|
33
|
Molecular optimization by capturing chemist's intuition using deep neural networks. J Cheminform 2021; 13:26. [PMID: 33743817 PMCID: PMC7980633 DOI: 10.1186/s13321-021-00497-0] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 02/22/2021] [Indexed: 01/08/2023] Open
Abstract
A main challenge in drug discovery is finding molecules with a desirable balance of multiple properties. Here, we focus on the task of molecular optimization, where the goal is to optimize a given starting molecule towards desirable properties. This task can be framed as a machine translation problem in natural language processing, where in our case, a molecule is translated into a molecule with optimized properties based on the SMILES representation. Typically, chemists would use their intuition to suggest chemical transformations for the starting molecule being optimized. A widely used strategy is the concept of matched molecular pairs where two molecules differ by a single transformation. We seek to capture the chemist’s intuition from matched molecular pairs using machine translation models. Specifically, the sequence-to-sequence model with attention mechanism, and the Transformer model are employed to generate molecules with desirable properties. As a proof of concept, three ADMET properties are optimized simultaneously: logD, solubility, and clearance, which are important properties of a drug. Since desirable properties often vary from project to project, the user-specified desirable property changes are incorporated into the input as an additional condition together with the starting molecules being optimized. Thus, the models can be guided to generate molecules satisfying the desirable properties. Additionally, we compare the two machine translation models based on the SMILES representation, with a graph-to-graph translation model HierG2G, which has shown the state-of-the-art performance in molecular optimization. Our results show that the Transformer can generate more molecules with desirable properties by making small modifications to the given starting molecules, which can be intuitive to chemists. A further enrichment of diverse molecules can be achieved by using an ensemble of models.
Collapse
|
34
|
Aishwarya S, Gunasekaran K, Sagaya Jansi R, Sangeetha G. From genomes to molecular dynamics - A bottom up approach in extrication of SARS CoV-2 main protease inhibitors. ACTA ACUST UNITED AC 2021; 18:100156. [PMID: 33532671 PMCID: PMC7844360 DOI: 10.1016/j.comtox.2021.100156] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 12/24/2020] [Accepted: 01/21/2021] [Indexed: 12/13/2022]
Abstract
The recent pandemic Coronavirus disease-19 outbreak had traumatized global countries since its origin in late December 2019. Though the virus originated in China, it has spread rapidly across the world due its firmly established community transmission. To successfully tackle the spread and further infection, there needs a clear multidimensional understanding of the molecular mechanisms. Henceforth, 942 viral genome sequences were analysed to predict the core genomes crucial in virus life cycle. Additionally, 35 small interfering RNA transcripts were predicted that can target specifically the viral core proteins and reduce pathogenesis. The crystal structure of Covid-19 main protease-6LU7 was chosen as an attractive target due to the factors that there were fewer mutations and whose structure had significant identity to the annotated protein sequence of the core genome. Drug repurposing of both recruiting and non recruiting drugs was carried out through molecular docking procedures to recognize bitolterol as a good inhibitor of Covid-19 protease. The study was extended further to screen antiviral phytocompounds through quantitative structure activity relationship and molecular docking to identify davidigenin, from licorice as the best novel lead with good interactions and binding energy. The docking of the best compounds in all three categories was validated with molecular dynamics simulations which implied stable binding of the drug and lead molecule. Though the studies need clinical evaluations, the results are suggestive of curbing the pandemic.
Collapse
Affiliation(s)
- S Aishwarya
- Department of Bioinformatics, Stella Maris College (Autonomous), Chennai 600086, India.,Centre for Advanced Studies in Crystallography and Biophysics, University of Madras, Chennai 600025, India
| | - K Gunasekaran
- Centre for Advanced Studies in Crystallography and Biophysics, University of Madras, Chennai 600025, India
| | - R Sagaya Jansi
- Department of Bioinformatics, Stella Maris College (Autonomous), Chennai 600086, India
| | - G Sangeetha
- Centre for Advanced Studies in Crystallography and Biophysics, University of Madras, Chennai 600025, India
| |
Collapse
|
35
|
Therapeutic Path to Double Knockout: Investigating the Selective Dual-Inhibitory Mechanisms of Adenosine Receptors A1 and A2 by a Novel Methoxy-Substituted Benzofuran Derivative in the Treatment of Parkinson's Disease. Cell Biochem Biophys 2020; 79:25-36. [PMID: 33222095 DOI: 10.1007/s12013-020-00957-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/05/2020] [Indexed: 10/22/2022]
Abstract
The dual inhibition of adenosine receptors A1 (A1 AR) and A2 (A2A AR) has been considered as an efficient strategy in the treatment of Parkinson's disease (PD). This led to the recent development of a series of methoxy-substituted benzofuran derivatives among which compound 3j exhibited dual-inhibitory potencies in the micromolar range. Therefore, in this study, we seek to resolve the mechanisms by which this novel compound elicits its selective dual targeting against A1 AR and A2A AR. Unique to the binding of 3j in both proteins, from our findings, is the ring-ring interaction elicited by A1Phe275 (→ A2Phe170) with the benzofuran ring of the compound. As observed, this π-stacking interaction contributes notably to the stability of 3j at the active sites of A1 and A2A AR. Besides, conserved active site residues in the proteins such as A1Ala170 (→ A2Ala65), A1Ile173 (→ A2Ile68), A1Val191 (→ A2Val86), A1Leu192 (→ A2Leu87), A1Ala195 (→ A2Ala90), A1Met284 (→ A2Met179), A1Tyr375 (→ A2Tyr369), A1Ile378 (→ A2Ile372), and A1His382 (→ A2His376) were commonly involved with other ring substituents which further complement the dual binding and stability of 3j. This reflects a similar interaction mechanism that involved aromatic (π) interactions. Consequentially, vdW energies contributed immensely to the dual binding of the compound, which culminated in high ΔGbinds that were homogenous in both proteins. Furthermore, 3j commonly disrupted the stable and compact conformation of A1 and A2A AR, coupled with their active sites where Cα deviations were relatively high. Ligand mobility analysis also revealed that both compounds exhibited a similar motion pattern at the active site of the proteins relative to their optimal dual binding. We believe that findings from this study with significantly aid the structure-based design of highly selective dual-inhibitors of A1 and A2A AR.
Collapse
|
36
|
Bagri K, Kumar A, Nimbhal M, Kumar P. Index of ideality of correlation and correlation contradiction index: a confluent perusal on acetylcholinesterase inhibitors. MOLECULAR SIMULATION 2020. [DOI: 10.1080/08927022.2020.1770753] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Kiran Bagri
- Department of Pharmaceutical Sciences, Guru Jambheshwar University of Science & Technology, Hisar, India
| | - Ashwani Kumar
- Department of Pharmaceutical Sciences, Guru Jambheshwar University of Science & Technology, Hisar, India
| | - Manisha Nimbhal
- Department of Pharmaceutical Sciences, Guru Jambheshwar University of Science & Technology, Hisar, India
| | - Parvin Kumar
- Department of Chemistry, Kurukshetra University, Kurukshetra, India
| |
Collapse
|
37
|
Pharmacophore modelling, QSAR study, molecular docking and insilico ADME prediction of 1,2,3-triazole and pyrazolopyridones as DprE1 inhibitor antitubercular agents. SN APPLIED SCIENCES 2020. [DOI: 10.1007/s42452-020-2638-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
|
38
|
Schaduangrat N, Lampa S, Simeon S, Gleeson MP, Spjuth O, Nantasenamat C. Towards reproducible computational drug discovery. J Cheminform 2020; 12:9. [PMID: 33430992 PMCID: PMC6988305 DOI: 10.1186/s13321-020-0408-x] [Citation(s) in RCA: 78] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 01/02/2020] [Indexed: 12/11/2022] Open
Abstract
The reproducibility of experiments has been a long standing impediment for further scientific progress. Computational methods have been instrumental in drug discovery efforts owing to its multifaceted utilization for data collection, pre-processing, analysis and inference. This article provides an in-depth coverage on the reproducibility of computational drug discovery. This review explores the following topics: (1) the current state-of-the-art on reproducible research, (2) research documentation (e.g. electronic laboratory notebook, Jupyter notebook, etc.), (3) science of reproducible research (i.e. comparison and contrast with related concepts as replicability, reusability and reliability), (4) model development in computational drug discovery, (5) computational issues on model development and deployment, (6) use case scenarios for streamlining the computational drug discovery protocol. In computational disciplines, it has become common practice to share data and programming codes used for numerical calculations as to not only facilitate reproducibility, but also to foster collaborations (i.e. to drive the project further by introducing new ideas, growing the data, augmenting the code, etc.). It is therefore inevitable that the field of computational drug design would adopt an open approach towards the collection, curation and sharing of data/code.
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, 10700, Bangkok, Thailand
| | - Samuel Lampa
- Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden
| | - Saw Simeon
- Interdisciplinary Graduate Program in Bioscience, Faculty of Science, Kasetsart University, 10900, Bangkok, Thailand
| | - Matthew Paul Gleeson
- Department of Biomedical Engineering, Faculty of Engineering, King Mongkut's Institute of Technology Ladkrabang, 10520, Bangkok, Thailand.
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden.
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, 10700, Bangkok, Thailand.
| |
Collapse
|
39
|
Ambure P, Cordeiro MNDS. Importance of Data Curation in QSAR Studies Especially While Modeling Large-Size Datasets. METHODS IN PHARMACOLOGY AND TOXICOLOGY 2020. [DOI: 10.1007/978-1-0716-0150-1_5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
40
|
Bibi S, Wang YB, Tang DX, Kamal MA, Yu H. Prospects for Discovering the Secondary Metabolites of Cordyceps Sensu Lato by the Integrated Strategy. Med Chem 2019; 17:97-120. [PMID: 31880251 DOI: 10.2174/1573406416666191227120425] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 12/10/2019] [Accepted: 12/10/2019] [Indexed: 11/22/2022]
Abstract
BACKGROUND Some species of Cordyceps sensu lato are famous Chinese herbs with significant biological activities, often used as edible food and traditional medicine in China. Cordyceps represents the largest entomopathogenic group of fungi, including 40 genera and 1339 species in three families and incertae sedis of Hypocreales. OBJECTIVE Most of the Cordyceps-derivatives have been approved clinically for the treatment of various diseases such as diabetes, cancers, inflammation, cardiovascular, renal and neurological disorders and are used worldwide as supplements and herbal drugs, but there is still need for highly efficient Cordyceps-derived drugs for fatal diseases with approval of the U.S. Food and Drug Administration. METHODS Computer-aided drug design concepts could improve the discovery of putative Cordyceps- derived medicine within less time and low budget. The integration of computer-aided drug design methods with experimental validation has contributed to the successful discovery of novel drugs. RESULTS This review focused on modern taxonomy, active metabolites, and modern drug design techniques that could accelerate conventional drug design and discovery of Cordyceps s. l. Successful application of computer-aided drug design methods in Cordyceps research has been discussed. CONCLUSION It has been concluded that computer-aided drug design techniques could influence the multiple target-focused drug design, because each metabolite of Cordyceps has shown significant activities for the various diseases with very few or no side effects.
Collapse
Affiliation(s)
- Shabana Bibi
- Yunnan Herbal Laboratory, School of Life Sciences, Yunnan University, Kunming 650091, Yunnan, China
| | - Yuan-Bing Wang
- Yunnan Herbal Laboratory, School of Life Sciences, Yunnan University, Kunming 650091, Yunnan, China
| | - De-Xiang Tang
- Yunnan Herbal Laboratory, School of Life Sciences, Yunnan University, Kunming 650091, Yunnan, China
| | - Mohammad Amjad Kamal
- King Fahd Medical Research Center, King Abdulaziz University, P. O. Box 80216, Jeddah 21589, Saudi Arabia
| | - Hong Yu
- Yunnan Herbal Laboratory, School of Life Sciences, Yunnan University, Kunming 650091, Yunnan, China
| |
Collapse
|
41
|
Miranda PHDS, Lourenço EMG, Morais AMS, de Oliveira PIC, Silverio PSDSN, Jordão AK, Barbosa EG. Molecular modeling of a series of dehydroquinate dehydratase type II inhibitors of Mycobacterium tuberculosis and design of new binders. Mol Divers 2019; 25:1-12. [PMID: 31820222 DOI: 10.1007/s11030-019-10020-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 11/22/2019] [Indexed: 11/24/2022]
Abstract
Tuberculosis, caused by Mycobacterium tuberculosis (M. tuberculosis), is still responsible for a large number of fatal cases, especially in developing countries with alarming rates of incidence and prevalence worldwide. Mycobacterium tuberculosis has a remarkable ability to develop new resistance mechanisms to the conventional antimicrobials treatment. Because of this, there is an urgent need for novel bioactive compounds for its treatment. The dehydroquinate dehydratase II (DHQase II) is considered a key enzyme of shikimate pathway, and it can be used as a promising target for the design of new bioactive compounds with antibacterial action. The aim of this work was the construction of QSAR models to aid the design of new potential DHQase II inhibitors. For that purpose, various molecular modeling approaches, such as activity cliff, QSAR models and computer-aided ligand design were utilized. A predictive in silico 4D-QSAR model was built using a database comprising 86 inhibitors of DHQase II, and the model was used to predict the activity of the designed ligands. The obtained model proved to predict well the DHQase II inhibition for an external validation dataset ([Formula: see text] = 0.72). Also, the Activity Cliff analysis shed light on important structural features applied to the ligand design.
Collapse
Affiliation(s)
- Paulo H de S Miranda
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Estela M G Lourenço
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Alexander M S Morais
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Pedro I C de Oliveira
- Programa de Pós-Graduação em Bioinformática, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | | | - Alessandro K Jordão
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Euzébio G Barbosa
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil. .,Programa de Pós-Graduação em Bioinformática, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil.
| |
Collapse
|
42
|
Kausar S, Falcao AO. A visual approach for analysis and inference of molecular activity spaces. J Cheminform 2019; 11:63. [PMID: 33430986 PMCID: PMC6805449 DOI: 10.1186/s13321-019-0386-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Accepted: 10/05/2019] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Molecular space visualization can help to explore the diversity of large heterogeneous chemical data, which ultimately may increase the understanding of structure-activity relationships (SAR) in drug discovery projects. Visual SAR analysis can therefore be useful for library design, chemical classification for their biological evaluation and virtual screening for the selection of compounds for synthesis or in vitro testing. As such, computational approaches for molecular space visualization have become an important issue in cheminformatics research. The proposed approach uses molecular similarity as the sole input for computing a probabilistic surface of molecular activity (PSMA). This similarity matrix is transformed in 2D using different dimension reduction algorithms (Principal Coordinates Analysis ( PCooA), Kruskal multidimensional scaling, Sammon mapping and t-SNE). From this projection, a kernel density function is applied to compute the probability of activity for each coordinate in the new projected space. RESULTS This methodology was tested over four different quantitative structure-activity relationship (QSAR) binary classification data sets and the PSMAs were computed for each. The generated maps showed internal consistency with active molecules grouped together for all data sets and all dimensionality reduction algorithms. To validate the quality of the generated maps, the 2D coordinates of test molecules were computed into the new reference space using a data transformation matrix. In total sixteen PSMAs were built, and their performance was assessed using the Area Under Curve (AUC) and the Matthews Coefficient Correlation (MCC). For the best projections for each data set, AUC testing results ranged from 0.87 to 0.98 and the MCC scores ranged from 0.33 to 0.77, suggesting this methodology can validly capture the complexities of the molecular activity space. All four mapping functions provided generally good results yet the overall performance of PCooA and t-SNE was slightly better than Sammon mapping and Kruskal multidimensional scaling. CONCLUSIONS Our result showed that by using an appropriate combination of metric space representation and dimensionality reduction applied over metric spaces it is possible to produce a visual PSMA for which its consistency has been validated by using this map as a classification model. The produced maps can be used as prediction tools as it is simple to project any molecule into this new reference space as long as the similarities to the molecules used to compute the initial similarity matrix can be computed.
Collapse
Affiliation(s)
- Samina Kausar
- LaSIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
- BioISI: Biosystems & Integrative Sciences Institute, Faculdade de Ciencias, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| | - Andre O. Falcao
- LaSIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
- BioISI: Biosystems & Integrative Sciences Institute, Faculdade de Ciencias, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| |
Collapse
|
43
|
Ambure P, Gajewicz-Skretna A, Cordeiro MNDS, Roy K. New Workflow for QSAR Model Development from Small Data Sets: Small Dataset Curator and Small Dataset Modeler. Integration of Data Curation, Exhaustive Double Cross-Validation, and a Set of Optimal Model Selection Techniques. J Chem Inf Model 2019; 59:4070-4076. [DOI: 10.1021/acs.jcim.9b00476] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Pravin Ambure
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, University of Porto, 4169-007 Porto, Portugal
| | - Agnieszka Gajewicz-Skretna
- Laboratory of Environmental Chemometrics, Faculty of Chemistry, University of Gdansk, Gdansk 80-308, Poland
| | - M. Natalia D. S. Cordeiro
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, University of Porto, 4169-007 Porto, Portugal
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700032, India
| |
Collapse
|
44
|
Exploring the Potential of Spherical Harmonics and PCVM for Compounds Activity Prediction. Int J Mol Sci 2019; 20:ijms20092175. [PMID: 31052500 PMCID: PMC6539940 DOI: 10.3390/ijms20092175] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 04/14/2019] [Accepted: 04/29/2019] [Indexed: 01/11/2023] Open
Abstract
Biologically active chemical compounds may provide remedies for several diseases. Meanwhile, Machine Learning techniques applied to Drug Discovery, which are cheaper and faster than wet-lab experiments, have the capability to more effectively identify molecules with the expected pharmacological activity. Therefore, it is urgent and essential to develop more representative descriptors and reliable classification methods to accurately predict molecular activity. In this paper, we investigate the potential of a novel representation based on Spherical Harmonics fed into Probabilistic Classification Vector Machines classifier, namely SHPCVM, to compound the activity prediction task. We make use of representation learning to acquire the features which describe the molecules as precise as possible. To verify the performance of SHPCVM ten-fold cross-validation tests are performed on twenty-one G protein-coupled receptors (GPCRs). Experimental outcomes (accuracy of 0.86) assessed by the classification accuracy, precision, recall, Matthews’ Correlation Coefficient and Cohen’s kappa reveal that using our Spherical Harmonics-based representation which is relatively short and Probabilistic Classification Vector Machines can achieve very satisfactory performance results for GPCRs.
Collapse
|
45
|
Kausar S, Falcao AO. Analysis and Comparison of Vector Space and Metric Space Representations in QSAR Modeling. Molecules 2019; 24:E1698. [PMID: 31052325 PMCID: PMC6539555 DOI: 10.3390/molecules24091698] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 04/18/2019] [Accepted: 04/26/2019] [Indexed: 12/16/2022] Open
Abstract
The performance of quantitative structure-activity relationship (QSAR) models largely depends on the relevance of the selected molecular representation used as input data matrices. This work presents a thorough comparative analysis of two main categories of molecular representations (vector space and metric space) for fitting robust machine learning models in QSAR problems. For the assessment of these methods, seven different molecular representations that included RDKit descriptors, five different fingerprints types (MACCS, PubChem, FP2-based, Atom Pair, and ECFP4), and a graph matching approach (non-contiguous atom matching structure similarity; NAMS) in both vector space and metric space, were subjected to state-of-art machine learning methods that included different dimensionality reduction methods (feature selection and linear dimensionality reduction). Five distinct QSAR data sets were used for direct assessment and analysis. Results show that, in general, metric-space and vector-space representations are able to produce equivalent models, but there are significant differences between individual approaches. The NAMS-based similarity approach consistently outperformed most fingerprint representations in model quality, closely followed by Atom Pair fingerprints. To further verify these findings, the metric space-based models were fitted to the same data sets with the closest neighbors removed. These latter results further strengthened the above conclusions. The metric space graph-based approach appeared significantly superior to the other representations, albeit at a significant computational cost.
Collapse
Affiliation(s)
- Samina Kausar
- LASIGE, Faculdade de Ciencias, Universidade de Lisboa, 1749-016 Lisboa, Portugal.
- BioISI-Biosystems & Integrative Sciences Institute, Faculdade de Ciencias, Universidade de Lisboa, 1749-016 Lisboa, Portugal.
| | - Andre O Falcao
- LASIGE, Faculdade de Ciencias, Universidade de Lisboa, 1749-016 Lisboa, Portugal.
- BioISI-Biosystems & Integrative Sciences Institute, Faculdade de Ciencias, Universidade de Lisboa, 1749-016 Lisboa, Portugal.
| |
Collapse
|
46
|
Kazmi SR, Jun R, Yu MS, Jung C, Na D. In silico approaches and tools for the prediction of drug metabolism and fate: A review. Comput Biol Med 2019; 106:54-64. [PMID: 30682640 DOI: 10.1016/j.compbiomed.2019.01.008] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Revised: 01/14/2019] [Accepted: 01/14/2019] [Indexed: 01/08/2023]
Abstract
The fate of administered drugs is largely influenced by their metabolism. For example, endogenous enzyme-catalyzed conversion of drugs may result in therapeutic inactivation or activation or may transform the drugs into toxic chemical compounds. This highlights the importance of drug metabolism in drug discovery and development, and accounts for the wide variety of experimental technologies that provide insights into the fate of drugs. In view of the high cost of traditional drug development, a number of computational approaches have been developed for predicting the metabolic fate of drug candidates, allowing for screening of large numbers of chemical compounds and then identifying a small number of promising candidates. In this review, we introduce in silico approaches and tools that have been developed to predict drug metabolism and fate, and assess their potential to facilitate the virtual discovery of promising drug candidates. We also provide a brief description of various recent models for predicting different aspects of enzyme-drug reactions and provide a list of recent in silico tools used for drug metabolism prediction.
Collapse
Affiliation(s)
- Sayada Reemsha Kazmi
- School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Ren Jun
- School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Myeong-Sang Yu
- School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Chanjin Jung
- School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea
| | - Dokyun Na
- School of Integrative Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Republic of Korea.
| |
Collapse
|