1
|
Hemant Kumar S, Venkatachalapathy M, Sistla R, Poongavanam V. Advances in molecular glues: exploring chemical space and design principles for targeted protein degradation. Drug Discov Today 2024; 29:104205. [PMID: 39393773 DOI: 10.1016/j.drudis.2024.104205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Revised: 09/18/2024] [Accepted: 10/04/2024] [Indexed: 10/13/2024]
Abstract
The discovery of the E3 ligase cereblon (CRBN) as the target of thalidomide and its analogs revolutionized the field of targeted protein degradation (TPD). This ubiquitin-mediated degradation pathway was first harnessed by bivalent degraders. Recently, the emergence of low-molecular-weight molecular glue degraders (MGDs) has expanded the TPD landscape, because MGDs operate via the same mechanism while offering attractive physicochemical properties that are consistent with small-molecule therapeutics. This review delves into the discovery and advancement of MGDs, with case studies on cyclin K and the zinc finger protein IKZF2, highlighting the design principles, biological assays and therapeutic applications. Additionally, it examines the chemical space of molecular glues and outlines the collaborative efforts that are fueling innovation in this field.
Collapse
Affiliation(s)
- S Hemant Kumar
- thinkMolecular Technologies Pvt. Ltd, Haralur, Bangalore, KA 560102, India
| | | | - Ramesh Sistla
- thinkMolecular Technologies Pvt. Ltd, Haralur, Bangalore, KA 560102, India.
| | | |
Collapse
|
2
|
Cho C, Lee S, Bang D, Piao Y, Kim S. ChemAP: predicting drug approval with chemical structures before clinical trial phase by leveraging multi-modal embedding space and knowledge distillation. Sci Rep 2024; 14:23010. [PMID: 39362916 PMCID: PMC11449903 DOI: 10.1038/s41598-024-72868-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Accepted: 09/11/2024] [Indexed: 10/05/2024] Open
Abstract
Recent studies showed that the likelihood of drug approval can be predicted with clinical data and structure information of drug using computational approaches. Predicting the likelihood of drug approval can be innovative and of high impact. However, models that leverage clinical data are applicable only in clinical stages, which is not very practical. Prioritizing drug candidates and early-stage decision-making in the de novo drug development process is crucial in pharmaceutical research to optimize resource allocation. For early-stage decision-making, we need a computational model that uses only chemical structures. This seemingly impossible task may utilize the predictive power with multi-modal features including clinical data. In this work, we introduce ChemAP (Chemical structure-based drug Approval Predictor), a novel deep learning scheme for drug approval prediction in the early-stage drug discovery phase. ChemAP aims to enhance the possibility of early-stage decision-making by enriching semantic knowledge to fill in the gap between multi-modal and single-modal chemical spaces through knowledge distillation techniques. This approach facilitates the effective construction of chemical space solely from chemical structure data, guided by multi-modal knowledge related to efficacy, such as clinical trials and patents of drugs. In this study, ChemAP achieved state-of-the-art performance, outperforming both traditional machine learning and deep learning models in drug approval prediction, with AUROC and AUPRC scores of 0.782 and 0.842 respectively on the drug approval benchmark dataset. Additionally, we demonstrated its generalizability by outperforming baseline models on a recent external dataset, which included drugs from the 2023 FDA-approved list and the 2024 clinical trial failure drug list, achieving AUROC and AUPRC scores of 0.694 and 0.851. These results demonstrate that ChemAP is an effective method in predicting drug approval only with chemical structure information of drug so that decision-making can be done at the early stages of drug development process. To the best of our knowledge, our work is the first of its kind to show that prediction of drug approval is possible only with structure information of drug by defining the chemical space of approved and unapproved drugs using deep learning technology.
Collapse
Affiliation(s)
- Changyun Cho
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea
- AIGENDRUG Co., Ltd, Seoul, Republic of Korea
| | - Sangseon Lee
- Institute of Computer Technology, Seoul National University, Seoul, 08826, Republic of Korea
- Department of Artificial Intelligence, Inha University, Incheon, 22212, Republic of Korea
| | - Dongmin Bang
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea
- AIGENDRUG Co., Ltd, Seoul, Republic of Korea
| | - Yinhua Piao
- Department of Computer Science and Engineering, Seoul National University, Seoul, 08826, Republic of Korea
| | - Sun Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea.
- Department of Computer Science and Engineering, Seoul National University, Seoul, 08826, Republic of Korea.
- AIGENDRUG Co., Ltd, Seoul, Republic of Korea.
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, 08826, Republic of Korea.
| |
Collapse
|
3
|
Ai Q, Meng F, Shi J, Pelkie B, Coley CW. Extracting structured data from organic synthesis procedures using a fine-tuned large language model. DIGITAL DISCOVERY 2024; 3:1822-1831. [PMID: 39157760 PMCID: PMC11322921 DOI: 10.1039/d4dd00091a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2024] [Accepted: 07/30/2024] [Indexed: 08/20/2024]
Abstract
The popularity of data-driven approaches and machine learning (ML) techniques in the field of organic chemistry and its various subfields has increased the value of structured reaction data. Most data in chemistry is represented by unstructured text, and despite the vastness of the organic chemistry literature (papers, patents), manual conversion from unstructured text to structured data remains a largely manual endeavor. Software tools for this task would facilitate downstream applications such as reaction prediction and condition recommendation. In this study, we fine-tune a large language model (LLM) to extract reaction information from organic synthesis procedure text into structured data following the Open Reaction Database (ORD) schema, a comprehensive data structure designed for organic reactions. The fine-tuned model produces syntactically correct ORD records with an average accuracy of 91.25% for ORD "messages" (e.g., full compound, workups, or condition definitions) and 92.25% for individual data fields (e.g., compound identifiers, mass quantities), with the ability to recognize compound-referencing tokens and to infer reaction roles. We investigate its failure modes and evaluate performance on specific subtasks such as reaction role classification.
Collapse
Affiliation(s)
- Qianxiang Ai
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA USA
| | - Fanwang Meng
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA USA
| | - Jiale Shi
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA USA
| | - Brenden Pelkie
- Department of Chemical Engineering, University of Washington Seattle WA USA
| | - Connor W Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA USA
| |
Collapse
|
4
|
Manen-Freixa L, Antolin AA. Polypharmacology prediction: the long road toward comprehensively anticipating small-molecule selectivity to de-risk drug discovery. Expert Opin Drug Discov 2024; 19:1043-1069. [PMID: 39004919 DOI: 10.1080/17460441.2024.2376643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 07/02/2024] [Indexed: 07/16/2024]
Abstract
INTRODUCTION Small molecules often bind to multiple targets, a behavior termed polypharmacology. Anticipating polypharmacology is essential for drug discovery since unknown off-targets can modulate safety and efficacy - profoundly affecting drug discovery success. Unfortunately, experimental methods to assess selectivity present significant limitations and drugs still fail in the clinic due to unanticipated off-targets. Computational methods are a cost-effective, complementary approach to predict polypharmacology. AREAS COVERED This review aims to provide a comprehensive overview of the state of polypharmacology prediction and discuss its strengths and limitations, covering both classical cheminformatics methods and bioinformatic approaches. The authors review available data sources, paying close attention to their different coverage. The authors then discuss major algorithms grouped by the types of data that they exploit using selected examples. EXPERT OPINION Polypharmacology prediction has made impressive progress over the last decades and contributed to identify many off-targets. However, data incompleteness currently limits most approaches to comprehensively predict selectivity. Moreover, our limited agreement on model assessment challenges the identification of the best algorithms - which at present show modest performance in prospective real-world applications. Despite these limitations, the exponential increase of multidisciplinary Big Data and AI hold much potential to better polypharmacology prediction and de-risk drug discovery.
Collapse
Affiliation(s)
- Leticia Manen-Freixa
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
| | - Albert A Antolin
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
- Center for Cancer Drug Discovery, The Division of Cancer Therapeutics, The Institute of Cancer Research, London, UK
| |
Collapse
|
5
|
Feller AL, Wilke CO. Peptide-specific chemical language model successfully predicts membrane diffusion of cyclic peptides. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.09.607221. [PMID: 39149303 PMCID: PMC11326283 DOI: 10.1101/2024.08.09.607221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Biological language modeling has significantly advanced the prediction of membrane penetration for small molecule drugs and natural peptides. However, accurately predicting membrane diffusion for peptides with pharmacologically relevant modifications remains a substantial challenge. Here, we introduce PeptideCLM, a peptide-focused chemical language model capable of encoding peptides with chemical modifications, unnatural or non-canonical amino acids, and cyclizations. We assess this model by predicting membrane diffusion of cyclic peptides, demonstrating greater predictive power than existing chemical language models. Our model is versatile, able to be extended beyond membrane diffusion predictions to other target values. Its advantages include the ability to model macromolecules using chemical string notation, a largely unexplored domain, and a simple, flexible architecture that allows for adaptation to any peptide or other macromolecule dataset.
Collapse
Affiliation(s)
- Aaron L Feller
- Interdisciplinary Life Sciences, The University of Texas, Austin
| | - Claus O Wilke
- Department of Integrative Biology, The University of Texas, Austin
- Interdisciplinary Life Sciences, The University of Texas, Austin
| |
Collapse
|
6
|
Morin L, Weber V, Meijer GI, Yu F, Staar PWJ. PatCID: an open-access dataset of chemical structures in patent documents. Nat Commun 2024; 15:6532. [PMID: 39095357 PMCID: PMC11297020 DOI: 10.1038/s41467-024-50779-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Accepted: 07/19/2024] [Indexed: 08/04/2024] Open
Abstract
The automatic analysis of patent publications has potential to accelerate research across various domains, including drug discovery and material science. Within patent documents, crucial information often resides in visual depictions of molecule structures. PatCID (Patent-extracted Chemical-structure Images database for Discovery) allows to access such information at scale. It enables users to search which molecules are displayed in which documents. PatCID contains 81M chemical-structure images and 14M unique chemical structures. Here, we compare PatCID with state-of-the-art chemical patent-databases. On a random set, PatCID retrieves 56.0% of molecules, which is higher than automatically-created databases, Google Patents (41.5%) and SureChEMBL (23.5%), as well as manually-created databases, Reaxys (53.5%) and SciFinder (49.5%). Leveraging state-of-the-art methods of document understanding, PatCID high-quality data outperforms currently available automatically-generated patent-databases. PatCID even competes with proprietary manually-created patent-databases. This enables promising applications for automatic literature review and learning-based molecular generation methods. The dataset is freely accessible for download.
Collapse
Affiliation(s)
- Lucas Morin
- IBM Research, Säumerstrasse 4, 8803, Rüschlikon, Switzerland.
- Department of Information Technology and Electrical Engineering, ETH Zürich, Sternwartstrasse 7, 8092, Zürich, Switzerland.
| | - Valéry Weber
- IBM Research, Säumerstrasse 4, 8803, Rüschlikon, Switzerland
| | | | - Fisher Yu
- Department of Information Technology and Electrical Engineering, ETH Zürich, Sternwartstrasse 7, 8092, Zürich, Switzerland
| | - Peter W J Staar
- IBM Research, Säumerstrasse 4, 8803, Rüschlikon, Switzerland.
| |
Collapse
|
7
|
Saifi I, Bhat BA, Hamdani SS, Bhat UY, Lobato-Tapia CA, Mir MA, Dar TUH, Ganie SA. Artificial intelligence and cheminformatics tools: a contribution to the drug development and chemical science. J Biomol Struct Dyn 2024; 42:6523-6541. [PMID: 37434311 DOI: 10.1080/07391102.2023.2234039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 07/03/2023] [Indexed: 07/13/2023]
Abstract
In the ever-evolving field of drug discovery, the integration of Artificial Intelligence (AI) and Machine Learning (ML) with cheminformatics has proven to be a powerful combination. Cheminformatics, which combines the principles of computer science and chemistry, is used to extract chemical information and search compound databases, while the application of AI and ML allows for the identification of potential hit compounds, optimization of synthesis routes, and prediction of drug efficacy and toxicity. This collaborative approach has led to the discovery, preclinical evaluations and approval of over 70 drugs in recent years. To aid researchers in the pursuit of new drugs, this article presents a comprehensive list of databases, datasets, predictive and generative models, scoring functions and web platforms that have been launched between 2021 and 2022. These resources provide a wealth of information and tools for computer-assisted drug development, and are a valuable asset for those working in the field of cheminformatics. Overall, the integration of AI, ML and cheminformatics has greatly advanced the drug discovery process and continues to hold great potential for the future. As new resources and technologies become available, we can expect to see even more groundbreaking discoveries and advancements in these fields.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Ifra Saifi
- Chaudhary Charan Singh University, Meerut, Uttar Pradesh, India
| | - Basharat Ahmad Bhat
- Department of Bioresources, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| | - Syed Suhail Hamdani
- Department of Bioresources, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| | - Umar Yousuf Bhat
- Department of Zoology, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| | | | - Mushtaq Ahmad Mir
- Department of Clinical Laboratory Sciences, College of Applied Medical Science, King Khalid University, KSA, Saudi Arabia
| | - Tanvir Ul Hasan Dar
- Department of Biotechnology, School of Biosciences and Biotechnology, BGSB University, Rajouri, India
| | - Showkat Ahmad Ganie
- Department of Clinical Biochemistry, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| |
Collapse
|
8
|
Domingo-Fernández D, Gadiya Y, Preto A, Krettler CA, Mubeen S, Allen A, Healey D, Colluru V. Natural Products Have Increased Rates of Clinical Trial Success throughout the Drug Development Process. JOURNAL OF NATURAL PRODUCTS 2024; 87:1844-1851. [PMID: 38970498 PMCID: PMC11287737 DOI: 10.1021/acs.jnatprod.4c00581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Revised: 06/15/2024] [Accepted: 06/16/2024] [Indexed: 07/08/2024]
Abstract
Natural products (NPs) or their derivatives represent a large proportion of drugs that successfully progress through clinical trials to approval. This study explores the presence of NPs in both early- and late-stage drug discovery to determine their success rate, and the factors or features of natural products that contribute to such success. As a proxy for early drug development stages, we analyzed patent applications over several decades, finding a consistent proportion of NP, NP-derived, and synthetic-compound-based patent documents, with the latter group outnumbering NP and NP-derived ones (approximately 77% vs 23%). We next assessed clinical trial data, where we observed a steady increase in NP and NP-derived compounds from clinical trial phases I to III (from approximately 35% in phase I to 45% in phase III), with an inverse trend observed in synthetics (from approximately 65% in phase I to 55% in phase III). Finally, in vitro and in silico toxicity studies revealed that NPs and their derivatives were less toxic alternatives to their synthetic counterparts. These discoveries offer valuable insights for successful NP-based drug development, highlighting the potential benefits of prioritizing NPs and their derivatives as starting points.
Collapse
Affiliation(s)
| | - Yojana Gadiya
- Enveda Biosciences, 5700 Flatiron Parkway, Boulder, Colorado 80301, United States
| | - António
José Preto
- Enveda Biosciences, 5700 Flatiron Parkway, Boulder, Colorado 80301, United States
| | | | - Sarah Mubeen
- Enveda Biosciences, 5700 Flatiron Parkway, Boulder, Colorado 80301, United States
| | - August Allen
- Enveda Biosciences, 5700 Flatiron Parkway, Boulder, Colorado 80301, United States
| | - David Healey
- Enveda Biosciences, 5700 Flatiron Parkway, Boulder, Colorado 80301, United States
| | - Viswa Colluru
- Enveda Biosciences, 5700 Flatiron Parkway, Boulder, Colorado 80301, United States
| |
Collapse
|
9
|
Godinez-Macias KP, Winzeler EA. CACTI: an in silico chemical analysis tool through the integration of chemogenomic data and clustering analysis. J Cheminform 2024; 16:84. [PMID: 39049122 PMCID: PMC11270953 DOI: 10.1186/s13321-024-00885-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 07/14/2024] [Indexed: 07/27/2024] Open
Abstract
It is well-accepted that knowledge of a small molecule's target can accelerate optimization. Although chemogenomic databases are helpful resources for predicting or finding compound interaction partners, they tend to be limited and poorly annotated. Furthermore, unlike genes, compound identifiers are often not standardized, and many synonyms may exist, especially in the biological literature, making batch analysis of compounds difficult. Here, we constructed an open-source annotation and target hypothesis prediction tool that explores some of the largest chemical and biological databases, mining these for both common name, synonyms, and structurally similar molecules. We used this Chemical Analysis and Clustering for Target Identification (CACTI) tool to analyze the Pathogen Box collection, an open-source set of 400 drug-like compounds active against a variety of microbial pathogens. Our analysis resulted in 4,315 new synonyms, 35,963 pieces of new information and target prediction hints for 58 members.Scientific contributionsWith the employment of this tool, a comprehensive report with known evidence, close analogs and drug-target prediction can be obtained for large-scale chemical libraries that will facilitate their evaluation and future target validation and optimization efforts.
Collapse
Affiliation(s)
- Karla P Godinez-Macias
- Department of Pediatrics, University of California, San Diego, School of Medicine, La Jolla, CA, 92093, USA
| | - Elizabeth A Winzeler
- Department of Pediatrics, University of California, San Diego, School of Medicine, La Jolla, CA, 92093, USA.
| |
Collapse
|
10
|
Thomas M, Ahmad M, Tresadern G, de Fabritiis G. PromptSMILES: prompting for scaffold decoration and fragment linking in chemical language models. J Cheminform 2024; 16:77. [PMID: 38965600 PMCID: PMC11225391 DOI: 10.1186/s13321-024-00866-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 06/04/2024] [Indexed: 07/06/2024] Open
Abstract
SMILES-based generative models are amongst the most robust and successful recent methods used to augment drug design. They are typically used for complete de novo generation, however, scaffold decoration and fragment linking applications are sometimes desirable which requires a different grammar, architecture, training dataset and therefore, re-training of a new model. In this work, we describe a simple procedure to conduct constrained molecule generation with a SMILES-based generative model to extend applicability to scaffold decoration and fragment linking by providing SMILES prompts, without the need for re-training. In combination with reinforcement learning, we show that pre-trained, decoder-only models adapt to these applications quickly and can further optimize molecule generation towards a specified objective. We compare the performance of this approach to a variety of orthogonal approaches and show that performance is comparable or better. For convenience, we provide an easy-to-use python package to facilitate model sampling which can be found on GitHub and the Python Package Index.Scientific contributionThis novel method extends an autoregressive chemical language model to scaffold decoration and fragment linking scenarios. This doesn't require re-training, the use of a bespoke grammar, or curation of a custom dataset, as commonly required by other approaches.
Collapse
Affiliation(s)
- Morgan Thomas
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aguiader 88, 08003, Barcelona, Spain.
| | - Mazen Ahmad
- In Silico Discovery, Janssen Pharmaceutica N. V., Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Gary Tresadern
- In Silico Discovery, Janssen Pharmaceutica N. V., Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Gianni de Fabritiis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aguiader 88, 08003, Barcelona, Spain.
- Acellera Labs, C Dr. Trueta 183, 08005, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010, Barcelona, Spain.
| |
Collapse
|
11
|
Kosonocky CW, Wilke CO, Marcotte EM, Ellington AD. Mining patents with large language models elucidates the chemical function landscape. DIGITAL DISCOVERY 2024; 3:1150-1159. [PMID: 38873033 PMCID: PMC11167698 DOI: 10.1039/d4dd00011k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 05/03/2024] [Indexed: 06/15/2024]
Abstract
The fundamental goal of small molecule discovery is to generate chemicals with target functionality. While this often proceeds through structure-based methods, we set out to investigate the practicality of methods that leverage the extensive corpus of chemical literature. We hypothesize that a sufficiently large text-derived chemical function dataset would mirror the actual landscape of chemical functionality. Such a landscape would implicitly capture complex physical and biological interactions given that chemical function arises from both a molecule's structure and its interacting partners. To evaluate this hypothesis, we built a Chemical Function (CheF) dataset of patent-derived functional labels. This dataset, comprising 631 K molecule-function pairs, was created using an LLM- and embedding-based method to obtain 1.5 K unique functional labels for approximately 100 K randomly selected molecules from their corresponding 188 K unique patents. We carry out a series of analyses demonstrating that the CheF dataset contains a semantically coherent textual representation of the functional landscape congruent with chemical structural relationships, thus approximating the actual chemical function landscape. We then demonstrate through several examples that this text-based functional landscape can be leveraged to identify drugs with target functionality using a model able to predict functional profiles from structure alone. We believe that functional label-guided molecular discovery may serve as an alternative approach to traditional structure-based methods in the pursuit of designing novel functional molecules.
Collapse
Affiliation(s)
- Clayton W Kosonocky
- Department of Molecular Biosciences, University of Texas at Austin Austin TX 78705 USA
| | - Claus O Wilke
- Department of Integrative Biology, University of Texas at Austin Austin TX 78705 USA
| | - Edward M Marcotte
- Department of Molecular Biosciences, University of Texas at Austin Austin TX 78705 USA
- Center for Systems and Synthetic Biology, University of Texas at Austin Austin TX 78705 USA
| | - Andrew D Ellington
- Department of Molecular Biosciences, University of Texas at Austin Austin TX 78705 USA
- Center for Systems and Synthetic Biology, University of Texas at Austin Austin TX 78705 USA
| |
Collapse
|
12
|
Zhang Y, Zhang Z, Ke D, Pan X, Wang X, Xiao X, Ji C. FragGrow: A Web Server for Structure-Based Drug Design by Fragment Growing within Constraints. J Chem Inf Model 2024; 64:3970-3976. [PMID: 38725251 DOI: 10.1021/acs.jcim.4c00154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2024]
Abstract
Fragment growing is an important ligand design strategy in drug discovery. In this study, we present FragGrow, a web server that facilitates structure-based drug design by fragment growing. FragGrow offers two working modes: one for growing molecules through the direct replacement of hydrogen atoms or substructures and the other for growing via virtual synthesis. FragGrow works by searching for suitable fragments that meet a set of constraints from an indexed 3D fragment database and using them to create new compounds in 3D space. The users can set a range of constraints when searching for their desired fragment, including the fragment's ability to interact with specific protein sites; its size, topology, and physicochemical properties; and the presence of particular heteroatoms and functional groups within the fragment. We hope that FragGrow will serve as a useful tool for medicinal chemists in ligand design. The FragGrow server is freely available to researchers and can be accessed at https://fraggrow.xundrug.cn.
Collapse
Affiliation(s)
- Yueqing Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Zhihan Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Dongliang Ke
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Xiaolin Pan
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Xingyu Wang
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - Xudong Xiao
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
| | - Changge Ji
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| |
Collapse
|
13
|
Gadiya Y, Shetty S, Hofmann-Apitius M, Gribbon P, Zaliani A. Exploring SureChEMBL from a drug discovery perspective. Sci Data 2024; 11:507. [PMID: 38755219 PMCID: PMC11099139 DOI: 10.1038/s41597-024-03371-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 05/13/2024] [Indexed: 05/18/2024] Open
Abstract
In the pharmaceutical industry, the patent protection of drugs and medicines is accorded importance because of the high costs involved in the development of novel drugs. Over the years, researchers have analyzed patent documents to identify freedom-to-operate spaces for novel drug candidates. To assist this, several well-established public patent document data repositories have enabled automated methodologies for extracting information on therapeutic agents. In this study, we delve into one such publicly available patent database, SureChEMBL, which catalogues patent documents related to life sciences. Our exploration begins by identifying patent compounds across public chemical data resources, followed by pinpointing sections in patent documents where the chemical annotations were found. Next, we exhibit the potential of compounds to serve as drug candidates by evaluating their conformity to drug-likeness criteria. Lastly, we examine the drug development stage reported for these compounds to understand their clinical success. In summary, our investigation aims at providing a comprehensive overview of the patent compounds catalogued in SureChEMBL, assessing their relevance to pharmaceutical drug discovery.
Collapse
Affiliation(s)
- Yojana Gadiya
- Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), Schnackenburgallee 114, 22525, Hamburg, Germany.
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases (CIMD), Theodor Stern Kai 7, 60590, Frankfurt, Germany.
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53113, Bonn, Germany.
| | - Simran Shetty
- Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases (CIMD), Theodor Stern Kai 7, 60590, Frankfurt, Germany
- Hamburg University of Applied Sciences (HAW), 20099, Hamburg, Germany
| | - Martin Hofmann-Apitius
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53113, Bonn, Germany
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
| | - Philip Gribbon
- Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases (CIMD), Theodor Stern Kai 7, 60590, Frankfurt, Germany
| | - Andrea Zaliani
- Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases (CIMD), Theodor Stern Kai 7, 60590, Frankfurt, Germany
| |
Collapse
|
14
|
Oprea TI, Bologa C, Holmes J, Mathias S, Metzger VT, Waller A, Yang JJ, Leach AR, Jensen LJ, Kelleher KJ, Sheils TK, Mathé E, Avram S, Edwards JS. Overview of the Knowledge Management Center for Illuminating the Druggable Genome. Drug Discov Today 2024; 29:103882. [PMID: 38218214 PMCID: PMC10939799 DOI: 10.1016/j.drudis.2024.103882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 12/22/2023] [Accepted: 01/09/2024] [Indexed: 01/15/2024]
Abstract
The Knowledge Management Center (KMC) for the Illuminating the Druggable Genome (IDG) project aims to aggregate, update, and articulate protein-centric data knowledge for the entire human proteome, with emphasis on the understudied proteins from the three IDG protein families. KMC collates and analyzes data from over 70 resources to compile the Target Central Resource Database (TCRD), which is the web-based informatics platform (Pharos). These data include experimental, computational, and text-mined information on protein structures, compound interactions, and disease and phenotype associations. Based on this knowledge, proteins are classified into different Target Development Levels (TDLs) for identification of understudied targets. Additional work by the KMC focuses on enriching target knowledge and producing DrugCentral and other data visualization tools for expanding investigation of understudied targets.
Collapse
Affiliation(s)
- Tudor I Oprea
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Cristian Bologa
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Jayme Holmes
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Stephen Mathias
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Vincent T Metzger
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Anna Waller
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Jeremy J Yang
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA
| | - Andrew R Leach
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Keith J Kelleher
- National Center for Advancing Translational Sciences (NCATS), NIH, Bethesda, MD, USA
| | - Timothy K Sheils
- National Center for Advancing Translational Sciences (NCATS), NIH, Bethesda, MD, USA
| | - Ewy Mathé
- National Center for Advancing Translational Sciences (NCATS), NIH, Bethesda, MD, USA
| | - Sorin Avram
- Coriolan Dragulescu Institute of Chemistry, Timisoara, Romania
| | - Jeremy S Edwards
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM, USA; Department of Chemistry and Chemical Biology, University of New Mexico, Albuquerque, NM, USA.
| |
Collapse
|
15
|
Gómez-Sacristán P, Simeon S, Tran-Nguyen VK, Patil S, Ballester PJ. Inactive-enriched machine-learning models exploiting patent data improve structure-based virtual screening for PDL1 dimerizers. J Adv Res 2024:S2090-1232(24)00037-7. [PMID: 38280715 DOI: 10.1016/j.jare.2024.01.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 12/01/2023] [Accepted: 01/21/2024] [Indexed: 01/29/2024] Open
Abstract
INTRODUCTION Small-molecule Programmable Cell Death Protein 1/Programmable Death-Ligand 1 (PD1/PDL1) inhibition via PDL1 dimerization has the potential to lead to inexpensive drugs with better cancer patient outcomes and milder side effects. However, this therapeutic approach has proven challenging, with only one PDL1 dimerizer reaching early clinical trials so far. There is hence a need for fast and accurate methods to develop alternative PDL1 dimerizers. OBJECTIVES We aim to show that structure-based virtual screening (SBVS) based on PDL1-specific machine-learning (ML) scoring functions (SFs) is a powerful drug design tool for detecting PD1/PDL1 inhibitors via PDL1 dimerization. METHODS By incorporating the latest MLSF advances, we generated and evaluated PDL1-specific MLSFs (classifiers and inactive-enriched regressors) on two demanding test sets. RESULTS 60 PDL1-specific MLSFs (30 classifiers and 30 regressors) were generated. Our large-scale analysis provides highly predictive PDL1-specific MLSFs that benefitted from training with large volumes of docked inactives and enabling inactive-enriched regression. CONCLUSION PDL1-specific MLSFs strongly outperformed generic SFs of various types on this target and are released here without restrictions.
Collapse
Affiliation(s)
| | - Saw Simeon
- Centre de Recherche en Cancérologie de Marseille, Marseille 13009, France
| | | | - Sachin Patil
- NanoBio Laboratory, Widener University, Chester, PA 19013, USA
| | - Pedro J Ballester
- Department of Bioengineering, Imperial College London, London SW7 2AZ, UK.
| |
Collapse
|
16
|
Zhu TF, Qian R, Wei X, Lu AP, Cao DS. PatentNetML: A Novel Framework for Predicting Key Compounds in Patents Using Network Science and Machine Learning. J Med Chem 2024; 67:1347-1359. [PMID: 38181431 DOI: 10.1021/acs.jmedchem.3c01893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2024]
Abstract
Patents play a crucial role in drug research and development, providing early access to unpublished data and offering unique insights. Identifying key compounds in patents is essential to finding novel lead compounds. This study collected a comprehensive data set comprising 1555 patents, encompassing 1000 key compounds, to explore innovative approaches for predicting these key compounds. Our novel PatentNetML framework integrated network science and machine learning algorithms, combining network measures, ADMET properties, and physicochemical properties, to construct robust classification models to identify key compounds. Through a model interpretation and an analysis of three compelling case studies, we showcase the potential of PatentNetML in unveiling hidden patterns and connections within diverse patents. While our framework is pioneering, we acknowledge its limitations when applied to patents that deviate from the assumed central pattern. This work serves as a promising foundation for future research endeavors aimed at efficiently identifying promising drug candidates and expediting drug discovery in the pharmaceutical industry.
Collapse
Affiliation(s)
- Ting-Fei Zhu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410003, Hunan, China
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR 999077, China
| | - Rong Qian
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410003, Hunan, China
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR 999077, China
| | - Xiao Wei
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410003, Hunan, China
| | - Ai-Ping Lu
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR 999077, China
- Guangdong-Hong Kong-Macau Joint Lab on Chinese Medicine and Immune Disease Research, Guangzhou 510000, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410003, Hunan, China
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR 999077, China
| |
Collapse
|
17
|
Zdrazil B, Felix E, Hunter F, Manners EJ, Blackshaw J, Corbett S, de Veij M, Ioannidis H, Lopez DM, Mosquera J, Magarinos M, Bosc N, Arcila R, Kizilören T, Gaulton A, Bento A, Adasme M, Monecke P, Landrum G, Leach A. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res 2024; 52:D1180-D1192. [PMID: 37933841 PMCID: PMC10767899 DOI: 10.1093/nar/gkad1004] [Citation(s) in RCA: 71] [Impact Index Per Article: 71.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 10/09/2023] [Accepted: 10/23/2023] [Indexed: 11/08/2023] Open
Abstract
ChEMBL (https://www.ebi.ac.uk/chembl/) is a manually curated, high-quality, large-scale, open, FAIR and Global Core Biodata Resource of bioactive molecules with drug-like properties, previously described in the 2012, 2014, 2017 and 2019 Nucleic Acids Research Database Issues. Since its introduction in 2009, ChEMBL's content has changed dramatically in size and diversity of data types. Through incorporation of multiple new datasets from depositors since the 2019 update, ChEMBL now contains slightly more bioactivity data from deposited data vs data extracted from literature. In collaboration with the EUbOPEN consortium, chemical probe data is now regularly deposited into ChEMBL. Release 27 made curated data available for compounds screened for potential anti-SARS-CoV-2 activity from several large-scale drug repurposing screens. In addition, new patent bioactivity data have been added to the latest ChEMBL releases, and various new features have been incorporated, including a Natural Product likeness score, updated flags for Natural Products, a new flag for Chemical Probes, and the initial annotation of the action type for ∼270 000 bioactivity measurements.
Collapse
Affiliation(s)
- Barbara Zdrazil
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Eloy Felix
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Fiona Hunter
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Emma J Manners
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - James Blackshaw
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Sybilla Corbett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Marleen de Veij
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Harris Ioannidis
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - David Mendez Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Juan F Mosquera
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Maria Paula Magarinos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Nicolas Bosc
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Ricardo Arcila
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Tevfik Kizilören
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Anna Gaulton
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - A Patrícia Bento
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Melissa F Adasme
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Peter Monecke
- Sanofi, R&D, Preclinical Safety, Industriepark Höchst, 65926 Frankfurt am Main, Germany
| | - Gregory A Landrum
- Department of Chemistry and Applied Biosciences, ETH Zürich, 8093 Zürich, Switzerland
| | - Andrew R Leach
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| |
Collapse
|
18
|
Martinez-Sevillano M, Falaguera MJ, Mestres J. CIPSI: An open chemical intellectual property service for medicinal chemists. Mol Inform 2024; 43:e202300221. [PMID: 38010631 DOI: 10.1002/minf.202300221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 11/22/2023] [Accepted: 11/23/2023] [Indexed: 11/29/2023]
Abstract
The availability of patent chemical data offers public access to a chemical space that is not well covered by other sources collecting small molecules from scholarly literature. However, open applications to facilitate the search and analysis of biologically-relevant molecular structures present in patents are still largely missing. We have developed CIPSI, an open Chemical Intellectual Property Service @ IMIM to assist medicinal chemists in searching and analysing molecules in SureChEMBL patents. The current version contains 6,240,500 molecules from 236,689 pharmacological patents, of which 5,949,214 are confidently assigned to core chemical structures reminiscent of the Markush structure in the patent claim. The platform includes some graphical tools to facilitate comparative patent analyses between drugs, chemical substructures, and company assignees. CIPSI is available at https://cipsi.org.
Collapse
Affiliation(s)
- Maria Martinez-Sevillano
- Systems Pharmacology, Research Group on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute, Doctor Aiguader 88, 08028, Barcelona, Spain
| | - Maria J Falaguera
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
- Open Targets, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Jordi Mestres
- Systems Pharmacology, Research Group on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute, Doctor Aiguader 88, 08028, Barcelona, Spain
- Institut de Quimica Computacional i Catalisi, Facultat de Ciencies, Universitat de Girona, Maria Aurelia Capmany 69, 17003, Girona, Spain
| |
Collapse
|
19
|
Laha A, Sarkar A, Panja AS, Bandopadhyay R. Screening of Prospective Antiallergic Compound as FcεRI Inhibitors and Its Antiallergic Efficacy Through Immunoinformatics Approaches. Mol Biotechnol 2024; 66:26-33. [PMID: 36988875 DOI: 10.1007/s12033-023-00728-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 03/21/2023] [Indexed: 03/30/2023]
Abstract
The occurrence of allergy, a type I hypersensitivity reaction, is rising exponentially all over the world. Sometimes, allergy proves to be fatal for atopic patients, due to the occurrence of anaphylaxis. This study is aimed to find an anti-allergic agent that can inhibit the binding of IgE to Human High Affinity IgE Receptor (FCεRI), thereby preventing the degranulation of mast cells. A considerable number of potential anti-allergic compounds were assessed for their inhibitory strength through ADMET studies. AUTODOCK was used for estimating the binding energy between anti-allergic compounds and FCεRI, along with the interacting amino acids. The docked pose showing favorable binding energy was subjected to molecular dynamics simulation study. Marrubiin, a diterpenoid lactone from Lamiaceae, and epicatechin-3-gallate appears to be effective in blocking the Human High Affinity IgE Receptor (FCεRI). This in-silico study proposes the use of marrubiin and epicatechin-3-gallate, in the downregulation of allergic responses. Due to the better inhibition constant, future direction of this study is to analyze the safety and efficacy of marrubiin in anti-allergic activities through in-vivo clinical human trials.
Collapse
Affiliation(s)
- Anubhab Laha
- UGC Centre for Advanced Study, Department of Botany, The University of Burdwan, Golapbag, Burdwan, West Bengal, 713104, India
- Department of Botany, Chandernagore College, Chandernagore, Hooghly, West Bengal, 712136, India
| | - Aniket Sarkar
- Post-Graduate Department of Biotechnology, Oriental Institute of Science and Technology, Vidyasagar University, Midnapore, West Bengal, India
| | - Anindya Sundar Panja
- Department of Biotechnology, Molecular Informatics Laboratory, Oriental Institute of Science and Technology, Vidyasagar University, Midnapore, West Bengal, 721102, India
| | - Rajib Bandopadhyay
- UGC Centre for Advanced Study, Department of Botany, The University of Burdwan, Golapbag, Burdwan, West Bengal, 713104, India.
| |
Collapse
|
20
|
Kosonocky CW, Wilke CO, Marcotte EM, Ellington AD. Mining Patents with Large Language Models Elucidates the Chemical Function Landscape. ARXIV 2023:arXiv:2309.08765v2. [PMID: 38196747 PMCID: PMC10775343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
The fundamental goal of small molecule discovery is to generate chemicals with target functionality. While this often proceeds through structure-based methods, we set out to investigate the practicality of orthogonal methods that leverage the extensive corpus of chemical literature. We hypothesize that a sufficiently large text-derived chemical function dataset would mirror the actual landscape of chemical functionality. Such a landscape would implicitly capture complex physical and biological interactions given that chemical function arises from both a molecule's structure and its interacting partners. To evaluate this hypothesis, we built a Chemical Function (CheF) dataset of patent-derived functional labels. This dataset, comprising 631K molecule-function pairs, was created using an LLM- and embedding-based method to obtain functional labels for approximately 100K molecules from their corresponding 188K unique patents. We carry out a series of analyses demonstrating that the CheF dataset contains a semantically coherent textual representation of the functional landscape congruent with chemical structural relationships, thus approximating the actual chemical function landscape. We then demonstrate that this text-based functional landscape can be leveraged to identify drugs with target functionality using a model able to predict functional profiles from structure alone. We believe that functional label-guided molecular discovery may serve as an orthogonal approach to traditional structure-based methods in the pursuit of designing novel functional molecules.
Collapse
|
21
|
Insana G, Ignatchenko A, Martin M, Bateman A. MBDBMetrics: an online metrics tool to measure the impact of biological data resources. BIOINFORMATICS ADVANCES 2023; 3:vbad180. [PMID: 38130879 PMCID: PMC10733715 DOI: 10.1093/bioadv/vbad180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 11/13/2023] [Indexed: 12/23/2023]
Abstract
Motivation There now exist thousands of molecular biology databases covering every aspect of biological data. This database infrastructure takes significant effort and funding to develop and maintain. The creators of these databases need to make strong justifications to funders to prove their impact or importance. There are many publication metrics and tools available such as Google Scholar to measure citation impact or AltMetrics covering multiple measures including social media coverage. Results In this article, we describe a series of novel impact metrics that have been applied initially to the UniProt database, and now made available via a Google Colab to enable any molecular biology resource to gain several additional metrics. These metrics, powered by freely available APIs from Europe PubMedCentral and SureCHEMBL cover mentions of the resource in full text articles, including which section of the paper the mention occurs in, grant acknowledgements and mentions in patent applications. This tool, that we call MBDBMetrics, is a useful adjunct to existing tools. Availability and implementation The MBDBMetrics tool is available at the following locations: https://colab.research.google.com/drive/1aEmSQR9DGQIZmHAIuQV9mLv7Mw9Ppkin and https://github.com/g-insana/MBDBMetrics.
Collapse
Affiliation(s)
- Giuseppe Insana
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Alex Ignatchenko
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Maria Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| |
Collapse
|
22
|
Shimizu Y, Ohta M, Ishida S, Terayama K, Osawa M, Honma T, Ikeda K. AI-driven molecular generation of not-patented pharmaceutical compounds using world open patent data. J Cheminform 2023; 15:120. [PMID: 38093324 PMCID: PMC10716930 DOI: 10.1186/s13321-023-00791-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 12/02/2023] [Indexed: 12/17/2023] Open
Abstract
Developing compounds with novel structures is important for the production of new drugs. From an intellectual perspective, confirming the patent status of newly developed compounds is essential, particularly for pharmaceutical companies. The generation of a large number of compounds has been made possible because of the recent advances in artificial intelligence (AI). However, confirming the patent status of these generated molecules has been a challenge because there are no free and easy-to-use tools that can be used to determine the novelty of the generated compounds in terms of patents in a timely manner; additionally, there are no appropriate reference databases for pharmaceutical patents in the world. In this study, two public databases, SureChEMBL and Google Patents Public Datasets, were used to create a reference database of drug-related patented compounds using international patent classification. An exact structure search system was constructed using InChIKey and a relational database system to rapidly search for compounds in the reference database. Because drug-related patented compounds are a good source for generative AI to learn useful chemical structures, they were used as the training data. Furthermore, molecule generation was successfully directed by increasing and decreasing the number of generated patented compounds through incorporation of patent status (i.e., patented or not) into learning. The use of patent status enabled generation of novel molecules with high drug-likeness. The generation using generative AI with patent information would help efficiently propose novel compounds in terms of pharmaceutical patents. Scientific contribution: In this study, a new molecule-generation method that takes into account the patent status of molecules, which has rarely been considered but is an important feature in drug discovery, was developed. The method enables the generation of novel molecules based on pharmaceutical patents with high drug-likeness and will help in the efficient development of effective drug compounds.
Collapse
Affiliation(s)
- Yugo Shimizu
- HPC- and AI-driven Drug Development Platform Division, RIKEN Center for Computational Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa, 230-0045, Japan
- Division of Physics for Life Functions, Keio University Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo, 105-8512, Japan
| | - Masateru Ohta
- HPC- and AI-driven Drug Development Platform Division, RIKEN Center for Computational Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa, 230-0045, Japan
| | - Shoichi Ishida
- Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa, 230-0045, Japan
| | - Kei Terayama
- Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa, 230-0045, Japan
| | - Masanori Osawa
- Division of Physics for Life Functions, Keio University Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo, 105-8512, Japan
| | - Teruki Honma
- RIKEN Center for Biosystems Dynamics Research, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa, 230-0045, Japan
| | - Kazuyoshi Ikeda
- HPC- and AI-driven Drug Development Platform Division, RIKEN Center for Computational Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa, 230-0045, Japan.
- Division of Physics for Life Functions, Keio University Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo, 105-8512, Japan.
| |
Collapse
|
23
|
Kosonocky CW, Feller AL, Wilke CO, Ellington AD. Using alternative SMILES representations to identify novel functional analogues in chemical similarity vector searches. PATTERNS (NEW YORK, N.Y.) 2023; 4:100865. [PMID: 38106612 PMCID: PMC10724362 DOI: 10.1016/j.patter.2023.100865] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 08/09/2023] [Accepted: 10/06/2023] [Indexed: 12/19/2023]
Abstract
Chemical similarity searches are a widely used family of in silico methods for identifying pharmaceutical leads. These methods historically relied on structure-based comparisons to compute similarity. Here, we use a chemical language model to create a vector-based chemical search. We extend previous implementations by creating a prompt engineering strategy that utilizes two different chemical string representation algorithms: one for the query and the other for the database. We explore this method by reviewing search results from nine queries with diverse targets. We find that the method identifies molecules with similar patent-derived functionality to the query, as determined by our validated LLM-assisted patent summarization pipeline. Further, many of these functionally similar molecules have different structures and scaffolds from the query, making them unlikely to be found with traditional chemical similarity searches. This method may serve as a new tool for the discovery of novel molecular structural classes that achieve target functionality.
Collapse
Affiliation(s)
- Clayton W. Kosonocky
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78705, USA
| | - Aaron L. Feller
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78705, USA
| | - Claus O. Wilke
- Department of Integrative Biology, University of Texas at Austin, Austin, TX 78705, USA
| | - Andrew D. Ellington
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78705, USA
- Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78705, USA
| |
Collapse
|
24
|
Tran-Nguyen VK, Junaid M, Simeon S, Ballester PJ. A practical guide to machine-learning scoring for structure-based virtual screening. Nat Protoc 2023; 18:3460-3511. [PMID: 37845361 DOI: 10.1038/s41596-023-00885-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 07/03/2023] [Indexed: 10/18/2023]
Abstract
Structure-based virtual screening (SBVS) via docking has been used to discover active molecules for a range of therapeutic targets. Chemical and protein data sets that contain integrated bioactivity information have increased both in number and in size. Artificial intelligence and, more concretely, its machine-learning (ML) branch, including deep learning, have effectively exploited these data sets to build scoring functions (SFs) for SBVS against targets with an atomic-resolution 3D model (e.g., generated by X-ray crystallography or predicted by AlphaFold2). Often outperforming their generic and non-ML counterparts, target-specific ML-based SFs represent the state of the art for SBVS. Here, we present a comprehensive and user-friendly protocol to build and rigorously evaluate these new SFs for SBVS. This protocol is organized into four sections: (i) using a public benchmark of a given target to evaluate an existing generic SF; (ii) preparing experimental data for a target from public repositories; (iii) partitioning data into a training set and a test set for subsequent target-specific ML modeling; and (iv) generating and evaluating target-specific ML SFs by using the prepared training-test partitions. All necessary code and input/output data related to three example targets (acetylcholinesterase, HMG-CoA reductase, and peroxisome proliferator-activated receptor-α) are available at https://github.com/vktrannguyen/MLSF-protocol , can be run by using a single computer within 1 week and make use of easily accessible software/programs (e.g., Smina, CNN-Score, RF-Score-VS and DeepCoy) and web resources. Our aim is to provide practical guidance on how to augment training data to enhance SBVS performance, how to identify the most suitable supervised learning algorithm for a data set, and how to build an SF with the highest likelihood of discovering target-active molecules within a given compound library.
Collapse
Affiliation(s)
| | - Muhammad Junaid
- Centre de Recherche en Cancérologie de Marseille, Marseille, France
| | - Saw Simeon
- Centre de Recherche en Cancérologie de Marseille, Marseille, France
| | | |
Collapse
|
25
|
John L, Nagamani S, Mahanta HJ, Vaikundamani S, Kumar N, Kumar A, Jamir E, Priyadarsinee L, Sastry GN. Molecular Property Diagnostic Suite Compound Library (MPDS-CL): a structure-based classification of the chemical space. Mol Divers 2023:10.1007/s11030-023-10752-1. [PMID: 37902900 DOI: 10.1007/s11030-023-10752-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 10/17/2023] [Indexed: 11/01/2023]
Abstract
Molecular Property Diagnostic Suite Compound Library (MPDS-CL) is an open-source Galaxy-based cheminformatics web portal which presents a structure-based classification of the molecules. A structure-based classification of nearly 150 million unique compounds, obtained from 42 publicly available databases and curated for redundancy removal through 97 hierarchically well-defined atom composition-based portions, has been done. These are further subjected to 56-bit fingerprint-based classification algorithm which led to the formation of 56 structurally well-defined classes. The classes thus obtained were further divided into clusters based on their molecular weight. Thus, the entire set of molecules was put into 56 different classes and 625 clusters. This led to the assignment of a unique ID, named as MPDS-AadharID, for each of these 149,169,443 molecules. MPDS-AadharID is akin to the unique number given to citizens in India (similar to SSN in the US and NINO in the UK). The unique features of MPDS-CL are (a) several search options, such as exact structure search, substructure search, property-based search, fingerprint-based search, using SMILES, InChIKey and key-in; (b) automatic generation of information for the processing for MPDS and other galaxy tools; (c) providing the class and cluster of a molecule which makes it easier and fast to search for similar molecules and (d) information related to the presence of the molecules in multiple databases. The MPDS-CL can be accessed at https://mpds.neist.res.in:8086/ .
Collapse
Affiliation(s)
- Lijo John
- Advanced Computation and Data Sciences Division, CSIR - North East Institute of Science and Technology, Jorhat, 785006, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Selvaraman Nagamani
- Advanced Computation and Data Sciences Division, CSIR - North East Institute of Science and Technology, Jorhat, 785006, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Hridoy Jyoti Mahanta
- Advanced Computation and Data Sciences Division, CSIR - North East Institute of Science and Technology, Jorhat, 785006, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - S Vaikundamani
- Advanced Computation and Data Sciences Division, CSIR - North East Institute of Science and Technology, Jorhat, 785006, India
| | - Nandan Kumar
- Advanced Computation and Data Sciences Division, CSIR - North East Institute of Science and Technology, Jorhat, 785006, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Asheesh Kumar
- Advanced Computation and Data Sciences Division, CSIR - North East Institute of Science and Technology, Jorhat, 785006, India
| | - Esther Jamir
- Advanced Computation and Data Sciences Division, CSIR - North East Institute of Science and Technology, Jorhat, 785006, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Lipsa Priyadarsinee
- Advanced Computation and Data Sciences Division, CSIR - North East Institute of Science and Technology, Jorhat, 785006, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - G Narahari Sastry
- Advanced Computation and Data Sciences Division, CSIR - North East Institute of Science and Technology, Jorhat, 785006, India.
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India.
| |
Collapse
|
26
|
Medina J, White AD. Bloom filters for molecules. J Cheminform 2023; 15:95. [PMID: 37828615 PMCID: PMC10571468 DOI: 10.1186/s13321-023-00765-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 09/25/2023] [Indexed: 10/14/2023] Open
Abstract
Ultra-large chemical libraries are reaching 10s to 100s of billions of molecules. A challenge for these libraries is to efficiently check if a proposed molecule is present. Here we propose and study Bloom filters for testing if a molecule is present in a set using either string or fingerprint representations. Bloom filters are small enough to hold billions of molecules in just a few GB of memory and check membership in sub milliseconds. We found string representations can have a false positive rate below 1% and require significantly less storage than using fingerprints. Canonical SMILES with Bloom filters with the simple FNV (Fowler-Noll-Voll) hashing function provide fast and accurate membership tests with small memory requirements. We provide a general implementation and specific filters for detecting if a molecule is purchasable, patented, or a natural product according to existing databases at https://github.com/whitead/molbloom .
Collapse
Affiliation(s)
- Jorge Medina
- Department of Chemical Engineering, University of Rochester, Rochester, NY, USA
| | - Andrew D White
- Department of Chemical Engineering, University of Rochester, Rochester, NY, USA.
| |
Collapse
|
27
|
Takács G, Havasi D, Sándor M, Dohánics Z, Balogh GT, Kiss R. DIY Virtual Chemical Libraries - Novel Starting Points for Drug Discovery. ACS Med Chem Lett 2023; 14:1188-1197. [PMID: 37736187 PMCID: PMC10510501 DOI: 10.1021/acsmedchemlett.3c00146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 08/28/2023] [Indexed: 09/23/2023] Open
Abstract
The advancement of in silico technologies such as library enumeration and synthetic feasibility prediction has made drug discovery pipelines rely more and more on virtual libraries, which provide a significantly larger pool of compounds than in-stock supplier catalogs. Virtual libraries from external sources, however, may be associated with long delivery time and high cost. In this study, we present a Do-It-Yourself (DIY) combinatorial chemistry library containing over 14 million almost completely novel products built from 1000 low-cost building blocks based on robust reactions frequently applied at medicinal chemistry laboratories. The applicability of the DIY library for various drug discovery approaches is demonstrated by extensive physicochemical property, structural diversity profiling, and the generation of focused libraries. We found that internally built DIY chemical libraries present a viable alternative of external virtual catalogs by providing access to a large number of low-cost and quickly accessible potential chemical starting points for drug discovery.
Collapse
Affiliation(s)
- Gergely Takács
- Department
of Chemical and Environmental Process Engineering, Faculty of Chemical
Technology and Biotechnology, Budapest University
of Technology and Economics, Műegyetem rakpart 3, Budapest 1111, Hungary
- Mcule.com
Kft, Bartók Béla
út 105-113, Budapest 1115, Hungary
| | - Dávid Havasi
- Department
of Chemical and Environmental Process Engineering, Faculty of Chemical
Technology and Biotechnology, Budapest University
of Technology and Economics, Műegyetem rakpart 3, Budapest 1111, Hungary
- Mcule.com
Kft, Bartók Béla
út 105-113, Budapest 1115, Hungary
| | - Márk Sándor
- Mcule.com
Kft, Bartók Béla
út 105-113, Budapest 1115, Hungary
| | - Zsolt Dohánics
- Mcule.com
Kft, Bartók Béla
út 105-113, Budapest 1115, Hungary
| | - György T. Balogh
- Department
of Chemical and Environmental Process Engineering, Faculty of Chemical
Technology and Biotechnology, Budapest University
of Technology and Economics, Műegyetem rakpart 3, Budapest 1111, Hungary
- Department
of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Semmelweis University, Hőgyes Endre utca 7-9, Budapest 1092, Hungary
| | - Róbert Kiss
- Mcule.com
Kft, Bartók Béla
út 105-113, Budapest 1115, Hungary
| |
Collapse
|
28
|
Clyde A, Liu X, Brettin T, Yoo H, Partin A, Babuji Y, Blaiszik B, Mohd-Yusof J, Merzky A, Turilli M, Jha S, Ramanathan A, Stevens R. AI-accelerated protein-ligand docking for SARS-CoV-2 is 100-fold faster with no significant change in detection. Sci Rep 2023; 13:2105. [PMID: 36747041 PMCID: PMC9901402 DOI: 10.1038/s41598-023-28785-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 01/24/2023] [Indexed: 02/08/2023] Open
Abstract
Protein-ligand docking is a computational method for identifying drug leads. The method is capable of narrowing a vast library of compounds down to a tractable size for downstream simulation or experimental testing and is widely used in drug discovery. While there has been progress in accelerating scoring of compounds with artificial intelligence, few works have bridged these successes back to the virtual screening community in terms of utility and forward-looking development. We demonstrate the power of high-speed ML models by scoring 1 billion molecules in under a day (50 k predictions per GPU seconds). We showcase a workflow for docking utilizing surrogate AI-based models as a pre-filter to a standard docking workflow. Our workflow is ten times faster at screening a library of compounds than the standard technique, with an error rate less than 0.01% of detecting the underlying best scoring 0.1% of compounds. Our analysis of the speedup explains that another order of magnitude speedup must come from model accuracy rather than computing speed. In order to drive another order of magnitude of acceleration, we share a benchmark dataset consisting of 200 million 3D complex structures and 2D structure scores across a consistent set of 13 million "in-stock" molecules over 15 receptors, or binding sites, across the SARS-CoV-2 proteome. We believe this is strong evidence for the community to begin focusing on improving the accuracy of surrogate models to improve the ability to screen massive compound libraries 100 × or even 1000 × faster than current techniques and reduce missing top hits. The technique outlined aims to be a fast drop-in replacement for docking for screening billion-scale molecular libraries.
Collapse
Affiliation(s)
- Austin Clyde
- Argonne National Laboratory, Data Science and Learning Division, Chicago, Lemont, 60439, USA.
- Department of Computer Science, University of Chicago, Chicago, 60637, USA.
| | - Xuefeng Liu
- Department of Computer Science, University of Chicago, Chicago, 60637, USA
| | - Thomas Brettin
- Department of Computer Science, University of Chicago, Chicago, 60637, USA
- Argonne National Laboratory, Computing, Environment, and Life Sciences Directorate, Lemont, 60439, USA
| | - Hyunseung Yoo
- Argonne National Laboratory, Data Science and Learning Division, Chicago, Lemont, 60439, USA
| | - Alexander Partin
- Argonne National Laboratory, Data Science and Learning Division, Chicago, Lemont, 60439, USA
| | - Yadu Babuji
- Department of Computer Science, University of Chicago, Chicago, 60637, USA
| | - Ben Blaiszik
- Argonne National Laboratory, Data Science and Learning Division, Chicago, Lemont, 60439, USA
- University of Chicago, Globus, Chicago, 60637, USA
| | - Jamaludin Mohd-Yusof
- Los Alamos National Laboratory, Computer, Computational, and Statistical Sciences, Los Alamos, 87545, USA
| | - Andre Merzky
- Department of Electrical and Computer Engineering, Rutgers University, Piscataway, 08854, USA
- Brookhaven National Laboratory, Computational Sciences Initiative, Upton, 11973, USA
| | - Matteo Turilli
- Department of Electrical and Computer Engineering, Rutgers University, Piscataway, 08854, USA
- Brookhaven National Laboratory, Computational Sciences Initiative, Upton, 11973, USA
| | - Shantenu Jha
- Department of Electrical and Computer Engineering, Rutgers University, Piscataway, 08854, USA
- Brookhaven National Laboratory, Computational Sciences Initiative, Upton, 11973, USA
| | - Arvind Ramanathan
- Argonne National Laboratory, Data Science and Learning Division, Chicago, Lemont, 60439, USA
| | - Rick Stevens
- Department of Computer Science, University of Chicago, Chicago, 60637, USA
- Argonne National Laboratory, Computing, Environment, and Life Sciences Directorate, Lemont, 60439, USA
| |
Collapse
|
29
|
Gadiya Y, Zaliani A, Gribbon P, Hofmann-Apitius M. PEMT: a patent enrichment tool for drug discovery. Bioinformatics 2023; 39:btac716. [PMID: 36322820 PMCID: PMC9805556 DOI: 10.1093/bioinformatics/btac716] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 10/10/2022] [Accepted: 11/01/2022] [Indexed: 11/11/2022] Open
Abstract
MOTIVATION Drug discovery practitioners in industry and academia use semantic tools to extract information from online scientific literature to generate new insights into targets, therapeutics and diseases. However, due to complexities in access and analysis, patent-based literature is often overlooked as a source of information. As drug discovery is a highly competitive field, naturally, tools that tap into patent literature can provide any actor in the field an advantage in terms of better informed decision-making. Hence, we aim to facilitate access to patent literature through the creation of an automatic tool for extracting information from patents described in existing public resources. RESULTS Here, we present PEMT, a novel patent enrichment tool, that takes advantage of public databases like ChEMBL and SureChEMBL to extract relevant patent information linked to chemical structures and/or gene names described through FAIR principles and metadata annotations. PEMT aims at supporting drug discovery and research by establishing a patent landscape around genes of interest. The pharmaceutical focus of the tool is mainly due to the subselection of International Patent Classification codes, but in principle, it can be used for other patent fields, provided that a link between a concept and chemical structure is investigated. Finally, we demonstrate a use-case in rare diseases by generating a gene-patent list based on the epidemiological prevalence of these diseases and exploring their underlying patent landscapes. AVAILABILITY AND IMPLEMENTATION PEMT is an open-source Python tool and its source code and PyPi package are available at https://github.com/Fraunhofer-ITMP/PEMT and https://pypi.org/project/PEMT/, respectively. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yojana Gadiya
- Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), Hamburg 22525, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases (CIMD), Frankfurt 60590, Germany
| | - Andrea Zaliani
- Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), Hamburg 22525, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases (CIMD), Frankfurt 60590, Germany
| | - Philip Gribbon
- Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP), Hamburg 22525, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases (CIMD), Frankfurt 60590, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin 53754, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, Bonn 53113, Germany
| |
Collapse
|
30
|
Magariños MP, Gaulton A, Félix E, Kiziloren T, Arcila R, Oprea TI, Leach AR. Illuminating the druggable genome through patent bioactivity data. PeerJ 2023; 11:e15153. [PMID: 37151295 PMCID: PMC10162037 DOI: 10.7717/peerj.15153] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 03/10/2023] [Indexed: 05/09/2023] Open
Abstract
The patent literature is a potentially valuable source of bioactivity data. In this article we describe a process to prioritise 3.7 million life science relevant patents obtained from the SureChEMBL database (https://www.surechembl.org/), according to how likely they were to contain bioactivity data for potent small molecules on less-studied targets, based on the classification developed by the Illuminating the Druggable Genome (IDG) project. The overall goal was to select a smaller number of patents that could be manually curated and incorporated into the ChEMBL database. Using relatively simple annotation and filtering pipelines, we have been able to identify a substantial number of patents containing quantitative bioactivity data for understudied targets that had not previously been reported in the peer-reviewed medicinal chemistry literature. We quantify the added value of such methods in terms of the numbers of targets that are so identified, and provide some specific illustrative examples. Our work underlines the potential value in searching the patent corpus in addition to the more traditional peer-reviewed literature. The small molecules found in these patents, together with their measured activity against the targets, are now accessible via the ChEMBL database.
Collapse
Affiliation(s)
| | - Anna Gaulton
- EMBL-EBI, Hinxton, United Kingdom
- Exscientia, Oxford, United Kingdom
| | | | | | | | - Tudor I. Oprea
- Translational informatics Division, Department of Internal Medicine, School of Medicine, University of New Mexico, Albuquerque, United States
| | | |
Collapse
|
31
|
Ahmad I, Kuznetsov AE, Pirzada AS, Alsharif KF, Daglia M, Khan H. Computational pharmacology and computational chemistry of 4-hydroxyisoleucine: Physicochemical, pharmacokinetic, and DFT-based approaches. Front Chem 2023; 11:1145974. [PMID: 37123881 PMCID: PMC10133580 DOI: 10.3389/fchem.2023.1145974] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 03/21/2023] [Indexed: 05/02/2023] Open
Abstract
Computational pharmacology and chemistry of drug-like properties along with pharmacokinetic studies have made it more amenable to decide or predict a potential drug candidate. 4-Hydroxyisoleucine is a pharmacologically active natural product with prominent antidiabetic properties. In this study, ADMETLab 2.0 was used to determine its important drug-related properties. 4-Hydroxyisoleucine is compliant with important drug-like physicochemical properties and pharma giants' drug-ability rules like Lipinski's, Pfizer, and GlaxoSmithKline (GSK) rules. Pharmacokinetically, it has been predicted to have satisfactory cell permeability. Blood-brain barrier permeation may add central nervous system (CNS) effects, while a very slight probability of being CYP2C9 substrate exists. None of the well-known toxicities were predicted in silico, being congruent with wet lab results, except for a "very slight risk" for respiratory toxicity predicted. The molecule is non ecotoxic as analyzed with common indicators such as bioconcentration and LC50 for fathead minnow and daphnia magna. The toxicity parameters identified 4-hydroxyisoleucine as non-toxic to androgen receptors, PPAR-γ, mitochondrial membrane receptor, heat shock element, and p53. However, out of seven parameters, not even a single toxicophore was found. The density functional theory (DFT) study provided support to the findings obtained from drug-like property predictions. Hence, it is a very logical approach to proceed further with a detailed pharmacokinetics and drug development process for 4-hydroxyisoleucine.
Collapse
Affiliation(s)
- Imad Ahmad
- Department of Pharmacy, Abdul Wali Khan University Mardan, Mardan, Pakistan
| | - Aleksey E. Kuznetsov
- Department of Chemistry, Universidad Tecnica Federico Santa Maria, Santiago, Chile
| | | | - Khalaf F. Alsharif
- Department of Clinical Laboratory, College of Applied Medical Science, Taif University, Taif, Saudi Arabia
| | - Maria Daglia
- Department of Pharmacy, University of Naples Federico II, Naples, Italy
- International Research Centre for Food Nutrition and Safety, Jiangsu University, Zhenjiang, China
| | - Haroon Khan
- Department of Pharmacy, Abdul Wali Khan University Mardan, Mardan, Pakistan
- *Correspondence: Haroon Khan,
| |
Collapse
|
32
|
Jama M, Ahmed M, Jutla A, Wiethan C, Kumar J, Moon TC, West F, Overduin M, Barakat KH. Discovery of allosteric SHP2 inhibitors through ensemble-based consensus molecular docking, endpoint and absolute binding free energy calculations. Comput Biol Med 2023; 152:106442. [PMID: 36566625 DOI: 10.1016/j.compbiomed.2022.106442] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 12/05/2022] [Accepted: 12/15/2022] [Indexed: 12/24/2022]
Abstract
SHP2 (Src homology-2 domain-containing protein tyrosine phosphatase-2) is a cytoplasmic protein -tyrosine phosphatase encoded by the gene PTPN11. It plays a crucial role in regulating cell growth and differentiation. Specifically, SHP2 is an oncoprotein associated with developmental pathologies and several different cancer types, including gastric, leukemia and breast cancer and is of great therapeutic interest. Given these roles, current research efforts have focused on developing SHP2 inhibitors. Allosteric SHP2 inhibitors have been shown to be more selective and pharmacologically appealing compared to competitive catalytic inhibitors targeting SHP2. Nevertheless, there remains a need for novel allosteric inhibitor scaffolds targeting SHP2 to develop compounds with improved selectivity, cell permeability, and bioavailability. Towards this goal, this study applied various computational tools to screen over 6 million compounds against the allosteric site within SHP2. The top-ranked hits from our in-silico screening were validated using protein thermal shift and biolayer interferometry assays, revealing three potent compounds. Kinetic binding assays were employed to measure the binding affinities of the top-ranked compounds and demonstrated that they all bind to SHP2 with a nanomolar affinity. Hence the compounds and the computational workflow described herein provide an effective approach for identifying and designing a generation of improved allosteric inhibitors of SHP2.
Collapse
Affiliation(s)
- Maryam Jama
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Canada
| | - Marawan Ahmed
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Canada
| | - Anna Jutla
- Department of Biochemistry, Faculty of Medicine and Dentistry, University of Alberta, Canada
| | | | - Jitendra Kumar
- Department of Biochemistry, Faculty of Medicine and Dentistry, University of Alberta, Canada
| | - Tae Chul Moon
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Canada
| | - Frederick West
- Department of Chemistry, University of Alberta, Canada; Department of Oncology and Cancer Research Institute of Northern Alberta, University of Alberta, Canada
| | - Michael Overduin
- Department of Biochemistry, Faculty of Medicine and Dentistry, University of Alberta, Canada
| | - Khaled H Barakat
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Canada; Li Ka Shing Institute of Virology, University of Alberta, Canada.
| |
Collapse
|
33
|
Ciray F, Doğan T. Machine learning-based prediction of drug approvals using molecular, physicochemical, clinical trial, and patent-related features. Expert Opin Drug Discov 2022; 17:1425-1441. [PMID: 36444655 DOI: 10.1080/17460441.2023.2153830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
BACKGROUND Drug development productivity has been declining lately due to elevated costs and reduced discovery rates. Therefore, pharmaceutical companies have been seeking alternative ways to determine and evaluate drug candidates. RESEARCH DESIGN AND METHODS In this work, we proposed a new computational approach to directly predict the regulatory approval of drug candidates, and implemented it as a method called 'DrugApp.' To accomplish this task, we employed multiple types of features including molecular and physicochemical properties of drug candidates, together with clinical trial and patent-related features, which are then processed by random forest classifiers to train our disease group-specific approval prediction models. RESULTS Our evaluations indicated DrugApp has a high and robust prediction performance. Within a use-case study, we showed our method can predict phase IV trial drugs that are later withdrawn from the market due to severe side effects. Finally, we used DrugApp models to forecast the approval of drug candidates that are currently in phases I/II/III of clinical trials. CONCLUSIONS We hope that our study will aid the research community in terms of evaluating and improving the process of drug development. The datasets, source code, results, and pre-trained models of DrugApp are freely available at https://github.com/HUBioDataLab/DrugApp.
Collapse
Affiliation(s)
- Fulya Ciray
- Biological Data Science Laboratory, Department of Computer Engineering, Hacettepe University, Ankara, Turkey.,Department of Health Informatics, Graduate School of Informatics, METU, Ankara, Turkey
| | - Tunca Doğan
- Biological Data Science Laboratory, Department of Computer Engineering, Hacettepe University, Ankara, Turkey.,Department of Health Informatics, Institute of Informatics, Hacettepe University, Ankara, Turkey.,Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, Ankara, Turkey
| |
Collapse
|
34
|
Lim S, Lee S, Piao Y, Choi M, Bang D, Gu J, Kim S. On modeling and utilizing chemical compound information with deep learning technologies: A task-oriented approach. Comput Struct Biotechnol J 2022; 20:4288-4304. [PMID: 36051875 PMCID: PMC9399946 DOI: 10.1016/j.csbj.2022.07.049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 07/29/2022] [Accepted: 07/29/2022] [Indexed: 11/22/2022] Open
Abstract
A large number of chemical compounds are available in databases such as PubChem and ZINC. However, currently known compounds, though large, represent only a fraction of possible compounds, which is known as chemical space. Many of these compounds in the databases are annotated with properties and assay data that can be used for drug discovery efforts. For this goal, a number of machine learning algorithms have been developed and recent deep learning technologies can be effectively used to navigate chemical space, especially for unknown chemical compounds, in terms of drug-related tasks. In this article, we survey how deep learning technologies can model and utilize chemical compound information in a task-oriented way by exploiting annotated properties and assay data in the chemical compounds databases. We first compile what kind of tasks are trying to be accomplished by machine learning methods. Then, we survey deep learning technologies to show their modeling power and current applications for accomplishing drug related tasks. Next, we survey deep learning techniques to address the insufficiency issue of annotated data for more effective navigation of chemical space. Chemical compound information alone may not be powerful enough for drug related tasks, thus we survey what kind of information, such as assay and gene expression data, can be used to improve the prediction power of deep learning models. Finally, we conclude this survey with four important newly developed technologies that are yet to be fully incorporated into computational analysis of chemical information.
Collapse
Affiliation(s)
- Sangsoo Lim
- Bioinformatics Institute, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 08826, South Korea
| | - Sangseon Lee
- Institute of Computer Technology, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 08826, South Korea
| | - Yinhua Piao
- Department of Computer Science and Engineering, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 08826, South Korea
| | - MinGyu Choi
- Department of Chemistry, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 08826, South Korea
- AIGENDRUG Co., Ltd., Gwanak-ro 1, Gwanak-gu, Seoul 08826, South Korea
| | - Dongmin Bang
- Interdisciplinary Program in Bioinformatics, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 08826, South Korea
| | - Jeonghyeon Gu
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 08826, South Korea
| | - Sun Kim
- Department of Computer Science and Engineering, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 08826, South Korea
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 08826, South Korea
- MOGAM Institute for Biomedical Research, Yong-in 16924, South Korea
- AIGENDRUG Co., Ltd., Gwanak-ro 1, Gwanak-gu, Seoul 08826, South Korea
| |
Collapse
|
35
|
Artificial intelligence and machine-learning approaches in structure and ligand-based discovery of drugs affecting central nervous system. Mol Divers 2022; 27:959-985. [PMID: 35819579 DOI: 10.1007/s11030-022-10489-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 06/21/2022] [Indexed: 12/11/2022]
Abstract
CNS disorders are indications with a very high unmet medical needs, relatively smaller number of available drugs, and a subpar satisfaction level among patients and caregiver. Discovery of CNS drugs is extremely expensive affair with its own unique challenges leading to extremely high attrition rates and low efficiency. With explosion of data in information age, there is hardly any aspect of life that has not been touched by data driven technologies such as artificial intelligence (AI) and machine learning (ML). Drug discovery is no exception, emergence of big data via genomic, proteomic, biological, and chemical technologies has driven pharmaceutical giants to collaborate with AI oriented companies to revolutionise drug discovery, with the goal of increasing the efficiency of the process. In recent years many examples of innovative applications of AI and ML techniques in CNS drug discovery has been reported. Research on therapeutics for diseases such as schizophrenia, Alzheimer's and Parkinsonism has been provided with a new direction and thrust from these developments. AI and ML has been applied to both ligand-based and structure-based drug discovery and design of CNS therapeutics. In this review, we have summarised the general aspects of AI and ML from the perspective of drug discovery followed by a comprehensive coverage of the recent developments in the applications of AI/ML techniques in CNS drug discovery.
Collapse
|
36
|
Shearer J, Castro JL, Lawson ADG, MacCoss M, Taylor RD. Rings in Clinical Trials and Drugs: Present and Future. J Med Chem 2022; 65:8699-8712. [PMID: 35730680 PMCID: PMC9289879 DOI: 10.1021/acs.jmedchem.2c00473] [Citation(s) in RCA: 110] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We present a comprehensive analysis of all ring systems (both heterocyclic and nonheterocyclic) in clinical trial compounds and FDA-approved drugs. We show 67% of small molecules in clinical trials comprise only ring systems found in marketed drugs, which mirrors previously published findings for newly approved drugs. We also show there are approximately 450 000 unique ring systems derived from 2.24 billion molecules currently available in synthesized chemical space, and molecules in clinical trials utilize only 0.1% of this available pool. Moreover, there are fewer ring systems in drugs compared with those in clinical trials, but this is balanced by the drug ring systems being reused more often. Furthermore, systematic changes of up to two atoms on existing drug and clinical trial ring systems give a set of 3902 future clinical trial ring systems, which are predicted to cover approximately 50% of the novel ring systems entering clinical trials.
Collapse
Affiliation(s)
| | | | | | - Malcolm MacCoss
- Bohicket Pharma Consulting Limited Liability Company, 2556 Seabrook Island Road, Seabrook Island, South Carolina29455, United States
| | | |
Collapse
|
37
|
Kojima E, Iimuro A, Nakajima M, Kinuta H, Asada N, Sako Y, Nakata Z, Uemura K, Arita S, Miki S, Wakasa-Morimoto C, Tachibana Y. Pocket-to-Lead: Structure-Based De Novo Design of Novel Non-peptidic HIV-1 Protease Inhibitors Using the Ligand Binding Pocket as a Template. J Med Chem 2022; 65:6157-6170. [PMID: 35416651 DOI: 10.1021/acs.jmedchem.1c02217] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A novel strategy for lead identification that we have dubbed the "Pocket-to-Lead" strategy is demonstrated using HIV-1 protease as a model target. Sometimes, it is difficult to obtain hit compounds because of the difficulties in satisfying the complex pharmacophoric features. In this study, a virtual fragment hit which does not match all of the pharmacophore features but has key interactions and vectors that could grow into remaining pharmacophore features was optimized in silico. The designed compound 9 demonstrated weak but evident inhibitory activity (IC50 = 54 μM), and the design concept was proven by the co-crystal structure. Then, structure-based drug design promptly gave compound 14 (IC50 = 0.0071 μM, EC50 = 0.86 μM), an almost 10,000-fold improvement in activity from 9. The structure of the designed molecules proved to be novel with high synthetic feasibility, indicating the usefulness of this strategy to tackle tough targets with complex pharmacophore.
Collapse
Affiliation(s)
- Eiichi Kojima
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Atsuhiro Iimuro
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Mado Nakajima
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Hirotaka Kinuta
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Naoya Asada
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Yusuke Sako
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Zenzaburo Nakata
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Kentaro Uemura
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Shuhei Arita
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Shinobu Miki
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Chiaki Wakasa-Morimoto
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| | - Yuki Tachibana
- Shionogi Pharmaceutical Research Center, 3-1-1 Futaba-cho, Toyonaka, Osaka 561-0825, Japan
| |
Collapse
|
38
|
Wang PH, Chen JH, Tseng YJ. Intelligent pharmaceutical patent search on a near-term gate-based quantum computer. Sci Rep 2022; 12:175. [PMID: 34997034 PMCID: PMC8742058 DOI: 10.1038/s41598-021-04031-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 12/14/2021] [Indexed: 12/03/2022] Open
Abstract
Pharmaceutical patent analysis is the key to product protection for pharmaceutical companies. In patent claims, a Markush structure is a standard chemical structure drawing with variable substituents. Overlaps between apparently dissimilar Markush structures are nearly unrecognizable when the structures span a broad chemical space. We propose a quantum search-based method which performs an exact comparison between two non-enumerated Markush structures with a constraint satisfaction oracle. The quantum circuit is verified with a quantum simulator and the real effect of noise is estimated using a five-qubit superconductivity-based IBM quantum computer. The possibilities of measuring the correct states can be increased by improving the connectivity of the most computation intensive qubits. Depolarizing error is the most influential error. The quantum method to exactly compares two patents is hard to simulate classically and thus creates a quantum advantage in patent analysis.
Collapse
Affiliation(s)
- Pei-Hua Wang
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan
| | - Jen-Hao Chen
- Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan.,Chunghwa Telecom Co., Ltd, Taipei, 106, Taiwan
| | - Yufeng Jane Tseng
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan. .,Department of Computer Science and Information Engineering, National Taiwan University, No. 1 Sec. 4, Roosevelt Road, Taipei, 106, Taiwan.
| |
Collapse
|
39
|
Choi J, Lee J. V-Dock: Fast Generation of Novel Drug-like Molecules Using Machine-Learning-Based Docking Score and Molecular Optimization. Int J Mol Sci 2021; 22:11635. [PMID: 34769065 PMCID: PMC8584000 DOI: 10.3390/ijms222111635] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 10/13/2021] [Accepted: 10/24/2021] [Indexed: 02/06/2023] Open
Abstract
We propose a computational workflow to design novel drug-like molecules by combining the global optimization of molecular properties and protein-ligand docking with machine learning. However, most existing methods depend heavily on experimental data, and many targets do not have sufficient data to train reliable activity prediction models. To overcome this limitation, protein-ligand docking calculations must be performed using the limited data available. Such docking calculations during molecular generation require considerable computational time, preventing extensive exploration of the chemical space. To address this problem, we trained a machine-learning-based model that predicted the docking energy using SMILES to accelerate the molecular generation process. Docking scores could be accurately predicted using only a SMILES string. We combined this docking score prediction model with the global molecular property optimization approach, MolFinder, to find novel molecules exhibiting the desired properties with high values of predicted docking scores. We named this design approach V-dock. Using V-dock, we efficiently generated many novel molecules with high docking scores for a target protein, a similarity to the reference molecule, and desirable drug-like and bespoke properties, such as QED. The predicted docking scores of the generated molecules were verified by correlating them with the actual docking scores.
Collapse
Affiliation(s)
- Jieun Choi
- Department of Chemistry, Division of Chemistry and Biochemistry, Kangwon National University, Chuncheon 24341, Korea;
| | - Juyong Lee
- Department of Chemistry, Division of Chemistry and Biochemistry, Kangwon National University, Chuncheon 24341, Korea;
- Arontier Co., Seoul 06735, Korea
| |
Collapse
|
40
|
Ohms J. Current methodologies for chemical compound searching in patents: A case study. WORLD PATENT INFORMATION 2021. [DOI: 10.1016/j.wpi.2021.102055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
41
|
Congenericity of Claimed Compounds in Patent Applications. Molecules 2021; 26:molecules26175253. [PMID: 34500686 PMCID: PMC8433967 DOI: 10.3390/molecules26175253] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 08/17/2021] [Accepted: 08/18/2021] [Indexed: 12/04/2022] Open
Abstract
A method is presented to analyze quantitatively the degree of congenericity of claimed compounds in patent applications. The approach successfully differentiates patents exemplified with highly congeneric compounds of a structurally compact and well defined chemical series from patents containing a more diverse set of compounds around a more vaguely described patent claim. An application to 750 common patents available in SureChEMBL, SureChEMBLccs and ChEMBL is presented and the congenericity of patent compounds in those different sources discussed.
Collapse
|
42
|
Grisoni F, Huisman BJH, Button AL, Moret M, Atz K, Merk D, Schneider G. Combining generative artificial intelligence and on-chip synthesis for de novo drug design. SCIENCE ADVANCES 2021; 7:eabg3338. [PMID: 34117066 PMCID: PMC8195470 DOI: 10.1126/sciadv.abg3338] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2020] [Accepted: 04/23/2021] [Indexed: 05/24/2023]
Abstract
Automating the molecular design-make-test-analyze cycle accelerates hit and lead finding for drug discovery. Using deep learning for molecular design and a microfluidics platform for on-chip chemical synthesis, liver X receptor (LXR) agonists were generated from scratch. The computational pipeline was tuned to explore the chemical space of known LXRα agonists and generate novel molecular candidates. To ensure compatibility with automated on-chip synthesis, the chemical space was confined to the virtual products obtainable from 17 one-step reactions. Twenty-five de novo designs were successfully synthesized in flow. In vitro screening of the crude reaction products revealed 17 (68%) hits, with up to 60-fold LXR activation. The batch resynthesis, purification, and retesting of 14 of these compounds confirmed that 12 of them were potent LXR agonists. These results support the suitability of the proposed design-make-test-analyze framework as a blueprint for automated drug design with artificial intelligence and miniaturized bench-top synthesis.
Collapse
Affiliation(s)
- Francesca Grisoni
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland.
- Eindhoven University of Technology, Department of Biomedical Engineering, Eindhoven, Netherlands
| | - Berend J H Huisman
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland
| | - Alexander L Button
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland
- University of Lausanne, Department of Computational Biology, Lausanne, Switzerland
| | - Michael Moret
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland
| | - Kenneth Atz
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland
| | - Daniel Merk
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland.
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, Frankfurt, Germany
| | - Gisbert Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, RETHINK, Zurich, Switzerland.
- ETH Singapore SEC Ltd, Singapore, Singapore
| |
Collapse
|
43
|
Active Learning and the Potential of Neural Networks Accelerate Molecular Screening for the Design of a New Molecule Effective against SARS-CoV-2. BIOMED RESEARCH INTERNATIONAL 2021; 2021:6696012. [PMID: 34124259 PMCID: PMC8172298 DOI: 10.1155/2021/6696012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2020] [Revised: 05/07/2021] [Accepted: 05/15/2021] [Indexed: 12/04/2022]
Abstract
A global pandemic has emerged following the appearance of the new severe acute respiratory virus whose official name is the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), strongly affecting the health sector as well as the world economy. Indeed, following the emergence of this new virus, despite the existence of a few approved and known effective vaccines at the time of writing this original study, a sense of urgency has emerged worldwide to discover new technical tools and new drugs as soon as possible. In this context, many studies and researches are currently underway to develop new tools and therapies against SARS CoV-2 and other viruses, using different approaches. The 3-chymotrypsin (3CL) protease, which is directly involved in the cotranslational and posttranslational modifications of viral polyproteins essential for the existence and replication of the virus in the host, is one of the coronavirus target proteins that has been the subject of these extensive studies. Currently, the majority of these studies are aimed at repurposing already known and clinically approved drugs against this new virus, but this approach is not really successful. Recently, different studies have successfully demonstrated the effectiveness of artificial intelligence-based techniques to understand existing chemical spaces and generate new small molecules that are both effective and efficient. In this framework and for our study, we combined a generative recurrent neural network model with transfer learning methods and active learning-based algorithms to design novel small molecules capable of effectively inhibiting the 3CL protease in human cells. We then analyze these small molecules to find the correct binding site that matches the structure of the 3CL protease of our target virus as well as other analyses performed in this study. Based on these screening results, some molecules have achieved a good binding score close to -18 kcal/mol, which we can consider as good potential candidates for further synthesis and testing against SARS-CoV-2.
Collapse
|
44
|
Falaguera MJ, Mestres J. Identification of the Core Chemical Structure in SureChEMBL Patents. J Chem Inf Model 2021; 61:2241-2247. [PMID: 33929850 DOI: 10.1021/acs.jcim.1c00151] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The SureChEMBL database provides open access to 17 million chemical entities mentioned in 14 million patents published since 1970. However, alongside with molecules covered by patent claims, the database is full of starting materials and intermediate products of little pharmacological relevance. Herein, we introduce a new filtering protocol to automatically select the core chemical structures best representing a congeneric series of pharmacologically relevant molecules in patents. The protocol is first validated against a selection of 890 SureChEMBL patents for which a total of 51,738 manually curated molecules are deposited in ChEMBL. Our protocol was able to select 92.5% of the molecules in ChEMBL from all 270,968 molecules in SureChEMBL for those patents. Subsequently, the protocol was applied to all 240,988 US pharmacological patents for which 9,111,706 molecules are available in SureChEMBL. The unsupervised filtering process selected 5,949,214 molecules (65.3% of the total number of molecules) that form highly congeneric chemical series in 188,795 of those patents (78.3% of the total number of patents). A SureChEMBL version enriched with molecules of pharmacological relevance is available for download at https://ftp.ebi.ac.uk/pub/databases/chembl/SureChEMBLccs.
Collapse
Affiliation(s)
- Maria J Falaguera
- Research Group on Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute and University Pompeu Fabra, Parc de Recerca Biomèdica (PRBB), Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| | - Jordi Mestres
- Research Group on Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute and University Pompeu Fabra, Parc de Recerca Biomèdica (PRBB), Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| |
Collapse
|
45
|
Yang ZY, Yang ZJ, Zhao Y, Yin MZ, Lu AP, Chen X, Liu S, Hou TJ, Cao DS. PySmash: Python package and individual executable program for representative substructure generation and application. Brief Bioinform 2021; 22:6168498. [PMID: 33709154 DOI: 10.1093/bib/bbab017] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2020] [Revised: 01/06/2021] [Accepted: 01/12/2021] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Substructure screening is widely applied to evaluate the molecular potency and ADMET properties of compounds in drug discovery pipelines, and it can also be used to interpret QSAR models for the design of new compounds with desirable physicochemical and biological properties. With the continuous accumulation of more experimental data, data-driven computational systems which can derive representative substructures from large chemical libraries attract more attention. Therefore, the development of an integrated and convenient tool to generate and implement representative substructures is urgently needed. RESULTS In this study, PySmash, a user-friendly and powerful tool to generate different types of representative substructures, was developed. The current version of PySmash provides both a Python package and an individual executable program, which achieves ease of operation and pipeline integration. Three types of substructure generation algorithms, including circular, path-based and functional group-based algorithms, are provided. Users can conveniently customize their own requirements for substructure size, accuracy and coverage, statistical significance and parallel computation during execution. Besides, PySmash provides the function for external data screening. CONCLUSION PySmash, a user-friendly and integrated tool for the automatic generation and implementation of representative substructures, is presented. Three screening examples, including toxicophore derivation, privileged motif detection and the integration of substructures with machine learning (ML) models, are provided to illustrate the utility of PySmash in safety profile evaluation, therapeutic activity exploration and molecular optimization, respectively. Its executable program and Python package are available at https://github.com/kotori-y/pySmash.
Collapse
Affiliation(s)
- Zi-Yi Yang
- Department of Pharmacy, Xiangya Hospital, Central South University and the Xiangya School of Pharmaceutical Sciences, Central South University, Sichuan, China
| | - Zhi-Jiang Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, Hunan, China
| | - Yue Zhao
- Xiangya School of Pharmaceutical Sciences, Central South University (Changsha), Sichuan, China
| | - Ming-Zhu Yin
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital, Central South University, Hunan
| | - Ai-Ping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong
| | - Xiang Chen
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital, Central South University, Hunan
| | - Shao Liu
- Department of Pharmacy, Xiangya Hospital, Central South University, Hunan
| | - Ting-Jun Hou
- College of Pharmaceutical Sciences, Zhejiang University, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, China
| |
Collapse
|
46
|
Bian Y, Xie XQ. Generative chemistry: drug discovery with deep learning generative models. J Mol Model 2021; 27:71. [PMID: 33543405 PMCID: PMC10984615 DOI: 10.1007/s00894-021-04674-8] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 01/13/2021] [Indexed: 12/15/2022]
Abstract
The de novo design of molecular structures using deep learning generative models introduces an encouraging solution to drug discovery in the face of the continuously increased cost of new drug development. From the generation of original texts, images, and videos, to the scratching of novel molecular structures the creativity of deep learning generative models exhibits the height machine intelligence can achieve. The purpose of this paper is to review the latest advances in generative chemistry which relies on generative modeling to expedite the drug discovery process. This review starts with a brief history of artificial intelligence in drug discovery to outline this emerging paradigm. Commonly used chemical databases, molecular representations, and tools in cheminformatics and machine learning are covered as the infrastructure for generative chemistry. The detailed discussions on utilizing cutting-edge generative architectures, including recurrent neural network, variational autoencoder, adversarial autoencoder, and generative adversarial network for compound generation are focused. Challenges and future perspectives follow.
Collapse
Affiliation(s)
- Yuemin Bian
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA, 15261, USA
- NIH National Center of Excellence for Computational Drug Abuse Research, University of Pittsburgh, Pittsburgh, PA, 15261, USA
| | - Xiang-Qun Xie
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA, 15261, USA.
- NIH National Center of Excellence for Computational Drug Abuse Research, University of Pittsburgh, Pittsburgh, PA, 15261, USA.
- Drug Discovery Institute, University of Pittsburgh, 335 Sutherland Drive, 206 Salk Pavilion, Pittsburgh, PA, 15261, USA.
- Departments of Computational Biology and Structural Biology, School of Medicine, University of Pittsburgh, PA, 15261, Pittsburgh, USA.
| |
Collapse
|
47
|
Awale M, Hert J, Guasch L, Riniker S, Kramer C. The Playbooks of Medicinal Chemistry Design Moves. J Chem Inf Model 2021; 61:729-742. [PMID: 33522806 DOI: 10.1021/acs.jcim.0c01143] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Large databases of biologically relevant molecules, such as ChEMBL, SureChEMBL, or compound collections of pharmaceutical or agrochemical companies, are invaluable sources of medicinal chemistry information, albeit implicit. We developed a modified matched molecular pair approach to systematically and exhaustively extract the transformations in these databases and distill them into snippets of explicit design knowledge that are easily interpretable and directly applicable. The resulting "playbooks of medicinal chemistry design moves" capture the collective pharmaceutical and agrochemical research expertise across multiple chemists, companies, targets, and projects. They can be queried in an automated fashion for systematic prospective design and compound generation. The ChEMBL playbook and an application to exploit it are available at https://github.com/mahendra-awale/medchem_moves.
Collapse
Affiliation(s)
- Mahendra Awale
- Computer-Aided Drug Design/Therapeutic Modalities, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070 Basel, Switzerland
| | - Jérôme Hert
- Computer-Aided Drug Design/Therapeutic Modalities, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070 Basel, Switzerland
| | - Laura Guasch
- Computer-Aided Drug Design/Therapeutic Modalities, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070 Basel, Switzerland
| | - Sereina Riniker
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Christian Kramer
- Computer-Aided Drug Design/Therapeutic Modalities, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070 Basel, Switzerland
| |
Collapse
|
48
|
Machine learning, artificial intelligence, and data science breaking into drug design and neglected diseases. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2021. [DOI: 10.1002/wcms.1513] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
49
|
Coley CW, Eyke NS, Jensen KF. Autonome Entdeckung in den chemischen Wissenschaften, Teil II: Ausblick. Angew Chem Int Ed Engl 2020. [DOI: 10.1002/ange.201909989] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Natalie S. Eyke
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
50
|
Langevin M, Minoux H, Levesque M, Bianciotto M. Scaffold-Constrained Molecular Generation. J Chem Inf Model 2020; 60:5637-5646. [PMID: 33301333 DOI: 10.1021/acs.jcim.0c01015] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
One of the major applications of generative models for drug discovery targets the lead-optimization phase. During the optimization of a lead series, it is common to have scaffold constraints imposed on the structure of the molecules designed. Without enforcing such constraints, the probability of generating molecules with the required scaffold is extremely low and hinders the practicality of generative models for de novo drug design. To tackle this issue, we introduce a new algorithm, named SAMOA (Scaffold Constrained Molecular Generation), to perform scaffold-constrained in silico molecular design. We build on the well-known SMILES-based Recurrent Neural Network (RNN) generative model, with a modified sampling procedure to achieve scaffold-constrained generation. We directly benefit from the associated reinforcement learning methods, allowing to design molecules optimized for different properties while exploring only the relevant chemical space. We showcase the method's ability to perform scaffold-constrained generation on various tasks: designing novel molecules around scaffolds extracted from SureChEMBL chemical series, generating novel active molecules on the Dopamine Receptor D2 (DRD2) target, and finally, designing predicted actives on the MMP-12 series, an industrial lead-optimization project.
Collapse
Affiliation(s)
- Maxime Langevin
- PASTEUR, Département de chimie, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 75005 Paris, France.,Molecular Design Sciences - Integrated Drug Discovery, Sanofi R&D, 94400 Vitry-sur-Seine, France
| | - Hervé Minoux
- Molecular Design Sciences - Integrated Drug Discovery, Sanofi R&D, 94400 Vitry-sur-Seine, France
| | - Maximilien Levesque
- PASTEUR, Département de chimie, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 75005 Paris, France.,Aqemia, 75001 Paris, France
| | - Marc Bianciotto
- Molecular Design Sciences - Integrated Drug Discovery, Sanofi R&D, 94400 Vitry-sur-Seine, France
| |
Collapse
|