1
|
Luginina AP, Khnykin AN, Khorn PA, Moiseeva OV, Safronova NA, Pospelov VA, Dashevskii DE, Belousov AS, Borschevskiy VI, Mishin AV. Rational Design of Drugs Targeting G-Protein-Coupled Receptors: Ligand Search and Screening. BIOCHEMISTRY. BIOKHIMIIA 2024; 89:958-972. [PMID: 38880655 DOI: 10.1134/s0006297924050158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 02/22/2024] [Accepted: 02/23/2024] [Indexed: 06/18/2024]
Abstract
G protein-coupled receptors (GPCRs) are transmembrane proteins that participate in many physiological processes and represent major pharmacological targets. Recent advances in structural biology of GPCRs have enabled the development of drugs based on the receptor structure (structure-based drug design, SBDD). SBDD utilizes information about the receptor-ligand complex to search for suitable compounds, thus expanding the chemical space of possible receptor ligands without the need for experimental screening. The review describes the use of structure-based virtual screening (SBVS) for GPCR ligands and approaches for the functional testing of potential drug compounds, as well as discusses recent advances and successful examples in the application of SBDD for the identification of GPCR ligands.
Collapse
Affiliation(s)
- Aleksandra P Luginina
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Andrey N Khnykin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Polina A Khorn
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Olga V Moiseeva
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
- Skryabin Institute of Biochemistry and Physiology of Microorganisms, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russia
| | - Nadezhda A Safronova
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Vladimir A Pospelov
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Dmitrii E Dashevskii
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Anatolii S Belousov
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Valentin I Borschevskiy
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia.
- Frank Laboratory of Neutron Physics, Joint Institute for Nuclear Research, Dubna, Moscow Region, 141980, Russia
| | - Alexey V Mishin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia.
| |
Collapse
|
2
|
Kim H, Lee K, Kim C, Lim J, Kim WY. DFRscore: Deep Learning-Based Scoring of Synthetic Complexity with Drug-Focused Retrosynthetic Analysis for High-Throughput Virtual Screening. J Chem Inf Model 2024; 64:2432-2444. [PMID: 37651152 DOI: 10.1021/acs.jcim.3c01134] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
Recently emerging generative AI models enable us to produce a vast number of compounds for potential applications. While they can provide novel molecular structures, the synthetic feasibility of the generated molecules is often questioned. To address this issue, a few recent studies have attempted to use deep learning models to estimate the synthetic accessibility of many molecules rapidly. However, retrosynthetic analysis tools used to train the models rely on reaction templates automatically extracted from a large reaction database that are not domain-specific and may exhibit low chemical correctness. To overcome this limitation, we introduce DFRscore (Drug-Focused Retrosynthetic score), a deep learning-based approach for a more practical assessment of synthetic accessibility in drug discovery. The DFRscore model is trained exclusively on drug-focused reactions, providing a predicted number of minimally required synthetic steps for each compound. This approach enables practitioners to filter out compounds that do not meet their desired level of synthetic accessibility at an early stage of high-throughput virtual screening for accelerated drug discovery. The proposed strategy can be easily adapted to other domains by adjusting the synthesis planning setup of the reaction templates and starting materials.
Collapse
Affiliation(s)
- Hyeongwoo Kim
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| | - Kyunghoon Lee
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| | - Chansu Kim
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| | - Jaechang Lim
- HITS Incorporation, 124 Teheran-ro, Gangnam-gu, Seoul 06234, Republic of Korea
| | - Woo Youn Kim
- Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
- HITS Incorporation, 124 Teheran-ro, Gangnam-gu, Seoul 06234, Republic of Korea
- AI Institute, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
| |
Collapse
|
3
|
Soleymani S, Gravel N, Huang LC, Yeung W, Bozorgi E, Bendzunas NG, Kochut KJ, Kannan N. Dark kinase annotation, mining, and visualization using the Protein Kinase Ontology. PeerJ 2023; 11:e16087. [PMID: 38077442 PMCID: PMC10704995 DOI: 10.7717/peerj.16087] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 08/22/2023] [Indexed: 12/18/2023] Open
Abstract
The Protein Kinase Ontology (ProKinO) is an integrated knowledge graph that conceptualizes the complex relationships among protein kinase sequence, structure, function, and disease in a human and machine-readable format. In this study, we have significantly expanded ProKinO by incorporating additional data on expression patterns and drug interactions. Furthermore, we have developed a completely new browser from the ground up to render the knowledge graph visible and interactive on the web. We have enriched ProKinO with new classes and relationships that capture information on kinase ligand binding sites, expression patterns, and functional features. These additions extend ProKinO's capabilities as a discovery tool, enabling it to uncover novel insights about understudied members of the protein kinase family. We next demonstrate the application of ProKinO. Specifically, through graph mining and aggregate SPARQL queries, we identify the p21-activated protein kinase 5 (PAK5) as one of the most frequently mutated dark kinases in human cancers with abnormal expression in multiple cancers, including a previously unappreciated role in acute myeloid leukemia. We have identified recurrent oncogenic mutations in the PAK5 activation loop predicted to alter substrate binding and phosphorylation. Additionally, we have identified common ligand/drug binding residues in PAK family kinases, underscoring ProKinO's potential application in drug discovery. The updated ontology browser and the addition of a web component, ProtVista, which enables interactive mining of kinase sequence annotations in 3D structures and Alphafold models, provide a valuable resource for the signaling community. The updated ProKinO database is accessible at https://prokino.uga.edu.
Collapse
Affiliation(s)
- Saber Soleymani
- Department of Computer Science, University of Georgia, Athens, GA, United States
| | - Nathan Gravel
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
| | - Liang-Chin Huang
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
| | - Wayland Yeung
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
| | - Elika Bozorgi
- Department of Computer Science, University of Georgia, Athens, GA, United States
| | - Nathaniel G. Bendzunas
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, United States
| | - Krzysztof J. Kochut
- Department of Computer Science, University of Georgia, Athens, GA, United States
| | - Natarajan Kannan
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, United States
| |
Collapse
|
4
|
Wang S, Wang L, Li F, Bai F. DeepSA: a deep-learning driven predictor of compound synthesis accessibility. J Cheminform 2023; 15:103. [PMID: 37919805 PMCID: PMC10621138 DOI: 10.1186/s13321-023-00771-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 10/20/2023] [Indexed: 11/04/2023] Open
Abstract
With the continuous development of artificial intelligence technology, more and more computational models for generating new molecules are being developed. However, we are often confronted with the question of whether these compounds are easy or difficult to synthesize, which refers to synthetic accessibility of compounds. In this study, a deep learning based computational model called DeepSA, was proposed to predict the synthesis accessibility of compounds, which provides a useful tool to choose molecules. DeepSA is a chemical language model that was developed by training on a dataset of 3,593,053 molecules using various natural language processing (NLP) algorithms, offering advantages over state-of-the-art methods and having a much higher area under the receiver operating characteristic curve (AUROC), i.e., 89.6%, in discriminating those molecules that are difficult to synthesize. This helps users select less expensive molecules for synthesis, reducing the time and cost required for drug discovery and development. Interestingly, a comparison of DeepSA with a Graph Attention-based method shows that using SMILES alone can also efficiently visualize and extract compound's informative features. DeepSA is available online on the below web server ( https://bailab.siais.shanghaitech.edu.cn/services/deepsa/ ) of our group, and the code is available at https://github.com/Shihang-Wang-58/DeepSA .
Collapse
Affiliation(s)
- Shihang Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
| | - Lin Wang
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
| | - Fenglei Li
- School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
| | - Fang Bai
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China.
- School of Information Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China.
- Shanghai Clinical Research and Trial Center, Shanghai, 201210, China.
| |
Collapse
|
5
|
Abstract
Drug development is a wide scientific field that faces many challenges these days. Among them are extremely high development costs, long development times, and a small number of new drugs that are approved each year. New and innovative technologies are needed to solve these problems that make the drug discovery process of small molecules more time and cost efficient, and that allow previously undruggable receptor classes to be targeted, such as protein-protein interactions. Structure-based virtual screenings (SBVSs) have become a leading contender in this context. In this review, we give an introduction to the foundations of SBVSs and survey their progress in the past few years with a focus on ultralarge virtual screenings (ULVSs). We outline key principles of SBVSs, recent success stories, new screening techniques, available deep learning-based docking methods, and promising future research directions. ULVSs have an enormous potential for the development of new small-molecule drugs and are already starting to transform early-stage drug discovery.
Collapse
Affiliation(s)
- Christoph Gorgulla
- Harvard Medical School and Physics Department, Harvard University, Boston, Massachusetts, USA;
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Current affiliation: Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| |
Collapse
|
6
|
Chen L, Fan Z, Chang J, Yang R, Hou H, Guo H, Zhang Y, Yang T, Zhou C, Sui Q, Chen Z, Zheng C, Hao X, Zhang K, Cui R, Zhang Z, Ma H, Ding Y, Zhang N, Lu X, Luo X, Jiang H, Zhang S, Zheng M. Sequence-based drug design as a concept in computational drug design. Nat Commun 2023; 14:4217. [PMID: 37452028 PMCID: PMC10349078 DOI: 10.1038/s41467-023-39856-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 06/27/2023] [Indexed: 07/18/2023] Open
Abstract
Drug development based on target proteins has been a successful approach in recent decades. However, the conventional structure-based drug design (SBDD) pipeline is a complex, human-engineered process with multiple independently optimized steps. Here, we propose a sequence-to-drug concept for computational drug design based on protein sequence information by end-to-end differentiable learning. We validate this concept in three stages. First, we design TransformerCPI2.0 as a core tool for the concept, which demonstrates generalization ability across proteins and compounds. Second, we interpret the binding knowledge that TransformerCPI2.0 learned. Finally, we use TransformerCPI2.0 to discover new hits for challenging drug targets, and identify new target for an existing drug based on an inverse application of the concept. Overall, this proof-of-concept study shows that the sequence-to-drug concept adds a perspective on drug design. It can serve as an alternative method to SBDD, particularly for proteins that do not yet have high-quality 3D structures available.
Collapse
Affiliation(s)
- Lifan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Zisheng Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Shanghai, 200031, China
| | - Jie Chang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
| | - Ruirui Yang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Shanghai, 200031, China
| | - Hui Hou
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Hao Guo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Yinghui Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Tianbiao Yang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Chenmao Zhou
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
| | - Qibang Sui
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Zhengyang Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Chen Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Xinyue Hao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
| | - Keke Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
| | - Rongrong Cui
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Zehong Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Hudson Ma
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Yiluan Ding
- Department of Analytical Chemistry, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Naixia Zhang
- Department of Analytical Chemistry, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Xiaojie Lu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Shanghai, 200031, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Sub-lane Xiangshan, Hangzhou, 310024, China
| | - Sulin Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing, 210023, China.
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Shanghai, 200031, China.
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Sub-lane Xiangshan, Hangzhou, 310024, China.
| |
Collapse
|
7
|
Bhattacharjee A, Sarma S, Sen T, Devi MV, Deka B, Singh AK. Genome mining to identify valuable secondary metabolites and their regulation in Actinobacteria from different niches. Arch Microbiol 2023; 205:127. [PMID: 36944761 DOI: 10.1007/s00203-023-03482-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 02/20/2023] [Accepted: 03/11/2023] [Indexed: 03/23/2023]
Abstract
Actinobacteria are the largest bacteria group with 18 significant lineages, which are ubiquitously distributed in all the possible terrains. They are known to produce more than 10,000 medically relevant compounds. Despite their ability to make critical secondary metabolites and genome sequences' availability, these two have not been linked with certainty. With this intent, our study aims at understanding the biosynthetic capacity in terms of secondary metabolite production in 528 Actinobacteria species from five different habitats, viz., soil, water, plants, animals, and humans. In our analysis of 9,646 clusters of 59 different classes, we have documented 64,000 SMs, of which more than 74% were of unique type, while 19% were partially conserved and 7% were conserved compounds. In the case of conserved compounds, we found the highest distribution in soil, 79.12%. We found alternate sources of antibiotics, such as viomycin, vancomycin, teicoplanin, fosfomycin, ficellomycin and patulin, and antitumour compounds, such as doxorubicin and tacrolimus in the soil. Also our study reported alternate sources for the toxin cyanobactin in water and plant isolates. We further analysed the clusters to determine their regulatory pathways and reported the prominent presence of the two component system of TetR/AcrR family, as well as other partial domains like CitB superfamily and HTH superfamily, and discussed their role in secondary metabolite production. This information will be helpful in exploring Actinobacteria from other environments and in discovering new chemical moieties of clinical significance.
Collapse
Affiliation(s)
- Abhilash Bhattacharjee
- Biotechnology Group, Biological Sciences and Technology Division, CSIR-North East Institute of Science and Technology, Jorhat, 785006, Assam, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 220002, India
- Department of Botany, Dibrugarh Hanumanbax Surajmall Kanoi College, Dibrugarh, 786001, Assam, India
| | - Sangita Sarma
- Biotechnology Group, Biological Sciences and Technology Division, CSIR-North East Institute of Science and Technology, Jorhat, 785006, Assam, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 220002, India
| | - Tejosmita Sen
- Biotechnology Group, Biological Sciences and Technology Division, CSIR-North East Institute of Science and Technology, Jorhat, 785006, Assam, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 220002, India
| | - Moirangthem Veigyabati Devi
- Biotechnology Group, Biological Sciences and Technology Division, CSIR-North East Institute of Science and Technology, Jorhat, 785006, Assam, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 220002, India
| | - Banani Deka
- Biotechnology Group, Biological Sciences and Technology Division, CSIR-North East Institute of Science and Technology, Jorhat, 785006, Assam, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 220002, India
| | - Anil Kumar Singh
- Biotechnology Group, Biological Sciences and Technology Division, CSIR-North East Institute of Science and Technology, Jorhat, 785006, Assam, India.
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 220002, India.
| |
Collapse
|
8
|
Identification of Diagnostic Genes and Effective Drugs Associated with Osteoporosis Treatment by Single-Cell RNA-Seq Analysis and Network Pharmacology. Mediators Inflamm 2022; 2022:6830635. [PMID: 36199280 PMCID: PMC9527401 DOI: 10.1155/2022/6830635] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 08/25/2022] [Accepted: 09/01/2022] [Indexed: 11/17/2022] Open
Abstract
Background Osteoporosis is a common bone metabolic disease with increased bone fragility and fracture rate. Effective diagnosis and treatment of osteoporosis still need to be explored due to the increasing incidence of disease. Methods Single-cell RNA-seq was acquired from GSE147287 dataset. Osteoporosis-related genes were obtained from chEMBL. Cell subpopulations were identified and characterized by scRNA-seq, t-SNE, clusterProfiler, and other computational methods. “limma” R packages were used to identify all differentially expressed genes. A diagnosis model was build using rms R packages. Key drugs were determined by proteins-proteins interaction and molecular docking. Results Firstly, 15,577 cells were obtained, and 12 cell subpopulations were identified by clustering, among which 6 cell subpopulations belong to CD45+ BM-MSCs and the other subpopulations were CD45-BM-MSCs. CD45- BM-MSCs_6 and CD45+ BM-MSCs_5 were consider as key subpopulations. Furthermore, we found 7 genes were correlated with above two subpopulations, and F9 gene had highest AUC. Finally, five compounds were identified, among which DB03742 bound well to F9 protein. Conclusions This work discovered that 7 genes were correlated with CD45-BM-MSCs_6 and CD45+ BM-MSCs_5 subpopulations in osteoporosis, among which F9 gene had better research value. Moreover, compound DB03742 was a potential inhibitor of F9 protein.
Collapse
|
9
|
Gorgulla C, Jayaraj A, Fackeldey K, Arthanari H. Emerging frontiers in virtual drug discovery: From quantum mechanical methods to deep learning approaches. Curr Opin Chem Biol 2022; 69:102156. [PMID: 35576813 PMCID: PMC9990419 DOI: 10.1016/j.cbpa.2022.102156] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 03/16/2022] [Accepted: 04/07/2022] [Indexed: 11/19/2022]
Abstract
Virtual screening-based approaches to discover initial hit and lead compounds have the potential to reduce both the cost and time of early drug discovery stages, as well as to find inhibitors for even challenging target sites such as protein-protein interfaces. Here in this review, we provide an overview of the progress that has been made in virtual screening methodology and technology on multiple fronts in recent years. The advent of ultra-large virtual screens, in which hundreds of millions to billions of compounds are screened, has proven to be a powerful approach to discover highly potent hit compounds. However, these developments are just the tip of the iceberg, with new technologies and methods emerging to propel the field forward. Examples include novel machine-learning approaches, which can reduce the computational costs of virtual screening dramatically, while progress in quantum-mechanical approaches can increase the accuracy of predictions of various small molecule properties.
Collapse
Affiliation(s)
- Christoph Gorgulla
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School (HMS), Boston, MA, USA; Department of Physics, Faculty of Arts and Sciences, Harvard University, Cambridge, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA
| | | | - Konstantin Fackeldey
- Institute of Mathematics, Technical University Berlin, Berlin, Germany; Zuse Institute Berlin, Berlin, Germany
| | - Haribabu Arthanari
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School (HMS), Boston, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA.
| |
Collapse
|
10
|
Yu J, Wang J, Zhao H, Gao J, Kang Y, Cao D, Wang Z, Hou T. Organic Compound Synthetic Accessibility Prediction Based on the Graph Attention Mechanism. J Chem Inf Model 2022; 62:2973-2986. [PMID: 35675668 DOI: 10.1021/acs.jcim.2c00038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Accurate estimation of the synthetic accessibility of small molecules is needed in many phases of drug discovery. Several expert-crafted scoring methods and descriptor-based quantitative structure-activity relationship (QSAR) models have been developed for synthetic accessibility assessment, but their practical applications in drug discovery are still quite limited because of relatively low prediction accuracy and poor model interpretability. In this study, we proposed a data-driven interpretable prediction framework called GASA (Graph Attention-based assessment of Synthetic Accessibility) to evaluate the synthetic accessibility of small molecules by distinguishing compounds to be easy- (ES) or hard-to-synthesize (HS). GASA is a graph neural network (GNN) architecture that makes self-feature deduction by applying an attention mechanism to automatically capture the most important structural features related to synthetic accessibility. The sampling around the hypothetical classification boundary was used to improve the ability of GASA to distinguish structurally similar molecules. GASA was extensively evaluated and compared with two descriptor-based machine learning methods (random forest, RF; eXtreme gradient boosting, XGBoost) and four existing scores (SYBA: SYnthetic Bayesian Accessibility; SCScore: Synthetic Complexity score; RAscore: Retrosynthetic Accessibility score; SAscore: Synthetic Accessibility score). Our analysis demonstrates that GASA achieved remarkable performance in distinguishing similar molecules compared with other methods and had a broader applicability domain. In addition, we show how GASA learns the important features that affect molecular synthetic accessibility by assigning attention weights to different atoms. An online prediction service for GASA was offered at http://cadd.zju.edu.cn/gasa/.
Collapse
Affiliation(s)
- Jiahui Yu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.,School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China
| | - Hong Zhao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Junbo Gao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410004, Hunan, P. R. China
| | - Zhe Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| |
Collapse
|
11
|
Ahmed F, Lee JW, Samantasinghar A, Kim YS, Kim KH, Kang IS, Memon FH, Lim JH, Choi KH. SperoPredictor: An Integrated Machine Learning and Molecular Docking-Based Drug Repurposing Framework With Use Case of COVID-19. Front Public Health 2022; 10:902123. [PMID: 35784208 PMCID: PMC9244710 DOI: 10.3389/fpubh.2022.902123] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 05/02/2022] [Indexed: 12/13/2022] Open
Abstract
The global spread of the SARS coronavirus 2 (SARS-CoV-2), its manifestation in human hosts as a contagious disease, and its variants have induced a pandemic resulting in the deaths of over 6,000,000 people. Extensive efforts have been devoted to drug research to cure and refrain the spread of COVID-19, but only one drug has received FDA approval yet. Traditional drug discovery is inefficient, costly, and unable to react to pandemic threats. Drug repurposing represents an effective strategy for drug discovery and reduces the time and cost compared to de novo drug discovery. In this study, a generic drug repurposing framework (SperoPredictor) has been developed which systematically integrates the various types of drugs and disease data and takes the advantage of machine learning (Random Forest, Tree Ensemble, and Gradient Boosted Trees) to repurpose potential drug candidates against any disease of interest. Drug and disease data for FDA-approved drugs (n = 2,865), containing four drug features and three disease features, were collected from chemical and biological databases and integrated with the form of drug-disease association tables. The resulting dataset was split into 70% for training, 15% for testing, and the remaining 15% for validation. The testing and validation accuracies of the models were 99.3% for Random Forest and 99.03% for Tree Ensemble. In practice, SperoPredictor identified 25 potential drug candidates against 6 human host-target proteomes identified from a systematic review of journals. Literature-based validation indicated 12 of 25 predicted drugs (48%) have been already used for COVID-19 followed by molecular docking and re-docking which indicated 4 of 13 drugs (30%) as potential candidates against COVID-19 to be pre-clinically and clinically validated. Finally, SperoPredictor results illustrated the ability of the platform to be rapidly deployed to repurpose the drugs as a rapid response to emergent situations (like COVID-19 and other pandemics).
Collapse
Affiliation(s)
- Faheem Ahmed
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
| | - Jae Wook Lee
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
- BioSpero, Inc., Jeju, South Korea
| | | | | | - Kyung Hwan Kim
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
| | - In Suk Kang
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
| | - Fida Hussain Memon
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
| | - Jong Hwan Lim
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
| | - Kyung Hyun Choi
- Department of Mechatronics Engineering, Jeju National University, Jeju, South Korea
- BioSpero, Inc., Jeju, South Korea
| |
Collapse
|
12
|
Warr WA, Nicklaus MC, Nicolaou CA, Rarey M. Exploration of Ultralarge Compound Collections for Drug Discovery. J Chem Inf Model 2022; 62:2021-2034. [PMID: 35421301 DOI: 10.1021/acs.jcim.2c00224] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Designing new medicines more cheaply and quickly is tightly linked to the quest of exploring chemical space more widely and efficiently. Chemical space is monumentally large, but recent advances in computer software and hardware have enabled researchers to navigate virtual chemical spaces containing billions of chemical structures. This review specifically concerns collections of many millions or even billions of enumerated chemical structures as well as even larger chemical spaces that are not fully enumerated. We present examples of chemical libraries and spaces and the means used to construct them, and we discuss new technologies for searching huge libraries and for searching combinatorially in chemical space. We also cover space navigation techniques and consider new approaches to de novo drug design and the impact of the "autonomous laboratory" on synthesis of designed compounds. Finally, we summarize some other challenges and opportunities for the future.
Collapse
Affiliation(s)
- Wendy A Warr
- Wendy Warr & Associates, 6 Berwick Court, Holmes Chapel, Crewe, Cheshire CW4 7HZ, United Kingdom
| | - Marc C Nicklaus
- NCI, NIH, CADD Group, NCI-Frederick, Frederick, Maryland 21702, United States
| | - Christos A Nicolaou
- Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, Indiana 46285, United States
| | - Matthias Rarey
- Universität Hamburg, ZBH Center for Bioinformatics, 20146 Hamburg, Germany
| |
Collapse
|
13
|
Su A, Cheng Y, Xue H, She Y, Rajan K. Artificial intelligence informed toxicity screening of amine chemistries used in the synthesis of hybrid
organic–inorganic
perovskites. AIChE J 2022. [DOI: 10.1002/aic.17699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- An Su
- College of Chemical Engineering Zhejiang University of Technology Hangzhou China
- Department of Materials Design and Innovation University at Buffalo Buffalo New York USA
| | - Yingying Cheng
- College of Chemical Engineering Zhejiang University of Technology Hangzhou China
| | - Haotian Xue
- Collaborative Innovation Center of Yangtze River Delta Region Green Pharmaceuticals Zhejiang University of Technology Hangzhou China
| | - Yuanbin She
- College of Chemical Engineering Zhejiang University of Technology Hangzhou China
| | - Krishna Rajan
- Department of Materials Design and Innovation University at Buffalo Buffalo New York USA
| |
Collapse
|
14
|
Staszak M, Staszak K, Wieszczycka K, Bajek A, Roszkowski K, Tylkowski B. Machine learning in drug design: Use of artificial intelligence to explore the chemical structure–biological activity relationship. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1568] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Maciej Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Katarzyna Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Karolina Wieszczycka
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Anna Bajek
- Department of Tissue Engineering Collegium Medicum, Nicolaus Copernicus University Bydgoszcz Poland
| | - Krzysztof Roszkowski
- Department of Oncology Collegium Medicum Nicolaus Copernicus University Bydgoszcz Poland
| | - Bartosz Tylkowski
- Department of Chemical Engineering University Rovira i Virgili Tarragona Spain
- Eurecat, Centre Tecnològic de Catalunya Chemical Technologies Unit Tarragona Spain
| |
Collapse
|
15
|
Genheden S, Engkvist O, Bjerrum E. Fast prediction of distances between synthetic routes with deep learning. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1088/2632-2153/ac4a91] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Abstract
We expand the recent work on clustering of synthetic routes and train a deep learning model to predict the distances between arbitrary routes. The model is based on a long short-term memory representation of a synthetic route and is trained as a twin network to reproduce the tree edit distance (TED) between two routes. The machine learning approach is approximately two orders of magnitude faster than the TED approach and enables clustering many more routes from a retrosynthesis route prediction. The clusters have a high degree of similarity to the clusters given by the TED-based approach and are accordingly intuitive and explainable. We provide the developed model as open-source.
Collapse
|
16
|
Wang XR, Cao TT, Jia CM, Tian XM, Wang Y. Quantitative prediction model for affinity of drug-target interactions based on molecular vibrations and overall system of ligand-receptor. BMC Bioinformatics 2021; 22:497. [PMID: 34649499 PMCID: PMC8515642 DOI: 10.1186/s12859-021-04389-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Accepted: 09/20/2021] [Indexed: 12/27/2022] Open
Abstract
Background The study of drug–target interactions (DTIs) affinity plays an important role in safety assessment and pharmacology. Currently, quantitative structure–activity relationship (QSAR) and molecular docking (MD) are most common methods in research of DTIs affinity. However, they often built for a specific target or several targets, and most QSAR and MD methods were based either on structure of drug molecules or on structure of receptors with low accuracy and small scope of application. How to construct quantitative prediction models with high accuracy and wide applicability remains a challenge. To this end, this paper screened molecular descriptors based on molecular vibrations and took molecule-target as a whole system to construct prediction models with high accuracy-wide applicability based on dissociation constant (Kd) and concentration for 50% of maximal effect (EC50), and to provide reference for quantifying affinity of DTIs. Results After comprehensive comparison, the results showed that RF models are optimal models to analyze and predict DTIs affinity with coefficients of determination (R2) are all greater than 0.94. Compared to the quantitative models reported in literatures, the RF models developed in this paper have higher accuracy and wide applicability. In addition, E-state molecular descriptors associated with molecular vibrations and normalized Moreau-Broto autocorrelation (G3), Moran autocorrelation (G4), transition-distribution (G7) protein descriptors are of higher importance in the quantification of DTIs. Conclusion Through screening molecular descriptors based on molecular vibrations and taking molecule-target as whole system, we obtained optimal models based on RF with more accurate-widely applicable, which indicated that selection of molecular descriptors associated with molecular vibrations and the use of molecular-target as whole system are reliable methods for improving performance of models. It can provide reference for quantifying affinity of DTIs. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04389-w.
Collapse
Affiliation(s)
- Xian-Rui Wang
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Ting-Ting Cao
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Cong Min Jia
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Xue-Mei Tian
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China
| | - Yun Wang
- Key Laboratory of TCM-Information Engineer of State Administration of TCM, School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing, 100102, China.
| |
Collapse
|
17
|
Leguy J, Glavatskikh M, Cauchy T, Da Mota B. Scalable estimator of the diversity for de novo molecular generation resulting in a more robust QM dataset (OD9) and a more efficient molecular optimization. J Cheminform 2021; 13:76. [PMID: 34600576 PMCID: PMC8487551 DOI: 10.1186/s13321-021-00554-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 09/15/2021] [Indexed: 01/21/2023] Open
Abstract
Chemical diversity is one of the key term when dealing with machine learning and molecular generation. This is particularly true for quantum chemical datasets. The composition of which should be done meticulously since the calculation is highly time demanding. Previously we have seen that the most known quantum chemical dataset QM9 lacks chemical diversity. As a consequence, ML models trained on QM9 showed generalizability shortcomings. In this paper we would like to present (i) a fast and generic method to evaluate chemical diversity, (ii) a new quantum chemical dataset of 435k molecules, OD9, that includes QM9 and new molecules generated with a diversity objective, (iii) an analysis of the diversity impact on unconstrained and goal-directed molecular generation on the example of QED optimization. Our innovative approach makes it possible to individually estimate the impact of a solution to the diversity of a set, allowing for effective incremental evaluation. In the first application, we will see how the diversity constraint allows us to generate more than a million of molecules that would efficiently complete the reference datasets. The compounds were calculated with DFT thanks to a collaborative effort through the QuChemPedIA@home BOINC project. With regard to goal-directed molecular generation, getting a high QED score is not complicated, but adding a little diversity can cut the number of calls to the evaluation function by a factor of ten.
Collapse
Affiliation(s)
- Jules Leguy
- Univ Angers, LERIA, SFR MATHSTIC, 49000, Angers, France
| | - Marta Glavatskikh
- Univ Angers, LERIA, SFR MATHSTIC, 49000, Angers, France.,Univ Angers, CNRS, MOLTECH-ANJOU, SFR MATRIX, 49000, Angers, France
| | - Thomas Cauchy
- Univ Angers, CNRS, MOLTECH-ANJOU, SFR MATRIX, 49000, Angers, France.
| | - Benoit Da Mota
- Univ Angers, LERIA, SFR MATHSTIC, 49000, Angers, France.
| |
Collapse
|
18
|
Yu P, Sterling AJ, Hein J. A Novel Automated Screening Method for Combinatorially Generated Small Molecules. J Chem Inf Model 2021; 61:1637-1646. [PMID: 33844913 DOI: 10.1021/acs.jcim.0c01462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A main challenge in the enumeration of small-molecule chemical spaces for drug design is to quickly and accurately differentiate between possible and impossible molecules. Current approaches for screening enumerated molecules (e.g., 2D heuristics and 3D force fields) have not been able to achieve a balance between accuracy and speed. We have developed a new automated approach for fast and high-quality screening of small molecules, with the following steps: (1) for each molecule in the set, an ensemble of 2D descriptors as feature encoding is computed; (2) on a random small subset, classification (feasible/infeasible) targets via a 3D-based approach are generated; (3) a classification dataset with the computed features and targets is formed and a machine learning model for predicting the 3D approach's decisions is trained; and (4) the trained model is used to screen the remainder of the enumerated set. Our approach is ≈8× (7.96× to 8.84×) faster than screening via 3D simulations without significantly sacrificing accuracy; while compared to 2D-based pruning rules, this approach is more accurate, with better coverage of known feasible molecules. Once the topological features and 3D conformer evaluation methods are established, the process can be fully automated, without any additional chemistry expertise.
Collapse
Affiliation(s)
- Pingshi Yu
- Department of Statistics, University of Oxford, 29 St Giles', Oxford OX1 2JD, U.K.,Department of Computer Science, University of Oxford, 15 Parks Road, Oxford OX1 3QD, U.K
| | - Alistair J Sterling
- Department of Chemistry, University of Oxford, Mansfield Road, Oxford OX1 3TA, U.K
| | - Jotun Hein
- Department of Statistics, University of Oxford, 29 St Giles', Oxford OX1 2JD, U.K
| |
Collapse
|
19
|
Thakkar A, Chadimová V, Bjerrum EJ, Engkvist O, Reymond JL. Retrosynthetic accessibility score (RAscore) - rapid machine learned synthesizability classification from AI driven retrosynthetic planning. Chem Sci 2021; 12:3339-3349. [PMID: 34164104 DOI: 10.26434/chemrxiv.13019993.v1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023] Open
Abstract
Computer aided synthesis planning (CASP) is part of a suite of artificial intelligence (AI) based tools that are able to propose synthesis routes to a wide range of compounds. However, at present they are too slow to be used to screen the synthetic feasibility of millions of generated or enumerated compounds before identification of potential bioactivity by virtual screening (VS) workflows. Herein we report a machine learning (ML) based method capable of classifying whether a synthetic route can be identified for a particular compound or not by the CASP tool AiZynthFinder. The resulting ML models return a retrosynthetic accessibility score (RAscore) of any molecule of interest, and computes at least 4500 times faster than retrosynthetic analysis performed by the underlying CASP tool. The RAscore should be useful for pre-screening millions of virtual molecules from enumerated databases or generative models for synthetic accessibility and produce higher quality databases for virtual screening of biological activity.
Collapse
Affiliation(s)
- Amol Thakkar
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg 431 50 Sweden
- Department of Chemistry and Biochemistry, University of Bern Bern CH-3012 Switzerland
| | - Veronika Chadimová
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg 431 50 Sweden
| | | | - Ola Engkvist
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg 431 50 Sweden
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern Bern CH-3012 Switzerland
| |
Collapse
|
20
|
Thakkar A, Chadimová V, Bjerrum EJ, Engkvist O, Reymond JL. Retrosynthetic accessibility score (RAscore) - rapid machine learned synthesizability classification from AI driven retrosynthetic planning. Chem Sci 2021; 12:3339-3349. [PMID: 34164104 PMCID: PMC8179384 DOI: 10.1039/d0sc05401a] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Computer aided synthesis planning (CASP) is part of a suite of artificial intelligence (AI) based tools that are able to propose synthesis routes to a wide range of compounds. However, at present they are too slow to be used to screen the synthetic feasibility of millions of generated or enumerated compounds before identification of potential bioactivity by virtual screening (VS) workflows. Herein we report a machine learning (ML) based method capable of classifying whether a synthetic route can be identified for a particular compound or not by the CASP tool AiZynthFinder. The resulting ML models return a retrosynthetic accessibility score (RAscore) of any molecule of interest, and computes at least 4500 times faster than retrosynthetic analysis performed by the underlying CASP tool. The RAscore should be useful for pre-screening millions of virtual molecules from enumerated databases or generative models for synthetic accessibility and produce higher quality databases for virtual screening of biological activity. The retrosynthetic accessibility score (RAscore) is based on AI driven retrosynthetic planning, and is useful for rapid scoring of synthetic feasability and pre-screening of large datasets of virtual/generated molecules.![]()
Collapse
Affiliation(s)
- Amol Thakkar
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg 431 50 Sweden .,Department of Chemistry and Biochemistry, University of Bern Bern CH-3012 Switzerland
| | - Veronika Chadimová
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg 431 50 Sweden
| | | | - Ola Engkvist
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg 431 50 Sweden
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern Bern CH-3012 Switzerland
| |
Collapse
|
21
|
Yang T, Li Z, Chen Y, Feng D, Wang G, Fu Z, Ding X, Tan X, Zhao J, Luo X, Chen K, Jiang H, Zheng M. DrugSpaceX: a large screenable and synthetically tractable database extending drug space. Nucleic Acids Res 2021; 49:D1170-D1178. [PMID: 33104791 PMCID: PMC7778939 DOI: 10.1093/nar/gkaa920] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 09/11/2020] [Accepted: 10/05/2020] [Indexed: 02/07/2023] Open
Abstract
One of the most prominent topics in drug discovery is efficient exploration of the vast drug-like chemical space to find synthesizable and novel chemical structures with desired biological properties. To address this challenge, we created the DrugSpaceX (https://drugspacex.simm.ac.cn/) database based on expert-defined transformations of approved drug molecules. The current version of DrugSpaceX contains >100 million transformed chemical products for virtual screening, with outstanding characteristics in terms of structural novelty, diversity and large three-dimensional chemical space coverage. To illustrate its practical application in drug discovery, we used a case study of discoidin domain receptor 1 (DDR1), a kinase target implicated in fibrosis and other diseases, to show DrugSpaceX performing a quick search of initial hit compounds. Additionally, for ligand identification and optimization purposes, DrugSpaceX also provides several subsets for download, including a 10% diversity subset, an extended drug-like subset, a drug-like subset, a lead-like subset, and a fragment-like subset. In addition to chemical properties and transformation instructions, DrugSpaceX can locate the position of transformation, which will enable medicinal chemists to easily integrate strategy planning and protection design.
Collapse
Affiliation(s)
- Tianbiao Yang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China
| | - Zhaojun Li
- School of Information Management, Dezhou University, No. 566 University Rd. West, Dezhou 253023, Shandong, China
| | - Yingjia Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Dan Feng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Chemistry, College of Sciences, Shanghai University, Shanghai, China
| | - Guangchao Wang
- School of Information Management, Dezhou University, No. 566 University Rd. West, Dezhou 253023, Shandong, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing 210023, China
| | - Xiaoyu Ding
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Xiaoqin Tan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Jihui Zhao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Kaixian Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China
- School of Life Science and Technology, ShanghaiTech University, 393 Huaxiazhong Road, Shanghai 200031, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| |
Collapse
|
22
|
Leguy J, Cauchy T, Glavatskikh M, Duval B, Da Mota B. EvoMol: a flexible and interpretable evolutionary algorithm for unbiased de novo molecular generation. J Cheminform 2020; 12:55. [PMID: 33431049 PMCID: PMC7494000 DOI: 10.1186/s13321-020-00458-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 08/31/2020] [Indexed: 11/24/2022] Open
Abstract
The objective of this work is to design a molecular generator capable of exploring known as well as unfamiliar areas of the chemical space. Our method must be flexible to adapt to very different problems. Therefore, it has to be able to work with or without the influence of prior data and knowledge. Moreover, regardless of the success, it should be as interpretable as possible to allow for diagnosis and improvement. We propose here a new open source generation method using an evolutionary algorithm to sequentially build molecular graphs. It is independent of starting data and can generate totally unseen compounds. To be able to search a large part of the chemical space, we define an original set of 7 generic mutations close to the atomic level. Our method achieves excellent performances and even records on the QED, penalised logP, SAscore, CLscore as well as the set of goal-directed functions defined in GuacaMol. To demonstrate its flexibility, we tackle a very different objective issued from the organic molecular materials domain. We show that EvoMol can generate sets of optimised molecules having high energy HOMO or low energy LUMO, starting only from methane. We can also set constraints on a synthesizability score and structural features. Finally, the interpretability of EvoMol allows for the visualisation of its exploration process as a chemically relevant tree. ![]()
Collapse
Affiliation(s)
- Jules Leguy
- Laboratoire LERIA, UNIV Angers, SFR MathSTIC, 2 Bd Lavoisier, 49045, Angers, France
| | - Thomas Cauchy
- Laboratoire MOLTECH-Anjou, UMR CNRS 6200, UNIV Angers, SFR MATRIX, 2 Bd Lavoisier, 49045, Angers, France.
| | - Marta Glavatskikh
- Laboratoire LERIA, UNIV Angers, SFR MathSTIC, 2 Bd Lavoisier, 49045, Angers, France.,Laboratoire MOLTECH-Anjou, UMR CNRS 6200, UNIV Angers, SFR MATRIX, 2 Bd Lavoisier, 49045, Angers, France
| | - Béatrice Duval
- Laboratoire LERIA, UNIV Angers, SFR MathSTIC, 2 Bd Lavoisier, 49045, Angers, France
| | - Benoit Da Mota
- Laboratoire LERIA, UNIV Angers, SFR MathSTIC, 2 Bd Lavoisier, 49045, Angers, France.
| |
Collapse
|
23
|
Poirier M, Pujol-Giménez J, Manatschal C, Bühlmann S, Embaby A, Javor S, Hediger MA, Reymond JL. Pyrazolyl-pyrimidones inhibit the function of human solute carrier protein SLC11A2 (hDMT1) by metal chelation. RSC Med Chem 2020; 11:1023-1031. [PMID: 33479694 PMCID: PMC7649969 DOI: 10.1039/d0md00085j] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 05/06/2020] [Indexed: 12/22/2022] Open
Abstract
Solute carrier proteins (SLCs) control fluxes of ions and molecules across biological membranes and represent an emerging class of drug targets. SLC11A2 (hDMT1) mediates intestinal iron uptake and its inhibition might be used to treat iron overload diseases such as hereditary hemochromatosis. Here we report a micromolar (IC50 = 1.1 μM) pyrazolyl-pyrimidone inhibitor of radiolabeled iron uptake in hDMT1 overexpressing HEK293 cells acting by a non-competitive mechanism, which however does not affect the electrophysiological properties of the transporter. Isothermal titration calorimetry, competition with calcein, induced precipitation of radioactive iron and cross inhibition of the unrelated iron transporter SLC39A8 (hZIP8) indicate that inhibition is mediated by metal chelation. Mapping the chemical space of thousands of pyrazolo-pyrimidones and similar 2,2'-diazabiaryls in ChEMBL suggests that their reported activities might partly reflect metal chelation. Such metal chelating groups are not listed in pan-assay interference compounds (PAINS) but should be checked when addressing SLCs.
Collapse
Affiliation(s)
- Marion Poirier
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Jonai Pujol-Giménez
- Institute of Biochemistry and Molecular Medicine , University of Bern , Bühlstrasse 28 , 3012 Bern , Switzerland
- Membrane Transport Discovery Lab , Department of Nephrology and Hypertension , Inselspital , University of Bern Kinderklinik , Freiburgstrasse 15 , 3010 Bern , Switzerland .
- Department of Biomedical Research , University of Bern , Murtenstrasse 35 , 3008 Bern , Switzerland
| | - Cristina Manatschal
- Department of Biochemistry , University of Zürich , Winterthurerstrasse 190 , Zürich , Switzerland
| | - Sven Bühlmann
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Ahmed Embaby
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Sacha Javor
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| | - Matthias A Hediger
- Institute of Biochemistry and Molecular Medicine , University of Bern , Bühlstrasse 28 , 3012 Bern , Switzerland
- Membrane Transport Discovery Lab , Department of Nephrology and Hypertension , Inselspital , University of Bern Kinderklinik , Freiburgstrasse 15 , 3010 Bern , Switzerland .
- Department of Biomedical Research , University of Bern , Murtenstrasse 35 , 3008 Bern , Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry , University of Bern , Freiestrasse 3 , 3012 Bern , Switzerland .
| |
Collapse
|