351
|
Zhang Z, Guan J, Zhou S. FraGAT: a fragment-oriented multi-scale graph attention model for molecular property prediction. Bioinformatics 2021; 37:2981-2987. [PMID: 33769437 PMCID: PMC8479684 DOI: 10.1093/bioinformatics/btab195] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 02/05/2021] [Accepted: 03/24/2021] [Indexed: 02/02/2023] Open
Abstract
MOTIVATION Molecular property prediction is a hot topic in recent years. Existing graph-based models ignore the hierarchical structures of molecules. According to the knowledge of chemistry and pharmacy, the functional groups of molecules are closely related to its physio-chemical properties and binding affinities. So, it should be helpful to represent molecular graphs by fragments that contain functional groups for molecular property prediction. RESULTS In this article, to boost the performance of molecule property prediction, we first propose a definition of molecule graph fragments that may be or contain functional groups, which are relevant to molecular properties, then develop a fragment-oriented multi-scale graph attention network for molecular property prediction, which is called FraGAT. Experiments on several widely used benchmarks are conducted to evaluate FraGAT. Experimental results show that FraGAT achieves state-of-the-art predictive performance in most cases. Furthermore, our case studies show that when the fragments used to represent the molecule graphs contain functional groups, the model can make better predictions. This conforms to our expectation and demonstrates the interpretability of the proposed model. AVAILABILITY AND IMPLEMENTATION The code and data underlying this work are available in GitHub, at https://github.com/ZiqiaoZhang/FraGAT. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ziqiao Zhang
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | - Shuigeng Zhou
- Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China
| |
Collapse
|
352
|
Sato A, Miyao T, Funatsu K. Prediction of Reaction Yield for Buchwald-Hartwig Cross-coupling Reactions Using Deep Learning. Mol Inform 2021; 41:e2100156. [PMID: 34585854 DOI: 10.1002/minf.202100156] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 09/12/2021] [Indexed: 11/09/2022]
Abstract
Chemical reaction yield is one of the most important factors for determining reaction conditions. Recently, several machine learning-based prediction models using high-throughput experiment (HTE) data sets were reported for the prediction of reaction yield. However, none of them were at a practical level in terms of predictive ability. In this study, we propose a message passing neural network (MPNN) model for chemical yield prediction, focusing on the Buchwald-Hartwig cross-coupling HTE data set. As an initial atom embedding in MPNN model, we propose to use the Mol2Vec feature vectors pre-trained using a large compound database. Predictive ability of the proposed model was higher than that of previously reported five models for the three out of five data sets. Moreover, visualization of important atoms based on self-attention mechanism was in favor of Mol2Vec as an atom embedding rather than other embeddings including previously employed simple representations.
Collapse
Affiliation(s)
- Akinori Sato
- Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan
| | - Tomoyuki Miyao
- Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan.,Data Science Center, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan
| | - Kimito Funatsu
- Data Science Center, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan
| |
Collapse
|
353
|
Deep learning identifies synergistic drug combinations for treating COVID-19. Proc Natl Acad Sci U S A 2021; 118:2105070118. [PMID: 34526388 PMCID: PMC8488647 DOI: 10.1073/pnas.2105070118] [Citation(s) in RCA: 81] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/21/2021] [Indexed: 11/18/2022] Open
Abstract
Effective treatments for COVID-19 are urgently needed. However, discovering single-agent therapies with activity against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been challenging. Combination therapies play an important role in antiviral therapies, due to their improved efficacy and reduced toxicity. Recent approaches have applied deep learning to identify synergistic drug combinations for diseases with vast preexisting datasets, but these are not applicable to new diseases with limited combination data, such as COVID-19. Given that drug synergy often occurs through inhibition of discrete biological targets, here we propose a neural network architecture that jointly learns drug-target interaction and drug-drug synergy. The model consists of two parts: a drug-target interaction module and a target-disease association module. This design enables the model to utilize drug-target interaction data and single-agent antiviral activity data, in addition to available drug-drug combination datasets, which may be small in nature. By incorporating additional biological information, our model performs significantly better in synergy prediction accuracy than previous methods with limited drug combination training data. We empirically validated our model predictions and discovered two drug combinations, remdesivir and reserpine as well as remdesivir and IQ-1S, which display strong antiviral SARS-CoV-2 synergy in vitro. Our approach, which was applied here to address the urgent threat of COVID-19, can be readily extended to other diseases for which a dearth of chemical-chemical combination data exists.
Collapse
|
354
|
Wang J, Liu X, Shen S, Deng L, Liu H. DeepDDS: deep graph neural network with attention mechanism to predict synergistic drug combinations. Brief Bioinform 2021; 23:6375262. [PMID: 34571537 DOI: 10.1093/bib/bbab390] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 08/14/2021] [Accepted: 08/28/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Drug combination therapy has become an increasingly promising method in the treatment of cancer. However, the number of possible drug combinations is so huge that it is hard to screen synergistic drug combinations through wet-lab experiments. Therefore, computational screening has become an important way to prioritize drug combinations. Graph neural network has recently shown remarkable performance in the prediction of compound-protein interactions, but it has not been applied to the screening of drug combinations. RESULTS In this paper, we proposed a deep learning model based on graph neural network and attention mechanism to identify drug combinations that can effectively inhibit the viability of specific cancer cells. The feature embeddings of drug molecule structure and gene expression profiles were taken as input to multilayer feedforward neural network to identify the synergistic drug combinations. We compared DeepDDS (Deep Learning for Drug-Drug Synergy prediction) with classical machine learning methods and other deep learning-based methods on benchmark data set, and the leave-one-out experimental results showed that DeepDDS achieved better performance than competitive methods. Also, on an independent test set released by well-known pharmaceutical enterprise AstraZeneca, DeepDDS was superior to competitive methods by more than 16% predictive precision. Furthermore, we explored the interpretability of the graph attention network and found the correlation matrix of atomic features revealed important chemical substructures of drugs. We believed that DeepDDS is an effective tool that prioritized synergistic drug combinations for further wet-lab experiment validation. AVAILABILITY AND IMPLEMENTATION Source code and data are available at https://github.com/Sinwang404/DeepDDS/tree/master.
Collapse
Affiliation(s)
- Jinxian Wang
- Hunan Agricultural University in 2019, and at present is studying for a Master's degree at Central South University, China
| | - Xuejun Liu
- School of Computer Science and Technology, Nanjing Tech University, Nanjing, China
| | - Siyuan Shen
- School of Software, Xinjiang University, Urumqi, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Hui Liu
- School of Computer Science and Technology, Nanjing Tech University, Nanjing, China
| |
Collapse
|
355
|
An X, Chen X, Yi D, Li H, Guan Y. Representation of molecules for drug response prediction. Brief Bioinform 2021; 23:6375515. [PMID: 34571534 DOI: 10.1093/bib/bbab393] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 08/28/2021] [Accepted: 08/30/2021] [Indexed: 12/18/2022] Open
Abstract
The rapid development of machine learning and deep learning algorithms in the recent decade has spurred an outburst of their applications in many research fields. In the chemistry domain, machine learning has been widely used to aid in drug screening, drug toxicity prediction, quantitative structure-activity relationship prediction, anti-cancer synergy score prediction, etc. This review is dedicated to the application of machine learning in drug response prediction. Specifically, we focus on molecular representations, which is a crucial element to the success of drug response prediction and other chemistry-related prediction tasks. We introduce three types of commonly used molecular representation methods, together with their implementation and application examples. This review will serve as a brief introduction of the broad field of molecular representations.
Collapse
Affiliation(s)
- Xin An
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Xi Chen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Daiyao Yi
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Hongyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
356
|
Eibeck A, Nurkowski D, Menon A, Bai J, Wu J, Zhou L, Mosbach S, Akroyd J, Kraft M. Predicting Power Conversion Efficiency of Organic Photovoltaics: Models and Data Analysis. ACS OMEGA 2021; 6:23764-23775. [PMID: 34568656 PMCID: PMC8459373 DOI: 10.1021/acsomega.1c02156] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 07/16/2021] [Indexed: 06/13/2023]
Abstract
In this paper, the ability of three selected machine learning neural and baseline models in predicting the power conversion efficiency (PCE) of organic photovoltaics (OPVs) using molecular structure information as an input is assessed. The bidirectional long short-term memory (gFSI/BiLSTM), attentive fingerprints (attentive FP), and simple graph neural networks (simple GNN) as well as baseline support vector regression (SVR), random forests (RF), and high-dimensional model representation (HDMR) methods are trained to both the large and computational Harvard clean energy project database (CEPDB) and the much smaller experimental Harvard organic photovoltaic 15 dataset (HOPV15). It was found that the neural-based models generally performed better on the computational dataset with the attentive FP model reaching a state-of-the-art performance with the test set mean squared error of 0.071. The experimental dataset proved much harder to fit, with all of the models exhibiting a rather poor performance. Contrary to the computational dataset, the baseline models were found to perform better than the neural models. To improve the ability of machine learning models to predict PCEs for OPVs, either better computational results that correlate well with experiments or more experimental data at well-controlled conditions are likely required.
Collapse
Affiliation(s)
- Andreas Eibeck
- CARES,
Cambridge Centre for Advanced Research and Education in Singapore, 1 Create Way, CREATE Tower, #05-05, 138602 Singapore
| | - Daniel Nurkowski
- CMCL
Innovations, Sheraton House, Castle Park, Cambridge CB3 0AX, U.K.
| | - Angiras Menon
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
| | - Jiaru Bai
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
| | - Jinkui Wu
- School
of Chemical Engineering, Sichuan University, Chengdu, Sichuan 610065, China
| | - Li Zhou
- School
of Chemical Engineering, Sichuan University, Chengdu, Sichuan 610065, China
| | - Sebastian Mosbach
- CMCL
Innovations, Sheraton House, Castle Park, Cambridge CB3 0AX, U.K.
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
| | - Jethro Akroyd
- CMCL
Innovations, Sheraton House, Castle Park, Cambridge CB3 0AX, U.K.
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
| | - Markus Kraft
- CARES,
Cambridge Centre for Advanced Research and Education in Singapore, 1 Create Way, CREATE Tower, #05-05, 138602 Singapore
- Department
of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.
- School
of Chemical and Biomedical Engineering, Nanyang Technological University, 62 Nanyang Drive, 637459 Singapore
| |
Collapse
|
357
|
Xiong Z, Jeon M, Allaway RJ, Kang J, Park D, Lee J, Jeon H, Ko M, Jiang H, Zheng M, Tan AC, Guo X, Dang KK, Tropsha A, Hecht C, Das TK, Carlson HA, Abagyan R, Guinney J, Schlessinger A, Cagan R. Crowdsourced identification of multi-target kinase inhibitors for RET- and TAU- based disease: The Multi-Targeting Drug DREAM Challenge. PLoS Comput Biol 2021; 17:e1009302. [PMID: 34520464 PMCID: PMC8483411 DOI: 10.1371/journal.pcbi.1009302] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 09/30/2021] [Accepted: 07/23/2021] [Indexed: 01/22/2023] Open
Abstract
A continuing challenge in modern medicine is the identification of safer and more efficacious drugs. Precision therapeutics, which have one molecular target, have been long promised to be safer and more effective than traditional therapies. This approach has proven to be challenging for multiple reasons including lack of efficacy, rapidly acquired drug resistance, and narrow patient eligibility criteria. An alternative approach is the development of drugs that address the overall disease network by targeting multiple biological targets ('polypharmacology'). Rational development of these molecules will require improved methods for predicting single chemical structures that target multiple drug targets. To address this need, we developed the Multi-Targeting Drug DREAM Challenge, in which we challenged participants to predict single chemical entities that target pro-targets but avoid anti-targets for two unrelated diseases: RET-based tumors and a common form of inherited Tauopathy. Here, we report the results of this DREAM Challenge and the development of two neural network-based machine learning approaches that were applied to the challenge of rational polypharmacology. Together, these platforms provide a potentially useful first step towards developing lead therapeutic compounds that address disease complexity through rational polypharmacology.
Collapse
Affiliation(s)
- Zhaoping Xiong
- Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, Shanghai, China
| | - Minji Jeon
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | | | - Jaewoo Kang
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
- Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul, Republic of Korea
| | - Donghyeon Park
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | - Jinhyuk Lee
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | - Hwisang Jeon
- Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul, Republic of Korea
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Miyoung Ko
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | - Hualiang Jiang
- Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, Shanghai, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Aik Choon Tan
- Department of Biostatistics and Bioinformatics, Moffitt Cancer Center, Tampa, Florida, United States of America
| | - Xindi Guo
- Sage Bionetworks, Seattle, Washington, United States of America
| | | | - Kristen K. Dang
- Sage Bionetworks, Seattle, Washington, United States of America
| | - Alex Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Chana Hecht
- Department of Cell, Developmental, and Regenerative Biology, Icahn School of Medicine at Mount Sinai, New York City, New York, United States of America
| | - Tirtha K. Das
- Department of Cell, Developmental, and Regenerative Biology, Icahn School of Medicine at Mount Sinai, New York City, New York, United States of America
| | - Heather A. Carlson
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Ruben Abagyan
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, California, United States of America
| | - Justin Guinney
- Sage Bionetworks, Seattle, Washington, United States of America
| | - Avner Schlessinger
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York City, New York, United States of America
| | - Ross Cagan
- Department of Cell, Developmental, and Regenerative Biology, Icahn School of Medicine at Mount Sinai, New York City, New York, United States of America
- Institute of Cancer Sciences, University of Glasgow; Glasgow, Scotland, United Kingdom
| |
Collapse
|
358
|
Zhang S, Jiang M, Wang S, Wang X, Wei Z, Li Z. SAG-DTA: Prediction of Drug-Target Affinity Using Self-Attention Graph Network. Int J Mol Sci 2021; 22:ijms22168993. [PMID: 34445696 PMCID: PMC8396496 DOI: 10.3390/ijms22168993] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 08/14/2021] [Accepted: 08/17/2021] [Indexed: 11/16/2022] Open
Abstract
The prediction of drug–target affinity (DTA) is a crucial step for drug screening and discovery. In this study, a new graph-based prediction model named SAG-DTA (self-attention graph drug–target affinity) was implemented. Unlike previous graph-based methods, the proposed model utilized self-attention mechanisms on the drug molecular graph to obtain effective representations of drugs for DTA prediction. Features of each atom node in the molecular graph were weighted using an attention score before being aggregated as molecule representation. Various self-attention scoring methods were compared in this study. In addition, two pooing architectures, namely, global and hierarchical architectures, were presented and evaluated on benchmark datasets. Results of comparative experiments on both regression and binary classification tasks showed that SAG-DTA was superior to previous sequence-based or other graph-based methods and exhibited good generalization ability.
Collapse
Affiliation(s)
- Shugang Zhang
- College of Computer Science and Technology, Ocean University of China, Qingdao 266100, China; (S.Z.); (Z.W.)
| | - Mingjian Jiang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266033, China;
| | - Shuang Wang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China;
| | | | - Zhiqiang Wei
- College of Computer Science and Technology, Ocean University of China, Qingdao 266100, China; (S.Z.); (Z.W.)
| | - Zhen Li
- College of Computer Science and Technology, Qingdao University, Qingdao 266071, China
- Correspondence: ; Tel./Fax: +86-532-85953086
| |
Collapse
|
359
|
Carracedo-Reboredo P, Liñares-Blanco J, Rodríguez-Fernández N, Cedrón F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C. A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 2021; 19:4538-4558. [PMID: 34471498 PMCID: PMC8387781 DOI: 10.1016/j.csbj.2021.08.011] [Citation(s) in RCA: 136] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 08/06/2021] [Accepted: 08/06/2021] [Indexed: 12/30/2022] Open
Abstract
Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in computer science with the skyrocketing of machine learning techniques due to its democratization. With the objectives set by the Precision Medicine initiative and the new challenges generated, it is necessary to establish robust, standard and reproducible computational methodologies to achieve the objectives set. Currently, predictive models based on Machine Learning have gained great importance in the step prior to preclinical studies. This stage manages to drastically reduce costs and research times in the discovery of new drugs. This review article focuses on how these new methodologies are being used in recent years of research. Analyzing the state of the art in this field will give us an idea of where cheminformatics will be developed in the short term, the limitations it presents and the positive results it has achieved. This review will focus mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drug discovery in recent years.
Collapse
Key Words
- ADMET, Absorption, distribution, metabolism, elimination and toxicity
- ADR, Adverse Drug Reaction
- AI, Artificial Intelligence
- ANN, Artificial Neural Networks
- APFP, Atom Pairs 2d FingerPrint
- AUC, Area under the Curve
- BBB, Blood–Brain barrier
- CDK, Chemical Development Kit
- CNN, Convolutional Neural Networks
- CNS, Central Nervous System
- CPI, Compound-protein interaction
- CV, Cross Validation
- Cheminformatics
- DL, Deep Learning
- DNA, Deoxyribonucleic acid
- Deep Learning
- Drug Discovery
- ECFP, Extended Connectivity Fingerprints
- FDA, Food and Drug Administration
- FNN, Fully Connected Neural Networks
- FP, Fringerprints
- FS, Feature Selection
- GCN, Graph Convolutional Networks
- GEO, Gene Expression Omnibus
- GNN, Graph Neural Networks
- GO, Gene Ontology
- KEGG, Kyoto Encyclopedia of Genes and Genomes
- MACCS, Molecular ACCess System
- MCC, Matthews correlation coefficient
- MD, Molecular Descriptors
- MKL, Multiple Kernel Learning
- ML, Machine Learning
- Machine Learning
- Molecular Descriptors
- NB, Naive Bayes
- OOB, Out of Bag
- PCA, Principal Component Analyisis
- QSAR
- QSAR, Quantitative structure–activity relationship
- RF, Random Forest
- RNA, Ribonucleic Acid
- SMILES, simplified molecular-input line-entry system
- SVM, Support Vector Machines
- TCGA, The Cancer Genome Atlas
- WHO, World Health Organization
- t-SNE, t-Distributed Stochastic Neighbor Embedding
Collapse
Affiliation(s)
- Paula Carracedo-Reboredo
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Jose Liñares-Blanco
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
| | - Nereida Rodríguez-Fernández
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco Cedrón
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco J. Novoa
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Adrian Carballal
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Victor Maojo
- Biomedical Informatics Group, Artificial Intelligence Department, Polytechnic University of Madrid, Calle de los Ciruelos, Boadilla del Monte, Madrid 28660, Spain
| | - Alejandro Pazos
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| |
Collapse
|
360
|
Wang X, Liu M, Zhang Y, He S, Qin C, Li Y, Lu T. Deep fusion learning facilitates anatomical therapeutic chemical recognition in drug repurposing and discovery. Brief Bioinform 2021; 22:6342939. [PMID: 34368838 DOI: 10.1093/bib/bbab289] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 07/03/2021] [Accepted: 07/06/2021] [Indexed: 01/17/2023] Open
Abstract
The advent of large-scale biomedical data and computational algorithms provides new opportunities for drug repurposing and discovery. It is of great interest to find an appropriate data representation and modeling method to facilitate these studies. The anatomical therapeutic chemical (ATC) classification system, proposed by the World Health Organization (WHO), is an essential source of information for drug repurposing and discovery. Besides, computational methods are applied to predict drug ATC classification. We conducted a systematic review of ATC computational prediction studies and revealed the differences in data sets, data representation, algorithm approaches, and evaluation metrics. We then proposed a deep fusion learning (DFL) framework to optimize the ATC prediction model, namely DeepATC. The methods based on graph convolutional network, inferring biological network and multimodel attentive fusion network were applied in DeepATC to extract the molecular topological information and low-dimensional representation from the molecular graph and heterogeneous biological networks. The results indicated that DeepATC achieved superior model performance with area under the curve (AUC) value at 0.968. Furthermore, the DFL framework was performed for the transcriptome data-based ATC prediction, as well as another independent task that is significantly relevant to drug discovery, namely drug-target interaction. The DFL-based model achieved excellent performance in the above-extended validation task, suggesting that the idea of aggregating the heterogeneous biological network and node's (molecule or protein) self-topological features will bring inspiration for broader drug repurposing and discovery research.
Collapse
Affiliation(s)
- Xiting Wang
- Life Science School, Beijing University of Chinese Medicine, Beijing, China
| | - Meng Liu
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing, China
| | - Yiling Zhang
- Beijing University of Chinese Medicine, Beijing, China
| | - Shuangshuang He
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing, China
| | - Caimeng Qin
- School of Life Sciences, Beijing University of Chinese Medicine and Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Yu Li
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing, China
| | - Tao Lu
- Integrative Medicine Center in School of Life Sciences, Beijing University of Chinese Medicine, Beijing, China
| |
Collapse
|
361
|
Yu L, Su Y, Liu Y, Zeng X. Review of unsupervised pretraining strategies for molecules representation. Brief Funct Genomics 2021; 20:323-332. [PMID: 34342611 DOI: 10.1093/bfgp/elab036] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 07/07/2021] [Accepted: 07/08/2021] [Indexed: 11/14/2022] Open
Abstract
In recent years, the computer-assisted techniques make a great progress in the field of drug discovery. And, yet, the problem of limited labeled data problem is still challenging and also restricts the performance of these techniques in specific tasks, such as molecular property prediction, compound-protein interaction and de novo molecular generation. One effective solution is to utilize the experience and knowledge gained from other tasks to cope with related pursuits. Unsupervised pretraining is promising, due to its capability of leveraging a vast number of unlabeled molecules and acquiring a more informative molecular representation for the downstream tasks. In particular, models trained on large-scale unlabeled molecules can capture generalizable features, and this ability can be employed to improve the performance of specific downstream tasks. Many relevant pretraining works have been recently proposed. Here, we provide an overview of molecular unsupervised pretraining and related applications in drug discovery. Challenges and possible solutions are also summarized.
Collapse
|
362
|
Lopez K, Pinheiro S, Zamora WJ. Multiple linear regression models for predicting the n‑octanol/water partition coefficients in the SAMPL7 blind challenge. J Comput Aided Mol Des 2021; 35:923-931. [PMID: 34251523 PMCID: PMC8273033 DOI: 10.1007/s10822-021-00409-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Accepted: 07/05/2021] [Indexed: 01/19/2023]
Abstract
A multiple linear regression model called MLR-3 is used for predicting the experimental n-octanol/water partition coefficient (log PN) of 22 N-sulfonamides proposed by the organizers of the SAMPL7 blind challenge. The MLR-3 method was trained with 82 molecules including drug-like sulfonamides and small organic molecules, which resembled the main functional groups present in the challenge dataset. Our model, submitted as "TFE-MLR", presented a root-mean-square error of 0.58 and mean absolute error of 0.41 in log P units, accomplishing the highest accuracy, among empirical methods and also in all submissions based on the ranked ones. Overall, the results support the appropriateness of multiple linear regression approach MLR-3 for computing the n-octanol/water partition coefficient in sulfonamide-bearing compounds. In this context, the outstanding performance of empirical methodologies, where 75% of the ranked submissions achieved root-mean-square errors < 1 log P units, support the suitability of these strategies for obtaining accurate and fast predictions of physicochemical properties as partition coefficients of bioorganic compounds.
Collapse
Affiliation(s)
- Kenneth Lopez
- School of Chemistry, University of Costa Rica, San Pedro, San José, Costa Rica
| | - Silvana Pinheiro
- Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Pará, 66075-110, Brazil
| | - William J Zamora
- School of Chemistry, University of Costa Rica, San Pedro, San José, Costa Rica.
- Advanced Computing Lab (CNCA), National High Technology Center (CeNAT-CONARE), Pavas, San José, Costa Rica.
| |
Collapse
|
363
|
Gallego V, Naveiro R, Roca C, Ríos Insua D, Campillo NE. AI in drug development: a multidisciplinary perspective. Mol Divers 2021; 25:1461-1479. [PMID: 34251580 PMCID: PMC8342381 DOI: 10.1007/s11030-021-10266-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 06/29/2021] [Indexed: 01/09/2023]
Abstract
The introduction of a new drug to the commercial market follows a complex and long process that typically spans over several years and entails large monetary costs due to a high attrition rate. Because of this, there is an urgent need to improve this process using innovative technologies such as artificial intelligence (AI). Different AI tools are being applied to support all four steps of the drug development process (basic research for drug discovery; pre-clinical phase; clinical phase; and postmarketing). Some of the main tasks where AI has proven useful include identifying molecular targets, searching for hit and lead compounds, synthesising drug-like compounds and predicting ADME-Tox. This review, on the one hand, brings in a mathematical vision of some of the key AI methods used in drug development closer to medicinal chemists and, on the other hand, brings the drug development process and the use of different models closer to mathematicians. Emphasis is placed on two aspects not mentioned in similar surveys, namely, Bayesian approaches and their applications to molecular modelling and the eventual final use of the methods to actually support decisions. Promoting a perfect synergy.
Collapse
Affiliation(s)
- Víctor Gallego
- Institute of Mathematical Sciences (ICMAT-CSIC), Nicolás Cabrera 13-15, 28049, Madrid, Spain
| | - Roi Naveiro
- Institute of Mathematical Sciences (ICMAT-CSIC), Nicolás Cabrera 13-15, 28049, Madrid, Spain
| | - Carlos Roca
- AItenea Biotech S.L. Parque Científico de Madrid, Faraday, 7, 28049, Madrid, Spain
| | - David Ríos Insua
- ICMAT-CSIC and Dept. of Statistics and OR, U. Compl. Madrid, Madrid, Spain
| | - Nuria E Campillo
- CIB-Margarita Salas (CSIC), Ramiro de Maeztu, 9, 28040, Madrid, Spain.
| |
Collapse
|
364
|
Menke J, Massa J, Koch O. Natural product scores and fingerprints extracted from artificial neural networks. Comput Struct Biotechnol J 2021; 19:4593-4602. [PMID: 34584636 PMCID: PMC8445839 DOI: 10.1016/j.csbj.2021.07.032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 07/26/2021] [Accepted: 07/26/2021] [Indexed: 11/21/2022] Open
Abstract
Due to their desirable properties, natural products are an important ligand class for medicinal chemists. However, due to their structural distinctiveness, traditional cheminformatic approaches, like ligand-based virtual screening, often perform worse for natural products. Based on our recent work, we evaluated the ability of neural networks to generate fingerprints more appropriate for use with natural products. A manually curated dataset of natural products and synthetic decoys was used to train a multi-layer perceptron network and an autoencoder-like network. In-depth analysis showed that the extracted natural product-specific neural fingerprint outperforms traditional as well as natural product-specific fingerprints on three datasets. Further, we explored how the activations from the output layer of a network can work as a novel natural product likeness score. Overall, two natural product-specific datasets were generated, which are publicly available together with the code to create the fingerprints and the novel natural product likeness score.
Collapse
Affiliation(s)
- Janosch Menke
- Institute of Pharmaceutical and Medicinal Chemistry, Westfälische Wilhelms-Universität Münster, Corrensstraße 48, 48149 Münster, Germany
| | - Joana Massa
- Institute of Pharmaceutical and Medicinal Chemistry, Westfälische Wilhelms-Universität Münster, Corrensstraße 48, 48149 Münster, Germany
| | - Oliver Koch
- Institute of Pharmaceutical and Medicinal Chemistry, Westfälische Wilhelms-Universität Münster, Corrensstraße 48, 48149 Münster, Germany
- Center for Multiscale Theory and Computation, Westfälische Wilhelms-Universität Münster, Corrensstraße 48, 48149 Münster, Germany
| |
Collapse
|
365
|
Wang L, Zhao L, Liu X, Fu J, Zhang A. SepPCNET: Deeping Learning on a 3D Surface Electrostatic Potential Point Cloud for Enhanced Toxicity Classification and Its Application to Suspected Environmental Estrogens. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:9958-9967. [PMID: 34240848 DOI: 10.1021/acs.est.1c01228] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Deep learning (DL) offers an unprecedented opportunity to revolutionize the landscape of toxicity prediction based on quantitative structure-activity relationship (QSAR) studies in the big data era. However, the structural description in the reported DL-QSAR models is still restricted to the two-dimensional level. Inspired by point clouds, a type of geometric data structure, a novel three-dimensional (3D) molecular surface point cloud with electrostatic potential (SepPC) was proposed to describe chemical structures. Each surface point of a chemical is assigned its 3D coordinate and molecular electrostatic potential. A novel DL architecture SepPCNET was then introduced to directly consume unordered SepPC data for toxicity classification. The SepPCNET model was trained on 1317 chemicals tested in a battery of 18 estrogen receptor-related assays of the ToxCast program. The obtained model recognized the active and inactive chemicals at accuracies of 82.8 and 88.9%, respectively, with a total accuracy of 88.3% on the internal test set and 92.5% on the external test set, which outperformed other up-to-date machine learning models and succeeded in recognizing the difference in the activity of isomers. Additional insights into the toxicity mechanism were also gained by visualizing critical points and extracting data-driven point features of active chemicals.
Collapse
Affiliation(s)
- Liguo Wang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Lu Zhao
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Xian Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
| | - Jianjie Fu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
- School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310012, P. R. China
| | - Aiqian Zhang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
- School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310012, P. R. China
| |
Collapse
|
366
|
Hung C, Gini G. QSAR modeling without descriptors using graph convolutional neural networks: the case of mutagenicity prediction. Mol Divers 2021; 25:1283-1299. [PMID: 34146224 DOI: 10.1007/s11030-021-10250-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Accepted: 06/08/2021] [Indexed: 11/30/2022]
Abstract
Deep neural networks are effective in learning directly from low-level encoded data without the need of feature extraction. This paper shows how QSAR models can be constructed from 2D molecular graphs without computing chemical descriptors. Two graph convolutional neural network-based models are presented with and without a Bayesian estimation of the prediction uncertainty. The property under investigation is mutagenicity: Models developed here predict the output of the Ames test. These models take the SMILES representation of the molecules as input to produce molecular graphs in terms of adjacency matrices and subsequently use attention mechanisms to weight the role of their subgraphs in producing the output. The results positively compare with current state-of-the-art models. Furthermore, our proposed model interpretation can be enhanced by the automatic extraction of the substructures most important in driving the prediction, as well as by uncertainty estimations.
Collapse
|
367
|
Chen W, Chen G, Zhao L, Chen CYC. Predicting Drug-Target Interactions with Deep-Embedding Learning of Graphs and Sequences. J Phys Chem A 2021; 125:5633-5642. [PMID: 34142824 DOI: 10.1021/acs.jpca.1c02419] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Computational approaches for predicting drug-target interactions (DTIs) play an important role in drug discovery since conventional screening experiments are time-consuming and expensive. In this study, we proposed end-to-end representation learning of a graph neural network with an attention mechanism and an attentive bidirectional long short-term memory (BiLSTM) to predict DTIs. For efficient training, we introduced a bidirectional encoder representations from transformers (BERT) pretrained method to extract substructure features from protein sequences and a local breadth-first search (BFS) to learn subgraph information from molecular graphs. Integrating both models, we developed a DTI prediction system. As a result, the proposed method achieved high performances with increases of 2.4% and 9.4% for AUC and recall, respectively, on unbalanced datasets compared with other methods. Extensive experiments showed that our model can relatively screen potential drugs for specific protein. Furthermore, visualizing the attention weights provides biological insight.
Collapse
Affiliation(s)
- Wei Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China
| | - Guanxing Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China
| | - Lu Zhao
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China.,Department of Clinical Laboratory, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou 510655, China
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China.,Department of Medical Research, China Medical University Hospital, Taichung 40447, Taiwan.,Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan
| |
Collapse
|
368
|
Zhong W, Zhao L, Yang Z, Yu-Chian Chen C. Graph convolutional network approach to investigate potential selective Limk1 inhibitors. J Mol Graph Model 2021; 107:107965. [PMID: 34167067 DOI: 10.1016/j.jmgm.2021.107965] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 05/27/2021] [Accepted: 06/07/2021] [Indexed: 12/26/2022]
Abstract
Since the Limk1 is a promising drug target and few inhibitors with good Limk1/ROCK2 selectivity have been reported, discovering potential and selective Limk1 inhibitors with novel scaffolds is becoming an urgent need to develop new treatments for the related diseases. Here, we utilized molecular docking to screen potential compounds of Limk1 from Traditional Chinese Medicine (TCM) database. Meanwhile, we performed a three-dimensional graph convolutional network (3DGCN), based on 3D molecular graph, to predict the inhibitory activity of Limk1 and ROCK2. Compared with the baseline models (RF, GCN and Weave), the 3DGCN achieved higher accuracy and the averaged RMSE values on test sets for Limk1 and ROCK2 were 0.721 and 0.852 respectively. In 3DGCN, above 80% of the test-set molecules from both two datasets were predicted within absolute error of 1.0 and the feature visualization suggested that it could automatically learn relevant structure features including 3D molecular information from a specific task for prediction. Furthermore, molecular dynamics (MD) simulations within 100 ns were employed to verify the stability of ligand-protein complexes and reveal the binding modes of the potential selective lead compounds of Limk1. Finally, integrating docking results, the predicted values by the 3DGCN and the MD analysis, we found that 7549 and 2007_15649 might be the potential and selective inhibitors for Limk1 receptor.
Collapse
Affiliation(s)
- Weihe Zhong
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong, 510275, China; School of Biomedical Engineering, Sun Yat-sen University, Shenzhen, Guangdong, 510275, China
| | - Lu Zhao
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong, 510275, China; Department of Clinical Laboratory, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, Guangdong, 510655, China
| | - Ziduo Yang
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong, 510275, China
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, Guangdong, 510275, China; Department of Medical Research, China Medical University Hospital, Taichung, 40447, Taiwan; Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41354, Taiwan.
| |
Collapse
|
369
|
Druchok M, Yarish D, Garkot S, Nikolaienko T, Gurbych O. Ensembling machine learning models to boost molecular affinity prediction. Comput Biol Chem 2021; 93:107529. [PMID: 34192653 DOI: 10.1016/j.compbiolchem.2021.107529] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 06/01/2021] [Accepted: 06/08/2021] [Indexed: 02/01/2023]
Abstract
This study unites six popular machine learning approaches to enhance the prediction of a molecular binding affinity between receptors (large protein molecules) and ligands (small organic molecules). Here we examine a scheme where affinity of ligands is predicted against a single receptor - human thrombin, thus, the models consider ligand features only. However, the suggested approach can be repurposed for other receptors. The methods include Support Vector Machine, Random Forest, CatBoost, feed-forward neural network, graph neural network, and Bidirectional Encoder Representations from Transformers. The first five methods use input features based on physico-chemical properties of molecules, while the last one is based on textual molecular representations. All approaches do not rely on atomic spatial coordinates, avoiding a potential bias from known structures, and are capable of generalizing for compounds with unknown conformations. Within each of the methods, we have trained two models that solve classification and regression tasks. Then, all models are grouped into a pipeline of two subsequent ensembles. The first ensemble aggregates six classification models which vote whether a ligand binds to a receptor or not. If a ligand is classified as active (i.e., binds), the second ensemble predicts its binding affinity in terms of the inhibition constant Ki.
Collapse
Affiliation(s)
- Maksym Druchok
- SoftServe, Inc., 2d Sadova Str., 79021 Lviv, Ukraine; Institute for Condensed Matter Physics, NAS of Ukraine, 1 Svientsitskii Str., 79011 Lviv, Ukraine.
| | | | - Sofiya Garkot
- SoftServe, Inc., 2d Sadova Str., 79021 Lviv, Ukraine; Ukrainian Catholic University, 17 Svientsitskii Str., 79011 Lviv, Ukraine
| | - Tymofii Nikolaienko
- SoftServe, Inc., 2d Sadova Str., 79021 Lviv, Ukraine; Taras Shevchenko National University of Kyiv, 64/13, Volodymyrska Str., 01601 Kyiv, Ukraine
| | - Oleksandr Gurbych
- SoftServe, Inc., 2d Sadova Str., 79021 Lviv, Ukraine; Lviv Polytechnic National University, 5 Kniazia Romana Str., 79005 Lviv, Ukraine
| |
Collapse
|
370
|
Serafim MSM, Dos Santos Júnior VS, Gertrudes JC, Maltarollo VG, Honorio KM. Machine learning techniques applied to the drug design and discovery of new antivirals: a brief look over the past decade. Expert Opin Drug Discov 2021; 16:961-975. [PMID: 33957833 DOI: 10.1080/17460441.2021.1918098] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Introduction: Drug design and discovery of new antivirals will always be extremely important in medicinal chemistry, taking into account known and new viral diseases that are yet to come. Although machine learning (ML) have shown to improve predictions on the biological potential of chemicals and accelerate the discovery of drugs over the past decade, new methods and their combinations have improved their performance and established promising perspectives regarding ML in the search for new antivirals.Areas covered: The authors consider some interesting areas that deal with different ML techniques applied to antivirals. Recent innovative studies on ML and antivirals were selected and analyzed in detail. Also, the authors provide a brief look at the past to the present to detect advances and bottlenecks in the area.Expert opinion: From classical ML techniques, it was possible to boost the searches for antivirals. However, from the emergence of new algorithms and the improvement in old approaches, promising results will be achieved every day, as we have observed in the case of SARS-CoV-2. Recent experience has shown that it is possible to use ML to discover new antiviral candidates from virtual screening and drug repurposing.
Collapse
Affiliation(s)
- Mateus Sá Magalhães Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | | | - Jadson Castro Gertrudes
- Departamento de Computação, Instituto de Ciências Exatas e Biológicas, Universidade Federal de Ouro Preto (UFOP), Ouro Preto, Brazil
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Kathia Maria Honorio
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP), São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC (UFABC), Santo André, Brazil
| |
Collapse
|
371
|
Wu Z, Jiang D, Wang J, Hsieh CY, Cao D, Hou T. Mining Toxicity Information from Large Amounts of Toxicity Data. J Med Chem 2021; 64:6924-6936. [PMID: 33961429 DOI: 10.1021/acs.jmedchem.1c00421] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Safety is a main reason for drug failures, and therefore, the detection of compound toxicity and potential adverse effects in the early stage of drug development is highly desirable. However, accurate prediction of many toxicity endpoints is extremely challenging due to low accessibility of sufficient and reliable toxicity data, as well as complicated and diversified mechanisms related to toxicity. In this study, we proposed the novel multitask graph attention (MGA) framework to learn the regression and classification tasks simultaneously. MGA has shown excellent predictive power on 33 toxicity data sets and has the capability to extract general toxicity features and generate customized toxicity fingerprints. In addition, MGA provides a new way to detect structural alerts and discover the relationship between different toxicity tasks, which will be quite helpful to mine toxicity information from large amounts of toxicity data.
Collapse
Affiliation(s)
- Zhenxing Wu
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058 Zhejiang, P. R. China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, P. R. China
| | - Dejun Jiang
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058 Zhejiang, P. R. China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058 Zhejiang, P. R. China.,National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan 430072 Hubei, P. R. China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory, Tencent, Shenzhen 518057 Guangdong, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410004 Hunan, P. R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058 Zhejiang, P. R. China.,State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058 Zhejiang, P. R. China
| |
Collapse
|
372
|
Ye Z, Yang W, Yang Y, Ouyang D. Interpretable machine learning methods for in vitro pharmaceutical formulation development. FOOD FRONTIERS 2021. [DOI: 10.1002/fft2.78] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Affiliation(s)
- Zhuyifan Ye
- State Key Laboratory of Quality Research in Chinese Medicine Institute of Chinese Medical Sciences (ICMS) University of Macau Macau China
| | - Wenmian Yang
- State Key Laboratory of Internet of Things for Smart City University of Macau Macau China
| | - Yilong Yang
- School of Software Beihang University Beijing China
| | - Defang Ouyang
- State Key Laboratory of Quality Research in Chinese Medicine Institute of Chinese Medical Sciences (ICMS) University of Macau Macau China
| |
Collapse
|
373
|
Zhang XC, Wu CK, Yang ZJ, Wu ZX, Yi JC, Hsieh CY, Hou TJ, Cao DS. MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction. Brief Bioinform 2021; 22:6265201. [PMID: 33951729 DOI: 10.1093/bib/bbab152] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/11/2021] [Accepted: 04/01/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Accurate and efficient prediction of molecular properties is one of the fundamental issues in drug design and discovery pipelines. Traditional feature engineering-based approaches require extensive expertise in the feature design and selection process. With the development of artificial intelligence (AI) technologies, data-driven methods exhibit unparalleled advantages over the feature engineering-based methods in various domains. Nevertheless, when applied to molecular property prediction, AI models usually suffer from the scarcity of labeled data and show poor generalization ability. RESULTS In this study, we proposed molecular graph BERT (MG-BERT), which integrates the local message passing mechanism of graph neural networks (GNNs) into the powerful BERT model to facilitate learning from molecular graphs. Furthermore, an effective self-supervised learning strategy named masked atoms prediction was proposed to pretrain the MG-BERT model on a large amount of unlabeled data to mine context information in molecules. We found the MG-BERT model can generate context-sensitive atomic representations after pretraining and transfer the learned knowledge to the prediction of a variety of molecular properties. The experimental results show that the pretrained MG-BERT model with a little extra fine-tuning can consistently outperform the state-of-the-art methods on all 11 ADMET datasets. Moreover, the MG-BERT model leverages attention mechanisms to focus on atomic features essential to the target property, providing excellent interpretability for the trained model. The MG-BERT model does not require any hand-crafted feature as input and is more reliable due to its excellent interpretability, providing a novel framework to develop state-of-the-art models for a wide range of drug discovery tasks.
Collapse
Affiliation(s)
- Xiao-Chen Zhang
- State Key Laboratory of High-Performance Computing, School of Computer Science, National University of Defense Technology, China
| | - Cheng-Kun Wu
- State Key Laboratory of High-Performance Computing, School of Computer Science, National University of Defense Technology, China
| | - Zhi-Jiang Yang
- Xiangya School of Pharmaceutical Sciences, Central South University, China
| | - Zhen-Xing Wu
- College of Pharmaceutical Sciences, Zhengjiang University, China
| | - Jia-Cai Yi
- State Key Laboratory of High-Performance Computing, School of Computer Science, National University of Defense Technology, China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory since 2018. He received his PhD degree in Physics from the University of Ottawa in 2012 and worked as a postdoctoral researcher at the University of Toronto (2012-2013) and Massachusetts Institute of Technology (2013-2016), respectively. Before joining Tencent, he worked as a senior researcher at Singapore-MIT Alliance for Science and Technology (2017-2018)
| | - Ting-Jun Hou
- College of Pharmaceutical Sciences, Zhejiang University, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, China
| |
Collapse
|
374
|
Li P, Wang J, Qiao Y, Chen H, Yu Y, Yao X, Gao P, Xie G, Song S. An effective self-supervised framework for learning expressive molecular global representations to drug discovery. Brief Bioinform 2021; 22:6262238. [PMID: 33940598 DOI: 10.1093/bib/bbab109] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 03/06/2021] [Accepted: 03/12/2021] [Indexed: 11/13/2022] Open
Abstract
How to produce expressive molecular representations is a fundamental challenge in artificial intelligence-driven drug discovery. Graph neural network (GNN) has emerged as a powerful technique for modeling molecular data. However, previous supervised approaches usually suffer from the scarcity of labeled data and poor generalization capability. Here, we propose a novel molecular pre-training graph-based deep learning framework, named MPG, that learns molecular representations from large-scale unlabeled molecules. In MPG, we proposed a powerful GNN for modelling molecular graph named MolGNet, and designed an effective self-supervised strategy for pre-training the model at both the node and graph-level. After pre-training on 11 million unlabeled molecules, we revealed that MolGNet can capture valuable chemical insights to produce interpretable representation. The pre-trained MolGNet can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of drug discovery tasks, including molecular properties prediction, drug-drug interaction and drug-target interaction, on 14 benchmark datasets. The pre-trained MolGNet in MPG has the potential to become an advanced molecular encoder in the drug discovery pipeline.
Collapse
Affiliation(s)
- Pengyong Li
- Department of Biomedical Engineering at Tsinghua University, China
| | - Jun Wang
- Ping An Healthcare Technology, Chaoyang, 100027 Beijing, China
| | - Yixuan Qiao
- Operations Research and Cybernetics at Beijing University of Technology, China
| | - Hao Chen
- Cybernetics at Beijing University of Technology, China
| | - Yihuan Yu
- Beijing University of Biomedical Engineering, China
| | - Xiaojun Yao
- Analytical Chemistry and Chemoinformatics at Lanzhou University, China
| | - Peng Gao
- Ping An Healthcare Technology, Chaoyang, 100027 Beijing, China
| | - Guotong Xie
- Ping An Healthcare Technology, Chaoyang, 100027 Beijing, China
| | - Sen Song
- Tsinghua Laboratory of Brain and Intelligence and Department of Biomedical Engineering, Tsinghua University, Haidian, 100084 Beijing, China
| |
Collapse
|
375
|
Vaškevičius M, Kapočiūtė-Dzikienė J, Šlepikas L. Prediction of Chromatography Conditions for Purification in Organic Synthesis Using Deep Learning. Molecules 2021; 26:2474. [PMID: 33922736 PMCID: PMC8123027 DOI: 10.3390/molecules26092474] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 04/15/2021] [Accepted: 04/22/2021] [Indexed: 01/27/2023] Open
Abstract
In this research, a process for developing normal-phase liquid chromatography solvent systems has been proposed. In contrast to the development of conditions via thin-layer chromatography (TLC), this process is based on the architecture of two hierarchically connected neural network-based components. Using a large database of reaction procedures allows those two components to perform an essential role in the machine-learning-based prediction of chromatographic purification conditions, i.e., solvents and the ratio between solvents. In our paper, we build two datasets and test various molecular vectorization approaches, such as extended-connectivity fingerprints, learned embedding, and auto-encoders along with different types of deep neural networks to demonstrate a novel method for modeling chromatographic solvent systems employing two neural networks in sequence. Afterward, we present our findings and provide insights on the most effective methods for solving prediction tasks. Our approach results in a system of two neural networks with long short-term memory (LSTM)-based auto-encoders, where the first predicts solvent labels (by reaching the classification accuracy of 0.950 ± 0.001) and in the case of two solvents, the second one predicts the ratio between two solvents (R2 metric equal to 0.982 ± 0.001). Our approach can be used as a guidance instrument in laboratories to accelerate scouting for suitable chromatography conditions.
Collapse
Affiliation(s)
- Mantas Vaškevičius
- Department of Applied Informatics, Vytautas Magnus University, LT-44404 Kaunas, Lithuania;
- JSC Synhet, Biržų Str. 6, LT-44139 Kaunas, Lithuania;
| | | | | |
Collapse
|
376
|
Wu Z, Jiang D, Hsieh CY, Chen G, Liao B, Cao D, Hou T. Hyperbolic relational graph convolution networks plus: a simple but highly efficient QSAR-modeling method. Brief Bioinform 2021; 22:6235968. [PMID: 33866354 DOI: 10.1093/bib/bbab112] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 03/11/2021] [Accepted: 03/12/2021] [Indexed: 01/04/2023] Open
Abstract
Accurate predictions of druggability and bioactivities of compounds are desirable to reduce the high cost and time of drug discovery. After more than five decades of continuing developments, quantitative structure-activity relationship (QSAR) methods have been established as indispensable tools that facilitate fast, reliable and affordable assessments of physicochemical and biological properties of compounds in drug-discovery programs. Currently, there are mainly two types of QSAR methods, descriptor-based methods and graph-based methods. The former is developed based on predefined molecular descriptors, whereas the latter is developed based on simple atomic and bond information. In this study, we presented a simple but highly efficient modeling method by combining molecular graphs and molecular descriptors as the input of a modified graph neural network, called hyperbolic relational graph convolution network plus (HRGCN+). The evaluation results show that HRGCN+ achieves state-of-the-art performance on 11 drug-discovery-related datasets. We also explored the impact of the addition of traditional molecular descriptors on the predictions of graph-based methods, and found that the addition of molecular descriptors can indeed boost the predictive power of graph-based methods. The results also highlight the strong anti-noise capability of our method. In addition, our method provides a way to interpret models at both the atom and descriptor levels, which can help medicinal chemists extract hidden information from complex datasets. We also offer an HRGCN+'s online prediction service at https://quantum.tencent.com/hrgcn/.
Collapse
Affiliation(s)
- Zhenxing Wu
- College of Pharmaceutical Sciences, Zhejiang University, under the supervision of Prof. Tingjun Hou
| | - Dejun Jiang
- College of Pharmaceutical Sciences, Zhejiang University, under the supervision of Prof. Tingjun Hou
| | | | - Guangyong Chen
- Shenzhen Institute of Advanced Technology Chinese Academy of Sciences
| | - Ben Liao
- demonstrated history of working in industry and academia. Skilled in machine learning, mathematics, natural language processing, computer vision and graph neural networks. Strong education professional with a PhD from Université de Paris in France
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University
| |
Collapse
|
377
|
Zhi HY, Zhao L, Lee CC, Chen CYC. A Novel Graph Neural Network Methodology to Investigate Dihydroorotate Dehydrogenase Inhibitors in Small Cell Lung Cancer. Biomolecules 2021; 11:biom11030477. [PMID: 33806898 PMCID: PMC8005042 DOI: 10.3390/biom11030477] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 02/26/2021] [Accepted: 03/16/2021] [Indexed: 12/17/2022] Open
Abstract
Small cell lung cancer (SCLC) is a particularly aggressive tumor subtype, and dihydroorotate dehydrogenase (DHODH) has been demonstrated to be a therapeutic target for SCLC. Network pharmacology analysis and virtual screening were utilized to find out related proteins and investigate candidates with high docking capacity to multiple targets. Graph neural networks (GNNs) and machine learning were used to build reliable predicted models. We proposed a novel concept of multi-GNNs, and then built three multi-GNN models called GIAN, GIAT, and SGCA, which achieved satisfactory results in our dataset containing 532 molecules with all R^2 values greater than 0.92 on the training set and higher than 0.8 on the test set. Compared with machine learning algorithms, random forest (RF), and support vector regression (SVR), multi-GNNs had a better modeling effect and higher precision. Furthermore, the long-time 300 ns molecular dynamics simulation verified the stability of the protein–ligand complexes. The result showed that ZINC8577218, ZINC95618747, and ZINC4261765 might be the potentially potent inhibitors for DHODH. Multi-GNNs show great performance in practice, making them a promising field for future research. We therefore suggest that this novel concept of multi-GNNs is a promising protocol for drug discovery.
Collapse
Affiliation(s)
- Hong-Yi Zhi
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China; (H.-Y.Z.); (L.Z.)
| | - Lu Zhao
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China; (H.-Y.Z.); (L.Z.)
- Department of Clinical Laboratory, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou 510655, China
| | - Cheng-Chun Lee
- Department of Medical Research, China Medical University Hospital, Taichung 40447, Taiwan;
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 510275, China; (H.-Y.Z.); (L.Z.)
- Department of Medical Research, China Medical University Hospital, Taichung 40447, Taiwan;
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan
- Correspondence:
| |
Collapse
|
378
|
Ding J, Xu N, Nguyen MT, Qiao Q, Shi Y, He Y, Shao Q. Machine learning for molecular thermodynamics. Chin J Chem Eng 2021. [DOI: 10.1016/j.cjche.2020.10.044] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
379
|
Shen WX, Zeng X, Zhu F, Wang YL, Qin C, Tan Y, Jiang YY, Chen YZ. Out-of-the-box deep learning prediction of pharmaceutical properties by broadly learned knowledge-based molecular representations. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00301-6] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
380
|
Jiang D, Wu Z, Hsieh CY, Chen G, Liao B, Wang Z, Shen C, Cao D, Wu J, Hou T. Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. J Cheminform 2021; 13:12. [PMID: 33597034 PMCID: PMC7888189 DOI: 10.1186/s13321-020-00479-8] [Citation(s) in RCA: 203] [Impact Index Per Article: 50.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 11/26/2020] [Indexed: 12/31/2022] Open
Abstract
Graph neural networks (GNN) has been considered as an attractive modelling method for molecular property prediction, and numerous studies have shown that GNN could yield more promising results than traditional descriptor-based methods. In this study, based on 11 public datasets covering various property endpoints, the predictive capacity and computational efficiency of the prediction models developed by eight machine learning (ML) algorithms, including four descriptor-based models (SVM, XGBoost, RF and DNN) and four graph-based models (GCN, GAT, MPNN and Attentive FP), were extensively tested and compared. The results demonstrate that on average the descriptor-based models outperform the graph-based models in terms of prediction accuracy and computational efficiency. SVM generally achieves the best predictions for the regression tasks. Both RF and XGBoost can achieve reliable predictions for the classification tasks, and some of the graph-based models, such as Attentive FP and GCN, can yield outstanding performance for a fraction of larger or multi-task datasets. In terms of computational cost, XGBoost and RF are the two most efficient algorithms and only need a few seconds to train a model even for a large dataset. The model interpretations by the SHAP method can effectively explore the established domain knowledge for the descriptor-based models. Finally, we explored use of these models for virtual screening (VS) towards HIV and demonstrated that different ML algorithms offer diverse VS profiles. All in all, we believe that the off-the-shelf descriptor-based models still can be directly employed to accurately predict various chemical endpoints with excellent computability and interpretability.![]()
Collapse
Affiliation(s)
- Dejun Jiang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.,State Key Lab of CAD & CG, Zhejiang University, Hangzhou, 310058, Zhejiang, China.,College of Computer Science and Technology, Zhejiang University, Hangzhou, China
| | - Zhenxing Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory Tencent, Shenzhen, 518057, Guangdong, China
| | - Guangyong Chen
- Shenzhen Institutes of Advanced Technology, Shenzhen, 518055, Guangdong, China
| | - Ben Liao
- Tencent Quantum Laboratory Tencent, Shenzhen, 518057, Guangdong, China
| | - Zhe Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410004, Hunan, China.
| | - Jian Wu
- College of Computer Science and Technology, Zhejiang University, Hangzhou, China.
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China. .,State Key Lab of CAD & CG, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
381
|
Graph neural networks for automated de novo drug design. Drug Discov Today 2021; 26:1382-1393. [PMID: 33609779 DOI: 10.1016/j.drudis.2021.02.011] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 01/27/2021] [Accepted: 02/11/2021] [Indexed: 01/10/2023]
Abstract
The goal of de novo drug design is to create novel chemical entities with desired biological activities and pharmacokinetics (PK) properties. Over recent years, with the development of artificial intelligence (AI) technologies, data-driven methods have rapidly gained in popularity in this field. Among them, graph neural networks (GNNs), a type of neural network directly operating on the graph structure data, have received extensive attention. In this review, we introduce the applications of GNNs in de novo drug design from three aspects: molecule scoring, molecule generation and optimization, and synthesis planning. Furthermore, we also discuss the current challenges and future directions of GNNs in de novo drug design.
Collapse
|
382
|
Kim H, Kim E, Lee I, Bae B, Park M, Nam H. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. BIOTECHNOL BIOPROC E 2021; 25:895-930. [PMID: 33437151 PMCID: PMC7790479 DOI: 10.1007/s12257-020-0049-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023]
Abstract
As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| |
Collapse
|
383
|
Wen M, Blau SM, Spotte-Smith EWC, Dwaraknath S, Persson KA. BonDNet: a graph neural network for the prediction of bond dissociation energies for charged molecules. Chem Sci 2020; 12:1858-1868. [PMID: 34163950 PMCID: PMC8179073 DOI: 10.1039/d0sc05251e] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Accepted: 12/03/2020] [Indexed: 12/13/2022] Open
Abstract
A broad collection of technologies, including e.g. drug metabolism, biofuel combustion, photochemical decontamination of water, and interfacial passivation in energy production/storage systems rely on chemical processes that involve bond-breaking molecular reactions. In this context, a fundamental thermodynamic property of interest is the bond dissociation energy (BDE) which measures the strength of a chemical bond. Fast and accurate prediction of BDEs for arbitrary molecules would lay the groundwork for data-driven projections of complex reaction cascades and hence a deeper understanding of these critical chemical processes and, ultimately, how to reverse design them. In this paper, we propose a chemically inspired graph neural network machine learning model, BonDNet, for the rapid and accurate prediction of BDEs. BonDNet maps the difference between the molecular representations of the reactants and products to the reaction BDE. Because of the use of this difference representation and the introduction of global features, including molecular charge, it is the first machine learning model capable of predicting both homolytic and heterolytic BDEs for molecules of any charge. To test the model, we have constructed a dataset of both homolytic and heterolytic BDEs for neutral and charged (-1 and +1) molecules. BonDNet achieves a mean absolute error (MAE) of 0.022 eV for unseen test data, significantly below chemical accuracy (0.043 eV). Besides the ability to handle complex bond dissociation reactions that no previous model could consider, BonDNet distinguishes itself even in only predicting homolytic BDEs for neutral molecules; it achieves an MAE of 0.020 eV on the PubChem BDE dataset, a 20% improvement over the previous best performing model. We gain additional insight into the model's predictions by analyzing the patterns in the features representing the molecules and the bond dissociation reactions, which are qualitatively consistent with chemical rules and intuition. BonDNet is just one application of our general approach to representing and learning chemical reactivity, and it could be easily extended to the prediction of other reaction properties in the future.
Collapse
Affiliation(s)
- Mingjian Wen
- Department of Materials Science and Engineering, University of California Berkeley CA 94720 USA
- Energy Technologies Area, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| | - Samuel M Blau
- Energy Technologies Area, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| | - Evan Walter Clark Spotte-Smith
- Department of Materials Science and Engineering, University of California Berkeley CA 94720 USA
- Energy Technologies Area, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| | - Shyam Dwaraknath
- Energy Technologies Area, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| | - Kristin A Persson
- Department of Materials Science and Engineering, University of California Berkeley CA 94720 USA
- Molecular Foundry, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| |
Collapse
|
384
|
Wang X, Liu M, Zhang L, Wang Y, Li Y, Lu T. Optimizing Pharmacokinetic Property Prediction Based on Integrated Datasets and a Deep Learning Approach. J Chem Inf Model 2020; 60:4603-4613. [PMID: 32804486 DOI: 10.1021/acs.jcim.0c00568] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Oral bioavailability (OBA)-related pharmacokinetic properties, such as aqueous solubility, lipophilicity, and intestinal membrane permeability, play a significant role in drug discovery. However, their measurement is usually costly and time-consuming. Therefore, prediction models based on diverse approaches have been established in recent decades. Computational prediction of molecular properties has become an important step in drug discovery, aiming to identify potential drug-like candidates and reduce costs. However, limitations related to dataset capacity and algorithm adaptation still place restrictions on the applicability of the related models. In this study, we considered both dataset and algorithm optimization to address the challenge of predicting OBA-related molecular properties. Benchmark datasets of aqueous solubility (log S), lipophilicity (log D), and membrane permeability measured using the Caco-2 cell line (log Papp) were constructed by merging and calibrating experimental data from diverse articles and databases. Then, a novel molecular property prediction model, called a multiembedding-based synthetic network (MESN), was generated by applying a deep learning algorithm based on the synthesis of multiple types of molecular embeddings. MESN achieves performance improvements over other state-of-the-art methods for the prediction of aqueous solubility, lipophilicity, and membrane permeability. Results were also obtained using several other algorithms and independent validation datasets as a control study. Moreover, a dimension reduction analysis (based on t-distributed stochastic neighbor embedding, t-SNE) and an atomic feature similarity analysis showed that the molecular embeddings extracted from the MESN model exhibit good clustering and diversity. Overall, considering the fundamental role of the data and the superior prediction performance of the model, we highlight the applicability of MESN on benchmark datasets for further utility in drug discovery-related molecular property prediction.
Collapse
Affiliation(s)
- Xiting Wang
- Life Science School, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Meng Liu
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Lan Zhang
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Yun Wang
- School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Yu Li
- Chinese Medicine School, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Tao Lu
- Life Science School, Beijing University of Chinese Medicine, Beijing 100029, China
| |
Collapse
|
385
|
Wang J, Cao D, Tang C, Xu L, He Q, Yang B, Chen X, Sun H, Hou T. DeepAtomicCharge: a new graph convolutional network-based architecture for accurate prediction of atomic charges. Brief Bioinform 2020; 22:6278692. [PMID: 34020543 DOI: 10.1093/bib/bbaa183] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 07/06/2020] [Accepted: 07/15/2020] [Indexed: 01/18/2023] Open
Abstract
Atomic charges play a very important role in drug-target recognition. However, computation of atomic charges with high-level quantum mechanics (QM) calculations is very time-consuming. A number of machine learning (ML)-based atomic charge prediction methods have been proposed to speed up the calculation of high-accuracy atomic charges in recent years. However, most of them used a set of predefined molecular properties, such as molecular fingerprints, for model construction, which is knowledge-dependent and may lead to biased predictions due to the representation preference of different molecular properties used for training. To solve the problem, we present a new architecture based on graph convolutional network (GCN) and develop a high-accuracy atomic charge prediction model named DeepAtomicCharge. The new GCN architecture is designed with only the atomic properties and the connection information between the atoms in molecules and can dynamically learn and convert molecules into appropriate atomic features without any prior knowledge of the molecules. Using the designed GCN architecture, substantial improvement is achieved for the prediction accuracy of atomic charges. The average root-mean-square error (RMSE) of DeepAtomicCharge is 0.0121 e, which is obviously more accurate than that (0.0180 e) reported by the previous benchmark study on the same two external test sets. Moreover, the new GCN architecture needs much lower storage space compared with other methods, and the predicted DDEC atomic charges can be efficiently used in large-scale structure-based drug design, thus opening a new avenue for high-performance atomic charge prediction and application.
Collapse
Affiliation(s)
- Jike Wang
- School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410004, Hunan, P. R. China
| | - Cunchen Tang
- School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China.,National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China.,Artificial Intelligence Institute, School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, Jiangsu, P. R. China
| | - Qiaojun He
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Bo Yang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Xi Chen
- School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China.,National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China.,Artificial Intelligence Institute, School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China
| | - Huiyong Sun
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing 210009, Jiangsu, P.R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| |
Collapse
|
386
|
Cai C, Wang S, Xu Y, Zhang W, Tang K, Ouyang Q, Lai L, Pei J. Transfer Learning for Drug Discovery. J Med Chem 2020; 63:8683-8694. [PMID: 32672961 DOI: 10.1021/acs.jmedchem.9b02147] [Citation(s) in RCA: 155] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
The data sets available to train models for in silico drug discovery efforts are often small. Indeed, the sparse availability of labeled data is a major barrier to artificial-intelligence-assisted drug discovery. One solution to this problem is to develop algorithms that can cope with relatively heterogeneous and scarce data. Transfer learning is a type of machine learning that can leverage existing, generalizable knowledge from other related tasks to enable learning of a separate task with a small set of data. Deep transfer learning is the most commonly used type of transfer learning in the field of drug discovery. This Perspective provides an overview of transfer learning and related applications to drug discovery to date. Furthermore, it provides outlooks on the future development of transfer learning for drug discovery.
Collapse
Affiliation(s)
- Chenjing Cai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| | - Shiwei Wang
- PTN Graduate Program, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| | - Youjun Xu
- BNLMS and Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, P. R. China
| | - Weilin Zhang
- Beijing Intelligent Pharma Technology Co., Ltd., Beijing 100083, P. R. China
| | - Ke Tang
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, P. R. China
| | - Qi Ouyang
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China.,The State Key Laboratory for Artificial Microstructures and Mesoscopic Physics, School of Physics, Peking University, Beijing 100871, P. R. China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China.,BNLMS and Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, P. R. China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, P. R. China
| |
Collapse
|
387
|
Wang Y, Hu J, Lai J, Li Y, Jin H, Zhang L, Zhang LR, Liu ZM. TF3P: Three-Dimensional Force Fields Fingerprint Learned by Deep Capsular Network. J Chem Inf Model 2020; 60:2754-2765. [PMID: 32392062 DOI: 10.1021/acs.jcim.0c00005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Molecular fingerprints are the workhorse in ligand-based drug discovery. In recent years, an increasing number of research papers reported fascinating results on using deep neural networks to learn 2D molecular representations as fingerprints. It is anticipated that the integration of deep learning would also contribute to the prosperity of 3D fingerprints. Here, we unprecedentedly introduce deep learning into 3D small molecule fingerprints, presenting a new one we termed as the three-dimensional force fields fingerprint (TF3P). TF3P is learned by a deep capsular network whose training is in no need of labeled data sets for specific predictive tasks. TF3P can encode the 3D force fields information of molecules and demonstrates the stronger ability to capture 3D structural changes, to recognize molecules alike in 3D but not in 2D, and to identify similar targets inaccessible by other 2D or 3D fingerprints based on only ligands similarity. Furthermore, TF3P is compatible with both statistical models (e.g., similarity ensemble approach) and machine learning models. Altogether, we report TF3P as a new 3D small molecule fingerprint with a promising future in ligand-based drug discovery. All codes are written in Python and available at https://github.com/canisw/tf3p.
Collapse
Affiliation(s)
- Yanxing Wang
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Jianxing Hu
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Junyong Lai
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Yibo Li
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100191, P. R. China
| | - Hongwei Jin
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Lihe Zhang
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Liang-Ren Zhang
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| | - Zhen-Ming Liu
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, P. R. China
| |
Collapse
|
388
|
Matsuzaka Y, Uesawa Y. DeepSnap-Deep Learning Approach Predicts Progesterone Receptor Antagonist Activity With High Performance. Front Bioeng Biotechnol 2020; 7:485. [PMID: 32039185 PMCID: PMC6987043 DOI: 10.3389/fbioe.2019.00485] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 12/30/2019] [Indexed: 12/16/2022] Open
Abstract
The progesterone receptor (PR) is important therapeutic target for many malignancies and endocrine disorders due to its role in controlling ovulation and pregnancy via the reproductive cycle. Therefore, the modulation of PR activity using its agonists and antagonists is receiving increasing interest as novel treatment strategy. However, clinical trials using the PR modulators have not yet been found conclusive evidences. Recently, increasing evidence from several fields shows that the classification of chemical compounds, including agonists and antagonists, can be done with recent improvements in deep learning (DL) using deep neural network. Therefore, we recently proposed a novel DL-based quantitative structure-activity relationship (QSAR) strategy using transfer learning to build prediction models for agonists and antagonists. By employing this novel approach, referred as DeepSnap-DL method, which uses images captured from 3-dimension (3D) chemical structure with multiple angles as input data into the DL classification, we constructed prediction models of the PR antagonists in this study. Here, the DeepSnap-DL method showed a high performance prediction of the PR antagonists by optimization of some parameters and image adjustment from 3D-structures. Furthermore, comparison of the prediction models from this approach with conventional machine learnings (MLs) indicated the DeepSnap-DL method outperformed these MLs. Therefore, the models predicted by DeepSnap-DL would be powerful tool for not only QSAR field in predicting physiological and agonist/antagonist activities, toxicity, and molecular bindings; but also for identifying biological or pathological phenomena.
Collapse
Affiliation(s)
| | - Yoshihiro Uesawa
- Department of Medical Molecular Informatics, Meiji Pharmaceutical University, Tokyo, Japan
| |
Collapse
|
389
|
David L, Arús-Pous J, Karlsson J, Engkvist O, Bjerrum EJ, Kogej T, Kriegl JM, Beck B, Chen H. Applications of Deep-Learning in Exploiting Large-Scale and Heterogeneous Compound Data in Industrial Pharmaceutical Research. Front Pharmacol 2019; 10:1303. [PMID: 31749705 PMCID: PMC6848277 DOI: 10.3389/fphar.2019.01303] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 10/14/2019] [Indexed: 12/21/2022] Open
Abstract
In recent years, the development of high-throughput screening (HTS) technologies and their establishment in an industrialized environment have given scientists the possibility to test millions of molecules and profile them against a multitude of biological targets in a short period of time, generating data in a much faster pace and with a higher quality than before. Besides the structure activity data from traditional bioassays, more complex assays such as transcriptomics profiling or imaging have also been established as routine profiling experiments thanks to the advancement of Next Generation Sequencing or automated microscopy technologies. In industrial pharmaceutical research, these technologies are typically established in conjunction with automated platforms in order to enable efficient handling of screening collections of thousands to millions of compounds. To exploit the ever-growing amount of data that are generated by these approaches, computational techniques are constantly evolving. In this regard, artificial intelligence technologies such as deep learning and machine learning methods play a key role in cheminformatics and bio-image analytics fields to address activity prediction, scaffold hopping, de novo molecule design, reaction/retrosynthesis predictions, or high content screening analysis. Herein we summarize the current state of analyzing large-scale compound data in industrial pharmaceutical research and describe the impact it has had on the drug discovery process over the last two decades, with a specific focus on deep-learning technologies.
Collapse
Affiliation(s)
- Laurianne David
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
- Department of Life Science Informatics, B-IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Josep Arús-Pous
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
- Department of Chemistry and Biochemistry, University of Bern, Bern, Switzerland
| | - Johan Karlsson
- Quantitative Biology, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
| | - Esben Jannik Bjerrum
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
| | - Thierry Kogej
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
| | - Jan M. Kriegl
- Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany
| | - Bernd Beck
- Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riss, Germany
| | - Hongming Chen
- Hit Discovery, Discovery Sciences, Biopharmaceutical R&D, AstraZeneca, Gothenburg, Sweden
- Chemistry and Chemical Biology Centre, Guangzhou Regenerative Medicine and Health – Guangdong Laboratory, Guangzhou, China
| |
Collapse
|