1
|
Gangwal A, Ansari A, Ahmad I, Azad AK, Wan Sulaiman WMA. Current strategies to address data scarcity in artificial intelligence-based drug discovery: A comprehensive review. Comput Biol Med 2024; 179:108734. [PMID: 38964243 DOI: 10.1016/j.compbiomed.2024.108734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 06/01/2024] [Accepted: 06/08/2024] [Indexed: 07/06/2024]
Abstract
Artificial intelligence (AI) has played a vital role in computer-aided drug design (CADD). This development has been further accelerated with the increasing use of machine learning (ML), mainly deep learning (DL), and computing hardware and software advancements. As a result, initial doubts about the application of AI in drug discovery have been dispelled, leading to significant benefits in medicinal chemistry. At the same time, it is crucial to recognize that AI is still in its infancy and faces a few limitations that need to be addressed to harness its full potential in drug discovery. Some notable limitations are insufficient, unlabeled, and non-uniform data, the resemblance of some AI-generated molecules with existing molecules, unavailability of inadequate benchmarks, intellectual property rights (IPRs) related hurdles in data sharing, poor understanding of biology, focus on proxy data and ligands, lack of holistic methods to represent input (molecular structures) to prevent pre-processing of input molecules (feature engineering), etc. The major component in AI infrastructure is input data, as most of the successes of AI-driven efforts to improve drug discovery depend on the quality and quantity of data, used to train and test AI algorithms, besides a few other factors. Additionally, data-gulping DL approaches, without sufficient data, may collapse to live up to their promise. Current literature suggests a few methods, to certain extent, effectively handle low data for better output from the AI models in the context of drug discovery. These are transferring learning (TL), active learning (AL), single or one-shot learning (OSL), multi-task learning (MTL), data augmentation (DA), data synthesis (DS), etc. One different method, which enables sharing of proprietary data on a common platform (without compromising data privacy) to train ML model, is federated learning (FL). In this review, we compare and discuss these methods, their recent applications, and limitations while modeling small molecule data to get the improved output of AI methods in drug discovery. Article also sums up some other novel methods to handle inadequate data.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule, 424001, Maharashtra, India.
| | - Azim Ansari
- Computer Aided Drug Design Center, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule, 424001, Maharashtra, India
| | - Iqrar Ahmad
- Department of Pharmaceutical Chemistry, Prof. Ravindra Nikam College of Pharmacy, Gondur, Dhule, 424002, Maharashtra, India.
| | - Abul Kalam Azad
- Faculty of Pharmacy, University College of MAIWP International, Batu Caves, 68100, Kuala Lumpur, Malaysia.
| | | |
Collapse
|
2
|
Guo Y, Gao Y, Song J. MolCFL: A personalized and privacy-preserving drug discovery framework based on generative clustered federated learning. J Biomed Inform 2024; 157:104712. [PMID: 39182631 DOI: 10.1016/j.jbi.2024.104712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Revised: 08/16/2024] [Accepted: 08/21/2024] [Indexed: 08/27/2024]
Abstract
In today's era of rapid development of large models, the traditional drug development process is undergoing a profound transformation. The vast demand for data and consumption of computational resources are making independent drug discovery increasingly difficult. By integrating federated learning technology into the drug discovery field, we have found a solution that both protects privacy and shares computational power. However, the differences in data held by various pharmaceutical institutions and the diversity in drug design objectives have exacerbated the issue of data heterogeneity, making traditional federated learning consensus models unable to meet the personalized needs of all parties. In this study, we introduce and evaluate an innovative drug discovery framework, MolCFL, which utilizes a multi-layer perceptron (MLP) as the generator and a graph convolutional network (GCN) as the discriminator in a generative adversarial network (GAN). By learning the graph structure of molecules, it generates new molecules in a highly personalized manner and then optimizes the learning process by clustering federated learning, grouping compound data with high similarity. MolCFL not only enhances the model's ability to protect privacy but also significantly improves the efficiency and personalization of molecular design. MolCFL exhibits superior performance when handling non-independently and identically distributed data compared to traditional models. Experimental results show that the framework demonstrates outstanding performance on two benchmark datasets, with the generated new molecules achieving over 90% in Uniqueness and close to 100% in Novelty. MolCFL not only improves the quality and efficiency of drug molecule design but also, through its highly customized clustered federated learning environment, promotes collaboration and specialization in the drug discovery process while ensuring data privacy. These features make MolCFL a powerful tool suitable for addressing the various challenges faced in the modern drug research and development field.
Collapse
Affiliation(s)
- Yan Guo
- Inner Mongolia University, College of Computer Science, Hohhot, 010000, China
| | - Yongqiang Gao
- Inner Mongolia University, College of Computer Science, Hohhot, 010000, China.
| | - Jiawei Song
- Inner Mongolia University, College of Computer Science, Hohhot, 010000, China
| |
Collapse
|
3
|
Dorsey MA, Dsouza K, Ranganath D, Harris JS, Lane TR, Urbina F, Ekins S. Near-Term Quantum Classification Algorithms Applied to Antimalarial Drug Discovery. J Chem Inf Model 2024; 64:5922-5930. [PMID: 39013438 PMCID: PMC11338495 DOI: 10.1021/acs.jcim.4c00953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
Computational approaches are widely applied in drug discovery to explore properties related to bioactivity, physiochemistry, and toxicology. Over at least the last 20 years, the exploitation of machine learning on molecular data sets has been used to understand the structure-activity relationships that exist between biomolecules and druggable targets. More recently, these methods have also seen application for phenotypic screening data for neglected diseases such as tuberculosis and malaria. Herein, we apply machine learning to build quantum Quantitative Structure Activity Relationship models from antimalarial data sets. There is a continual need for new antimalarials to address drug resistance, and the readily available in vitro data sets could be utilized with newer machine learning approaches as these develop. Furthermore, quantum machine learning is a relatively new method that uses a quantum computer to perform the calculations. First, we present a classical-quantum hybrid computational approach by building a Latent Bernoulli Autoencoder machine learning model for compressing bit-vector descriptors to a size that can be adapted to quantum computers for classification tasks with limited loss of embedded information. Second, we apply our method for feature map compression to quantum classification algorithms, including a completely novel machine learning algorithm with no analogy in classical computers: the Quantum Fourier Transform Classifier. We apply both these approaches to build quantum machine learning models for small-molecule antimalarials with quantum simulation software and then benchmark these quantum models against classical machine learning approaches. While there are many challenges currently facing the development of reliable quantum computers, our results demonstrate that there is potential for the use of this technology in the field of drug discovery.
Collapse
Affiliation(s)
- Matthew A. Dorsey
- Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina 27606, United States
| | - Kelvin Dsouza
- Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina 27606, United States
| | - Dhruv Ranganath
- Biomedical Engineering, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27514, United States
| | - Joshua S. Harris
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Thomas R. Lane
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Fabio Urbina
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| |
Collapse
|
4
|
Smaldone AM, Batista VS. Quantum-to-Classical Neural Network Transfer Learning Applied to Drug Toxicity Prediction. J Chem Theory Comput 2024; 20:4901-4908. [PMID: 38795030 DOI: 10.1021/acs.jctc.4c00432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2024]
Abstract
Toxicity is a roadblock that prevents an inordinate number of drugs from being used in potentially life-saving applications. Deep learning provides a promising solution to finding ideal drug candidates; however, the vastness of chemical space coupled with the underlying O ( n 3 ) matrix multiplication means these efforts quickly become computationally demanding. To remedy this, we present a hybrid quantum-classical neural network for predicting drug toxicity utilizing a quantum circuit design that mimics classical neural behavior by explicitly calculating matrix products with complexity O ( n 2 ) . Leveraging the Hadamard test for efficient inner product estimation rather than the conventionally used swap test, we reduce the number of qubits by half and remove the need for quantum phase estimation. Directly computing matrix products quantum mechanically allows for learnable weights to be transferred from a quantum to a classical device for further training. We apply our framework to the Tox21 data set and show that it achieves commensurate predictive accuracy to the model's fully classical O ( n 3 ) analogue. Additionally, we demonstrate that the model continues to learn, without disruption, once transferred to a fully classical architecture. We believe that combining the quantum advantage of reduced complexity and the classical advantage of noise-free calculation will pave the way for more scalable machine learning models.
Collapse
Affiliation(s)
- Anthony M Smaldone
- Department of Chemistry, Yale University, New Haven 06511, Connecticut, United States
| | - Victor S Batista
- Department of Chemistry, Yale University, New Haven 06511, Connecticut, United States
| |
Collapse
|
5
|
Yang Z, Huang T, Pan L, Wang J, Wang L, Ding J, Xiao J. QuanDB: a quantum chemical property database towards enhancing 3D molecular representation learning. J Cheminform 2024; 16:48. [PMID: 38685101 PMCID: PMC11059686 DOI: 10.1186/s13321-024-00843-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 04/24/2024] [Indexed: 05/02/2024] Open
Abstract
Previous studies have shown that the three-dimensional (3D) geometric and electronic structure of molecules play a crucial role in determining their key properties and intermolecular interactions. Therefore, it is necessary to establish a quantum chemical (QC) property database containing the most stable 3D geometric conformations and electronic structures of molecules. In this study, a high-quality QC property database, called QuanDB, was developed, which included structurally diverse molecular entities and featured a user-friendly interface. Currently, QuanDB contains 154,610 compounds sourced from public databases and scientific literature, with 10,125 scaffolds. The elemental composition comprises nine elements: H, C, O, N, P, S, F, Cl, and Br. For each molecule, QuanDB provides 53 global and 5 local QC properties and the most stable 3D conformation. These properties are divided into three categories: geometric structure, electronic structure, and thermodynamics. Geometric structure optimization and single point energy calculation at the theoretical level of B3LYP-D3(BJ)/6-311G(d)/SMD/water and B3LYP-D3(BJ)/def2-TZVP/SMD/water, respectively, were applied to ensure highly accurate calculations of QC properties, with the computational cost exceeding 107 core-hours. QuanDB provides high-value geometric and electronic structure information for use in molecular representation models, which are critical for machine-learning-based molecular design, thereby contributing to a comprehensive description of the chemical compound space. As a new high-quality dataset for QC properties, QuanDB is expected to become a benchmark tool for the training and optimization of machine learning models, thus further advancing the development of novel drugs and materials. QuanDB is freely available, without registration, at https://quandb.cmdrg.com/ .
Collapse
Affiliation(s)
- Zhijiang Yang
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China
| | - Tengxin Huang
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China
| | - Li Pan
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China
| | - Jingjing Wang
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China
| | - Liangliang Wang
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China.
| | - Junjie Ding
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China.
| | - Junhua Xiao
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China.
| |
Collapse
|
6
|
Fan W, He Y, Zhu F. RM-GPT: Enhance the comprehensive generative ability of molecular GPT model via LocalRNN and RealFormer. Artif Intell Med 2024; 150:102827. [PMID: 38553166 DOI: 10.1016/j.artmed.2024.102827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 02/26/2024] [Accepted: 02/26/2024] [Indexed: 04/02/2024]
Abstract
Due to the surging of cost, artificial intelligence-assisted de novo drug design has supplanted conventional methods and become an emerging option for drug discovery. Although there have arisen many successful examples of applying generative models to the molecular field, these methods struggle to deal with conditional generation that meet chemists' practical requirements which ask for a controllable process to generate new molecules or optimize basic molecules with appointed conditions. To address this problem, a Recurrent Molecular-Generative Pretrained Transformer model is proposed, supplemented by LocalRNN and Residual Attention Layer Transformer, referred to as RM-GPT. RM-GPT rebuilds GPT model's architecture by incorporating LocalRNN and Residual Attention Layer Transformer so that it is able to extract local information and build connectivity between attention blocks. The incorporation of Transformer in these two modules enables leveraging the parallel computing advantages of multi-head attention mechanisms while extracting local structural information effectively. Through exploring and learning in a large chemical space, RM-GPT absorbs the ability to generate drug-like molecules with conditions in demand, such as desired properties and scaffolds, precisely and stably. RM-GPT achieved better results than SOTA methods on conditional generation.
Collapse
Affiliation(s)
- Wenfeng Fan
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| | - Yue He
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| | - Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| |
Collapse
|
7
|
Wang Y, Wang C, Liu T, Qi H, Chen S, Cai X, Zhang M, Aliper A, Ren F, Ding X, Zhavoronkov A. Discovery of Tetrahydropyrazolopyrazine Derivatives as Potent and Selective MYT1 Inhibitors for the Treatment of Cancer. J Med Chem 2024; 67:420-432. [PMID: 38146659 DOI: 10.1021/acs.jmedchem.3c01476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2023]
Abstract
Breast and gynecological cancers are among the leading causes of death in women worldwide, illustrating the urgent need for innovative treatment options. We identified MYT1 as a promising new therapeutic target for breast and gynecological cancer using PandaOmics, an AI-driven target discovery platform. The synthetic lethal relationship of MYT1 in tumor cell lines with CCNE1 amplification enhanced this rationale. Through structure-based drug design, we developed a series of novel, potent, and highly selective inhibitors specifically targeting MYT1. Importantly, our lead compound, featuring a tetrahydropyrazolopyrazine ring, exhibits remarkable selectivity over WEE1, a related kinase associated with bone marrow suppression upon inhibition. Optimization of potency and physical properties resulted in the discovery of compound 21, a novel MYT1 inhibitor, exhibiting optimal pharmacokinetic properties and promising in vivo antitumor efficacy.
Collapse
Affiliation(s)
- Yazhou Wang
- Insilico Medicine Shanghai Ltd, Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
| | - Chao Wang
- Insilico Medicine Shanghai Ltd, Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
| | - Tingting Liu
- Insilico Medicine Shanghai Ltd, Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
| | - Hongyun Qi
- Insilico Medicine Shanghai Ltd, Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
| | - Shan Chen
- Insilico Medicine Shanghai Ltd, Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
| | - Xin Cai
- Insilico Medicine Shanghai Ltd, Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
| | - Man Zhang
- Insilico Medicine Shanghai Ltd, Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
| | - Alex Aliper
- Insilico Medicine AI Limited, Masdar City 145748, Abu Dhabi, United Arab Emirates
| | - Feng Ren
- Insilico Medicine Shanghai Ltd, Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
| | - Xiao Ding
- Insilico Medicine Shanghai Ltd, Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
| | - Alex Zhavoronkov
- Insilico Medicine Shanghai Ltd, Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
- Insilico Medicine AI Limited, Masdar City 145748, Abu Dhabi, United Arab Emirates
| |
Collapse
|
8
|
Kim MJ, Martin CA, Kim J, Jablonski MM. Computational methods in glaucoma research: Current status and future outlook. Mol Aspects Med 2023; 94:101222. [PMID: 37925783 PMCID: PMC10842846 DOI: 10.1016/j.mam.2023.101222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 10/06/2023] [Accepted: 10/19/2023] [Indexed: 11/07/2023]
Abstract
Advancements in computational techniques have transformed glaucoma research, providing a deeper understanding of genetics, disease mechanisms, and potential therapeutic targets. Systems genetics integrates genomic and clinical data, aiding in identifying drug targets, comprehending disease mechanisms, and personalizing treatment strategies for glaucoma. Molecular dynamics simulations offer valuable molecular-level insights into glaucoma-related biomolecule behavior and drug interactions, guiding experimental studies and drug discovery efforts. Artificial intelligence (AI) technologies hold promise in revolutionizing glaucoma research, enhancing disease diagnosis, target identification, and drug candidate selection. The generalized protocols for systems genetics, MD simulations, and AI model development are included as a guide for glaucoma researchers. These computational methods, however, are not separate and work harmoniously together to discover novel ways to combat glaucoma. Ongoing research and progresses in genomics technologies, MD simulations, and AI methodologies project computational methods to become an integral part of glaucoma research in the future.
Collapse
Affiliation(s)
- Minjae J Kim
- Department of Ophthalmology, The Hamilton Eye Institute, The University of Tennessee Health Science Center, Memphis, TN, 38163, USA.
| | - Cole A Martin
- Department of Ophthalmology, The Hamilton Eye Institute, The University of Tennessee Health Science Center, Memphis, TN, 38163, USA.
| | - Jinhwa Kim
- Graduate School of Artificial Intelligence, Graduate School of Metaverse, Department of Management Information Systems, Sogang University, 1 Shinsoo-Dong, Mapo-Gu, Seoul, South Korea.
| | - Monica M Jablonski
- Department of Ophthalmology, The Hamilton Eye Institute, The University of Tennessee Health Science Center, Memphis, TN, 38163, USA.
| |
Collapse
|
9
|
Bhatia AS, Saggi MK, Kais S. Quantum Machine Learning Predicting ADME-Tox Properties in Drug Discovery. J Chem Inf Model 2023; 63:6476-6486. [PMID: 37603536 DOI: 10.1021/acs.jcim.3c01079] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
In the drug discovery paradigm, the evaluation of absorption, distribution, metabolism, and excretion (ADME) and toxicity properties of new chemical entities is one of the most critical issues, which is a time-consuming process, immensely expensive, and poses formidable challenges in pharmaceutical R&D. In recent years, emerging technologies like artificial intelligence (AI), big data, and cloud technologies have garnered great attention to predict the ADME and toxicity of molecules. Currently, the blend of quantum computation and machine learning has attracted considerable attention in almost every field ranging from chemistry to biomedicine and several engineering disciplines as well. Quantum computers have the potential to bring advances in high-throughput experimental techniques and in screening billions of molecules by reducing development costs and time associated with the drug discovery process. Motivated by the efficiency of quantum kernel methods, we proposed a quantum machine learning (QML) framework consisting of a classical support vector classifier algorithm with a kernel-based quantum classifier. To demonstrate the feasibility of the proposed QML framework, the simplified molecular input line entry system (SMILES) notation-based string kernel, combined with a quantum support vector classifier, is used for the evaluation of chemical/drug ADME-Tox properties. The proposed quantum machine learning framework is validated and assessed via large-scale simulations. Based on our results from numerical simulations, the quantum model achieved the best performance as compared to classical counterparts in terms of the area under the curve of the receiver operating characteristic curve (AUC ROC; 0.80-0.95) for predicting outcomes on ADME-Tox data sets for small molecules, with a different number of features. The deployment of the proposed framework in the pharmaceutical industry would be extremely valuable in making the best decisions possible.
Collapse
Affiliation(s)
- Amandeep Singh Bhatia
- School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana 47907, United States
| | - Mandeep Kaur Saggi
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States
| | - Sabre Kais
- Department of Chemistry, Purdue University, West Lafayette, Indiana 47907, United States
| |
Collapse
|
10
|
Hagg A, Kirschner KN. Open-Source Machine Learning in Computational Chemistry. J Chem Inf Model 2023; 63:4505-4532. [PMID: 37466636 PMCID: PMC10430767 DOI: 10.1021/acs.jcim.3c00643] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Indexed: 07/20/2023]
Abstract
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
Collapse
Affiliation(s)
- Alexander Hagg
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| | - Karl N. Kirschner
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Computer Science, University of Applied
Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| |
Collapse
|
11
|
Pyrkov A, Aliper A, Bezrukov D, Lin YC, Polykovskiy D, Kamya P, Ren F, Zhavoronkov A. Quantum computing for near-term applications in generative chemistry and drug discovery. Drug Discov Today 2023; 28:103675. [PMID: 37331692 DOI: 10.1016/j.drudis.2023.103675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 05/22/2023] [Accepted: 06/13/2023] [Indexed: 06/20/2023]
Abstract
In recent years, drug discovery and life sciences have been revolutionized with machine learning and artificial intelligence (AI) methods. Quantum computing is touted to be the next most significant leap in technology; one of the main early practical applications for quantum computing solutions is predicted to be in quantum chemistry simulations. Here, we review the near-term applications of quantum computing and their advantages for generative chemistry and highlight the challenges that can be addressed with noisy intermediate-scale quantum (NISQ) devices. We also discuss the possible integration of generative systems running on quantum computers into established generative AI platforms.
Collapse
Affiliation(s)
- Alexey Pyrkov
- Insilico Medicine Hong Kong Ltd, Pak Shek Kok, New Territories, Hong Kong.
| | - Alex Aliper
- Insilico Medicine AI Ltd, Masdar City, Abu Dhabi, United Arab Emirates
| | - Dmitry Bezrukov
- Insilico Medicine Hong Kong Ltd, Pak Shek Kok, New Territories, Hong Kong
| | - Yen-Chu Lin
- Insilico Medicine Taiwan Ltd, Taipei, Taiwan
| | | | | | - Feng Ren
- Insilico Medicine Shanghai Ltd, Shanghai, China
| | - Alex Zhavoronkov
- Insilico Medicine Hong Kong Ltd, Pak Shek Kok, New Territories, Hong Kong
| |
Collapse
|