1
|
Gu C, Jang WD, Oh KS, Ryu JY. AnoChem: Prediction of chemical structural abnormalities based on machine learning models. Comput Struct Biotechnol J 2024; 23:2116-2121. [PMID: 38808129 PMCID: PMC11130677 DOI: 10.1016/j.csbj.2024.05.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 05/08/2024] [Accepted: 05/08/2024] [Indexed: 05/30/2024] Open
Abstract
De novo drug design aims to rationally discover novel and potent compounds while reducing experimental costs during the drug development stage. Despite the numerous generative models that have been developed, few successful cases of drug design utilizing generative models have been reported. One of the most common challenges is designing compounds that are not synthesizable or realistic. Therefore, methods capable of accurately assessing the chemical structures proposed by generative models for drug design are needed. In this study, we present AnoChem, a computational framework based on deep learning designed to assess the likelihood of a generated molecule being real. AnoChem achieves an area under the receiver operating characteristic curve score of 0.900 for distinguishing between real and generated molecules. We utilized AnoChem to evaluate and compare the performances of several generative models, using other metrics, namely SAscore and Fréschet ChemNet distance (FCD). AnoChem demonstrates a strong correlation with these metrics, validating its effectiveness as a reliable tool for assessing generative models. The source code for AnoChem is available at https://github.com/CSB-L/AnoChem.
Collapse
Affiliation(s)
- Changdai Gu
- Artificial Intelligence Laboratory, Oncocross Co., Ltd., Saechang-ro, Mapo-gu, Seoul 04168, Republic of Korea
- Department of Artificial Intelligence, College of Computing, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea
| | - Woo Dae Jang
- Data Convergence Drug Research Center, Korea Research Institute of Chemical Technology, 141 Gajeong-ro, Yuseong-gu, Daejeon 34114, Republic of Korea
- Department of Medicinal and Pharmaceutical Chemistry, University of Science and Technology, Daejeon 34129, Republic of Korea
| | - Kwang-Seok Oh
- Data Convergence Drug Research Center, Korea Research Institute of Chemical Technology, 141 Gajeong-ro, Yuseong-gu, Daejeon 34114, Republic of Korea
- Department of Medicinal and Pharmaceutical Chemistry, University of Science and Technology, Daejeon 34129, Republic of Korea
| | - Jae Yong Ryu
- Artificial Intelligence Laboratory, Oncocross Co., Ltd., Saechang-ro, Mapo-gu, Seoul 04168, Republic of Korea
- Department of Biotechnology, Duksung Women’s University, 33 Samyang-Ro 144-Gil, Dobong-gu, Seoul 01369, Republic of Korea
| |
Collapse
|
2
|
Yin X, Wang J, Ge M, Feng X, Zhang G. Designing Small Molecule PI3Kγ Inhibitors: A Review of Structure-Based Methods and Computational Approaches. J Med Chem 2024; 67:10530-10547. [PMID: 38988222 DOI: 10.1021/acs.jmedchem.4c00347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
The PI3K/AKT/mTOR pathway plays critical roles in a wide array of biological processes. Phosphatidylinositol 3-kinase gamma (PI3Kγ), a class IB PI3K family member, represents a potential therapeutic opportunity for the treatment of cancer, inflammation, and autoimmunity. In this Perspective, we provide a comprehensive overview of the structure, biological function, and regulation of PI3Kγ. We also focus on the development of PI3Kγ inhibitors over the past decade and emphasize their binding modes, structure-activity relationships, and pharmacological activities. The application of computational technologies and artificial intelligence in the discovery of novel PI3Kγ inhibitors is also introduced. This review aims to provide a timely and updated overview on the strategies for targeting PI3Kγ.
Collapse
Affiliation(s)
- Xiaoming Yin
- Hebei University of Science & Technology, Shijiazhuang 050018, People's Republic of China
- Hebei Research Center of Pharmaceutical and Chemical Engineering, Shijiazhuang 050018, People's Republic of China
| | - Jiaying Wang
- Hebei University of Science & Technology, Shijiazhuang 050018, People's Republic of China
- Hebei Research Center of Pharmaceutical and Chemical Engineering, Shijiazhuang 050018, People's Republic of China
| | - Minghao Ge
- Hebei University of Science & Technology, Shijiazhuang 050018, People's Republic of China
- Hebei Research Center of Pharmaceutical and Chemical Engineering, Shijiazhuang 050018, People's Republic of China
| | - Xue Feng
- Hebei University of Science & Technology, Shijiazhuang 050018, People's Republic of China
| | - Guogang Zhang
- Hebei University of Science & Technology, Shijiazhuang 050018, People's Republic of China
- Hebei Research Center of Pharmaceutical and Chemical Engineering, Shijiazhuang 050018, People's Republic of China
| |
Collapse
|
3
|
Singh S, Kaur N, Gehlot A. Application of artificial intelligence in drug design: A review. Comput Biol Med 2024; 179:108810. [PMID: 38991316 DOI: 10.1016/j.compbiomed.2024.108810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 05/31/2024] [Accepted: 06/24/2024] [Indexed: 07/13/2024]
Abstract
Artificial intelligence (AI) is a field of computer science that involves acquiring information, developing rule bases, and mimicking human behaviour. The fundamental concept behind AI is to create intelligent computer systems that can operate with minimal human intervention or without any intervention at all. These rule-based systems are developed using various machine learning and deep learning models, enabling them to solve complex problems. AI is integrated with these models to learn, understand, and analyse provided data. The rapid advancement of Artificial Intelligence (AI) is reshaping numerous industries, with the pharmaceutical sector experiencing a notable transformation. AI is increasingly being employed to automate, optimize, and personalize various facets of the pharmaceutical industry, particularly in pharmacological research. Traditional drug development methods areknown for being time-consuming, expensive, and less efficient, often taking around a decade and costing billions of dollars. The integration of artificial intelligence (AI) techniques addresses these challenges by enabling the examination of compounds with desired properties from a vast pool of input drugs. Furthermore, it plays a crucial role in drug screening by predicting toxicity, bioactivity, ADME properties (absorption, distribution, metabolism, and excretion), physicochemical properties, and more. AI enhances the drug design process by improving the efficiency and accuracy of predicting drug behaviour, interactions, and properties. These approaches further significantly improve the precision of drug discovery processes and decrease clinical trial costs leading to the development of more effective drugs.
Collapse
Affiliation(s)
- Simrandeep Singh
- Department of Electronics & Communication Engineering, UCRD, Chandigarh University, Gharuan, Punjab, India.
| | - Navjot Kaur
- Department of Pharmacognosy, Amar Shaheed Baba Ajit Singh Jujhar Singh Memorial College of Pharmacy, Bela, Ropar, India
| | - Anita Gehlot
- Uttaranchal Institute of technology, Uttaranchal University, Dehradun, India
| |
Collapse
|
4
|
Thomas M, Ahmad M, Tresadern G, de Fabritiis G. PromptSMILES: prompting for scaffold decoration and fragment linking in chemical language models. J Cheminform 2024; 16:77. [PMID: 38965600 PMCID: PMC11225391 DOI: 10.1186/s13321-024-00866-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 06/04/2024] [Indexed: 07/06/2024] Open
Abstract
SMILES-based generative models are amongst the most robust and successful recent methods used to augment drug design. They are typically used for complete de novo generation, however, scaffold decoration and fragment linking applications are sometimes desirable which requires a different grammar, architecture, training dataset and therefore, re-training of a new model. In this work, we describe a simple procedure to conduct constrained molecule generation with a SMILES-based generative model to extend applicability to scaffold decoration and fragment linking by providing SMILES prompts, without the need for re-training. In combination with reinforcement learning, we show that pre-trained, decoder-only models adapt to these applications quickly and can further optimize molecule generation towards a specified objective. We compare the performance of this approach to a variety of orthogonal approaches and show that performance is comparable or better. For convenience, we provide an easy-to-use python package to facilitate model sampling which can be found on GitHub and the Python Package Index.Scientific contributionThis novel method extends an autoregressive chemical language model to scaffold decoration and fragment linking scenarios. This doesn't require re-training, the use of a bespoke grammar, or curation of a custom dataset, as commonly required by other approaches.
Collapse
Affiliation(s)
- Morgan Thomas
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aguiader 88, 08003, Barcelona, Spain.
| | - Mazen Ahmad
- In Silico Discovery, Janssen Pharmaceutica N. V., Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Gary Tresadern
- In Silico Discovery, Janssen Pharmaceutica N. V., Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Gianni de Fabritiis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aguiader 88, 08003, Barcelona, Spain.
- Acellera Labs, C Dr. Trueta 183, 08005, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010, Barcelona, Spain.
| |
Collapse
|
5
|
Yang L, Guo Q, Zhang L. AI-assisted chemistry research: a comprehensive analysis of evolutionary paths and hotspots through knowledge graphs. Chem Commun (Camb) 2024; 60:6977-6987. [PMID: 38910536 DOI: 10.1039/d4cc01892c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/25/2024]
Abstract
Artificial intelligence (AI) offers transformative potential for chemical research through its ability to optimize reactions and processes, enhance energy efficiency, and reduce waste. AI-assisted chemical research (AI + chem) has become a global hotspot. To better understand the current research status of "AI + chem", this study conducted a scientific bibliometric investigation using CiteSpace. The web of science core collection was utilized to retrieve original articles related to "AI + chem" published from 2000 to 2024. The obtained data allowed for the visualization of the knowledge background, current research status, and latest knowledge structure of "AI + chem". The "AI + chem" has entered a stage of explosive growth, and the number of papers will maintain long-term high-speed growth. This article systematically analyzes the latest progress in "AI + chem" and objectively predicts future trends, including molecular design, reaction prediction, materials design, drug design, and quantum chemistry. The outcomes of this study will provide readers with a comprehensive understanding of the overall landscape of "AI + chem".
Collapse
Affiliation(s)
- Lin Yang
- School of Intellectual Property, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Qingle Guo
- School of Intellectual Property, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Lijing Zhang
- School of Chemistry, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China.
| |
Collapse
|
6
|
Morehead A, Cheng J. Geometry-complete diffusion for 3D molecule generation and optimization. Commun Chem 2024; 7:150. [PMID: 38961141 PMCID: PMC11222514 DOI: 10.1038/s42004-024-01233-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Accepted: 06/20/2024] [Indexed: 07/05/2024] Open
Abstract
Generative deep learning methods have recently been proposed for generating 3D molecules using equivariant graph neural networks (GNNs) within a denoising diffusion framework. However, such methods are unable to learn important geometric properties of 3D molecules, as they adopt molecule-agnostic and non-geometric GNNs as their 3D graph denoising networks, which notably hinders their ability to generate valid large 3D molecules. In this work, we address these gaps by introducing the Geometry-Complete Diffusion Model (GCDM) for 3D molecule generation, which outperforms existing 3D molecular diffusion models by significant margins across conditional and unconditional settings for the QM9 dataset and the larger GEOM-Drugs dataset, respectively. Importantly, we demonstrate that GCDM's generative denoising process enables the model to generate a significant proportion of valid and energetically-stable large molecules at the scale of GEOM-Drugs, whereas previous methods fail to do so with the features they learn. Additionally, we show that extensions of GCDM can not only effectively design 3D molecules for specific protein pockets but can be repurposed to consistently optimize the geometry and chemical composition of existing 3D molecules for molecular stability and property specificity, demonstrating new versatility of molecular diffusion models. Code and data are freely available on GitHub .
Collapse
Affiliation(s)
- Alex Morehead
- Department of Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO, 65211, USA.
| | - Jianlin Cheng
- Department of Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO, 65211, USA
| |
Collapse
|
7
|
Qin T, Wang Y, Kong M, Zhong H, Wu T, Xi Z, Qian Z, Li K, Cai Y, Wu J, Li W. Identification of potential PIM-2 inhibitors via ligand-based generative models, molecular docking and molecular dynamics simulations. Mol Divers 2024:10.1007/s11030-024-10916-7. [PMID: 38954072 DOI: 10.1007/s11030-024-10916-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 06/11/2024] [Indexed: 07/04/2024]
Abstract
Proviral Integrations of Moloney-2 (PIM-2) kinase is a promising target for various cancers and other diseases, and its inhibitors hold potential for treating related diseases. However, there is currently no clinically available PIM-2 inhibitor. In this study, we constructed a generative model for de novo PIM-2 inhibitor design based on artificial intelligence, performed molecular docking and molecular dynamics (MD) simulations to develop an efficient PIM-2 inhibitor generative model and discover potential PIM-2 inhibitors. First, we designed a generative model based on a Bi-directional Long Short-Term Memory (BiLSTM) framework combined with a transfer learning strategy and generated a new PIM-2 small molecule library using existing active drug databases. The generated compound library was then virtually screened by molecular docking and scaffold similarity comparison, identifying 10 initial hit compounds with better performance. Next, using the inhibitor in the crystal structure as a positive control, we performed two rounds of MD simulations, with lengths of 100 ns and 500 ns, respectively, to study the dynamic stability of the protein-ligand systems of the 10 compounds with PIM-2. Analyzed the interactions with key hinge region residues, binding free energies, and changes in the ATP pocket size. The generative model demonstrates good molecular generation capability and can generate efficient novel molecules with similar physicochemical properties as active PIM-2 drugs. Among the 10 initially selected hit compounds, 5 compounds C3 (- 29.69 kcal/mol), C4 (- 33.31 kcal/mol), C5 (- 28.59 kcal/mol), C8 (- 34.68 kcal/mol), and C9 (- 25.88 kcal/mol) have higher binding energies with PIM-2 than the positive drug 3YR (- 26.18 kcal/mol). The MD simulation results are consistent with the docking analysis, these compounds have lower and more stable RMSD values for the complex systems with the reported positive drug 3YR and PIM-2 complex system. They can form long-term stable interactions with active site and the hinge region of PIM-2, which suggests these compounds are likely to have potent inhibitory effects on PIM-2. This study provides an efficient generative model for PIM-2 inhibitor research and discovers 5 potential novel PIM-2 inhibitors.
Collapse
Affiliation(s)
- Tianli Qin
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China
- The Eye Hospital, School of Ophthalmology & Optometry, Wenzhou Medical University, Wenzhou, 325027, China
| | - Yijian Wang
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China
| | - Miaomiao Kong
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China
| | - Hongliang Zhong
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou, 325000, Zhejiang, China
| | - Tao Wu
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China
| | - Zixuan Xi
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China
| | - Zhenyong Qian
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou, 325000, Zhejiang, China
| | - Ke Li
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou, 325000, Zhejiang, China
| | - Yuepiao Cai
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, 325000, China.
| | - Jianzhang Wu
- The Eye Hospital, School of Ophthalmology & Optometry, Wenzhou Medical University, Wenzhou, 325027, China.
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou, 325000, Zhejiang, China.
| | - Wulan Li
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China.
| |
Collapse
|
8
|
Albrijawi MT, Alhajj R. LSTM-driven drug design using SELFIES for target-focused de novo generation of HIV-1 protease inhibitor candidates for AIDS treatment. PLoS One 2024; 19:e0303597. [PMID: 38905197 PMCID: PMC11192380 DOI: 10.1371/journal.pone.0303597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 04/26/2024] [Indexed: 06/23/2024] Open
Abstract
The battle against viral drug resistance highlights the need for innovative approaches to replace time-consuming and costly traditional methods. Deep generative models offer automation potential, especially in the fight against Human immunodeficiency virus (HIV), as they can synthesize diverse molecules effectively. In this paper, an application of an LSTM-based deep generative model named "LSTM-ProGen" is proposed to be tailored explicitly for the de novo design of drug candidate molecules that interact with a specific target protein (HIV-1 protease). LSTM-ProGen distinguishes itself by employing a long-short-term memory (LSTM) architecture, to generate novel molecules target specificity against the HIV-1 protease. Following a thorough training process involves fine-tuning LSTM-ProGen on a diverse range of compounds sourced from the ChEMBL database. The model was optimized to meet specific requirements, with multiple iterations to enhance its predictive capabilities and ensure it generates molecules that exhibit favorable target interactions. The training process encompasses an array of performance evaluation metrics, such as drug-likeness properties. Our evaluation includes extensive silico analysis using molecular docking and PCA-based visualization to explore the chemical space that the new molecules cover compared to those in the training set. These evaluations reveal that a subset of 12 de novo molecules generated by LSTM-ProGen exhibit a striking ability to interact with the target protein, rivaling or even surpassing the efficacy of native ligands. Extended versions with further refinement of LSTM-ProGen hold promise as versatile tools for designing efficacious and customized drug candidates tailored to specific targets, thus accelerating drug development and facilitating the discovery of new therapies for various diseases.
Collapse
Affiliation(s)
- M. Taleb Albrijawi
- 1 Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey
| | - Reda Alhajj
- 1 Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey
- 2 Department of Computer Science, University of Calgary, Alberta, Canada
- 3 Department of Health Informatics, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
9
|
Yoo S, Kim J. Adapt-cMolGPT: A Conditional Generative Pre-Trained Transformer with Adapter-Based Fine-Tuning for Target-Specific Molecular Generation. Int J Mol Sci 2024; 25:6641. [PMID: 38928346 PMCID: PMC11203498 DOI: 10.3390/ijms25126641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Revised: 06/09/2024] [Accepted: 06/14/2024] [Indexed: 06/28/2024] Open
Abstract
Small-molecule drug design aims to generate compounds that target specific proteins, playing a crucial role in the early stages of drug discovery. Recently, research has emerged that utilizes the GPT model, which has achieved significant success in various fields to generate molecular compounds. However, due to the persistent challenge of small datasets in the pharmaceutical field, there has been some degradation in the performance of generating target-specific compounds. To address this issue, we propose an enhanced target-specific drug generation model, Adapt-cMolGPT, which modifies molecular representation and optimizes the fine-tuning process. In particular, we introduce a new fine-tuning method that incorporates an adapter module into a pre-trained base model and alternates weight updates by sections. We evaluated the proposed model through multiple experiments and demonstrated performance improvements compared to previous models. In the experimental results, Adapt-cMolGPT generated a greater number of novel and valid compounds compared to other models, with these generated compounds exhibiting properties similar to those of real molecular data. These results indicate that our proposed method is highly effective in designing drugs targeting specific proteins.
Collapse
Affiliation(s)
- Soyoung Yoo
- Department of Artificial Intelligence, Sejong University, Seoul 05006, Republic of Korea;
| | - Junghyun Kim
- Department of Artificial Intelligence, Sejong University, Seoul 05006, Republic of Korea;
- Deep Learning Architecture Research Center, Sejong University, Seoul 05006, Republic of Korea
| |
Collapse
|
10
|
Gangwal A, Lavecchia A. Unleashing the power of generative AI in drug discovery. Drug Discov Today 2024; 29:103992. [PMID: 38663579 DOI: 10.1016/j.drudis.2024.103992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 03/22/2024] [Accepted: 04/18/2024] [Indexed: 05/04/2024]
Abstract
Artificial intelligence (AI) is revolutionizing drug discovery by enhancing precision, reducing timelines and costs, and enabling AI-driven computer-aided drug design. This review focuses on recent advancements in deep generative models (DGMs) for de novo drug design, exploring diverse algorithms and their profound impact. It critically analyses the challenges that are intricately interwoven into these technologies, proposing strategies to unlock their full potential. It features case studies of both successes and failures in advancing drugs to clinical trials with AI assistance. Last, it outlines a forward-looking plan for optimizing DGMs in de novo drug design, thereby fostering faster and more cost-effective drug development.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule 424001, Maharashtra, India
| | - Antonio Lavecchia
- "Drug Discovery" Laboratory, Department of Pharmacy, University of Naples Federico II, I-80131 Naples, Italy.
| |
Collapse
|
11
|
Krishnan SR, Bung N, Srinivasan R, Roy A. Target-specific novel molecules with their recipe: Incorporating synthesizability in the design process. J Mol Graph Model 2024; 129:108734. [PMID: 38442440 DOI: 10.1016/j.jmgm.2024.108734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 02/14/2024] [Accepted: 02/15/2024] [Indexed: 03/07/2024]
Abstract
Application of Artificial intelligence (AI) in drug discovery has led to several success stories in recent times. While traditional methods mostly relied upon screening large chemical libraries for early-stage drug-design, de novo design can help identify novel target-specific molecules by sampling from a much larger chemical space. Although this has increased the possibility of finding diverse and novel molecules from previously unexplored chemical space, this has also posed a great challenge for medicinal chemists to synthesize at least some of the de novo designed novel molecules for experimental validation. To address this challenge, in this work, we propose a novel forward synthesis-based generative AI method, which is used to explore the synthesizable chemical space. The method uses a structure-based drug design framework, where the target protein structure and a target-specific seed fragment from co-crystal structures can be the initial inputs. A random fragment from a purchasable fragment library can also be the input if a target-specific fragment is unavailable. Then a template-based forward synthesis route prediction and molecule generation is performed in parallel using the Monte Carlo Tree Search (MCTS) method where, the subsequent fragments for molecule growth can again be obtained from a purchasable fragment library. The rewards for each iteration of MCTS are computed using a drug-target affinity (DTA) model based on the docking pose of the generated reaction intermediates at the binding site of the target protein of interest. With the help of the proposed method, it is now possible to overcome one of the major obstacles posed to the AI-based drug design approaches through the ability of the method to design novel target-specific synthesizable molecules.
Collapse
Affiliation(s)
| | - Navneet Bung
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India
| | - Rajgopal Srinivasan
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India
| | - Arijit Roy
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India.
| |
Collapse
|
12
|
Wang L, Zhou Z, Yang X, Shi S, Zeng X, Cao D. The present state and challenges of active learning in drug discovery. Drug Discov Today 2024; 29:103985. [PMID: 38642700 DOI: 10.1016/j.drudis.2024.103985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 04/08/2024] [Accepted: 04/15/2024] [Indexed: 04/22/2024]
Abstract
Active learning (AL) is an iterative feedback process that efficiently identifies valuable data within vast chemical space, even with limited labeled data. This characteristic renders it a valuable approach to tackle the ongoing challenges faced in drug discovery, such as the ever-expanding explore space and the limitations of labeled data. Consequently, AL is increasingly gaining prominence in the field of drug development. In this paper, we comprehensively review the application of AL at all stages of drug discovery, including compounds-target interaction prediction, virtual screening, molecular generation and optimization, as well as molecular properties prediction. Additionally, we discuss the challenges and prospects associated with the current applications of AL in drug discovery.
Collapse
Affiliation(s)
- Lei Wang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Zhenran Zhou
- Department of Computer Science, Hunan University, Changsha 410082, Hunan, China
| | - Xixi Yang
- Department of Computer Science, Hunan University, Changsha 410082, Hunan, China
| | - Shaohua Shi
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Xiangxiang Zeng
- Department of Computer Science, Hunan University, Changsha 410082, Hunan, China.
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China.
| |
Collapse
|
13
|
Das M, Ghosh A, Sunoj RB. Advances in machine learning with chemical language models in molecular property and reaction outcome predictions. J Comput Chem 2024; 45:1160-1176. [PMID: 38299229 DOI: 10.1002/jcc.27315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 01/06/2024] [Accepted: 01/09/2024] [Indexed: 02/02/2024]
Abstract
Molecular properties and reactions form the foundation of chemical space. Over the years, innumerable molecules have been synthesized, a smaller fraction of them found immediate applications, while a larger proportion served as a testimony to creative and empirical nature of the domain of chemical science. With increasing emphasis on sustainable practices, it is desirable that a target set of molecules are synthesized preferably through a fewer empirical attempts instead of a larger library, to realize an active candidate. In this front, predictive endeavors using machine learning (ML) models built on available data acquire high timely significance. Prediction of molecular property and reaction outcome remain one of the burgeoning applications of ML in chemical science. Among several methods of encoding molecular samples for ML models, the ones that employ language like representations are gaining steady popularity. Such representations would additionally help adopt well-developed natural language processing (NLP) models for chemical applications. Given this advantageous background, herein we describe several successful chemical applications of NLP focusing on molecular property and reaction outcome predictions. From relatively simpler recurrent neural networks (RNNs) to complex models like transformers, different network architecture have been leveraged for tasks such as de novo drug design, catalyst generation, forward and retro-synthesis predictions. The chemical language model (CLM) provides promising avenues toward a broad range of applications in a time and cost-effective manner. While we showcase an optimistic outlook of CLMs, attention is also placed on the persisting challenges in reaction domain, which would optimistically be addressed by advanced algorithms tailored to chemical language and with increased availability of high-quality datasets.
Collapse
Affiliation(s)
- Manajit Das
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, India
| | - Ankit Ghosh
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, India
| | - Raghavan B Sunoj
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, India
- Centre for Machine Intelligence and Data Science, Indian Institute of Technology Bombay, Mumbai, India
| |
Collapse
|
14
|
Zhang H, Liu Y, Liu X, Wang C, Guo M. Equivariant score-based generative diffusion framework for 3D molecules. BMC Bioinformatics 2024; 25:203. [PMID: 38816718 DOI: 10.1186/s12859-024-05810-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Accepted: 05/13/2024] [Indexed: 06/01/2024] Open
Abstract
BACKGROUND Molecular biology is crucial for drug discovery, protein design, and human health. Due to the vastness of the drug-like chemical space, depending on biomedical experts to manually design molecules is exceedingly expensive. Utilizing generative methods with deep learning technology offers an effective approach to streamline the search space for molecular design and save costs. This paper introduces a novel E(3)-equivariant score-based diffusion framework for 3D molecular generation via SDEs, aiming to address the constraints of unified Gaussian diffusion methods. Within the proposed framework EMDS, the complete diffusion is decomposed into separate diffusion processes for distinct components of the molecular feature space, while the modeling processes also capture the complex dependency among these components. Moreover, angle and torsion angle information is integrated into the networks to enhance the modeling of atom coordinates and utilize spatial information more effectively. RESULTS Experiments on the widely utilized QM9 dataset demonstrate that our proposed framework significantly outperforms the state-of-the-art methods in all evaluation metrics for 3D molecular generation. Additionally, ablation experiments are conducted to highlight the contribution of key components in our framework, demonstrating the effectiveness of the proposed framework and the performance improvements of incorporating angle and torsion angle information for molecular generation. Finally, the comparative results of distribution show that our method is highly effective in generating molecules that closely resemble the actual scenario. CONCLUSION Through the experiments and comparative results, our framework clearly outperforms previous 3D molecular generation methods, exhibiting significantly better capacity for modeling chemically realistic molecules. The excellent performance of EMDS in 3D molecular generation brings novel and encouraging opportunities for tackling challenging biomedical molecule and protein scenarios.
Collapse
Affiliation(s)
- Hao Zhang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
| | - Yang Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China.
| | - Xiaoyan Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
| | - Cheng Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, 100044, China
| |
Collapse
|
15
|
Ansari M, White AD. Learning peptide properties with positive examples only. DIGITAL DISCOVERY 2024; 3:977-986. [PMID: 38756224 PMCID: PMC11094695 DOI: 10.1039/d3dd00218g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Accepted: 03/30/2024] [Indexed: 05/18/2024]
Abstract
Deep learning can create accurate predictive models by exploiting existing large-scale experimental data, and guide the design of molecules. However, a major barrier is the requirement of both positive and negative examples in the classical supervised learning frameworks. Notably, most peptide databases come with missing information and low number of observations on negative examples, as such sequences are hard to obtain using high-throughput screening methods. To address this challenge, we solely exploit the limited known positive examples in a semi-supervised setting, and discover peptide sequences that are likely to map to certain antimicrobial properties via positive-unlabeled learning (PU). In particular, we use the two learning strategies of adapting base classifier and reliable negative identification to build deep learning models for inferring solubility, hemolysis, binding against SHP-2, and non-fouling activity of peptides, given their sequence. We evaluate the predictive performance of our PU learning method and show that by only using the positive data, it can achieve competitive performance when compared with the classical positive-negative (PN) classification approach, where there is access to both positive and negative examples.
Collapse
Affiliation(s)
- Mehrad Ansari
- Department of Chemical Engineering, University of Rochester Rochester NY 14627 USA
| | - Andrew D White
- Department of Chemical Engineering, University of Rochester Rochester NY 14627 USA
| |
Collapse
|
16
|
Wang S, Liang D, Wang J, Dong K, Zhang Y, Liang H, Xu X, Song T. FraHMT: A Fragment-Oriented Heterogeneous Graph Molecular Generation Model for Target Proteins. J Chem Inf Model 2024; 64:3718-3732. [PMID: 38644797 DOI: 10.1021/acs.jcim.4c00252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
The molecular generation task stands as a pivotal step in the domains of computational chemistry and drug discovery, aiming to computationally generate molecular structures for specific properties. In contrast to previous models that focused primarily on SMILES strings or molecular graphs, our model placed a special emphasis on the substructure information on molecules, enabling the model to learn richer chemical rules and structure features from fragments and chemical reaction information on molecules. To accomplish this, we fragmented the molecules to construct heterogeneous graph representations based on atom and fragment information. Then our model mapped the heterogeneous graph data into a latent vector space by using an encoder and employed a self-regressive generative model as a decoder for molecular generation. Additionally, we performed transfer learning on the model using a small set of ligand molecules known to be active against the target protein to generate molecules that bind better to the target protein. Experimental results demonstrate that our model is highly competitive with state-of-the-art models. It can generate valid and diverse molecules with favorable physicochemical properties and drug-likeness. Importantly, they produce novel molecules with high docking scores against the target proteins.
Collapse
Affiliation(s)
- Shuang Wang
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
| | - Dingming Liang
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
| | - Jianmin Wang
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
- The Interdisciplinary Graduate Program in Integrative Biotechnology, Yonsei University, Incheon 21983, Republic of Korea
| | - Kaiyu Dong
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
| | - Yunjing Zhang
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
| | - Huicong Liang
- Marine Biomedical Institute of Qingdao, School of Medicine and Pharmacy, Ocean University of China, QingDao 266580, China
| | - Ximing Xu
- Marine Biomedical Institute of Qingdao, School of Medicine and Pharmacy, Ocean University of China, QingDao 266580, China
| | - Tao Song
- College of Computer Science and Technology, China University of Petroleum, QingDao 266580, China
- Department of Artificial Intelligence, Faculty of Computer Science, Polytechnical University of Madrid, Madrid 28031, Spain
| |
Collapse
|
17
|
Yang Y, Sun S, Yang S, Yang Q, Lu X, Wang X, Yu Q, Huo X, Qian X. Structural annotation of unknown molecules in a miniaturized mass spectrometer based on a transformer enabled fragment tree method. Commun Chem 2024; 7:109. [PMID: 38740942 DOI: 10.1038/s42004-024-01189-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 04/26/2024] [Indexed: 05/16/2024] Open
Abstract
Structural annotation of small molecules in tandem mass spectrometry has always been a central challenge in mass spectrometry analysis, especially using a miniaturized mass spectrometer for on-site testing. Here, we propose the Transformer enabled Fragment Tree (TeFT) method, which combines various types of fragmentation tree models and a deep learning Transformer module. It is aimed to generate the specific structure of molecules de novo solely from mass spectrometry spectra. The evaluation results on different open-source databases indicated that the proposed model achieved remarkable results in that the majority of molecular structures of compounds in the test can be successfully recognized. Also, the TeFT has been validated on a miniaturized mass spectrometer with low-resolution spectra for 16 flavonoid alcohols, achieving complete structure prediction for 8 substances. Finally, TeFT confirmed the structure of the compound contained in a Chinese medicine substance called the Anweiyang capsule. These results indicate that the TeFT method is suitable for annotating fragmentation peaks with clear fragmentation rules, particularly when applied to on-site mass spectrometry with lower mass resolution.
Collapse
Affiliation(s)
- Yiming Yang
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Shuang Sun
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Shuyuan Yang
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Qin Yang
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Xinqiong Lu
- CHIN Instrument (Hefei) Co., Ltd., Hefei, 231200, China
| | - Xiaohao Wang
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Quan Yu
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Xinming Huo
- Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province, School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, China.
| | - Xiang Qian
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
| |
Collapse
|
18
|
Luginina AP, Khnykin AN, Khorn PA, Moiseeva OV, Safronova NA, Pospelov VA, Dashevskii DE, Belousov AS, Borschevskiy VI, Mishin AV. Rational Design of Drugs Targeting G-Protein-Coupled Receptors: Ligand Search and Screening. BIOCHEMISTRY. BIOKHIMIIA 2024; 89:958-972. [PMID: 38880655 DOI: 10.1134/s0006297924050158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 02/22/2024] [Accepted: 02/23/2024] [Indexed: 06/18/2024]
Abstract
G protein-coupled receptors (GPCRs) are transmembrane proteins that participate in many physiological processes and represent major pharmacological targets. Recent advances in structural biology of GPCRs have enabled the development of drugs based on the receptor structure (structure-based drug design, SBDD). SBDD utilizes information about the receptor-ligand complex to search for suitable compounds, thus expanding the chemical space of possible receptor ligands without the need for experimental screening. The review describes the use of structure-based virtual screening (SBVS) for GPCR ligands and approaches for the functional testing of potential drug compounds, as well as discusses recent advances and successful examples in the application of SBDD for the identification of GPCR ligands.
Collapse
Affiliation(s)
- Aleksandra P Luginina
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Andrey N Khnykin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Polina A Khorn
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Olga V Moiseeva
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
- Skryabin Institute of Biochemistry and Physiology of Microorganisms, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russia
| | - Nadezhda A Safronova
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Vladimir A Pospelov
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Dmitrii E Dashevskii
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Anatolii S Belousov
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia
| | - Valentin I Borschevskiy
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia.
- Frank Laboratory of Neutron Physics, Joint Institute for Nuclear Research, Dubna, Moscow Region, 141980, Russia
| | - Alexey V Mishin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia.
| |
Collapse
|
19
|
Chakarborty S, Irshad IU, Mahima, Sharma AK. TIR predictor and optimizer: Web-tools for accurate prediction of translation initiation rate and precision gene design in Saccharomyces cerevisiae. Biotechnol J 2024; 19:e2400081. [PMID: 38719586 DOI: 10.1002/biot.202400081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Revised: 04/15/2024] [Accepted: 04/16/2024] [Indexed: 05/14/2024]
Abstract
Translation initiation is the primary determinant of the rate of protein production. The variation in the rate with which this step occurs can cause up to three orders of magnitude differences in cellular protein levels. Several mRNA features, including mRNA stability in proximity to the start codon, coding sequence length, and presence of specific motifs in the mRNA molecule, have been shown to influence the translation initiation rate. These molecular factors acting at different strengths allow precise control of in vivo translation initiation rate and thus the rate of protein synthesis. However, despite the paramount importance of translation initiation rate in protein synthesis, accurate prediction of the absolute values of initiation rate remains a challenge. In fact, as of now, there is no available model for predicting the initiation rate in Saccharomyces cerevisiae. To address this, we train a machine learning model for predicting the in vivo initiation rate in S. cerevisiae transcripts. The model is trained using a diverse set of mRNA transcripts, enabling the comparison of initiation rates across different transcripts. Our model exhibited excellent accuracy in predicting the translation initiation rate and demonstrated its effectiveness with both endogenous and exogenous transcripts. Then, by combining the machine learning model with the Monte-Carlo search algorithm, we have also devised a method to optimize the nucleotide sequence of any gene to achieve a specific target initiation rate. The machine learning model we've developed for predicting translation initiation rates, along with the gene optimization method, are deployed as a web server. Both web servers are accessible for free at the following link: ajeetsharmalab.com/TIRPredictor. Thus, this research advances our fundamental understanding of translation initiation processes, with direct applications in biotechnology.
Collapse
Affiliation(s)
| | | | - Mahima
- Department of Physics, Indian Institute of Technology Jammu, Jammu, India
| | - Ajeet K Sharma
- Department of Physics, Indian Institute of Technology Jammu, Jammu, India
- Department of Biosciences and Bioengineering, Indian Institute of Technology Jammu, Jammu, India
| |
Collapse
|
20
|
Kumar N, Acharya V. Advances in machine intelligence-driven virtual screening approaches for big-data. Med Res Rev 2024; 44:939-974. [PMID: 38129992 DOI: 10.1002/med.21995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 07/15/2023] [Accepted: 10/29/2023] [Indexed: 12/23/2023]
Abstract
Virtual screening (VS) is an integral and ever-evolving domain of drug discovery framework. The VS is traditionally classified into ligand-based (LB) and structure-based (SB) approaches. Machine intelligence or artificial intelligence has wide applications in the drug discovery domain to reduce time and resource consumption. In combination with machine intelligence algorithms, VS has emerged into revolutionarily progressive technology that learns within robust decision orders for data curation and hit molecule screening from large VS libraries in minutes or hours. The exponential growth of chemical and biological data has evolved as "big-data" in the public domain demands modern and advanced machine intelligence-driven VS approaches to screen hit molecules from ultra-large VS libraries. VS has evolved from an individual approach (LB and SB) to integrated LB and SB techniques to explore various ligand and target protein aspects for the enhanced rate of appropriate hit molecule prediction. Current trends demand advanced and intelligent solutions to handle enormous data in drug discovery domain for screening and optimizing hits or lead with fewer or no false positive hits. Following the big-data drift and tremendous growth in computational architecture, we presented this review. Here, the article categorized and emphasized individual VS techniques, detailed literature presented for machine learning implementation, modern machine intelligence approaches, and limitations and deliberated the future prospects.
Collapse
Affiliation(s)
- Neeraj Kumar
- Artificial Intelligence for Computational Biology Lab (AICoB), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research, Ghaziabad, India
| | - Vishal Acharya
- Artificial Intelligence for Computational Biology Lab (AICoB), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research, Ghaziabad, India
| |
Collapse
|
21
|
Zhang G, Zhang Y, Li L, Zhou J, Chen H, Ji J, Li Y, Cao Y, Xu Z, Pian C. Exploring Novel Fentanyl Analogues Using a Graph-Based Transformer Model. Interdiscip Sci 2024:10.1007/s12539-024-00623-0. [PMID: 38683279 DOI: 10.1007/s12539-024-00623-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 02/23/2024] [Accepted: 02/25/2024] [Indexed: 05/01/2024]
Abstract
The structures of fentanyl and its analogues are easy to be modified and few types have been included in database so far, which allow criminals to avoid the supervision of relevant departments. This paper introduces a molecular graph-based transformer model, which is combined with a data augmentation method based on substructure replacement to generate novel fentanyl analogues. 140,000 molecules were generated, and after a set of screening, 36,799 potential fentanyl analogues were finally obtained. We calculated the molecular properties of 36,799 potential fentanyl analogues. The results showed that the model could learn some properties of original fentanyl molecules. We compared the generated molecules from transformer model and data augmentation method based on substructure replacement with those generated by the other two molecular generation models based on deep learning, and found that the model in this paper can generate more novel potential fentanyl analogues. Finally, the findings of the paper indicate that transformer model based on molecular graph helps us explore the structure of potential fentanyl molecules as well as understand distribution of original molecules of fentanyl.
Collapse
Affiliation(s)
- Guangle Zhang
- College of Science, Wuxi University, 214105, Wuxi, China
| | - Yuan Zhang
- College of Agriculture, Nanjing Agricultural University, 210095, Nanjing, China
| | - Ling Li
- Zhejiang Laboratory, 311121, Hangzhou, China
| | - Jiaying Zhou
- College of Science, Nanjing Agricultural University, 210095, Nanjing, China
| | - Honglin Chen
- College of Science, Nanjing Agricultural University, 210095, Nanjing, China
| | - Jinwen Ji
- College of Agriculture, Nanjing Agricultural University, 210095, Nanjing, China
| | - Yanru Li
- College of Agriculture, Nanjing Agricultural University, 210095, Nanjing, China
| | - Yue Cao
- Department of Forensic Medicine, Nanjing Medical University, 211166, Nanjing, China.
| | - Zhihui Xu
- School of Pharmacy, China Pharmaceutical University, 211198, Nanjing, China.
| | - Cong Pian
- School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, 211198, Nanjing, China.
| |
Collapse
|
22
|
Guo Z, Fan Y, Yu C, Lu H, Zhang Z. GCMSFormer: A Fully Automatic Method for the Resolution of Overlapping Peaks in Gas Chromatography-Mass Spectrometry. Anal Chem 2024; 96:5878-5886. [PMID: 38560891 DOI: 10.1021/acs.analchem.3c05772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Gas chromatography-mass spectrometry (GC-MS) is one of the most important instruments for analyzing volatile organic compounds. However, the complexity of real samples and the limitations of chromatographic separation capabilities lead to coeluting compounds without ideal separation. In this study, a Transformer-based automatic resolution method (GCMSFormer) is proposed to resolve mass spectra from GC-MS peaks in an end-to-end manner, predicting the mass spectra of components directly from the raw overlapping peaks data. Furthermore, orthogonal projection resolution (OPR) was integrated into GCMSFormer to resolve minor components. The GCMSFormer model was trained, validated, and tested using 100,000 augmented data. It achieves 99.88% of the bilingual evaluation understudy (BLEU) value on the test set, significantly higher than the 97.68% BLEU value of the baseline sequence-to-sequence model long short-term memory (LSTM). GCMSFormer was also compared with two nondeep learning resolution tools (MZmine and AMDIS) and two deep learning resolution tools (PARAFAC2 with DL and MSHub/GNPS) on a real plant essential oil GC-MS data set. Their resolution results were compared on evaluation metrics, including the number of compounds resolved, mass spectral match score, correlation coefficient, explained variance, and resolution speed. The results demonstrate that GCMSFormer has better resolution performance, higher automation, and faster resolution speed. In summary, GCMSFormer is an end-to-end, fast, fully automatic, and accurate method for analyzing GC-MS data of complex samples.
Collapse
Affiliation(s)
- Zixuan Guo
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Yingjie Fan
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Chuanxiu Yu
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Hunan, Changsha 410083, China
| |
Collapse
|
23
|
Bhowmik D, Zhang P, Fox Z, Irle S, Gounley J. Enhancing molecular design efficiency: Uniting language models and generative networks with genetic algorithms. PATTERNS (NEW YORK, N.Y.) 2024; 5:100947. [PMID: 38645768 PMCID: PMC11026973 DOI: 10.1016/j.patter.2024.100947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 11/14/2023] [Accepted: 02/08/2024] [Indexed: 04/23/2024]
Abstract
This study examines the effectiveness of generative models in drug discovery, material science, and polymer science, aiming to overcome constraints associated with traditional inverse design methods relying on heuristic rules. Generative models generate synthetic data resembling real data, enabling deep learning model training without extensive labeled datasets. They prove valuable in creating virtual libraries of molecules for material science and facilitating drug discovery by generating molecules with specific properties. While generative adversarial networks (GANs) are explored for these purposes, mode collapse restricts their efficacy, limiting novel structure variability. To address this, we introduce a masked language model (LM) inspired by natural language processing. Although LMs alone can have inherent limitations, we propose a hybrid architecture combining LMs and GANs to efficiently generate new molecules, demonstrating superior performance over standalone masked LMs, particularly for smaller population sizes. This hybrid LM-GAN architecture enhances efficiency in optimizing properties and generating novel samples.
Collapse
Affiliation(s)
- Debsindhu Bhowmik
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Pei Zhang
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Zachary Fox
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Stephan Irle
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - John Gounley
- Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| |
Collapse
|
24
|
Xie J, Chen S, Lei J, Yang Y. DiffDec: Structure-Aware Scaffold Decoration with an End-to-End Diffusion Model. J Chem Inf Model 2024; 64:2554-2564. [PMID: 38267393 DOI: 10.1021/acs.jcim.3c01466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
In molecular optimization, one popular way is R-group decoration on molecular scaffolds, and many efforts have been made to generate R-groups based on deep generative models. However, these methods mostly use information on known binding ligands, without fully utilizing target structure information. In this study, we proposed a new method, DiffDec, to involve 3D pocket constraints by a modified diffusion technique for optimizing molecules through molecular scaffold decoration. For end-to-end generation of R-groups with different sizes, we designed a novel fake atom mechanism. DiffDec was shown to be able to generate structure-aware R-groups with realistic geometric substructures by the analysis of bond angles and dihedral angles and simultaneously generate multiple R-groups for one scaffold on different growth anchors. The growth anchors could be provided by users or automatically determined by our model. DiffDec achieved R-group recovery rates of 69.67% and 45.34% in the single and multiple R-group decoration tasks, respectively, and these values were significantly higher than competing methods (37.33% and 26.85%). According to the molecular docking study, our decorated molecules obtained a better average binding affinity than baseline methods. The docking pose analysis revealed that DiffDec could decorate scaffolds with R-groups that exhibited improved binding affinities and more favorable interactions with the pocket. These results demonstrated the potential and applicability of DiffDec in real-world scaffold decoration for molecular optimization.
Collapse
Affiliation(s)
- Junjie Xie
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
- AixplorerBio Inc., Jiaxing 314031, China
| | - Sheng Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
- AixplorerBio Inc., Jiaxing 314031, China
| | - Jinping Lei
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| |
Collapse
|
25
|
Pang C, Qiao J, Zeng X, Zou Q, Wei L. Deep Generative Models in De Novo Drug Molecule Generation. J Chem Inf Model 2024; 64:2174-2194. [PMID: 37934070 DOI: 10.1021/acs.jcim.3c01496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]
Abstract
The discovery of new drugs has important implications for human health. Traditional methods for drug discovery rely on experiments to optimize the structure of lead molecules, which are time-consuming and high-cost. Recently, artificial intelligence has exhibited promising and efficient performance for drug-like molecule generation. In particular, deep generative models achieve great success in de novo generation of drug-like molecules with desired properties, showing massive potential for novel drug discovery. In this study, we review the recent progress of molecule generation using deep generative models, mainly focusing on molecule representations, public databases, data processing tools, and advanced artificial intelligence based molecule generation frameworks. In particular, we present a comprehensive comparison of state-of-the-art deep generative models for molecule generation and a summary of commonly used molecular design strategies. We identify research gaps and challenges of molecule generation such as the need for better databases, missing 3D information in molecular representation, and the lack of high-precision evaluation metrics. We suggest future directions for molecular generation and drug discovery.
Collapse
Affiliation(s)
- Chao Pang
- School of Software, Shandong University, Jinan 250100, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250100, China
| | - Jianbo Qiao
- School of Software, Shandong University, Jinan 250100, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250100, China
| | - Xiangxiang Zeng
- College of Information Science and Engineering, Hunan University, Changsha 410082, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan 250100, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250100, China
| |
Collapse
|
26
|
Ghiandoni GM, Flanagan SR, Bodkin MJ, Nizi MG, Galera-Prat A, Brai A, Chen B, Wallace JEA, Hristozov D, Webster J, Manfroni G, Lehtiö L, Tabarrini O, Gillet VJ. Synthetically accessible de novo design using reaction vectors: Application to PARP1 inhibitors. Mol Inform 2024; 43:e202300183. [PMID: 38258328 DOI: 10.1002/minf.202300183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 01/16/2024] [Accepted: 01/22/2024] [Indexed: 01/24/2024]
Abstract
De novo design has been a hotly pursued topic for many years. Most recent developments have involved the use of deep learning methods for generative molecular design. Despite increasing levels of algorithmic sophistication, the design of molecules that are synthetically accessible remains a major challenge. Reaction-based de novo design takes a conceptually simpler approach and aims to address synthesisability directly by mimicking synthetic chemistry and driving structural transformations by known reactions that are applied in a stepwise manner. However, the use of a small number of hand-coded transformations restricts the chemical space that can be accessed and there are few examples in the literature where molecules and their synthetic routes have been designed and executed successfully. Here we describe the application of reaction-based de novo design to the design of synthetically accessible and biologically active compounds as proof-of-concept of our reaction vector-based software. Reaction vectors are derived automatically from known reactions and allow access to a wide region of synthetically accessible chemical space. The design was aimed at producing molecules that are active against PARP1 and which have improved brain penetration properties compared to existing PARP1 inhibitors. We synthesised a selection of the designed molecules according to the provided synthetic routes and tested them experimentally. The results demonstrate that reaction vectors can be applied to the design of novel molecules of biological relevance that are also synthetically accessible.
Collapse
Affiliation(s)
- Gian Marco Ghiandoni
- Information School, University of Sheffield, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK
| | - Stuart R Flanagan
- Evotec (U.K.) Ltd, 114 Innovation Drive, Milton Park, Abingdon, OX14 4RZ, UK
| | - Michael J Bodkin
- Evotec (U.K.) Ltd, 114 Innovation Drive, Milton Park, Abingdon, OX14 4RZ, UK
| | - Maria Giulia Nizi
- Department of Pharmaceutical Sciences, University of Perugia, 06123, Perugia, Italy
| | - Albert Galera-Prat
- Faculty of Biochemistry and Molecular Medicine & Biocenter Oulu, University of Oulu, Oulu, FI-90014, Finland
| | - Annalaura Brai
- Department of Biotechnology, Chemistry and Pharmacy, University of Siena, I-53100, Siena, Italy
| | - Beining Chen
- Department of Chemistry, University of Sheffield, Dainton Building, Brook Hill, Sheffield, S3 7HF, UK
| | - James E A Wallace
- Evotec (U.K.) Ltd, 114 Innovation Drive, Milton Park, Abingdon, OX14 4RZ, UK
| | - Dimitar Hristozov
- Evotec (U.K.) Ltd, 114 Innovation Drive, Milton Park, Abingdon, OX14 4RZ, UK
| | - James Webster
- Information School, University of Sheffield, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK
| | - Giuseppe Manfroni
- Department of Pharmaceutical Sciences, University of Perugia, 06123, Perugia, Italy
| | - Lari Lehtiö
- Faculty of Biochemistry and Molecular Medicine & Biocenter Oulu, University of Oulu, Oulu, FI-90014, Finland
| | - Oriana Tabarrini
- Department of Pharmaceutical Sciences, University of Perugia, 06123, Perugia, Italy
| | - Valerie J Gillet
- Information School, University of Sheffield, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK
| |
Collapse
|
27
|
Vogt M. Chemoinformatic approaches for navigating large chemical spaces. Expert Opin Drug Discov 2024; 19:403-414. [PMID: 38300511 DOI: 10.1080/17460441.2024.2313475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 01/30/2024] [Indexed: 02/02/2024]
Abstract
INTRODUCTION Large chemical spaces (CSs) include traditional large compound collections, combinatorial libraries covering billions to trillions of molecules, DNA-encoded chemical libraries comprising complete combinatorial CSs in a single mixture, and virtual CSs explored by generative models. The diverse nature of these types of CSs require different chemoinformatic approaches for navigation. AREAS COVERED An overview of different types of large CSs is provided. Molecular representations and similarity metrics suitable for large CS exploration are discussed. A summary of navigation of CSs in generative models is provided. Methods for characterizing and comparing CSs are discussed. EXPERT OPINION The size of large CSs might restrict navigation to specialized algorithms and limit it to considering neighborhoods of structurally similar molecules. Efficient navigation of large CSs not only requires methods that scale with size but also requires smart approaches that focus on better but not necessarily larger molecule selections. Deep generative models aim to provide such approaches by implicitly learning features relevant for targeted biological properties. It is unclear whether these models can fulfill this ideal as validation is difficult as long as the covered CSs remain mainly virtual without experimental verification.
Collapse
Affiliation(s)
- Martin Vogt
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany
| |
Collapse
|
28
|
Fan W, He Y, Zhu F. RM-GPT: Enhance the comprehensive generative ability of molecular GPT model via LocalRNN and RealFormer. Artif Intell Med 2024; 150:102827. [PMID: 38553166 DOI: 10.1016/j.artmed.2024.102827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 02/26/2024] [Accepted: 02/26/2024] [Indexed: 04/02/2024]
Abstract
Due to the surging of cost, artificial intelligence-assisted de novo drug design has supplanted conventional methods and become an emerging option for drug discovery. Although there have arisen many successful examples of applying generative models to the molecular field, these methods struggle to deal with conditional generation that meet chemists' practical requirements which ask for a controllable process to generate new molecules or optimize basic molecules with appointed conditions. To address this problem, a Recurrent Molecular-Generative Pretrained Transformer model is proposed, supplemented by LocalRNN and Residual Attention Layer Transformer, referred to as RM-GPT. RM-GPT rebuilds GPT model's architecture by incorporating LocalRNN and Residual Attention Layer Transformer so that it is able to extract local information and build connectivity between attention blocks. The incorporation of Transformer in these two modules enables leveraging the parallel computing advantages of multi-head attention mechanisms while extracting local structural information effectively. Through exploring and learning in a large chemical space, RM-GPT absorbs the ability to generate drug-like molecules with conditions in demand, such as desired properties and scaffolds, precisely and stably. RM-GPT achieved better results than SOTA methods on conditional generation.
Collapse
Affiliation(s)
- Wenfeng Fan
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| | - Yue He
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| | - Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| |
Collapse
|
29
|
Zhang Y, Tong Y, Xia X, Wu Q, Su Y. A domain-label-guided translation model for molecular optimization. Methods 2024; 224:71-78. [PMID: 38395182 DOI: 10.1016/j.ymeth.2024.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/11/2024] [Accepted: 02/17/2024] [Indexed: 02/25/2024] Open
Abstract
Molecular optimization, which aims to improve molecular properties by modifying complex molecular structures, is a crucial and challenging task in drug discovery. In recent years, translation models provide a promising way to transform low-property molecules to high-property molecules, which enables molecular optimization to achieve remarkable progress. However, most existing models require matched molecular pairs, which are prone to be limited by the datasets. Although some models do not require matched molecular pairs, their performance is usually sacrificed due to the lack of useful supervising information. To address this issue, a domain-label-guided translation model is proposed in this paper, namely DLTM. In the model, the domain label information of molecules is exploited as a control condition to obtain different embedding representations, enabling the model to generate diverse molecules. Besides, the model adopts a classifier network to identify the property categories of transformed molecules, guiding the model to generate molecules with desired properties. The performance of DLTM is verified on two optimization tasks, namely the quantitative estimation of drug-likeness and penalized logP. Experimental results show that the proposed DLTM is superior to the compared baseline models.
Collapse
Affiliation(s)
- Yajie Zhang
- School of Computer Science and Technology, Anhui University, Hefei, 230601, China.
| | - Yongqi Tong
- School of Computer Science and Technology, Anhui University, Hefei, 230601, China.
| | - Xin Xia
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, 230088, China; School of Artificial Intelligence, Anhui University, Hefei, 230601, China.
| | - Qingwen Wu
- Affiliated Hospital of Jining Medical University, Jining, 272007, China.
| | - Yansen Su
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, 230088, China; School of Artificial Intelligence, Anhui University, Hefei, 230601, China.
| |
Collapse
|
30
|
Ghandikota SK, Jegga AG. Application of artificial intelligence and machine learning in drug repurposing. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2024; 205:171-211. [PMID: 38789178 DOI: 10.1016/bs.pmbts.2024.03.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
The purpose of drug repurposing is to leverage previously approved drugs for a particular disease indication and apply them to another disease. It can be seen as a faster and more cost-effective approach to drug discovery and a powerful tool for achieving precision medicine. In addition, drug repurposing can be used to identify therapeutic candidates for rare diseases and phenotypic conditions with limited information on disease biology. Machine learning and artificial intelligence (AI) methodologies have enabled the construction of effective, data-driven repurposing pipelines by integrating and analyzing large-scale biomedical data. Recent technological advances, especially in heterogeneous network mining and natural language processing, have opened up exciting new opportunities and analytical strategies for drug repurposing. In this review, we first introduce the challenges in repurposing approaches and highlight some success stories, including those during the COVID-19 pandemic. Next, we review some existing computational frameworks in the literature, organized on the basis of the type of biomedical input data analyzed and the computational algorithms involved. In conclusion, we outline some exciting new directions that drug repurposing research may take, as pioneered by the generative AI revolution.
Collapse
Affiliation(s)
- Sudhir K Ghandikota
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States
| | - Anil G Jegga
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States.
| |
Collapse
|
31
|
Wang C, Ong HH, Chiba S, Rajapakse JC. GLDM: hit molecule generation with constrained graph latent diffusion model. Brief Bioinform 2024; 25:bbae142. [PMID: 38581415 PMCID: PMC10998532 DOI: 10.1093/bib/bbae142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 03/08/2024] [Accepted: 03/03/2024] [Indexed: 04/08/2024] Open
Abstract
Discovering hit molecules with desired biological activity in a directed manner is a promising but profound task in computer-aided drug discovery. Inspired by recent generative AI approaches, particularly Diffusion Models (DM), we propose Graph Latent Diffusion Model (GLDM)-a latent DM that preserves both the effectiveness of autoencoders of compressing complex chemical data and the DM's capabilities of generating novel molecules. Specifically, we first develop an autoencoder to encode the molecular data into low-dimensional latent representations and then train the DM on the latent space to generate molecules inducing targeted biological activity defined by gene expression profiles. Manipulating DM in the latent space rather than the input space avoids complicated operations to map molecule decomposition and reconstruction to diffusion processes, and thus improves training efficiency. Experiments show that GLDM not only achieves outstanding performances on molecular generation benchmarks, but also generates samples with optimal chemical properties and potentials to induce desired biological activity.
Collapse
Affiliation(s)
- Conghao Wang
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore
| | - Hiok Hian Ong
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore
| | - Shunsuke Chiba
- School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, 21 Nanyang Link, 637371, Singapore
| | - Jagath C Rajapakse
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore
| |
Collapse
|
32
|
Chang J, Ye JC. Bidirectional generation of structure and properties through a single molecular foundation model. Nat Commun 2024; 15:2323. [PMID: 38485914 PMCID: PMC10940637 DOI: 10.1038/s41467-024-46440-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 02/27/2024] [Indexed: 03/18/2024] Open
Abstract
Recent successes of foundation models in artificial intelligence have prompted the emergence of large-scale chemical pre-trained models. Despite the growing interest in large molecular pre-trained models that provide informative representations for downstream tasks, attempts for multimodal pre-training approaches on the molecule domain were limited. To address this, here we present a multimodal molecular pre-trained model that incorporates the modalities of structure and biochemical properties, drawing inspiration from recent advances in multimodal learning techniques. Our proposed model pipeline of data handling and training objectives aligns the structure/property features in a common embedding space, which enables the model to regard bidirectional information between the molecules' structure and properties. These contributions emerge synergistic knowledge, allowing us to tackle both multimodal and unimodal downstream tasks through a single model. Through extensive experiments, we demonstrate that our model has the capabilities to solve various meaningful chemical challenges, including conditional molecule generation, property prediction, molecule classification, and reaction prediction.
Collapse
Affiliation(s)
- Jinho Chang
- Graduate School of AI, KAIST, Daejeon, South Korea
| | - Jong Chul Ye
- Graduate School of AI, KAIST, Daejeon, South Korea.
| |
Collapse
|
33
|
Dodds M, Guo J, Löhr T, Tibo A, Engkvist O, Janet JP. Sample efficient reinforcement learning with active learning for molecular design. Chem Sci 2024; 15:4146-4160. [PMID: 38487235 PMCID: PMC10935729 DOI: 10.1039/d3sc04653b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Accepted: 02/07/2024] [Indexed: 03/17/2024] Open
Abstract
Reinforcement learning (RL) is a powerful and flexible paradigm for searching for solutions in high-dimensional action spaces. However, bridging the gap between playing computer games with thousands of simulated episodes and solving real scientific problems with complex and involved environments (up to actual laboratory experiments) requires improvements in terms of sample efficiency to make the most of expensive information. The discovery of new drugs is a major commercial application of RL, motivated by the very large nature of the chemical space and the need to perform multiparameter optimization (MPO) across different properties. In silico methods, such as virtual library screening (VS) and de novo molecular generation with RL, show great promise in accelerating this search. However, incorporation of increasingly complex computational models in these workflows requires increasing sample efficiency. Here, we introduce an active learning system linked with an RL model (RL-AL) for molecular design, which aims to improve the sample-efficiency of the optimization process. We identity and characterize unique challenges combining RL and AL, investigate the interplay between the systems, and develop a novel AL approach to solve the MPO problem. Our approach greatly expedites the search for novel solutions relative to baseline-RL for simple ligand- and structure-based oracle functions, with a 5-66-fold increase in hits generated for a fixed oracle budget and a 4-64-fold reduction in computational time to find a specific number of hits. Furthermore, compounds discovered through RL-AL display substantial enrichment of a multi-parameter scoring objective, indicating superior efficacy in curating high-scoring compounds, without a reduction in output diversity. This significant acceleration improves the feasibility of oracle functions that have largely been overlooked in RL due to high computational costs, for example free energy perturbation methods, and in principle is applicable to any RL domain.
Collapse
Affiliation(s)
- Michael Dodds
- Molecular AI, Discovery Sciences, R&D, AstraZeneca 431 50 Gothenburg Sweden
| | - Jeff Guo
- Molecular AI, Discovery Sciences, R&D, AstraZeneca 431 50 Gothenburg Sweden
| | - Thomas Löhr
- Molecular AI, Discovery Sciences, R&D, AstraZeneca 431 50 Gothenburg Sweden
| | - Alessandro Tibo
- Molecular AI, Discovery Sciences, R&D, AstraZeneca 431 50 Gothenburg Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca 431 50 Gothenburg Sweden
| | - Jon Paul Janet
- Molecular AI, Discovery Sciences, R&D, AstraZeneca 431 50 Gothenburg Sweden
| |
Collapse
|
34
|
Tu G, Fu T, Zheng G, Xu B, Gou R, Luo D, Wang P, Xue W. Computational Chemistry in Structure-Based Solute Carrier Transporter Drug Design: Recent Advances and Future Perspectives. J Chem Inf Model 2024; 64:1433-1455. [PMID: 38294194 DOI: 10.1021/acs.jcim.3c01736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Solute carrier transporters (SLCs) are a class of important transmembrane proteins that are involved in the transportation of diverse solute ions and small molecules into cells. There are approximately 450 SLCs within the human body, and more than a quarter of them are emerging as attractive therapeutic targets for multiple complex diseases, e.g., depression, cancer, and diabetes. However, only 44 unique transporters (∼9.8% of the SLC superfamily) with 3D structures and specific binding sites have been reported. To design innovative and effective drugs targeting diverse SLCs, there are a number of obstacles that need to be overcome. However, computational chemistry, including physics-based molecular modeling and machine learning- and deep learning-based artificial intelligence (AI), provides an alternative and complementary way to the classical drug discovery approach. Here, we present a comprehensive overview on recent advances and existing challenges of the computational techniques in structure-based drug design of SLCs from three main aspects: (i) characterizing multiple conformations of the proteins during the functional process of transportation, (ii) identifying druggability sites especially the cryptic allosteric ones on the transporters for substrates and drugs binding, and (iii) discovering diverse small molecules or synthetic protein binders targeting the binding sites. This work is expected to provide guidelines for a deep understanding of the structure and function of the SLC superfamily to facilitate rational design of novel modulators of the transporters with the aid of state-of-the-art computational chemistry technologies including artificial intelligence.
Collapse
Affiliation(s)
- Gao Tu
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Tingting Fu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | | | - Binbin Xu
- Chengdu Sintanovo Biotechnology Co., Ltd., Chengdu 610200, China
| | - Rongpei Gou
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Ding Luo
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Panpan Wang
- College of Chemistry and Pharmaceutical Engineering, Huanghuai University, Zhumadian 463000, China
| | - Weiwei Xue
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| |
Collapse
|
35
|
Manshour N, He F, Wang D, Xu D. Integrating Protein Structure Prediction and Bayesian Optimization for Peptide Design. RESEARCH SQUARE 2024:rs.3.rs-4045284. [PMID: 38559017 PMCID: PMC10980098 DOI: 10.21203/rs.3.rs-4045284/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Peptide design, with the goal of identifying peptides possessing unique biological properties, stands as a crucial challenge in peptide-based drug discovery. While traditional and computational methods have made significant strides, they often encounter hurdles due to the complexities and costs of laboratory experiments. Recent advancements in deep learning and Bayesian Optimization have paved the way for innovative research in this domain. In this context, our study presents a novel approach that effectively combines protein structure prediction with Bayesian Optimization for peptide design. By applying carefully designed objective functions, we guide and enhance the optimization trajectory for new peptide sequences. Benchmarked against multiple native structures, our methodology is tailored to generate new peptides to their optimal potential biological properties.
Collapse
Affiliation(s)
- Negin Manshour
- University of Missouri, Columbia, Columbia MO 65211, USA
| | - Fei He
- University of Missouri, Columbia, Columbia MO 65211, USA
| | - Duolin Wang
- University of Missouri, Columbia, Columbia MO 65211, USA
| | - Dong Xu
- University of Missouri, Columbia, Columbia MO 65211, USA
| |
Collapse
|
36
|
Parrilla-Gutiérrez JM, Granda JM, Ayme JF, Bajczyk MD, Wilbraham L, Cronin L. Electron density-based GPT for optimization and suggestion of host-guest binders. NATURE COMPUTATIONAL SCIENCE 2024; 4:200-209. [PMID: 38459272 DOI: 10.1038/s43588-024-00602-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 01/23/2024] [Indexed: 03/10/2024]
Abstract
Here we present a machine learning model trained on electron density for the production of host-guest binders. These are read out as simplified molecular-input line-entry system (SMILES) format with >98% accuracy, enabling a complete characterization of the molecules in two dimensions. Our model generates three-dimensional representations of the electron density and electrostatic potentials of host-guest systems using a variational autoencoder, and then utilizes these representations to optimize the generation of guests via gradient descent. Finally the guests are converted to SMILES using a transformer. The successful practical application of our model to established molecular host systems, cucurbit[n]uril and metal-organic cages, resulted in the discovery of 9 previously validated guests for CB[6] and 7 unreported guests (with association constant Ka ranging from 13.5 M-1 to 5,470 M-1) and the discovery of 4 unreported guests for [Pd214]4+ (with Ka ranging from 44 M-1 to 529 M-1).
Collapse
Affiliation(s)
- Juan M Parrilla-Gutiérrez
- School of Chemistry, University of Glasgow, Glasgow, UK
- School of Computing, Engineering and Built Environment, Glasgow Caledonian University, Glasgow, UK
| | - Jarosław M Granda
- School of Chemistry, University of Glasgow, Glasgow, UK
- Institute of Organic Chemistry, Polish Academy of Sciences, Warsaw, Poland
| | | | | | | | - Leroy Cronin
- School of Chemistry, University of Glasgow, Glasgow, UK.
| |
Collapse
|
37
|
Temizer AB, Uludoğan G, Özçelik R, Koulani T, Ozkirimli E, Ulgen KO, Karali N, Özgür A. Exploring data-driven chemical SMILES tokenization approaches to identify key protein-ligand binding moieties. Mol Inform 2024; 43:e202300249. [PMID: 38196065 DOI: 10.1002/minf.202300249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/13/2023] [Accepted: 01/06/2024] [Indexed: 01/11/2024]
Abstract
Machine learning models have found numerous successful applications in computational drug discovery. A large body of these models represents molecules as sequences since molecular sequences are easily available, simple, and informative. The sequence-based models often segment molecular sequences into pieces called chemical words, analogous to the words that make up sentences in human languages, and then apply advanced natural language processing techniques for tasks such as de novo drug design, property prediction, and binding affinity prediction. However, the chemical characteristics and significance of these building blocks, chemical words, remain unexplored. To address this gap, we employ data-driven SMILES tokenization techniques such as Byte Pair Encoding, WordPiece, and Unigram to identify chemical words and compare the resulting vocabularies. To understand the chemical significance of these words, we build a language-inspired pipeline that treats high affinity ligands of protein targets as documents and selects key chemical words making up those ligands based on tf-idf weighting. The experiments on multiple protein-ligand affinity datasets show that despite differences in words, lengths, and validity among the vocabularies generated by different subword tokenization algorithms, the identified key chemical words exhibit similarity. Further, we conduct case studies on a number of target to analyze the impact of key chemical words on binding. We find that these key chemical words are specific to protein targets and correspond to known pharmacophores and functional groups. Our approach elucidates chemical properties of the words identified by machine learning models and can be used in drug discovery studies to determine significant chemical moieties.
Collapse
Affiliation(s)
- Asu Busra Temizer
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, İstanbul University, İstanbul, Turkey
- Department of Pharmaceutical Chemistry, Institute of Health Sciences, İstanbul University, İstanbul, Turkey
| | - Gökçe Uludoğan
- Department of Computer Engineering, Boğaziçi University, İstanbul, Turkey
| | - Rıza Özçelik
- Department of Computer Engineering, Boğaziçi University, İstanbul, Turkey
| | - Taha Koulani
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, İstanbul University, İstanbul, Turkey
- Department of Pharmaceutical Chemistry, Institute of Health Sciences, İstanbul University, İstanbul, Turkey
| | - Elif Ozkirimli
- Science and Research Informatics, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Kutlu O Ulgen
- Department of Chemical Engineering, Boğaziçi University, İstanbul, Turkey
| | - Nilgun Karali
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, İstanbul University, İstanbul, Turkey
| | - Arzucan Özgür
- Department of Computer Engineering, Boğaziçi University, İstanbul, Turkey
| |
Collapse
|
38
|
Wang M, Wu Z, Wang J, Weng G, Kang Y, Pan P, Li D, Deng Y, Yao X, Bing Z, Hsieh CY, Hou T. Genetic Algorithm-Based Receptor Ligand: A Genetic Algorithm-Guided Generative Model to Boost the Novelty and Drug-Likeness of Molecules in a Sampling Chemical Space. J Chem Inf Model 2024; 64:1213-1228. [PMID: 38302422 DOI: 10.1021/acs.jcim.3c01964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Deep learning-based de novo molecular design has recently gained significant attention. While numerous DL-based generative models have been successfully developed for designing novel compounds, the majority of the generated molecules lack sufficiently novel scaffolds or high drug-like profiles. The aforementioned issues may not be fully captured by commonly used metrics for the assessment of molecular generative models, such as novelty, diversity, and quantitative estimation of the drug-likeness score. To address these limitations, we proposed a genetic algorithm-guided generative model called GARel (genetic algorithm-based receptor-ligand interaction generator), a novel framework for training a DL-based generative model to produce drug-like molecules with novel scaffolds. To efficiently train the GARel model, we utilized dense net to update the parameters based on molecules with novel scaffolds and drug-like features. To demonstrate the capability of the GARel model, we used it to design inhibitors for three targets: AA2AR, EGFR, and SARS-Cov2. The results indicate that GARel-generated molecules feature more diverse and novel scaffolds and possess more desirable physicochemical properties and favorable docking scores. Compared with other generative models, GARel makes significant progress in balancing novelty and drug-likeness, providing a promising direction for the further development of DL-based de novo design methodology with potential impacts on drug discovery.
Collapse
Affiliation(s)
- Mingyang Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Zhengjian Wu
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- School of Computer Science, Wuhan University, Wuhan 430072, Hubei ,China
| | - Jike Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Gaoqi Weng
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Yu Kang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Peichen Pan
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Dan Li
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Xiaojun Yao
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery Macau Institute for Applied Research in Medicine and Health State Key Laboratory of Quality Research in Chinese Medicine, Macau University of Science and Technology, Taipa, Macau 999078, China
| | - Zhitong Bing
- Institute of Modern Physics, Chinese Academy of Sciences, Lanzhou, Gansu 730000, China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Tingjun Hou
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| |
Collapse
|
39
|
Liu L, Zhao X, Huang X. Generating Potential RET-Specific Inhibitors Using a Novel LSTM Encoder-Decoder Model. Int J Mol Sci 2024; 25:2357. [PMID: 38397034 PMCID: PMC10889381 DOI: 10.3390/ijms25042357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 02/11/2024] [Accepted: 02/13/2024] [Indexed: 02/25/2024] Open
Abstract
The receptor tyrosine kinase RET (rearranged during transfection) plays a vital role in various cell signaling pathways and is a critical factor in the development of the nervous system. Abnormal activation of the RET kinase can lead to several cancers, including thyroid cancer and non-small-cell lung cancer. However, most RET kinase inhibitors are multi-kinase inhibitors. Therefore, the development of an effective RET-specific inhibitor continues to present a significant challenge. To address this issue, we built a molecular generation model based on fragment-based drug design (FBDD) and a long short-term memory (LSTM) encoder-decoder structure to generate receptor-specific molecules with novel scaffolds. Remarkably, our model was trained with a molecular assembly accuracy of 98.4%. Leveraging the pre-trained model, we rapidly generated a RET-specific-candidate active-molecule library by transfer learning. Virtual screening based on our molecular generation model was performed, combined with molecular dynamics simulation and binding energy calculation, to discover specific RET inhibitors, and five novel molecules were selected. Further analyses indicated that two of these molecules have good binding affinities and synthesizability, exhibiting high selectivity. Overall, this investigation demonstrates the capacity of our model to generate novel receptor-specific molecules and provides a rapid method to discover potential drugs.
Collapse
Affiliation(s)
| | - Xi Zhao
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, Changchun 130061, China;
| | - Xuri Huang
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, Changchun 130061, China;
| |
Collapse
|
40
|
Kerstjens A, De Winter H. Molecule auto-correction to facilitate molecular design. J Comput Aided Mol Des 2024; 38:10. [PMID: 38363377 PMCID: PMC10873457 DOI: 10.1007/s10822-024-00549-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 01/11/2024] [Indexed: 02/17/2024]
Abstract
Ensuring that computationally designed molecules are chemically reasonable is at best cumbersome. We present a molecule correction algorithm that morphs invalid molecular graphs into structurally related valid analogs. The algorithm is implemented as a tree search, guided by a set of policies to minimize its cost. We showcase how the algorithm can be applied to molecular design, either as a post-processing step or as an integral part of molecule generators.
Collapse
Affiliation(s)
- Alan Kerstjens
- Laboratory of Medicinal Chemistry, Department of Pharmaceutical Sciences, University of Antwerp, Universiteitslaan 1, 2610, Wilrijk, Belgium
| | - Hans De Winter
- Laboratory of Medicinal Chemistry, Department of Pharmaceutical Sciences, University of Antwerp, Universiteitslaan 1, 2610, Wilrijk, Belgium.
| |
Collapse
|
41
|
Zhang H, Huang J, Xie J, Huang W, Yang Y, Xu M, Lei J, Chen H. GRELinker: A Graph-Based Generative Model for Molecular Linker Design with Reinforcement and Curriculum Learning. J Chem Inf Model 2024; 64:666-676. [PMID: 38241022 DOI: 10.1021/acs.jcim.3c01700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2024]
Abstract
Fragment-based drug discovery (FBDD) is widely used in drug design. One useful strategy in FBDD is designing linkers for linking fragments to optimize their molecular properties. In the current study, we present a novel generative fragment linking model, GRELinker, which utilizes a gated-graph neural network combined with reinforcement and curriculum learning to generate molecules with desirable attributes. The model has been shown to be efficient in multiple tasks, including controlling log P, optimizing synthesizability or predicted bioactivity of compounds, and generating molecules with high 3D similarity but low 2D similarity to the lead compound. Specifically, our model outperforms the previously reported reinforcement learning (RL) built-in method DRlinker on these benchmark tasks. Moreover, GRELinker has been successfully used in an actual FBDD case to generate optimized molecules with enhanced affinities by employing the docking score as the scoring function in RL. Besides, the implementation of curriculum learning in our framework enables the generation of structurally complex linkers more efficiently. These results demonstrate the benefits and feasibility of GRELinker in linker design for molecular optimization and drug discovery.
Collapse
Affiliation(s)
- Hao Zhang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Jinchao Huang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Junjie Xie
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Weifeng Huang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
| | - Mingyuan Xu
- Guangzhou National Laboratory, Guangzhou International Bio Island, No. 9 Xin Dao Huan Bei Road, Guangzhou 510005, China
| | - Jinping Lei
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou 510006, China
| | - Hongming Chen
- Guangzhou National Laboratory, Guangzhou International Bio Island, No. 9 Xin Dao Huan Bei Road, Guangzhou 510005, China
| |
Collapse
|
42
|
Kyro GW, Morgunov A, Brent RI, Batista VS. ChemSpaceAL: An Efficient Active Learning Methodology Applied to Protein-Specific Molecular Generation. J Chem Inf Model 2024; 64:653-665. [PMID: 38287889 DOI: 10.1021/acs.jcim.3c01456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
The incredible capabilities of generative artificial intelligence models have inevitably led to their application in the domain of drug discovery. Within this domain, the vastness of chemical space motivates the development of more efficient methods for identifying regions with molecules that exhibit desired characteristics. In this work, we present a computationally efficient active learning methodology and demonstrate its applicability to targeted molecular generation. When applied to c-Abl kinase, a protein with FDA-approved small-molecule inhibitors, the model learns to generate molecules similar to the inhibitors without prior knowledge of their existence and even reproduces two of them exactly. We also show that the methodology is effective for a protein without any commercially available small-molecule inhibitors, the HNH domain of the CRISPR-associated protein 9 (Cas9) enzyme. To facilitate implementation and reproducibility, we made all of our software available through the open-source ChemSpaceAL Python package.
Collapse
Affiliation(s)
- Gregory W Kyro
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Anton Morgunov
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Rafael I Brent
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Victor S Batista
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| |
Collapse
|
43
|
Shen T, Guo J, Han Z, Zhang G, Liu Q, Si X, Wang D, Wu S, Xia J. AutoMolDesigner for Antibiotic Discovery: An AI-Based Open-Source Software for Automated Design of Small-Molecule Antibiotics. J Chem Inf Model 2024; 64:575-583. [PMID: 38265916 DOI: 10.1021/acs.jcim.3c01562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
Discovery of small-molecule antibiotics with novel chemotypes serves as one of the essential strategies to address antibiotic resistance. Although a considerable number of computational tools committed to molecular design have been reported, there is a deficit in holistic and efficient tools specifically developed for small-molecule antibiotic discovery. To address this issue, we report AutoMolDesigner, a computational modeling software dedicated to small-molecule antibiotic design. It is a generalized framework comprising two functional modules, i.e., generative-deep-learning-enabled molecular generation and automated machine-learning-based antibacterial activity/property prediction, wherein individually trained models and curated datasets are out-of-the-box for whole-cell-based antibiotic screening and design. It is open-source, thus allowing for the incorporation of new features for flexible use. Unlike most software programs based on Linux and command lines, this application equipped with a Qt-based graphical user interface can be run on personal computers with multiple operating systems, making it much easier to use for experimental scientists. The software and related materials are freely available at GitHub (https://github.com/taoshen99/AutoMolDesigner) and Zenodo (https://zenodo.org/record/10097899).
Collapse
Affiliation(s)
- Tao Shen
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Jiale Guo
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Zunsheng Han
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Gao Zhang
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Qingxin Liu
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
- School of Pharmacy, Jiangsu Ocean University, Lianyungang, Jiangsu 222005, China
| | - Xinxin Si
- School of Pharmacy, Jiangsu Ocean University, Lianyungang, Jiangsu 222005, China
| | - Dongmei Wang
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Song Wu
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Jie Xia
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| |
Collapse
|
44
|
Gangwal A, Ansari A, Ahmad I, Azad AK, Kumarasamy V, Subramaniyan V, Wong LS. Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities. Front Pharmacol 2024; 15:1331062. [PMID: 38384298 PMCID: PMC10879372 DOI: 10.3389/fphar.2024.1331062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/17/2024] [Indexed: 02/23/2024] Open
Abstract
There are two main ways to discover or design small drug molecules. The first involves fine-tuning existing molecules or commercially successful drugs through quantitative structure-activity relationships and virtual screening. The second approach involves generating new molecules through de novo drug design or inverse quantitative structure-activity relationship. Both methods aim to get a drug molecule with the best pharmacokinetic and pharmacodynamic profiles. However, bringing a new drug to market is an expensive and time-consuming endeavor, with the average cost being estimated at around $2.5 billion. One of the biggest challenges is screening the vast number of potential drug candidates to find one that is both safe and effective. The development of artificial intelligence in recent years has been phenomenal, ushering in a revolution in many fields. The field of pharmaceutical sciences has also significantly benefited from multiple applications of artificial intelligence, especially drug discovery projects. Artificial intelligence models are finding use in molecular property prediction, molecule generation, virtual screening, synthesis planning, repurposing, among others. Lately, generative artificial intelligence has gained popularity across domains for its ability to generate entirely new data, such as images, sentences, audios, videos, novel chemical molecules, etc. Generative artificial intelligence has also delivered promising results in drug discovery and development. This review article delves into the fundamentals and framework of various generative artificial intelligence models in the context of drug discovery via de novo drug design approach. Various basic and advanced models have been discussed, along with their recent applications. The review also explores recent examples and advances in the generative artificial intelligence approach, as well as the challenges and ongoing efforts to fully harness the potential of generative artificial intelligence in generating novel drug molecules in a faster and more affordable manner. Some clinical-level assets generated form generative artificial intelligence have also been discussed in this review to show the ever-increasing application of artificial intelligence in drug discovery through commercial partnerships.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, Maharashtra, India
| | - Azim Ansari
- Computer Aided Drug Design Center Shri Vile Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, Maharashtra, India
| | - Iqrar Ahmad
- Department of Pharmaceutical Chemistry, Prof. Ravindra Nikam College of Pharmacy, Dhule, India
| | - Abul Kalam Azad
- Faculty of Pharmacy, University College of MAIWP International, Batu Caves, Malaysia
| | - Vinoth Kumarasamy
- Department of Parasitology and Medical Entomology, Faculty of Medicine, Universiti Kebangsaan Malaysia, Cheras, Malaysia
| | - Vetriselvan Subramaniyan
- Pharmacology Unit, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Selangor, Malaysia
- School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, Punjab, India
| | - Ling Shing Wong
- Faculty of Health and Life Sciences, INTI International University, Nilai, Malaysia
| |
Collapse
|
45
|
Jinsong S, Qifeng J, Xing C, Hao Y, Wang L. Molecular fragmentation as a crucial step in the AI-based drug development pathway. Commun Chem 2024; 7:20. [PMID: 38302655 PMCID: PMC10834946 DOI: 10.1038/s42004-024-01109-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 01/19/2024] [Indexed: 02/03/2024] Open
Abstract
The AI-based small molecule drug discovery has become a significant trend at the intersection of computer science and life sciences. In the pursuit of novel compounds, fragment-based drug discovery has emerged as a novel approach. The Generative Pre-trained Transformers (GPT) model has showcased remarkable prowess across various domains, rooted in its pre-training and representation learning of fundamental linguistic units. Analogous to natural language, molecular encoding, as a form of chemical language, necessitates fragmentation aligned with specific chemical logic for accurate molecular encoding. This review provides a comprehensive overview of the current state of the art in molecular fragmentation. We systematically summarize the approaches and applications of various molecular fragmentation techniques, with special emphasis on the characteristics and scope of applicability of each technique, and discuss their applications. We also provide an outlook on the current development trends of molecular fragmentation techniques, including some potential research directions and challenges.
Collapse
Affiliation(s)
- Shao Jinsong
- Nantong University, School of Information Science and Technology, Nantong, China
| | - Jia Qifeng
- Nantong University, School of Information Science and Technology, Nantong, China
| | - Chen Xing
- Nantong University, School of Information Science and Technology, Nantong, China
| | - Yajie Hao
- Nantong University, School of Information Science and Technology, Nantong, China
| | - Li Wang
- Nantong University, Research Center for Intelligence Information Technology, Nantong, China.
| |
Collapse
|
46
|
Tropsha A, Isayev O, Varnek A, Schneider G, Cherkasov A. Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat Rev Drug Discov 2024; 23:141-155. [PMID: 38066301 DOI: 10.1038/s41573-023-00832-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2023] [Indexed: 02/08/2024]
Abstract
Quantitative structure-activity relationship (QSAR) modelling, an approach that was introduced 60 years ago, is widely used in computer-aided drug design. In recent years, progress in artificial intelligence techniques, such as deep learning, the rapid growth of databases of molecules for virtual screening and dramatic improvements in computational power have supported the emergence of a new field of QSAR applications that we term 'deep QSAR'. Marking a decade from the pioneering applications of deep QSAR to tasks involved in small-molecule drug discovery, we herein describe key advances in the field, including deep generative and reinforcement learning approaches in molecular design, deep learning models for synthetic planning and the application of deep QSAR models in structure-based virtual screening. We also reflect on the emergence of quantum computing, which promises to further accelerate deep QSAR applications and the need for open-source and democratized resources to support computer-aided drug design.
Collapse
Affiliation(s)
| | | | | | | | - Artem Cherkasov
- University of British Columbia, Vancouver, BC, Canada.
- Photonic Inc., Coquitlam, BC, Canada.
| |
Collapse
|
47
|
Satalkar V, Degaga GD, Li W, Pang YT, McShan AC, Gumbart JC, Mitchell JC, Torres MP. Generative β-hairpin design using a residue-based physicochemical property landscape. Biophys J 2024:S0006-3495(24)00070-5. [PMID: 38297834 DOI: 10.1016/j.bpj.2024.01.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/20/2023] [Accepted: 01/25/2024] [Indexed: 02/02/2024] Open
Abstract
De novo peptide design is a new frontier that has broad application potential in the biological and biomedical fields. Most existing models for de novo peptide design are largely based on sequence homology that can be restricted based on evolutionarily derived protein sequences and lack the physicochemical context essential in protein folding. Generative machine learning for de novo peptide design is a promising way to synthesize theoretical data that are based on, but unique from, the observable universe. In this study, we created and tested a custom peptide generative adversarial network intended to design peptide sequences that can fold into the β-hairpin secondary structure. This deep neural network model is designed to establish a preliminary foundation of the generative approach based on physicochemical and conformational properties of 20 canonical amino acids, for example, hydrophobicity and residue volume, using extant structure-specific sequence data from the PDB. The beta generative adversarial network model robustly distinguishes secondary structures of β hairpin from α helix and intrinsically disordered peptides with an accuracy of up to 96% and generates artificial β-hairpin peptide sequences with minimum sequence identities around 31% and 50% when compared against the current NCBI PDB and nonredundant databases, respectively. These results highlight the potential of generative models specifically anchored by physicochemical and conformational property features of amino acids to expand the sequence-to-structure landscape of proteins beyond evolutionary limits.
Collapse
Affiliation(s)
- Vardhan Satalkar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia
| | - Gemechis D Degaga
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee
| | - Wei Li
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia
| | - Yui Tik Pang
- School of Physics, Georgia Institute of Technology, Atlanta, Georgia
| | - Andrew C McShan
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia
| | - James C Gumbart
- School of Physics, Georgia Institute of Technology, Atlanta, Georgia; School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia
| | - Julie C Mitchell
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee.
| | - Matthew P Torres
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia; School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia.
| |
Collapse
|
48
|
Nowak D, Huczyński A, Bachorz RA, Hoffmann M. Machine Learning Application for Medicinal Chemistry: Colchicine Case, New Structures, and Anticancer Activity Prediction. Pharmaceuticals (Basel) 2024; 17:173. [PMID: 38399388 PMCID: PMC10892630 DOI: 10.3390/ph17020173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 01/02/2024] [Accepted: 01/12/2024] [Indexed: 02/25/2024] Open
Abstract
In the contemporary era, the exploration of machine learning (ML) has gained widespread attention and is being leveraged to augment traditional methodologies in quantitative structure-activity relationship (QSAR) investigations. The principal objective of this research was to assess the anticancer potential of colchicine-based compounds across five distinct cell lines. This research endeavor ultimately sought to construct ML models proficient in forecasting anticancer activity as quantified by the IC50 value, while concurrently generating innovative colchicine-derived compounds. The resistance index (RI) is computed to evaluate the drug resistance exhibited by LoVo/DX cells relative to LoVo cancer cell lines. Meanwhile, the selectivity index (SI) is computed to determine the potential of a compound to demonstrate superior efficacy against tumor cells compared to its toxicity against normal cells, such as BALB/3T3. We introduce a novel ML system adept at recommending novel chemical structures predicated on known anticancer activity. Our investigation entailed the assessment of inhibitory capabilities across five cell lines, employing predictive models utilizing various algorithms, including random forest, decision tree, support vector machines, k-nearest neighbors, and multiple linear regression. The most proficient model, as determined by quality metrics, was employed to predict the anticancer activity of novel colchicine-based compounds. This methodological approach yielded the establishment of a library encompassing new colchicine-based compounds, each assigned an IC50 value. Additionally, this study resulted in the development of a validated predictive model, capable of reasonably estimating IC50 values based on molecular structure input.
Collapse
Affiliation(s)
- Damian Nowak
- Department of Quantum Chemistry, Faculty of Chemistry, Adam Mickiewicz University in Poznan, Uniwersytetu Poznanskiego 8, 61-614 Poznan, Poland
| | - Adam Huczyński
- Department of Medical Chemistry, Faculty of Chemistry, Adam Mickiewicz University in Poznan, Uniwersytetu Poznanskiego 8, 61-614 Poznan, Poland;
| | - Rafał Adam Bachorz
- Institute of Medical Biology of Polish Academy of Sciences, Lodowa 106, 93-232 Lodz, Poland;
- Institute of Computing Science, Faculty of Computing, Poznań University of Technology, Piotrowo 2, 60-965 Poznań, Poland
| | - Marcin Hoffmann
- Department of Quantum Chemistry, Faculty of Chemistry, Adam Mickiewicz University in Poznan, Uniwersytetu Poznanskiego 8, 61-614 Poznan, Poland
| |
Collapse
|
49
|
Weng G, Zhao H, Nie D, Zhang H, Liu L, Hou T, Kang Y. RediscMol: Benchmarking Molecular Generation Models in Biological Properties. J Med Chem 2024; 67:1533-1543. [PMID: 38181194 DOI: 10.1021/acs.jmedchem.3c02051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2024]
Abstract
Deep learning-based molecular generative models have garnered emerging attention for their capability to generate molecules with novel structures and desired physicochemical properties. However, the evaluation of these models, particularly in a biological context, remains insufficient. To address the limitations of existing metrics and emulate practical application scenarios, we construct the RediscMol benchmark that comprises active molecules extracted from 5 kinase and 3 GPCR data sets. A set of rediscovery- and similarity-related metrics are introduced to assess the performance of 8 representative generative models (CharRNN, VAE, Reinvent, AAE, ORGAN, RNNAttn, TransVAE, and GraphAF). Our findings based on the RediscMol benchmark differ from those of previous evaluations. CharRNN, VAE, and Reinvent exhibit a greater ability to reproduce known active molecules, while RNNAttn, TransVAE, and GraphAF struggle in this aspect despite their notable performance on commonly used distribution-learning metrics. Our evaluation framework may provide valuable guidance for advancing generative models in real-world drug design scenarios.
Collapse
Affiliation(s)
- Gaoqi Weng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang UniversityHangzhou 310058, Zhejiang, China
| | - Huifeng Zhao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang UniversityHangzhou 310058, Zhejiang, China
| | - Dou Nie
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang UniversityHangzhou 310058, Zhejiang, China
| | - Haotian Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang UniversityHangzhou 310058, Zhejiang, China
| | - Liwei Liu
- Advanced Computing and Storage Laboratory, Central Research Institute, 2012 Laboratories, Huawei Technologies Co., Ltd., Shenzhen 518129, Guangdong, China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang UniversityHangzhou 310058, Zhejiang, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang UniversityHangzhou 310058, Zhejiang, China
| |
Collapse
|
50
|
Chowdhury J, Fricke C, Bamidele O, Bello M, Yang W, Heyden A, Terejanu G. Invariant Molecular Representations for Heterogeneous Catalysis. J Chem Inf Model 2024; 64:327-339. [PMID: 38197612 PMCID: PMC10806804 DOI: 10.1021/acs.jcim.3c00594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 12/25/2023] [Accepted: 12/28/2023] [Indexed: 01/11/2024]
Abstract
Catalyst screening is a critical step in the discovery and development of heterogeneous catalysts, which are vital for a wide range of chemical processes. In recent years, computational catalyst screening, primarily through density functional theory (DFT), has gained significant attention as a method for identifying promising catalysts. However, the computation of adsorption energies for all likely chemical intermediates present in complex surface chemistries is computationally intensive and costly due to the expensive nature of these calculations and the intrinsic idiosyncrasies of the methods or data sets used. This study introduces a novel machine learning (ML) method to learn adsorption energies from multiple DFT functionals by using invariant molecular representations (IMRs). To do this, we first extract molecular fingerprints for the reaction intermediates and later use a Siamese-neural-network-based training strategy to learn invariant molecular representations or the IMR across all available functionals. Our Siamese network-based representations demonstrate superior performance in predicting adsorption energies compared with other molecular representations. Notably, when considering mean absolute values of adsorption energies as 0.43 eV (PBE-D3), 0.46 eV (BEEF-vdW), 0.81 eV (RPBE), and 0.37 eV (scan+rVV10), our IMR method has achieved the lowest mean absolute errors (MAEs) of 0.18 0.10, 0.16, and 0.18 eV, respectively. These results emphasize the superior predictive capacity of our Siamese network-based representations. The empirical findings in this study illuminate the efficacy, robustness, and dependability of our proposed ML paradigm in predicting adsorption energies, specifically for propane dehydrogenation on a platinum catalyst surface.
Collapse
Affiliation(s)
- Jawad Chowdhury
- Department
of Computer Science, University of North
Carolina at Charlotte, Charlotte, North Carolina 28223, United States
| | - Charles Fricke
- Department
of Chemical Engineering, University of South
Carolina, Columbia, South Carolina 29208, United States
| | - Olajide Bamidele
- Department
of Chemical Engineering, University of South
Carolina, Columbia, South Carolina 29208, United States
| | - Mubarak Bello
- Department
of Chemical Engineering, University of South
Carolina, Columbia, South Carolina 29208, United States
| | - Wenqiang Yang
- Department
of Chemical Engineering, University of South
Carolina, Columbia, South Carolina 29208, United States
| | - Andreas Heyden
- Department
of Chemical Engineering, University of South
Carolina, Columbia, South Carolina 29208, United States
| | - Gabriel Terejanu
- Department
of Computer Science, University of North
Carolina at Charlotte, Charlotte, North Carolina 28223, United States
| |
Collapse
|