1
|
Zheng P, Pan T, Gao Y, Chen J, Li L, Chen Y, Fang D, Li X, Gao F, Li Y. Predicting the exposure of mycophenolic acid in children with autoimmune diseases using a limited sampling strategy: A retrospective study. Clin Transl Sci 2025; 18:e70092. [PMID: 39727288 DOI: 10.1111/cts.70092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 10/29/2024] [Accepted: 11/05/2024] [Indexed: 12/28/2024] Open
Abstract
Mycophenolic acid (MPA) is commonly used to treat autoimmune diseases in children, and therapeutic drug monitoring is recommended to ensure adequate drug exposure. However, multiple blood sampling is required to calculate the area under the plasma concentration-time curve (AUC), causing patient discomfort and waste of human and financial resources. This study aims to use machine learning and deep learning algorithms to develop a prediction model of MPA exposure for pediatric autoimmune diseases with optimizing sampling frequency. Pediatric autoimmune patients' data were collected at Nanfang Hospital between June 2018 and June 2023. Univariate analysis was applied for feature selection. Ten algorithms, including Random Forest, XGBoost, LightGBM, Gradient Boosting Decision Tree, CatBoost, Artificial Neural Network, Grandient Boosting Machine, Transformer, Wide&Deep, and TabNet, were employed for modeling based on two, three, or four concentrations of MPA. A total of 614 MPA AUC0-12h samples from 209 patients were enrolled. Among the 10 models evaluated, the Wide&Deep model exhibited the best predictive performance. The predictive performance of the Wide&Deep model using four and three blood concentration points was similar (R 2 ≈ 1 for four points; R 2 = 0.95 for three points). No significant difference in accuracy within ±30% was observed between models utilizing three and four blood concentration points (p = 0.06). This study demonstrates that in the Wide&Deep model, MPA exposure can be accurately estimated with three sampling points in children with autoimmune diseases. This model could help reduce discomfort in pediatric patients without reducing the accuracy of MPA exposure estimates in clinical practice.
Collapse
Affiliation(s)
- Ping Zheng
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Ting Pan
- Second Affiliated Hospital to Naval Medical University, Shanghai, China
| | - Ya Gao
- Department of Pharmacy, Fuwai Hospital, Chinese Academy of Medical Sciences, Beijing, China
| | - Juan Chen
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Liren Li
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Yan Chen
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Dandan Fang
- Beijing Medicinovo Technology Co. Ltd, Beijing, China
| | - Xuechun Li
- Dalian Medicinovo Technology Co. Ltd, Dalian, China
| | - Fei Gao
- Beijing Medicinovo Technology Co. Ltd, Beijing, China
| | - Yilei Li
- Department of Pharmacy, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
2
|
Wang N, Li X, Xiao J, Liu S, Cao D. Data-driven toxicity prediction in drug discovery: Current status and future directions. Drug Discov Today 2024; 29:104195. [PMID: 39357621 DOI: 10.1016/j.drudis.2024.104195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 09/13/2024] [Accepted: 09/26/2024] [Indexed: 10/04/2024]
Abstract
Early toxicity assessment plays a vital role in the drug discovery process on account of its significant influence on the attrition rate of candidates. Recently, constant upgrading of information technology has greatly promoted the continuous development of toxicity prediction. To give an overview of the current state of data-driven toxicity prediction, we reviewed relevant studies and summarized them in three main respects: the features and difficulties of toxicity prediction, the evolution of modeling approaches, and the available tools for toxicity prediction. For each part, we expound the research status, existing challenges, and feasible solutions. Finally, several new directions and suggestions for toxicity prediction are also put forward.
Collapse
Affiliation(s)
- Ningning Wang
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; The Hunan Institute of Pharmacy Practice and Clinical Research, Changsha 410008 Hunan, PR China
| | - Xinliang Li
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; The Hunan Institute of Pharmacy Practice and Clinical Research, Changsha 410008 Hunan, PR China
| | - Jing Xiao
- Hunan Institute for Drug Control, Changsha 410001 Hunan, PR China
| | - Shao Liu
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; The Hunan Institute of Pharmacy Practice and Clinical Research, Changsha 410008 Hunan, PR China.
| | - Dongsheng Cao
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, PR China.
| |
Collapse
|
3
|
Milon TI, Wang Y, Fontenot RL, Khajouie P, Villinger F, Raghavan V, Xu W. Development of a novel representation of drug 3D structures and enhancement of the TSR-based method for probing drug and target interactions. Comput Biol Chem 2024; 112:108117. [PMID: 38852360 PMCID: PMC11390338 DOI: 10.1016/j.compbiolchem.2024.108117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 05/13/2024] [Accepted: 05/31/2024] [Indexed: 06/11/2024]
Abstract
Understanding the mechanisms underlying interactions between drugs and target proteins is critical for drug discovery. In our earlier studies, we introduced the Triangular Spatial Relationship (TSR)-based algorithm, which enables the representation of a protein's 3D structure as a vector of integers (TSR keys). These TSR keys correspond to substructures of the 3D structure of a protein and are computed based on the triangles constructed by all possible triples of Cα atoms within the protein. In this study, we report on a new TSR-based algorithm for probing drug and target interactions. Specifically, we have extended the previous algorithm in three novel directions: TSR keys for representing the 3D structure of a drug or a ligand, cross TSR keys between drugs and their targets and intra-residual TSR keys for phosphorylated amino acids. The outcomes illustrate the key contributions as follows: (i) The TSR-based method, which uses the TSR keys as features, is unique in its capability to interpret hierarchical relationships of drugs as well as drug - target complexes using common and specific TSR keys. (ii) The method can distinguish not only the binding sites from the rest of the protein structures, but also the binding sites of primary targets from those of off-targets. (iii) The method has the potential to correlate the 3D structures of drugs with their functions. (iv) Representation of 3D structures by TSR keys has its unique advantage in terms of ease of making searching for similar substructures across structure datasets easier. In summary, this study presents a novel computational methodology, with significant advantages, for providing insights into the mechanism underlying drug and target interactions.
Collapse
Affiliation(s)
- Tarikul I Milon
- Department of Chemistry, University of Louisiana at Lafayette, P.O. Box 44370, Lafayette, LA 70504, USA
| | - Yuhong Wang
- National Center for Advancing Translational Sciences, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - Ryan L Fontenot
- Department of Chemistry, University of Louisiana at Lafayette, P.O. Box 44370, Lafayette, LA 70504, USA
| | - Poorya Khajouie
- Department of Chemistry, University of Louisiana at Lafayette, P.O. Box 44370, Lafayette, LA 70504, USA; The Center for Advanced Computer Studies, University of Louisiana at Lafayette, LA 70504, USA
| | - Francois Villinger
- Department of Biology, University of Louisiana at Lafayette, New Iberia, LA 70560, USA
| | - Vijay Raghavan
- The Center for Advanced Computer Studies, University of Louisiana at Lafayette, LA 70504, USA
| | - Wu Xu
- Department of Chemistry, University of Louisiana at Lafayette, P.O. Box 44370, Lafayette, LA 70504, USA.
| |
Collapse
|
4
|
Chhetri SP, Bhandari VS, Maharjan R, Lamichhane TR. Identification of lead inhibitors for 3CLpro of SARS-CoV-2 target using machine learning based virtual screening, ADMET analysis, molecular docking and molecular dynamics simulations. RSC Adv 2024; 14:29683-29692. [PMID: 39297030 PMCID: PMC11408992 DOI: 10.1039/d4ra04502e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Accepted: 09/04/2024] [Indexed: 09/21/2024] Open
Abstract
The SARS-CoV-2 3CLpro is a critical target for COVID-19 therapeutics due to its role in viral replication. We employed a screening pipeline to identify novel inhibitors by combining machine learning classification with similarity checks of approved medications. A voting classifier, integrating three machine learning classifiers, was used to filter a large database (∼10 million compounds) for potential inhibitors. This ensemble-based machine learning technique enhances overall performance and robustness compared to individual classifiers. From the screening, three compounds M1, M2 and M3 were selected for further analysis. Absorption, distribution, metabolism, excretion, and toxicity (ADMET) analysis compared these candidates to nirmatrelvir and azvudine. Molecular docking followed by 200 ns MD simulations showed that only M1 (6-[2,4-bis(dimethylamino)-6,8-dihydro-5H-pyrido[3,4-d]pyrimidine-7-carbonyl]-1H-pyrimidine-2,4-dione) remained stable. For azvudine and M1, the estimated median lethal doses are 1000 and 550 mg kg-1, respectively, with maximum tolerated doses of 0.289 and 0.614 log mg per kg per day. The predicted inhibitory activity of M1 is 7.35, similar to that of nirmatrelvir. The binding free energy based on Molecular Mechanics Poisson-Boltzmann Surface Area (MM-PBSA) of M1 is -18.86 ± 4.38 kcal mol-1, indicating strong binding interactions. These findings suggest that M1 merits further investigation as a potential SARS-CoV-2 treatment.
Collapse
Affiliation(s)
| | | | - Rajesh Maharjan
- Central Department of Physics, Tribhuvan University Kathmandu 44600 Nepal
| | | |
Collapse
|
5
|
Li B, Tan K, Lao AR, Wang H, Zheng H, Zhang L. A comprehensive review of artificial intelligence for pharmacology research. Front Genet 2024; 15:1450529. [PMID: 39290983 PMCID: PMC11405247 DOI: 10.3389/fgene.2024.1450529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Accepted: 08/26/2024] [Indexed: 09/19/2024] Open
Abstract
With the innovation and advancement of artificial intelligence, more and more artificial intelligence techniques are employed in drug research, biomedical frontier research, and clinical medicine practice, especially, in the field of pharmacology research. Thus, this review focuses on the applications of artificial intelligence in drug discovery, compound pharmacokinetic prediction, and clinical pharmacology. We briefly introduced the basic knowledge and development of artificial intelligence, presented a comprehensive review, and then summarized the latest studies and discussed the strengths and limitations of artificial intelligence models. Additionally, we highlighted several important studies and pointed out possible research directions.
Collapse
Affiliation(s)
- Bing Li
- College of Computer Science, Sichuan University, Chengdu, China
| | - Kan Tan
- College of Computer Science, Sichuan University, Chengdu, China
| | - Angelyn R Lao
- Department of Mathematics and Statistics, De La Salle University, Manila, Philippines
| | - Haiying Wang
- School of Computing, Ulster University, Belfast, United Kingdom
| | - Huiru Zheng
- School of Computing, Ulster University, Belfast, United Kingdom
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu, China
| |
Collapse
|
6
|
Satalkar V, Degaga GD, Li W, Pang YT, McShan AC, Gumbart JC, Mitchell JC, Torres MP. Generative β-hairpin design using a residue-based physicochemical property landscape. Biophys J 2024; 123:2790-2806. [PMID: 38297834 PMCID: PMC11393682 DOI: 10.1016/j.bpj.2024.01.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/20/2023] [Accepted: 01/25/2024] [Indexed: 02/02/2024] Open
Abstract
De novo peptide design is a new frontier that has broad application potential in the biological and biomedical fields. Most existing models for de novo peptide design are largely based on sequence homology that can be restricted based on evolutionarily derived protein sequences and lack the physicochemical context essential in protein folding. Generative machine learning for de novo peptide design is a promising way to synthesize theoretical data that are based on, but unique from, the observable universe. In this study, we created and tested a custom peptide generative adversarial network intended to design peptide sequences that can fold into the β-hairpin secondary structure. This deep neural network model is designed to establish a preliminary foundation of the generative approach based on physicochemical and conformational properties of 20 canonical amino acids, for example, hydrophobicity and residue volume, using extant structure-specific sequence data from the PDB. The beta generative adversarial network model robustly distinguishes secondary structures of β hairpin from α helix and intrinsically disordered peptides with an accuracy of up to 96% and generates artificial β-hairpin peptide sequences with minimum sequence identities around 31% and 50% when compared against the current NCBI PDB and nonredundant databases, respectively. These results highlight the potential of generative models specifically anchored by physicochemical and conformational property features of amino acids to expand the sequence-to-structure landscape of proteins beyond evolutionary limits.
Collapse
Affiliation(s)
- Vardhan Satalkar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia
| | - Gemechis D Degaga
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee
| | - Wei Li
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia
| | - Yui Tik Pang
- School of Physics, Georgia Institute of Technology, Atlanta, Georgia
| | - Andrew C McShan
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia
| | - James C Gumbart
- School of Physics, Georgia Institute of Technology, Atlanta, Georgia; School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia
| | - Julie C Mitchell
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee.
| | - Matthew P Torres
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia; School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia.
| |
Collapse
|
7
|
Rezić I, Somogyi Škoc M. Computational Methodologies in Synthesis, Preparation and Application of Antimicrobial Polymers, Biomolecules, and Nanocomposites. Polymers (Basel) 2024; 16:2320. [PMID: 39204538 PMCID: PMC11359845 DOI: 10.3390/polym16162320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Revised: 08/05/2024] [Accepted: 08/14/2024] [Indexed: 09/04/2024] Open
Abstract
The design and optimization of antimicrobial materials (polymers, biomolecules, or nanocomposites) can be significantly advanced by computational methodologies like molecular dynamics (MD), which provide insights into the interactions and stability of the antimicrobial agents within the polymer matrix, and machine learning (ML) or design of experiment (DOE), which predicts and optimizes antimicrobial efficacy and material properties. These innovations not only enhance the efficiency of developing antimicrobial polymers but also enable the creation of materials with tailored properties to meet specific application needs, ensuring safety and longevity in their usage. Therefore, this paper will present the computational methodologies employed in the synthesis and application of antimicrobial polymers, biomolecules, and nanocomposites. By leveraging advanced computational techniques such as MD, ML, or DOE, significant advancements in the design and optimization of antimicrobial materials are achieved. A comprehensive review on recent progress, together with highlights of the most relevant methodologies' contributions to state-of-the-art materials science will be discussed, as well as future directions in the field will be foreseen. Finally, future possibilities and opportunities will be derived from the current state-of-the-art methodologies, providing perspectives on the potential evolution of polymer science and engineering of novel materials.
Collapse
Affiliation(s)
- Iva Rezić
- Department of Applied Chemistry, Faculty of Textile Technology, University of Zagreb, 10000 Zagreb, Croatia
| | - Maja Somogyi Škoc
- Department of Materials Testing, Faculty of Textile Technology, University of Zagreb, 10000 Zagreb, Croatia;
| |
Collapse
|
8
|
Baik SM, Kwon HJ, Kim Y, Lee J, Park YH, Park DJ. Machine learning model for osteoporosis diagnosis based on bone turnover markers. Health Informatics J 2024; 30:14604582241270778. [PMID: 39115269 DOI: 10.1177/14604582241270778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/18/2024]
Abstract
To assess the diagnostic utility of bone turnover markers (BTMs) and demographic variables for identifying individuals with osteoporosis. A cross-sectional study involving 280 participants was conducted. Serum BTM values were obtained from 88 patients with osteoporosis and 192 controls without osteoporosis. Six machine learning models, including extreme gradient boosting (XGBoost), light gradient boosting machine (LGBM), CatBoost, random forest, support vector machine, and k-nearest neighbors, were employed to evaluate osteoporosis diagnosis. The performance measures included the area under the receiver operating characteristic curve (AUROC), F1-score, and accuracy. After AUROC optimization, LGBM exhibited the highest AUROC of 0.706. Post F1-score optimization, LGBM's F1-score was improved from 0.50 to 0.65. Combining the top three optimized models (LGBM, XGBoost, and CatBoost) resulted in an AUROC of 0.706, an F1-score of 0.65, and an accuracy of 0.73. BTMs, along with age and sex, were found to contribute significantly to osteoporosis diagnosis. This study demonstrates the potential of machine learning models utilizing BTMs and demographic variables for diagnosing preexisting osteoporosis. The findings highlight the clinical relevance of accessible clinical data in osteoporosis assessment, providing a promising tool for early diagnosis and management.
Collapse
Affiliation(s)
- Seung Min Baik
- Division of Critical Care Medicine, Department of Surgery, Ewha Womans University Mokdong Hospital, Ewha Womans University College of Medicine, Seoul, Korea
- Department of Surgery, Korea University College of Medicine, Seoul, Korea
| | - Hi Jeong Kwon
- Department of Laboratory Medicine, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Yeongsic Kim
- Department of Laboratory Medicine, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Jehoon Lee
- Department of Laboratory Medicine, Eunpyeong St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Young Hoon Park
- Division of Hematology, Department of Internal Medicine, Ewha Womans University Mokdong Hospital, Ewha Womans University College of Medicine, Seoul, Korea
| | - Dong Jin Park
- Department of Laboratory Medicine, Eunpyeong St Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
| |
Collapse
|
9
|
Liang Q, Liu Z, Liang Z, Zhu C, Li D, Kong Q, Mou H. Development strategies and application of antimicrobial peptides as future alternatives to in-feed antibiotics. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 927:172150. [PMID: 38580107 DOI: 10.1016/j.scitotenv.2024.172150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 03/14/2024] [Accepted: 03/30/2024] [Indexed: 04/07/2024]
Abstract
The use of in-feed antibiotics has been widely restricted due to the significant environmental pollution and food safety concerns they have caused. Antimicrobial peptides (AMPs) have attracted widespread attention as potential future alternatives to in-feed antibiotics owing to their demonstrated antimicrobial activity and environment friendly characteristics. However, the challenges of weak bioactivity, immature stability, and low production yields of natural AMPs impede practical application in the feed industry. To address these problems, efforts have been made to develop strategies for approaching the AMPs with enhanced properties. Herein, we summarize approaches to improving the properties of AMPs as potential alternatives to in-feed antibiotics, mainly including optimization of structural parameters, sequence modification, selection of microbial hosts, fusion expression, and industrially fermentation control. Additionally, the potential for application of AMPs in animal husbandry is discussed. This comprehensive review lays a strong theoretical foundation for the development of in-feed AMPs to achieve the public health globally.
Collapse
Affiliation(s)
- Qingping Liang
- College of Food Science and Engineering, Ocean University of China, Qingdao 266404, China
| | - Zhemin Liu
- Fundamental Science R&D Center of Vazyme Biotech Co. Ltd., Nanjing 210000, China
| | - Ziyu Liang
- Section of Neurobiology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
| | - Changliang Zhu
- College of Food Science and Engineering, Ocean University of China, Qingdao 266404, China
| | - Dongyu Li
- College of Food Science and Engineering, Ocean University of China, Qingdao 266404, China
| | - Qing Kong
- College of Food Science and Engineering, Ocean University of China, Qingdao 266404, China
| | - Haijin Mou
- College of Food Science and Engineering, Ocean University of China, Qingdao 266404, China.
| |
Collapse
|
10
|
Tang X, Dai H, Knight E, Wu F, Li Y, Li T, Gerstein M. A survey of generative AI for de novo drug design: new frontiers in molecule and protein generation. Brief Bioinform 2024; 25:bbae338. [PMID: 39007594 PMCID: PMC11247410 DOI: 10.1093/bib/bbae338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/21/2024] [Accepted: 06/27/2024] [Indexed: 07/16/2024] Open
Abstract
Artificial intelligence (AI)-driven methods can vastly improve the historically costly drug design process, with various generative models already in widespread use. Generative models for de novo drug design, in particular, focus on the creation of novel biological compounds entirely from scratch, representing a promising future direction. Rapid development in the field, combined with the inherent complexity of the drug design process, creates a difficult landscape for new researchers to enter. In this survey, we organize de novo drug design into two overarching themes: small molecule and protein generation. Within each theme, we identify a variety of subtasks and applications, highlighting important datasets, benchmarks, and model architectures and comparing the performance of top models. We take a broad approach to AI-driven drug design, allowing for both micro-level comparisons of various methods within each subtask and macro-level observations across different fields. We discuss parallel challenges and approaches between the two applications and highlight future directions for AI-driven de novo drug design as a whole. An organized repository of all covered sources is available at https://github.com/gersteinlab/GenAI4Drug.
Collapse
Affiliation(s)
- Xiangru Tang
- Department of Computer Science, Yale University, New Haven, CT 06520, United States
| | - Howard Dai
- Department of Computer Science, Yale University, New Haven, CT 06520, United States
| | - Elizabeth Knight
- School of Medicine, Yale University, New Haven, CT 06520, United States
| | - Fang Wu
- Computer Science Department, Stanford University, CA 94305, United States
| | - Yunyang Li
- Department of Computer Science, Yale University, New Haven, CT 06520, United States
| | - Tianxiao Li
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT 06520, United States
| | - Mark Gerstein
- Department of Computer Science, Yale University, New Haven, CT 06520, United States
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT 06520, United States
- Department of Statistics & Data Science, Yale University, New Haven, CT 06520, United States
- Department of Biomedical Informatics & Data Science, Yale University, New Haven, CT 06520, United States
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT 06520, United States
| |
Collapse
|
11
|
Vigna V, Cova TFGG, Nunes SCC, Pais AACC, Sicilia E. Machine Learning-Based Prediction of Reduction Potentials for Pt IV Complexes. J Chem Inf Model 2024; 64:3733-3743. [PMID: 38683970 DOI: 10.1021/acs.jcim.4c00315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Some of the well-known drawbacks of clinically approved PtII complexes can be overcome using six-coordinate PtIV complexes as inert prodrugs, which release the corresponding four-coordinate active PtII species upon reduction by cellular reducing agents. Therefore, the key factor of PtIV prodrug mechanism of action is their tendency to be reduced which, when the involved mechanism is of outer-sphere type, is measured by the value of the reduction potential. Machine learning (ML) models can be used to effectively capture intricate relationships within PtIV complex data, leading to highly accurate predictions of reduction potentials and other properties, and offering significant insights into their electrochemical behavior and potential applications. In this study, a machine learning-based approach for predicting the reduction potentials of PtIV complexes based on relevant molecular descriptors is presented. Leveraging a data set of experimentally determined reduction potentials and a diverse range of molecular descriptors, the proposed model demonstrates remarkable predictive accuracy (MSE = 0.016 V2, RMSE = 0.13 V, R2 = 0.92). Ab initio calculations and a set of different machine learning algorithms and feature engineering techniques have been employed to systematically explore the relationship between molecular structure and similarity and reduction potential. Specifically, it has been investigated whether the reduction potential of these compounds can be described by combining ML models across different combinations of constitutional, topological, and electronic molecular descriptors. Our results not only provide insights into the crucial factors influencing reduction potentials but also offer a rapid and effective tool for the rational design of PtIV complexes with tailored electrochemical properties for pharmaceutical applications. This approach has the potential to significantly expedite the development and screening of novel PtIV prodrug candidates. The analysis of principal components and key features extracted from the model highlights the significance of structural descriptors of the 2D Atom Pairs type and the lowest unoccupied molecular orbital energy. Specifically, with just 20 appropriately selected descriptors, a notable separation of complexes based on their reduction potential value is achieved.
Collapse
Affiliation(s)
- V Vigna
- PROMOCS Laboratory, Department of Chemistry and Chemical Technologies, University of Calabria, Arcavacata di Rende87036,Italy
| | - T F G G Cova
- Coimbra Chemistry Centre, Department of Chemistry, Institute of Molecular Sciences (IMS), Faculty of Sciences and Technology, University of Coimbra, Coimbra 3004-535,Portugal
| | - S C C Nunes
- Coimbra Chemistry Centre, Department of Chemistry, Institute of Molecular Sciences (IMS), Faculty of Sciences and Technology, University of Coimbra, Coimbra 3004-535,Portugal
| | - A A C C Pais
- Coimbra Chemistry Centre, Department of Chemistry, Institute of Molecular Sciences (IMS), Faculty of Sciences and Technology, University of Coimbra, Coimbra 3004-535,Portugal
| | - E Sicilia
- PROMOCS Laboratory, Department of Chemistry and Chemical Technologies, University of Calabria, Arcavacata di Rende87036,Italy
| |
Collapse
|
12
|
Kumar N, Acharya V. Advances in machine intelligence-driven virtual screening approaches for big-data. Med Res Rev 2024; 44:939-974. [PMID: 38129992 DOI: 10.1002/med.21995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 07/15/2023] [Accepted: 10/29/2023] [Indexed: 12/23/2023]
Abstract
Virtual screening (VS) is an integral and ever-evolving domain of drug discovery framework. The VS is traditionally classified into ligand-based (LB) and structure-based (SB) approaches. Machine intelligence or artificial intelligence has wide applications in the drug discovery domain to reduce time and resource consumption. In combination with machine intelligence algorithms, VS has emerged into revolutionarily progressive technology that learns within robust decision orders for data curation and hit molecule screening from large VS libraries in minutes or hours. The exponential growth of chemical and biological data has evolved as "big-data" in the public domain demands modern and advanced machine intelligence-driven VS approaches to screen hit molecules from ultra-large VS libraries. VS has evolved from an individual approach (LB and SB) to integrated LB and SB techniques to explore various ligand and target protein aspects for the enhanced rate of appropriate hit molecule prediction. Current trends demand advanced and intelligent solutions to handle enormous data in drug discovery domain for screening and optimizing hits or lead with fewer or no false positive hits. Following the big-data drift and tremendous growth in computational architecture, we presented this review. Here, the article categorized and emphasized individual VS techniques, detailed literature presented for machine learning implementation, modern machine intelligence approaches, and limitations and deliberated the future prospects.
Collapse
Affiliation(s)
- Neeraj Kumar
- Artificial Intelligence for Computational Biology Lab (AICoB), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research, Ghaziabad, India
| | - Vishal Acharya
- Artificial Intelligence for Computational Biology Lab (AICoB), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research, Ghaziabad, India
| |
Collapse
|
13
|
Zhang X, Gao H, Wang H, Chen Z, Zhang Z, Chen X, Li Y, Qi Y, Wang R. PLANET: A Multi-objective Graph Neural Network Model for Protein-Ligand Binding Affinity Prediction. J Chem Inf Model 2024; 64:2205-2220. [PMID: 37319418 DOI: 10.1021/acs.jcim.3c00253] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Predicting protein-ligand binding affinity is a central issue in drug design. Various deep learning models have been published in recent years, where many of them rely on 3D protein-ligand complex structures as input and tend to focus on the single task of reproducing binding affinity. In this study, we have developed a graph neural network model called PLANET (Protein-Ligand Affinity prediction NETwork). This model takes the graph-represented 3D structure of the binding pocket on the target protein and the 2D chemical structure of the ligand molecule as input. It was trained through a multi-objective process with three related tasks, including deriving the protein-ligand binding affinity, protein-ligand contact map, and ligand distance matrix. Besides the protein-ligand complexes with known binding affinity data retrieved from the PDBbind database, a large number of non-binder decoys were also added to the training data for deriving the final model of PLANET. When tested on the CASF-2016 benchmark, PLANET exhibited a scoring power comparable to the best result yielded by other deep learning models as well as a reasonable ranking power and docking power. In virtual screening trials conducted on the DUD-E benchmark, PLANET's performance was notably better than several deep learning and machine learning models. As on the LIT-PCBA benchmark, PLANET achieved comparable accuracy as the conventional docking program Glide, but it only spent less than 1% of Glide's computation time to finish the same job because PLANET did not need exhaustive conformational sampling. Considering the decent accuracy and efficiency of PLANET in binding affinity prediction, it may become a useful tool for conducting large-scale virtual screening.
Collapse
Affiliation(s)
- Xiangying Zhang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Haotian Gao
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Haojie Wang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Zhihang Chen
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Zhe Zhang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Xinchong Chen
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Yan Li
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Yifei Qi
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| | - Renxiao Wang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University, 826 Zhangheng Road, Shanghai 201203, People's Republic of China
| |
Collapse
|
14
|
Mastrolorito F, Togo MV, Gambacorta N, Trisciuzzi D, Giannuzzi V, Bonifazi F, Liantonio A, Imbrici P, De Luca A, Altomare CD, Ciriaco F, Amoroso N, Nicolotti O. TISBE: A Public Web Platform for the Consensus-Based Explainable Prediction of Developmental Toxicity. Chem Res Toxicol 2024; 37:323-339. [PMID: 38200616 DOI: 10.1021/acs.chemrestox.3c00310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]
Abstract
Despite being extremely relevant for the protection of prenatal and neonatal health, the developmental toxicity (Dev Tox) is a highly complex endpoint whose molecular rationale is still largely unknown. The lack of availability of high-quality data as well as robust nontesting methods makes its understanding even more difficult. Thus, the application of new explainable alternative methods is of utmost importance, with Dev Tox being one of the most animal-intensive research themes of regulatory toxicology. Descending from TIRESIA (Toxicology Intelligence and Regulatory Evaluations for Scientific and Industry Applications), the present work describes TISBE (TIRESIA Improved on Structure-Based Explainability), a new public web platform implementing four fundamental advancements for in silico analyses: a three times larger dataset, a transparent XAI (explainable artificial intelligence) framework employing a fragment-based fingerprint coding, a novel consensus classifier based on five independent machine learning models, and a new applicability domain (AD) method based on a double top-down approach for better estimating the prediction reliability. The training set (TS) includes as many as 1008 chemicals annotated with experimental toxicity values. Based on a 5-fold cross-validation, a median value of 0.410 for the Matthews correlation coefficient was calculated; TISBE was very effective, with a median value of sensitivity and specificity equal to 0.984 and 0.274, respectively. TISBE was applied on two external pools made of 1484 bioactive compounds and 85 pediatric drugs taken from ChEMBL (Chemical European Molecular Biology Laboratory) and TEDDY (Task-Force in Europe for Drug Development in the Young) repositories, respectively. Notably, TISBE gives users the option to clearly spot the molecular fragments responsible for the toxicity or the safety of a given chemical query and is available for free at https://prometheus.farmacia.uniba.it/tisbe.
Collapse
Affiliation(s)
- Fabrizio Mastrolorito
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Maria Vittoria Togo
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Nicola Gambacorta
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Viviana Giannuzzi
- Fondazione per la Ricerca Farmacologica Gianni Benzi Onlus, 70010 Valenzano (BA), Italy
| | - Fedele Bonifazi
- Fondazione per la Ricerca Farmacologica Gianni Benzi Onlus, 70010 Valenzano (BA), Italy
| | - Antonella Liantonio
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Paola Imbrici
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Annamaria De Luca
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Fulvio Ciriaco
- Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Nicola Amoroso
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| |
Collapse
|
15
|
Yang S, Kar S. Protracted molecular dynamics and secondary structure introspection to identify dual-target inhibitors of Nipah virus exerting approved small molecules repurposing. Sci Rep 2024; 14:3696. [PMID: 38355980 PMCID: PMC10866979 DOI: 10.1038/s41598-024-54281-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 02/10/2024] [Indexed: 02/16/2024] Open
Abstract
Nipah virus (NiV), with its significantly higher mortality rate compared to COVID-19, presents a looming threat as a potential next pandemic, particularly if constant mutations of NiV increase its transmissibility and transmission. Considering the importance of preventing the facilitation of the virus entry into host cells averting the process of assembly forming the viral envelope, and encapsulating the nucleocapsid, it is crucial to take the Nipah attachment glycoprotein-human ephrin-B2 and matrix protein as dual targets. Repurposing approved small molecules in drug development is a strategic choice, as it leverages molecules with known safety profiles, accelerating the path to finding effective treatments against NiV. The approved small molecules from DrugBank were used for repurposing and were subjected to extra precision docking followed by absorption, distribution, metabolism, excretion, and toxicity (ADMET) profiling. The 4 best molecules were selected for 500 ns molecular dynamics (MD) simulation followed by Molecular mechanics with generalized Born and surface area solvation (MM-GBSA). Further, the free energy landscape, the principal component analysis followed by the defined secondary structure of proteins analysis were introspected. The inclusive analysis proposed that Iotrolan (DB09487) and Iodixanol (DB01249) are effective dual inhibitors, while Rutin (DB01698) and Lactitol (DB12942) were found to actively target the matrix protein only.
Collapse
Affiliation(s)
- Siyun Yang
- Chemometrics and Molecular Modeling Laboratory, Department of Chemistry and Physics, Kean University, 1000 Morris Avenue, Union, NJ, 07083, USA
| | - Supratik Kar
- Chemometrics and Molecular Modeling Laboratory, Department of Chemistry and Physics, Kean University, 1000 Morris Avenue, Union, NJ, 07083, USA.
| |
Collapse
|
16
|
Serafim MSM, Kronenberger T, Rocha REO, Rosa ADRA, Mello TLG, Poso A, Ferreira RS, Abrahão JS, Kroon EG, Mota BEF, Maltarollo VG. Aminopyrimidine Derivatives as Multiflavivirus Antiviral Compounds Identified from a Consensus Virtual Screening Approach. J Chem Inf Model 2024; 64:393-411. [PMID: 38194508 DOI: 10.1021/acs.jcim.3c01505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
Around three billion people are at risk of infection by the dengue virus (DENV) and potentially other flaviviruses. Worldwide outbreaks of DENV, Zika virus (ZIKV), and yellow fever virus (YFV), the lack of antiviral drugs, and limitations on vaccine usage emphasize the need for novel antiviral research. Here, we propose a consensus virtual screening approach to discover potential protease inhibitors (NS3pro) against different flavivirus. We employed an in silico combination of a hologram quantitative structure-activity relationship (HQSAR) model and molecular docking on characterized binding sites followed by molecular dynamics (MD) simulations, which filtered a data set of 7.6 million compounds to 2,775 hits. Lastly, docking and MD simulations selected six final potential NS3pro inhibitors with stable interactions along the simulations. Five compounds had their antiviral activity confirmed against ZIKV, YFV, DENV-2, and DENV-3 (ranging from 4.21 ± 0.14 to 37.51 ± 0.8 μM), displaying aggregator characteristics for enzymatic inhibition against ZIKV NS3pro (ranging from 28 ± 7 to 70 ± 7 μM). Taken together, the compounds identified in this approach may contribute to the design of promising candidates to treat different flavivirus infections.
Collapse
Affiliation(s)
- Mateus Sá Magalhães Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG 31270-901, Brazil
| | - Thales Kronenberger
- Institute of Pharmacy, Pharmaceutical/Medicinal Chemistry and Tübingen Center for Academic Drug Discovery (TüCAD2), Eberhard Karls University Tübingen, Auf der Morgenstelle 8, Tübingen 72076, Germany
- Excellence Cluster "Controlling Microbes to Fight Infections" (CMFI), Tübingen 72076, Germany
- School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, Kuopio 70211, Finland
| | - Rafael Eduardo Oliveira Rocha
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG 31270-901, Brazil
| | - Amanda Del Rio Abreu Rosa
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG 31270-901, Brazil
| | - Thaysa Lara Gonçalves Mello
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG 31270-901, Brazil
| | - Antti Poso
- Institute of Pharmacy, Pharmaceutical/Medicinal Chemistry and Tübingen Center for Academic Drug Discovery (TüCAD2), Eberhard Karls University Tübingen, Auf der Morgenstelle 8, Tübingen 72076, Germany
- School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, Kuopio 70211, Finland
- Department of Medical Oncology and Pneumology, University Hospital of Tübingen, Tübingen 70211, Germany
| | - Rafaela Salgado Ferreira
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG 31270-901, Brazil
| | - Jonatas Santos Abrahão
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG 31270-901, Brazil
| | - Erna Geessien Kroon
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG 31270-901, Brazil
| | - Bruno Eduardo Fernandes Mota
- Departamento de Análises Clínicas e Toxicológicas, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG 31270-901, Brazil
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG 31270-901, Brazil
| |
Collapse
|
17
|
Shomope I, Percival KM, Abdel Jabbar NM, Husseini GA. Predicting Calcein Release from Ultrasound-Targeted Liposomes: A Comparative Analysis of Random Forest and Support Vector Machine. Technol Cancer Res Treat 2024; 23:15330338241296725. [PMID: 39539114 PMCID: PMC11561990 DOI: 10.1177/15330338241296725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 09/29/2024] [Accepted: 10/15/2024] [Indexed: 11/16/2024] Open
Abstract
OBJECTIVE This study presents a comparative analysis of RF and SVM for predicting calcein release from ultrasound-triggered, targeted liposomes under varied low-frequency ultrasound (LFUS) power densities (6.2, 9, and 10 mW/cm2). METHODS Liposomes loaded with calcein and targeted with seven different moieties (cRGD, estrone, folate, Herceptin, hyaluronic acid, lactobionic acid, and transferrin) were synthesized using the thin-film hydration method. The liposomes were characterized using Dynamic Light Scattering and Bicinchoninic Acid assays. Extensive data collection and preprocessing were performed. RF and SVM models were trained and evaluated using mean absolute error (MAE), mean squared error (MSE), coefficient of determination (R²), and the a20 index as performance metrics. RESULTS RF consistently outperformed SVM, achieving R2 scores above 0.96 across all power densities, particularly excelling at higher power densities and indicating a strong correlation with the actual data. CONCLUSION RF outperforms SVM in drug release prediction, though both show strengths and apply based on specific prediction needs.
Collapse
Affiliation(s)
- Ibrahim Shomope
- Department of Chemical and Biological Engineering, American University of Sharjah, Sharjah 26666, United Arab Emirates
| | - Kelly M. Percival
- Department of Chemical and Biological Engineering, American University of Sharjah, Sharjah 26666, United Arab Emirates
| | - Nabil M. Abdel Jabbar
- Department of Chemical and Biological Engineering, American University of Sharjah, Sharjah 26666, United Arab Emirates
| | - Ghaleb A. Husseini
- Department of Chemical and Biological Engineering, American University of Sharjah, Sharjah 26666, United Arab Emirates
- Materials Science and Engineering Program, College of Arts and Sciences, American University of Sharjah, Sharjah PO Box 26666, United Arab Emirates
| |
Collapse
|
18
|
Li Y, Fan Z, Rao J, Chen Z, Chu Q, Zheng M, Li X. An overview of recent advances and challenges in predicting compound-protein interaction (CPI). MEDICAL REVIEW (2021) 2023; 3:465-486. [PMID: 38282802 PMCID: PMC10808869 DOI: 10.1515/mr-2023-0030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 08/30/2023] [Indexed: 01/30/2024]
Abstract
Compound-protein interactions (CPIs) are critical in drug discovery for identifying therapeutic targets, drug side effects, and repurposing existing drugs. Machine learning (ML) algorithms have emerged as powerful tools for CPI prediction, offering notable advantages in cost-effectiveness and efficiency. This review provides an overview of recent advances in both structure-based and non-structure-based CPI prediction ML models, highlighting their performance and achievements. It also offers insights into CPI prediction-related datasets and evaluation benchmarks. Lastly, the article presents a comprehensive assessment of the current landscape of CPI prediction, elucidating the challenges faced and outlining emerging trends to advance the field.
Collapse
Affiliation(s)
- Yanbei Li
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhehuan Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jingxin Rao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhiyi Chen
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Qinyu Chu
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, Zhejiang Province, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
19
|
Rai M, Singh AV, Paudel N, Kanase A, Falletta E, Kerkar P, Heyda J, Barghash RF, Pratap Singh S, Soos M. Herbal concoction Unveiled: A computational analysis of phytochemicals' pharmacokinetic and toxicological profiles using novel approach methodologies (NAMs). Curr Res Toxicol 2023; 5:100118. [PMID: 37609475 PMCID: PMC10440360 DOI: 10.1016/j.crtox.2023.100118] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 08/08/2023] [Accepted: 08/09/2023] [Indexed: 08/24/2023] Open
Abstract
Herbal medications have an extensive history of use in treating various diseases, attributed to their perceived efficacy and safety. Traditional medicine practitioners and contemporary healthcare providers have shown particular interest in herbal syrups, especially for respiratory illnesses associated with the SARS-CoV-2 virus. However, the current understanding of the pharmacokinetic and toxicological properties of phytochemicals in these herbal mixtures is limited. This study presents a comprehensive computational analysis utilizing novel approach methodologies (NAMs) to investigate the pharmacokinetic and toxicological profiles of phytochemicals in herbal syrup, leveraging in-silico techniques and prediction tools such as PubChem, SwissADME, and Molsoft's database. Although molecular dynamics, docking, and broader system-wide analyses were not considered, future studies hold potential for further investigation in these areas. By combining drug-likeness with molecular simulation, researchers identify diverse phytochemicals suitable for complex medication development examining their pharmacokinetic-toxicological profiles in phytopharmaceutical syrup. The study focuses on herbal solutions for respiratory infections, with the goal of adding to the pool of all-natural treatments for such ailments. This research has the potential to revolutionize environmental and alternative medicine by leveraging in-silico models and innovative analytical techniques to identify novel phytochemicals with enhanced therapeutic benefits and explore network-based and systems biology approaches for a deeper understanding of their interactions with biological systems. Overall, our study offers valuable insights into the computational analysis of the pharmacokinetic and toxicological profiles of herbal concoction. This paves the way for advancements in environmental and alternative medicine. However, we acknowledge the need for future studies to address the aforementioned topics that were not adequately covered in this research.
Collapse
Affiliation(s)
- Mansi Rai
- Department of Microbiology, Central University of Rajasthan NH-8, Bandar Sindri, Dist-Ajmer-305817, Rajasthan, India
| | - Ajay Vikram Singh
- Department of Chemical and Product Safety, German Federal Institute of Risk Assessment (BfR), Maxdohrnstrasse 8-10, 10589 Berlin, Germany
| | - Namuna Paudel
- Department of Chemistry, Amrit Campus, Institute of Science and Technology, Tribhuvan University, Lainchaur, Kathmandu 44600, Nepal
| | - Anurag Kanase
- Opentrons Labworks Inc., Brooklyn, NY 11201, the United States of America
| | - Ermelinda Falletta
- Department of Chemistry, University of Milan, Via Golgi 19, 20133 Milan, Italy
| | - Pranali Kerkar
- Rutgers School of Public Health, 683 Hoes Lane West Piscataway, NJ 08854, the United States of America
| | - Jan Heyda
- Department of Physical Chemistry, University of Chemistry and Technology Prague, Technicka 5, Prague 6 Dejvice, 166 28, Czech Republic
| | - Reham F. Barghash
- Institute of Chemical Industries Researches, National Research Centre, Dokki, Cairo 12622, Egypt
| | | | - Miroslav Soos
- Department of Chemical Engineering, University of Chemistry and Technology Prague, Technicka 3, Prague 6 Dejvice, 166 28, Czech Republic
| |
Collapse
|
20
|
Gupta Y, Savytskyi OV, Coban M, Venugopal A, Pleqi V, Weber CA, Chitale R, Durvasula R, Hopkins C, Kempaiah P, Caulfield TR. Protein structure-based in-silico approaches to drug discovery: Guide to COVID-19 therapeutics. Mol Aspects Med 2023; 91:101151. [PMID: 36371228 PMCID: PMC9613808 DOI: 10.1016/j.mam.2022.101151] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 10/19/2022] [Accepted: 10/21/2022] [Indexed: 11/06/2022]
Abstract
With more than 5 million fatalities and close to 300 million reported cases, COVID-19 is the first documented pandemic due to a coronavirus that continues to be a major health challenge. Despite being rapid, uncontrollable, and highly infectious in its spread, it also created incentives for technology development and redefined public health needs and research agendas to fast-track innovations to be translated. Breakthroughs in computational biology peaked during the pandemic with renewed attention to making all cutting-edge technology deliver agents to combat the disease. The demand to develop effective treatments yielded surprising collaborations from previously segregated fields of science and technology. The long-standing pharmaceutical industry's aversion to repurposing existing drugs due to a lack of exponential financial gain was overrun by the health crisis and pressures created by front-line researchers and providers. Effective vaccine development even at an unprecedented pace took more than a year to develop and commence trials. Now the emergence of variants and waning protections during the booster shots is resulting in breakthrough infections that continue to strain health care systems. As of now, every protein of SARS-CoV-2 has been structurally characterized and related host pathways have been extensively mapped out. The research community has addressed the druggability of a multitude of possible targets. This has been made possible due to existing technology for virtual computer-assisted drug development as well as new tools and technologies such as artificial intelligence to deliver new leads. Here in this article, we are discussing advances in the drug discovery field related to target-based drug discovery and exploring the implications of known target-specific agents on COVID-19 therapeutic management. The current scenario calls for more personalized medicine efforts and stratifying patient populations early on for their need for different combinations of prognosis-specific therapeutics. We intend to highlight target hotspots and their potential agents, with the ultimate goal of using rational design of new therapeutics to not only end this pandemic but also uncover a generalizable platform for use in future pandemics.
Collapse
Affiliation(s)
- Yash Gupta
- Department of Medicine, Infectious Diseases, Mayo Clinic, Jacksonville, FL, USA
| | - Oleksandr V Savytskyi
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA; In Vivo Biosystems, Eugene, OR, USA
| | - Matt Coban
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA; Department of Cancer Biology, Mayo Clinic, Jacksonville, FL, USA
| | | | - Vasili Pleqi
- Department of Medicine, Infectious Diseases, Mayo Clinic, Jacksonville, FL, USA
| | - Caleb A Weber
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA
| | - Rohit Chitale
- Department of Medicine, Infectious Diseases, Mayo Clinic, Jacksonville, FL, USA; The Council on Strategic Risks, 1025 Connecticut Ave NW, Washington, DC, USA
| | - Ravi Durvasula
- Department of Medicine, Infectious Diseases, Mayo Clinic, Jacksonville, FL, USA
| | | | - Prakasha Kempaiah
- Department of Medicine, Infectious Diseases, Mayo Clinic, Jacksonville, FL, USA
| | - Thomas R Caulfield
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, USA; Department of QHS Computational Biology, Mayo Clinic, Jacksonville, FL, USA; Department of Biochemistry and Molecular Biology, Mayo Clinic, Rochester, MN, USA; Department of Clinical Genomics, Mayo Clinic, Rochester, MN, USA; Department of Neurosurgery, Mayo Clinic, Jacksonville, FL, USA.
| |
Collapse
|
21
|
Petinrin OO, Saeed F, Toseef M, Liu Z, Basurra S, Muyide IO, Li X, Lin Q, Wong KC. Machine learning in metastatic cancer research: Potentials, possibilities, and prospects. Comput Struct Biotechnol J 2023; 21:2454-2470. [PMID: 37077177 PMCID: PMC10106342 DOI: 10.1016/j.csbj.2023.03.046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 03/26/2023] [Accepted: 03/27/2023] [Indexed: 03/31/2023] Open
Abstract
Cancer has received extensive recognition for its high mortality rate, with metastatic cancer being the top cause of cancer-related deaths. Metastatic cancer involves the spread of the primary tumor to other body organs. As much as the early detection of cancer is essential, the timely detection of metastasis, the identification of biomarkers, and treatment choice are valuable for improving the quality of life for metastatic cancer patients. This study reviews the existing studies on classical machine learning (ML) and deep learning (DL) in metastatic cancer research. Since the majority of metastatic cancer research data are collected in the formats of PET/CT and MRI image data, deep learning techniques are heavily involved. However, its black-box nature and expensive computational cost are notable concerns. Furthermore, existing models could be overestimated for their generality due to the non-diverse population in clinical trial datasets. Therefore, research gaps are itemized; follow-up studies should be carried out on metastatic cancer using machine learning and deep learning tools with data in a symmetric manner.
Collapse
Affiliation(s)
| | - Faisal Saeed
- DAAI Research Group, Department of Computing and Data Science, School of Computing and Digital Technology, Birmingham City University, Birmingham B4 7XG, UK
| | - Muhammad Toseef
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong SAR
| | - Zhe Liu
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong SAR
| | - Shadi Basurra
- DAAI Research Group, Department of Computing and Data Science, School of Computing and Digital Technology, Birmingham City University, Birmingham B4 7XG, UK
| | | | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Qiuzhen Lin
- School of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong SAR
- Hong Kong Institute for Data Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong SAR
| |
Collapse
|
22
|
Wang X, Yang X, Wang Q, Meng D. Unnatural amino acids: promising implications for the development of new antimicrobial peptides. Crit Rev Microbiol 2023; 49:231-255. [PMID: 35254957 DOI: 10.1080/1040841x.2022.2047008] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
The increasing incidence and rapid spread of bacterial resistance to conventional antibiotics are a serious global threat to public health, highlighting the need to develop new antimicrobial alternatives. Antimicrobial peptides (AMPs) represent a class of promising natural antibiotic candidates due to their broad-spectrum activity and low tendency to induce resistance. However, the development of AMPs for medical use is hampered by several obstacles, such as moderate activity, lability to proteolytic degradation, and low bioavailability. To date, many researchers have focussed on the optimization or design of novel artificial AMPs with desired properties. Unnatural amino acids (UAAs) are valuable building blocks in the manufacture of a variety of pharmaceuticals, and have been used to develop artificial AMPs with specific structural and physicochemical properties. Rational incorporation of UAAs has become a very promising approach to endow AMPs with strong and long-lasting activity but no toxicity. This review aims to summarize key approaches that have been used to incorporate UAAs to develop novel AMPs with improved properties and better performance. It is anticipated that this review will guide future design considerations for UAA-based antimicrobial applications.
Collapse
Affiliation(s)
- Xiuhong Wang
- State Key Laboratory of Food Nutrition and Safety, College of Food Science and Engineering, Tianjin University of Science & Technology, Tianjin, People's Republic of China
| | - Xiaomin Yang
- State Key Laboratory of Food Nutrition and Safety, College of Food Science and Engineering, Tianjin University of Science & Technology, Tianjin, People's Republic of China
| | - Qiaoe Wang
- Key Laboratory of Cosmetic, China National Light Industry, Beijing Technology and Business University, Beijing, People's Republic of China
| | - Demei Meng
- State Key Laboratory of Food Nutrition and Safety, College of Food Science and Engineering, Tianjin University of Science & Technology, Tianjin, People's Republic of China.,Tianjin Gasin-DH Preservation Technology Co., Ltd, Tianjin, People's Republic of China
| |
Collapse
|
23
|
Kumar M, Nguyen TPN, Kaur J, Singh TG, Soni D, Singh R, Kumar P. Opportunities and challenges in application of artificial intelligence in pharmacology. Pharmacol Rep 2023; 75:3-18. [PMID: 36624355 PMCID: PMC9838466 DOI: 10.1007/s43440-022-00445-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 12/23/2022] [Accepted: 12/25/2022] [Indexed: 01/11/2023]
Abstract
Artificial intelligence (AI) is a machine science that can mimic human behaviour like intelligent analysis of data. AI functions with specialized algorithms and integrates with deep and machine learning. Living in the digital world can generate a huge amount of medical data every day. Therefore, we need an automated and reliable evaluation tool that can make decisions more accurately and faster. Machine learning has the potential to learn, understand and analyse the data used in healthcare systems. In the last few years, AI is known to be employed in various fields in pharmaceutical science especially in pharmacological research. It helps in the analysis of preclinical (laboratory animals) and clinical (in human) trial data. AI also plays important role in various processes such as drug discovery/manufacturing, diagnosis of big data for disease identification, personalized treatment, clinical trial research, radiotherapy, surgical robotics, smart electronic health records, and epidemic outbreak prediction. Moreover, AI has been used in the evaluation of biomarkers and diseases. In this review, we explain various models and general processes of machine learning and their role in pharmacological science. Therefore, AI with deep learning and machine learning could be relevant in pharmacological research.
Collapse
Affiliation(s)
- Mandeep Kumar
- Department of Pharmacy, Unit of Pharmacology and Toxicology, University of Genoa, Genoa, Italy
| | - T P Nhung Nguyen
- Department of Pharmacy, Unit of Pharmacology and Toxicology, University of Genoa, Genoa, Italy
- Department of Pharmacy, Da Nang University of Medical Technology and Pharmacy, Da Nang, Vietnam
| | - Jasleen Kaur
- Department of Pharmacology and Toxicology, National Institute of Pharmaceutical Education and Research (NIPER), Lucknow, Uttar Pradesh, 226002, India
| | | | - Divya Soni
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India
| | - Randhir Singh
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India
| | - Puneet Kumar
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India.
| |
Collapse
|
24
|
Application of machine learning to predict tacrolimus exposure in liver and kidney transplant patients given the MeltDose formulation. Eur J Clin Pharmacol 2023; 79:311-319. [PMID: 36564549 DOI: 10.1007/s00228-022-03445-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 12/15/2022] [Indexed: 12/25/2022]
Abstract
PURPOSE Machine Learning (ML) algorithms represent an interesting alternative to maximum a posteriori Bayesian estimators (MAP-BE) for tacrolimus AUC estimation, but it is not known if training an ML model using a lower number of full pharmacokinetic (PK) profiles (= "true" reference AUC) provides better performances than using a larger dataset of less accurate AUC estimates. The objectives of this study were: to develop and benchmark ML algorithms trained using full PK profiles to estimate MeltDose®-tacrolimus individual AUCs using 2 or 3 blood concentrations; and to compare their performance to MAP-BE. METHODS Data from liver (n = 113) and kidney (n = 97) transplant recipients involved in MeltDose-tacrolimus PK studies were used for the training and evaluation of ML algorithms. "True" AUC0-24 h was calculated for each patient using the trapezoidal rule on the full PK profile. ML algorithms were trained to estimate tacrolimus true AUC using 2 or 3 blood concentrations. Performances were evaluated in 2 external sets of 16 (renal) and 48 (liver) transplant patients. RESULTS Best estimation performances were obtained with the MARS algorithm and the following limited sampling strategies (LSS): predose (0), 8, and 12 h post-dose (rMPE = - 1.28%, rRMSE = 7.57%), or 0 and 12 h (rMPE = - 1.9%, rRMSE = 10.06%). In the external dataset, the performances of the final ML algorithms based on two samples in kidney (rMPE = - 3.1%, rRMSE = 11.1%) or liver transplant recipients (rMPE = - 3.4%, rRMSE = 9.86%) were as good as or better than those of MAP-BEs based on three time points. CONCLUSION The MARS ML models developed using "true" MeltDose®-tacrolimus AUCs yielded accurate individual estimations using only two blood concentrations.
Collapse
|
25
|
Li Q, Ma Z, Qin S, Zhao WJ. Virtual Screening-Based Drug Development for the Treatment of Nervous System Diseases. Curr Neuropharmacol 2023; 21:2447-2464. [PMID: 36043797 PMCID: PMC10616913 DOI: 10.2174/1570159x20666220830105350] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Revised: 08/04/2022] [Accepted: 08/19/2022] [Indexed: 11/22/2022] Open
Abstract
The incidence rate of nervous system diseases has increased in recent years. Nerve injury or neurodegenerative diseases usually cause neuronal loss and neuronal circuit damage, which seriously affect motor nerve and autonomic nervous function. Therefore, safe and effective treatment is needed. As traditional drug research becomes slower and more expensive, it is vital to enlist the help of cutting- edge technology. Virtual screening (VS) is an attractive option for the identification and development of promising new compounds with high efficiency and low cost. With the assistance of computer- aided drug design (CADD), VS is becoming more and more popular in new drug development and research. In recent years, it has become a reality to transform non-neuronal cells into functional neurons through small molecular compounds, which provides a broader application prospect than transcription factor-mediated neuronal reprogramming. This review mainly summarizes related theory and technology of VS and the drug research and development using VS technology in nervous system diseases in recent years, and focuses more on the potential application of VS technology in neuronal reprogramming, thus facilitating new drug design for both prevention and treatment of nervous system diseases.
Collapse
Affiliation(s)
- Qian Li
- Wuxi School of Medicine, Jiangnan University, Wuxi 214122, Jiangsu, P.R. China
| | - Zhaobin Ma
- College of Life Science and Technology, Kunming University of Science and Technology, Kunming 650504, Yunnan, P.R. China
| | - Shuhua Qin
- College of Life Science and Technology, Kunming University of Science and Technology, Kunming 650504, Yunnan, P.R. China
| | - Wei-Jiang Zhao
- Wuxi School of Medicine, Jiangnan University, Wuxi 214122, Jiangsu, P.R. China
- Department of Cell Biology, Wuxi School of Medicine, Jiangnan University, Wuxi 214122, Jiangsu, P.R. China
| |
Collapse
|
26
|
Durairaj J, de Ridder D, van Dijk AD. Beyond sequence: Structure-based machine learning. Comput Struct Biotechnol J 2022; 21:630-643. [PMID: 36659927 PMCID: PMC9826903 DOI: 10.1016/j.csbj.2022.12.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/21/2022] [Accepted: 12/21/2022] [Indexed: 12/31/2022] Open
Abstract
Recent breakthroughs in protein structure prediction demarcate the start of a new era in structural bioinformatics. Combined with various advances in experimental structure determination and the uninterrupted pace at which new structures are published, this promises an age in which protein structure information is as prevalent and ubiquitous as sequence. Machine learning in protein bioinformatics has been dominated by sequence-based methods, but this is now changing to make use of the deluge of rich structural information as input. Machine learning methods making use of structures are scattered across literature and cover a number of different applications and scopes; while some try to address questions and tasks within a single protein family, others aim to capture characteristics across all available proteins. In this review, we look at the variety of structure-based machine learning approaches, how structures can be used as input, and typical applications of these approaches in protein biology. We also discuss current challenges and opportunities in this all-important and increasingly popular field.
Collapse
Affiliation(s)
- Janani Durairaj
- Biozentrum, University of Basel, Basel, Switzerland
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Dick de Ridder
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Aalt D.J. van Dijk
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| |
Collapse
|
27
|
Guo Y, Rui SS, Xu W, Sun C. Machine Learning Method for Fatigue Strength Prediction of Nickel-Based Superalloy with Various Influencing Factors. MATERIALS (BASEL, SWITZERLAND) 2022; 16:46. [PMID: 36614382 PMCID: PMC9820995 DOI: 10.3390/ma16010046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/13/2022] [Accepted: 12/15/2022] [Indexed: 06/17/2023]
Abstract
The accurate prediction of fatigue performance is of great engineering significance for the safe and reliable service of components. However, due to the complexity of influencing factors on fatigue behavior and the incomplete understanding of the fatigue failure mechanism, it is difficult to correlate well the influence of various factors on fatigue performance. Machine learning could be used to deal with the association or influence of complex factors due to its good nonlinear approximation and multi-variable learning ability. In this paper, the gradient boosting regression tree model, the long short-term memory model and the polynomial regression model with ridge regularization in machine learning are used to predict the fatigue strength of a nickel-based superalloy GH4169 under different temperatures, stress ratios and fatigue life in the literature. By dividing different training and testing sets, the influence of the composition of data in the training set on the predictive ability of the machine learning method is investigated. The results indicate that the machine learning method shows great potential in the fatigue strength prediction through learning and training limited data, which could provide a new means for the prediction of fatigue performance incorporating complex influencing factors. However, the predicted results are closely related to the data in the training set. More abundant data in the training set is necessary to achieve a better predictive capability of the machine learning model. For example, it is hard to give good predictions for the anomalous data if the anomalous data are absent in the training set.
Collapse
Affiliation(s)
- Yiyun Guo
- State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China
- School of Engineering Science, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shao-Shi Rui
- State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China
| | - Wei Xu
- Beijing Key Laboratory of Aeronautical Materials Testing and Evaluation, Beijing Institute of Aeronautical Materials, Beijing 100095, China
| | - Chengqi Sun
- State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China
- School of Engineering Science, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
28
|
Zhang Y, Luo M, Wu P, Wu S, Lee TY, Bai C. Application of Computational Biology and Artificial Intelligence in Drug Design. Int J Mol Sci 2022; 23:13568. [PMID: 36362355 PMCID: PMC9658956 DOI: 10.3390/ijms232113568] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 10/29/2022] [Accepted: 11/03/2022] [Indexed: 08/24/2023] Open
Abstract
Traditional drug design requires a great amount of research time and developmental expense. Booming computational approaches, including computational biology, computer-aided drug design, and artificial intelligence, have the potential to expedite the efficiency of drug discovery by minimizing the time and financial cost. In recent years, computational approaches are being widely used to improve the efficacy and effectiveness of drug discovery and pipeline, leading to the approval of plenty of new drugs for marketing. The present review emphasizes on the applications of these indispensable computational approaches in aiding target identification, lead discovery, and lead optimization. Some challenges of using these approaches for drug design are also discussed. Moreover, we propose a methodology for integrating various computational techniques into new drug discovery and design.
Collapse
Affiliation(s)
- Yue Zhang
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| | - Mengqi Luo
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- South China Hospital, Health Science Center, Shenzhen University, Shenzhen 518116, China
| | - Peng Wu
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518055, China
| | - Song Wu
- South China Hospital, Health Science Center, Shenzhen University, Shenzhen 518116, China
| | - Tzong-Yi Lee
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| | - Chen Bai
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| |
Collapse
|
29
|
Avery C, Patterson J, Grear T, Frater T, Jacobs DJ. Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:1246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein-ligand binding, including allosteric effects, protein-protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
Affiliation(s)
- Chris Avery
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - John Patterson
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Tyler Grear
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Theodore Frater
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Donald J. Jacobs
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
30
|
Veríssimo GC, Serafim MSM, Kronenberger T, Ferreira RS, Honorio KM, Maltarollo VG. Designing drugs when there is low data availability: one-shot learning and other approaches to face the issues of a long-term concern. Expert Opin Drug Discov 2022; 17:929-947. [PMID: 35983695 DOI: 10.1080/17460441.2022.2114451] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Modern drug discovery generally is accessed by useful information from previous large databases or uncovering novel data. The lack of biological and/or chemical data tends to slow the development of scientific research and innovation. Here, approaches that may help provide solutions to generate or obtain enough relevant data or improve/accelerate existing methods within the last five years were reviewed. AREAS COVERED One-shot learning (OSL) approaches, structural modeling, molecular docking, scoring function space (SFS), molecular dynamics (MD), and quantum mechanics (QM) may be used to amplify the amount of available data to drug design and discovery campaigns, presenting methods, their perspectives, and discussions to be employed in the near future. EXPERT OPINION Recent works have successfully used these techniques to solve a range of issues in the face of data scarcity, including complex problems such as the challenging scenario of drug design aimed at intrinsically disordered proteins and the evaluation of potential adverse effects in a clinical scenario. These examples show that it is possible to improve and kickstart research from scarce available data to design and discover new potential drugs.
Collapse
Affiliation(s)
- Gabriel C Veríssimo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Mateus Sá M Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Thales Kronenberger
- Department of Medical Oncology and Pneumology, Internal Medicine VIII, University Hospital of Tübingen, Tübingen, Germany.,School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, Kuopio, Finland
| | - Rafaela S Ferreira
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Kathia M Honorio
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP), São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC (UFABC), Santo André, Brazil
| | - Vinícius G Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| |
Collapse
|
31
|
McGibbon M, Money-Kyrle S, Blay V, Houston DR. SCORCH: Improving structure-based virtual screening with machine learning classifiers, data augmentation, and uncertainty estimation. J Adv Res 2022; 46:135-147. [PMID: 35901959 PMCID: PMC10105235 DOI: 10.1016/j.jare.2022.07.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Revised: 07/08/2022] [Accepted: 07/09/2022] [Indexed: 11/17/2022] Open
Abstract
INTRODUCTION The discovery of a new drug is a costly and lengthy endeavour. The computational prediction of which small molecules can bind to a protein target can accelerate this process if the predictions are fast and accurate enough. Recent machine-learning scoring functions re-evaluate the output of molecular docking to achieve more accurate predictions. However, previous scoring functions were trained on crystalised protein-ligand complexes and datasets of decoys. The limited availability of crystal structures and biases in the decoy datasets can lower the performance of scoring functions. OBJECTIVES To address key limitations of previous scoring functions and thus improve the predictive performance of structure-based virtual screening. METHODS A novel machine-learning scoring function was created, named SCORCH (Scoring COnsensus for RMSD-based Classification of Hits). To develop SCORCH, training data is augmented by considering multiple ligand poses and labelling poses based on their RMSD from the native pose. Decoy bias is addressed by generating property-matched decoys for each ligand and using the same methodology for preparing and docking decoys and ligands. A consensus of 3 different machine learning approaches is also used to improve performance. RESULTS We find that multi-pose augmentation in SCORCH improves its docking power and screening power on independent benchmark datasets. SCORCH outperforms an equivalent scoring function trained on single poses, with a 1% enrichment factor (EF) of 13.78 vs. 10.86 on 18 DEKOIS 2.0 targets and a mean native pose rank of 5.9 vs 30.4 on CSAR 2014. Additionally, SCORCH outperforms widely used scoring functions in virtual screening and pose prediction on independent benchmark datasets. CONCLUSION By rationally addressing key limitations of previous scoring functions, SCORCH improves the performance of virtual screening. SCORCH also provides an estimate of its uncertainty, which can help reduce the cost and time required for drug discovery.
Collapse
Affiliation(s)
- Miles McGibbon
- Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, Scotland EH9 3BF, UK
| | - Sam Money-Kyrle
- Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, Scotland EH9 3BF, UK
| | - Vincent Blay
- Department of Microbiology and Environmental Toxicology, University of California at Santa Cruz, Santa Cruz, CA 95064, USA; Institute for Integrative Systems Biology (I(2)SysBio), Universitat de València and Spanish Research Council (CSIC), 46980 Valencia, Spain.
| | - Douglas R Houston
- Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, Scotland EH9 3BF, UK.
| |
Collapse
|
32
|
Azevedo L, Serafim MSM, Maltarollo VG, Grabrucker AM, Granato D. Atherosclerosis fate in the era of tailored functional foods: Evidence-based guidelines elicited from structure- and ligand-based approaches. Trends Food Sci Technol 2022. [DOI: 10.1016/j.tifs.2022.07.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
33
|
Orosz Á, Héberger K, Rácz A. Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets. Front Chem 2022; 10:852893. [PMID: 35755260 PMCID: PMC9214226 DOI: 10.3389/fchem.2022.852893] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 04/14/2022] [Indexed: 01/12/2023] Open
Abstract
The screening of compounds for ADME-Tox targets plays an important role in drug design. QSPR models can increase the speed of these specific tasks, although the performance of the models highly depends on several factors, such as the applied molecular descriptors. In this study, a detailed comparison of the most popular descriptor groups has been carried out for six main ADME-Tox classification targets: Ames mutagenicity, P-glycoprotein inhibition, hERG inhibition, hepatotoxicity, blood–brain-barrier permeability, and cytochrome P450 2C9 inhibition. The literature-based, medium-sized binary classification datasets (all above 1,000 molecules) were used for the model building by two common algorithms, XGBoost and the RPropMLP neural network. Five molecular representation sets were compared along with their joint applications: Morgan, Atompairs, and MACCS fingerprints, and the traditional 1D and 2D molecular descriptors, as well as 3D molecular descriptors, separately. The statistical evaluation of the model performances was based on 18 different performance parameters. Although all the developed models were close to the usual performance of QSPR models for each specific ADME-Tox target, the results clearly showed the superiority of the traditional 1D, 2D, and 3D descriptors in the case of the XGBoost algorithm. It is worth trying the classical tools in single model building because the use of 2D descriptors can produce even better models for almost every dataset than the combination of all the examined descriptor sets.
Collapse
Affiliation(s)
- Álmos Orosz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Budapest, Hungary
| | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Budapest, Hungary
| | - Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Budapest, Hungary
| |
Collapse
|
34
|
Tarín-Pelló A, Suay-García B, Pérez-Gracia MT. Antibiotic resistant bacteria: current situation and treatment options to accelerate the development of a new antimicrobial arsenal. Expert Rev Anti Infect Ther 2022; 20:1095-1108. [PMID: 35576494 DOI: 10.1080/14787210.2022.2078308] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
INTRODUCTION Antibiotic resistance is one of the biggest public health threats worldwide. Currently, antibiotic-resistant bacteria kill 700,000 people every year. These data represent the near future in which we find ourselves, a "post-antibiotic era" where the identification and development of new treatments are key. This review is focused on the current and emerging antimicrobial therapies which can solve this global threat. AREAS COVERED Through a literature search using databases such as Medline and Web of Science, and search engines such as Google Scholar, different antimicrobial therapies were analyzed, including pathogen-oriented therapy, phagotherapy, microbiota and antivirulent therapy. Additionally, the development pathways of new antibiotics were described, emphasizing on the potential advantages that the combination of a drug repurposing strategy with the application of mathematical prediction models could bring to solve the problem of AMRs. EXPERT OPINION This review offers several starting points to solve a single problem: reducing the number of AMR. The data suggest that the strategies described could provide many benefits to improve antimicrobial treatments. However, the development of new antimicrobials remains necessary. Drug repurposing, with the application of mathematical prediction models, is considered to be of interest due to its rapid and effective potential to increase the current therapeutic arsenal.
Collapse
Affiliation(s)
- Antonio Tarín-Pelló
- Área de Microbiología, Departamento de Farmacia, Instituto de Ciencias Biomédicas, Facultad de Ciencias de la Salud
| | - Beatriz Suay-García
- ESI International Chair@CEU-UCH, Departamento de Matemáticas, Física y Ciencias Tecnológicas, Universidad Cardenal Herrera-CEU, CEU Universities, C/ Santiago Ramón y Cajal, 46115 Alfara del Patriarca, Valencia, Spain
| | - María-Teresa Pérez-Gracia
- Área de Microbiología, Departamento de Farmacia, Instituto de Ciencias Biomédicas, Facultad de Ciencias de la Salud
| |
Collapse
|
35
|
Periwal V, Bassler S, Andrejev S, Gabrielli N, Patil KR, Typas A, Patil KR. Bioactivity assessment of natural compounds using machine learning models trained on target similarity between drugs. PLoS Comput Biol 2022; 18:e1010029. [PMID: 35468126 PMCID: PMC9071136 DOI: 10.1371/journal.pcbi.1010029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 05/05/2022] [Accepted: 03/17/2022] [Indexed: 11/19/2022] Open
Abstract
Natural compounds constitute a rich resource of potential small molecule therapeutics. While experimental access to this resource is limited due to its vast diversity and difficulties in systematic purification, computational assessment of structural similarity with known therapeutic molecules offers a scalable approach. Here, we assessed functional similarity between natural compounds and approved drugs by combining multiple chemical similarity metrics and physicochemical properties using a machine-learning approach. We computed pairwise similarities between 1410 drugs for training classification models and used the drugs shared protein targets as class labels. The best performing models were random forest which gave an average area under the ROC of 0.9, Matthews correlation coefficient of 0.35, and F1 score of 0.33, suggesting that it captured the structure-activity relation well. The models were then used to predict protein targets of circa 11k natural compounds by comparing them with the drugs. This revealed therapeutic potential of several natural compounds, including those with support from previously published sources as well as those hitherto unexplored. We experimentally validated one of the predicted pair’s activities, viz., Cox-1 inhibition by 5-methoxysalicylic acid, a molecule commonly found in tea, herbs and spices. In contrast, another natural compound, 4-isopropylbenzoic acid, with the highest similarity score when considering most weighted similarity metric but not picked by our models, did not inhibit Cox-1. Our results demonstrate the utility of a machine-learning approach combining multiple chemical features for uncovering protein binding potential of natural compounds.
Collapse
Affiliation(s)
- Vinita Periwal
- European Molecular Biology Laboratory, Heidelberg, Germany
- Medical Research Council Toxicology Unit, University of Cambridge, Cambridge, United Kingdom
| | - Stefan Bassler
- European Molecular Biology Laboratory, Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | | | | | - Kaustubh Raosaheb Patil
- Institute of Neuroscience and Medicine (INM-7), Jülich, Germany
- Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | | | - Kiran Raosaheb Patil
- European Molecular Biology Laboratory, Heidelberg, Germany
- Medical Research Council Toxicology Unit, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
36
|
Parakkal S, Datta R, Das D. DeepBBBP: High accuracy Blood-Brain-Barrier Permeability Prediction with a Mixed Deep Learning Model. Mol Inform 2022; 41:e2100315. [PMID: 35393777 DOI: 10.1002/minf.202100315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 04/07/2022] [Indexed: 11/05/2022]
Abstract
Blood-brain-barrier permeability (BBBP) is an important property that is used to establish the drug-likeness of a molecule, as it establishes whether the molecule can cross the BBB when desired. It also eliminates those molecules which are not supposed to cross the barrier, as doing so would lead to toxicity. BBBP can be measured in vivo, in vitro or in silico. With the advent and subsequent rise of in silico methods for virtual drug screening, quite a bit of work has been done to predict this feature using statistical machine learning (ML) and deep learning (DL) based methods. In this work a mixed DL-based model, consisting of a Multi-layer Perceptron (MLP) and Convolutional Neural Network layers, has been paired with Mol2vec. Mol2vec is a convenient and unsupervised machine learning technique which produces high-dimensional vector representations of molecules and its molecular substructures. These succinct vector representations are utilized as inputs to the mixed DL model that is used for BBBP predictions. Several well-known benchmarks incorporating BBBP data have been used for supervised training and prediction by our mixed DL model which demonstrates superior results when compared to existing ML and DL techniques used for predicting BBBP.
Collapse
|
37
|
Martinelli DD. Generative machine learning for de novo drug discovery: A systematic review. Comput Biol Med 2022; 145:105403. [PMID: 35339849 DOI: 10.1016/j.compbiomed.2022.105403] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 03/10/2022] [Accepted: 03/11/2022] [Indexed: 02/08/2023]
Abstract
Recent research on artificial intelligence indicates that machine learning algorithms can auto-generate novel drug-like molecules. Generative models have revolutionized de novo drug discovery, rendering the explorative process more efficient. Several model frameworks and input formats have been proposed to enhance the performance of intelligent algorithms in generative molecular design. In this systematic literature review of experimental articles and reviews over the last five years, machine learning models, challenges associated with computational molecule design along with proposed solutions, and molecular encoding methods are discussed. A query-based search of the PubMed, ScienceDirect, Springer, Wiley Online Library, arXiv, MDPI, bioRxiv, and IEEE Xplore databases yielded 87 studies. Twelve additional studies were identified via citation searching. Of the articles in which machine learning was implemented, six prominent algorithms were identified: long short-term memory recurrent neural networks (LSTM-RNNs), variational autoencoders (VAEs), generative adversarial networks (GANs), adversarial autoencoders (AAEs), evolutionary algorithms, and gated recurrent unit (GRU-RNNs). Furthermore, eight central challenges were designated: homogeneity of generated molecular libraries, deficient synthesizability, limited assay data, model interpretability, incapacity for multi-property optimization, incomparability, restricted molecule size, and uncertainty in model evaluation. Molecules were encoded either as strings, which were occasionally augmented using randomization, as 2D graphs, or as 3D graphs. Statistical analysis and visualization are performed to illustrate how approaches to machine learning in de novo drug design have evolved over the past five years. Finally, future opportunities and reservations are discussed.
Collapse
|
38
|
Mahendran N, Vincent PMDR, Srinivasan K, Chang CY. Improving the Classification of Alzheimer's Disease Using Hybrid Gene Selection Pipeline and Deep Learning. Front Genet 2021; 12:784814. [PMID: 34868275 PMCID: PMC8632950 DOI: 10.3389/fgene.2021.784814] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 10/20/2021] [Indexed: 11/13/2022] Open
Abstract
Alzheimer’s is a progressive, irreversible, neurodegenerative brain disease. Even with prominent symptoms, it takes years to notice, decode, and reveal Alzheimer’s. However, advancements in technologies, such as imaging techniques, help in early diagnosis. Still, sometimes the results are inaccurate, which delays the treatment. Thus, the research in recent times focused on identifying the molecular biomarkers that differentiate the genotype and phenotype characteristics. However, the gene expression dataset’s generated features are huge, 1,000 or even more than 10,000. To overcome such a curse of dimensionality, feature selection techniques are introduced. We designed a gene selection pipeline combining a filter, wrapper, and unsupervised method to select the relevant genes. We combined the minimum Redundancy and maximum Relevance (mRmR), Wrapper-based Particle Swarm Optimization (WPSO), and Auto encoder to select the relevant features. We used the GSE5281 Alzheimer’s dataset from the Gene Expression Omnibus We implemented an Improved Deep Belief Network (IDBN) with simple stopping criteria after choosing the relevant genes. We used a Bayesian Optimization technique to tune the hyperparameters in the Improved Deep Belief Network. The tabulated results show that the proposed pipeline shows promising results.
Collapse
Affiliation(s)
- Nivedhitha Mahendran
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India
| | - P M Durai Raj Vincent
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India
| | - Kathiravan Srinivasan
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India
| | - Chuan-Yu Chang
- Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology, Yunlin, Taiwan
| |
Collapse
|
39
|
Unsupervised Representation Learning for Proteochemometric Modeling. Int J Mol Sci 2021; 22:ijms222312882. [PMID: 34884688 PMCID: PMC8657702 DOI: 10.3390/ijms222312882] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/25/2021] [Accepted: 11/26/2021] [Indexed: 11/18/2022] Open
Abstract
In silico protein–ligand binding prediction is an ongoing area of research in computational chemistry and machine learning based drug discovery, as an accurate predictive model could greatly reduce the time and resources necessary for the detection and prioritization of possible drug candidates. Proteochemometric modeling (PCM) attempts to create an accurate model of the protein–ligand interaction space by combining explicit protein and ligand descriptors. This requires the creation of information-rich, uniform and computer interpretable representations of proteins and ligands. Previous studies in PCM modeling rely on pre-defined, handcrafted feature extraction methods, and many methods use protein descriptors that require alignment or are otherwise specific to a particular group of related proteins. However, recent advances in representation learning have shown that unsupervised machine learning can be used to generate embeddings that outperform complex, human-engineered representations. Several different embedding methods for proteins and molecules have been developed based on various language-modeling methods. Here, we demonstrate the utility of these unsupervised representations and compare three protein embeddings and two compound embeddings in a fair manner. We evaluate performance on various splits of a benchmark dataset, as well as on an internal dataset of protein–ligand binding activities and find that unsupervised-learned representations significantly outperform handcrafted representations.
Collapse
|
40
|
Selvaraj C, Chandra I, Singh SK. Artificial intelligence and machine learning approaches for drug design: challenges and opportunities for the pharmaceutical industries. Mol Divers 2021; 26:1893-1913. [PMID: 34686947 PMCID: PMC8536481 DOI: 10.1007/s11030-021-10326-z] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Accepted: 09/24/2021] [Indexed: 12/27/2022]
Abstract
The global spread of COVID-19 has raised the importance of pharmaceutical drug development as intractable and hot research. Developing new drug molecules to overcome any disease is a costly and lengthy process, but the process continues uninterrupted. The critical point to consider the drug design is to use the available data resources and to find new and novel leads. Once the drug target is identified, several interdisciplinary areas work together with artificial intelligence (AI) and machine learning (ML) methods to get enriched drugs. These AI and ML methods are applied in every step of the computer-aided drug design, and integrating these AI and ML methods results in a high success rate of hit compounds. In addition, this AI and ML integration with high-dimension data and its powerful capacity have taken a step forward. Clinical trials output prediction through the AI/ML integrated models could further decrease the clinical trials cost by also improving the success rate. Through this review, we discuss the backend of AI and ML methods in supporting the computer-aided drug design, along with its challenge and opportunity for the pharmaceutical industry. From the available information or data, the AI and ML based prediction for the high throughput virtual screening. After this integration of AI and ML, the success rate of hit identification has gained a momentum with huge success by providing novel drugs.
Collapse
Affiliation(s)
- Chandrabose Selvaraj
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India.
| | - Ishwar Chandra
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India
| | - Sanjeev Kumar Singh
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India.
| |
Collapse
|
41
|
Dhanabalan AK, Subaraja M, Palanichamy K, Velmurugan D, Gunasekaran K. Identification of a Chlorogenic Ester as a Monoamine Oxidase (MAO-B) Inhibitor by Integrating "Traditional and Machine Learning" Virtual Screening and In Vitro as well as In Vivo Validation: A Lead against Neurodegenerative Disorders? ACS Chem Neurosci 2021; 12:3690-3707. [PMID: 34553601 DOI: 10.1021/acschemneuro.1c00430] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Parkinson's disease (PD) is the furthermost motor disorder of adult-onset dementia connected to memory and other cognitive abilities. Monoamine oxidases (MAOs) have gained significant attention in recent years owing to their possible therapeutic use against PD. Expression of MAO-B has been found to be elevated in PD patients for increased uptake of dopamine, producing hydrogen peroxide and finally causing neuronal injury. In this work, two new compounds have been identified as leads against MAO-B, and one of those compounds has been validated in vitro and in vivo. From the Protein Data Bank, MAO-B protein structures complexed with selegiline, 6-hydroxy-N-propargyl-1(R)-aminoindan, or a chromen derivative have been selected as templates for shape-based virtual screening (SB-VS) against the Traditional Chinese Medicinal (TCM) natural database. In parallel, using machine learning, a molecular-descriptor-based support vector model (SVM) was prepared and screened. For this purpose, naïve Bayesian, logistic regression, and random forest strategies were employed with the best specific molecular descriptor, which yielded a model with an overall accuracy (Q) of 0.81. Two common hit compounds lead-1 and lead-2 resulting from both shape and SVM screenings were analyzed through molecular docking and molecular dynamics (MD) simulation (200 ns). Also, from trajectory analysis such as molecular mechanics generalized Born surface area (MMGB/SA) and the residual interaction network (RIN) analyzer, both leads were found to bind at the active site with a favorable correlated motion, including domain movements. Lead-2, which is a chlorogenic ester, was synthesized and found to have no cytotoxic effect up to 50 μg/mL on Neuro-2A cells. The significant reactive oxygen species (ROS) scavenging activity by lead-2 could be correlated to its neuroprotective efficacy. Its capacity to inhibit human MAO-B through a competitive mode could be observed. An experimental zebra fish model confirms the neuroprotection by lead-2 by assessing the locomotor activities under malathion influence and treatment of lead-2. Also, histopathology analysis revealed that lead-2 could slow down degeneration in the brain. The present study emphasizes that integrating machine learning in parallel with traditional virtual screening may be useful to identify effective lead compounds for a given target.
Collapse
Affiliation(s)
- Anantha Krishnan Dhanabalan
- Centre of Advanced Study in Crystallography and Biophysics, University of Madras, Guindy Campus, Chennai 600025, Tamil Nadu, India
| | - Mamangam Subaraja
- Vivekanandha College of Arts and Sciences for Women (Autonomous), Tiruchengode 637205, Tamil Nadu, India
| | - Kuppusamy Palanichamy
- Centre of Advanced Study in Crystallography and Biophysics, University of Madras, Guindy Campus, Chennai 600025, Tamil Nadu, India
| | - Devadasan Velmurugan
- Centre of Advanced Study in Crystallography and Biophysics, University of Madras, Guindy Campus, Chennai 600025, Tamil Nadu, India
| | - Krishnasamy Gunasekaran
- Centre of Advanced Study in Crystallography and Biophysics, University of Madras, Guindy Campus, Chennai 600025, Tamil Nadu, India
- Bioinformatics Infrastructure Facility, University of Madras, Guindy Campus, Chennai 600025, Tamil Nadu, India
| |
Collapse
|
42
|
Lee KH, Fant AD, Guo J, Guan A, Jung J, Kudaibergenova M, Miranda WE, Ku T, Cao J, Wacker S, Duff HJ, Newman AH, Noskov SY, Shi L. Toward Reducing hERG Affinities for DAT Inhibitors with a Combined Machine Learning and Molecular Modeling Approach. J Chem Inf Model 2021; 61:4266-4279. [PMID: 34420294 DOI: 10.1021/acs.jcim.1c00856] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Psychostimulant drugs, such as cocaine, inhibit dopamine reuptake via blockading the dopamine transporter (DAT), which is the primary mechanism underpinning their abuse. Atypical DAT inhibitors are dissimilar to cocaine and can block cocaine- or methamphetamine-induced behaviors, supporting their development as part of a treatment regimen for psychostimulant use disorders. When developing these atypical DAT inhibitors as medications, it is necessary to avoid off-target binding that can produce unwanted side effects or toxicities. In particular, the blockade of a potassium channel, human ether-a-go-go (hERG), can lead to potentially lethal ventricular tachycardia. In this study, we established a counter screening platform for DAT and against hERG binding by combining machine learning-based quantitative structure-activity relationship (QSAR) modeling, experimental validation, and molecular modeling and simulations. Our results show that the available data are adequate to establish robust QSAR models, as validated by chemical synthesis and pharmacological evaluation of a validation set of DAT inhibitors. Furthermore, the QSAR models based on subsets of the data according to experimental approaches used have predictive power as well, which opens the door to target specific functional states of a protein. Complementarily, our molecular modeling and simulations identified the structural elements responsible for a pair of DAT inhibitors having opposite binding affinity trends at DAT and hERG, which can be leveraged for rational optimization of lead atypical DAT inhibitors with desired pharmacological properties.
Collapse
Affiliation(s)
- Kuo Hao Lee
- Computational Chemistry and Molecular Biophysics Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Andrew D Fant
- Computational Chemistry and Molecular Biophysics Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Jiqing Guo
- Libin Cardiovascular Institute of Alberta, Cumming School of Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada
| | - Andy Guan
- Computational Chemistry and Molecular Biophysics Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Joslyn Jung
- Computational Chemistry and Molecular Biophysics Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Mary Kudaibergenova
- Centre for Molecular Simulation, Department of Biological Sciences, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | - Williams E Miranda
- Centre for Molecular Simulation, Department of Biological Sciences, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | - Therese Ku
- Medicinal Chemistry Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Jianjing Cao
- Medicinal Chemistry Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Soren Wacker
- Libin Cardiovascular Institute of Alberta, Cumming School of Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada.,Centre for Molecular Simulation, Department of Biological Sciences, University of Calgary, Calgary, Alberta T2N 1N4, Canada.,Achlys Inc., 7-126 Li Ka Shing Center for Health and Innovation, Edmonton, Alberta T6G 2E1, Canada
| | - Henry J Duff
- Libin Cardiovascular Institute of Alberta, Cumming School of Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada
| | - Amy Hauck Newman
- Medicinal Chemistry Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| | - Sergei Y Noskov
- Centre for Molecular Simulation, Department of Biological Sciences, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | - Lei Shi
- Computational Chemistry and Molecular Biophysics Section, Molecular Targets and Medications Discovery Branch, National Institute on Drug Abuse-Intramural Research Program, National Institutes of Health, Baltimore, Maryland 21224, United States
| |
Collapse
|
43
|
Mao J, Akhtar J, Zhang X, Sun L, Guan S, Li X, Chen G, Liu J, Jeon HN, Kim MS, No KT, Wang G. Comprehensive strategies of machine-learning-based quantitative structure-activity relationship models. iScience 2021; 24:103052. [PMID: 34553136 PMCID: PMC8441174 DOI: 10.1016/j.isci.2021.103052] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Early quantitative structure-activity relationship (QSAR) technologies have unsatisfactory versatility and accuracy in fields such as drug discovery because they are based on traditional machine learning and interpretive expert features. The development of Big Data and deep learning technologies significantly improve the processing of unstructured data and unleash the great potential of QSAR. Here we discuss the integration of wet experiments (which provide experimental data and reliable verification), molecular dynamics simulation (which provides mechanistic interpretation at the atomic/molecular levels), and machine learning (including deep learning) techniques to improve QSAR models. We first review the history of traditional QSAR and point out its problems. We then propose a better QSAR model characterized by a new iterative framework to integrate machine learning with disparate data input. Finally, we discuss the application of QSAR and machine learning to many practical research fields, including drug development and clinical trials.
Collapse
Affiliation(s)
- Jiashun Mao
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Computational Science and Material Design, Shenzhen, Guangdong 518055 China
| | - Javed Akhtar
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Cell Microenvironment and Disease Research, Shenzhen, Guangdong 518055, China
| | - Xiao Zhang
- Shanghai Rural Commercial Bank Co., Ltd, Shanghai 200002, China
| | - Liang Sun
- Department of Physics, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China
| | - Shenghui Guan
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Computational Science and Material Design, Shenzhen, Guangdong 518055 China
| | - Xinyu Li
- School of Life and Health Sciences and Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Guangming Chen
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Cell Microenvironment and Disease Research, Shenzhen, Guangdong 518055, China
| | - Jiaxin Liu
- Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Hyeon-Nae Jeon
- Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Min Sung Kim
- Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Kyoung Tai No
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea
| | - Guanyu Wang
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen, Guangdong 518055, China
- Guangdong Provincial Key Laboratory of Computational Science and Material Design, Shenzhen, Guangdong 518055 China
- Guangdong Provincial Key Laboratory of Cell Microenvironment and Disease Research, Shenzhen, Guangdong 518055, China
| |
Collapse
|
44
|
Jandova Z, Vargiu AV, Bonvin AMJJ. Native or Non-Native Protein-Protein Docking Models? Molecular Dynamics to the Rescue. J Chem Theory Comput 2021; 17:5944-5954. [PMID: 34342983 PMCID: PMC8444332 DOI: 10.1021/acs.jctc.1c00336] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Indexed: 11/29/2022]
Abstract
Molecular docking excels at creating a plethora of potential models of protein-protein complexes. To correctly distinguish the favorable, native-like models from the remaining ones remains, however, a challenge. We assessed here if a protocol based on molecular dynamics (MD) simulations would allow distinguishing native from non-native models to complement scoring functions used in docking. To this end, the first models for 25 protein-protein complexes were generated using HADDOCK. Next, MD simulations complemented with machine learning were used to discriminate between native and non-native complexes based on a combination of metrics reporting on the stability of the initial models. Native models showed higher stability in almost all measured properties, including the key ones used for scoring in the Critical Assessment of PRedicted Interaction (CAPRI) competition, namely the positional root mean square deviations and fraction of native contacts from the initial docked model. A random forest classifier was trained, reaching a 0.85 accuracy in correctly distinguishing native from non-native complexes. Reasonably modest simulation lengths of the order of 50-100 ns are sufficient to reach this accuracy, which makes this approach applicable in practice.
Collapse
Affiliation(s)
- Zuzana Jandova
- Computational
Structural Biology Group, Bijvoet Centre for Biomolecular Research,
Faculty of Science—Chemistry, Utrecht
University, Padualaan 8, 3584 CH Utrecht, the Netherlands
| | - Attilio Vittorio Vargiu
- Physics
Department, University of Cagliari, Cittadella
Universitaria, S.P. 8 km 0.700, 09042 Monserrato, Italy
| | - Alexandre M. J. J. Bonvin
- Computational
Structural Biology Group, Bijvoet Centre for Biomolecular Research,
Faculty of Science—Chemistry, Utrecht
University, Padualaan 8, 3584 CH Utrecht, the Netherlands
| |
Collapse
|
45
|
Gawriljuk VO, Zin PPK, Puhl AC, Zorn KM, Foil DH, Lane TR, Hurst B, Tavella TA, Costa FTM, Lakshmanane P, Bernatchez J, Godoy AS, Oliva G, Siqueira-Neto JL, Madrid PB, Ekins S. Machine Learning Models Identify Inhibitors of SARS-CoV-2. J Chem Inf Model 2021; 61:4224-4235. [PMID: 34387990 DOI: 10.1021/acs.jcim.1c00683] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
With the rapidly evolving SARS-CoV-2 variants of concern, there is an urgent need for the discovery of further treatments for the coronavirus disease (COVID-19). Drug repurposing is one of the most rapid strategies for addressing this need, and numerous compounds have already been selected for in vitro testing by several groups. These have led to a growing database of molecules with in vitro activity against the virus. Machine learning models can assist drug discovery through prediction of the best compounds based on previously published data. Herein, we have implemented several machine learning methods to develop predictive models from recent SARS-CoV-2 in vitro inhibition data and used them to prioritize additional FDA-approved compounds for in vitro testing selected from our in-house compound library. From the compounds predicted with a Bayesian machine learning model, lumefantrine, an antimalarial was selected for testing and showed limited antiviral activity in cell-based assays while demonstrating binding (Kd 259 nM) to the spike protein using microscale thermophoresis. Several other compounds which we prioritized have since been tested by others and were also found to be active in vitro. This combined machine learning and in vitro testing approach can be expanded to virtually screen available molecules with predicted activity against SARS-CoV-2 reference WIV04 strain and circulating variants of concern. In the process of this work, we have created multiple iterations of machine learning models that can be used as a prioritization tool for SARS-CoV-2 antiviral drug discovery programs. The very latest model for SARS-CoV-2 with over 500 compounds is now freely available at www.assaycentral.org.
Collapse
Affiliation(s)
- Victor O Gawriljuk
- São Carlos Institute of Physics, University of São Paulo, Av. João Dagnone, 1100-Santa Angelina, São Carlos, São Paulo 13563-120, Brazil
| | - Phyo Phyo Kyaw Zin
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Ana C Puhl
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Kimberley M Zorn
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Daniel H Foil
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Thomas R Lane
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Brett Hurst
- Institute for Antiviral Research, Utah State University, Logan, Utah 84322-5600, United States.,Department of Animal, Dairy and Veterinary Sciences, Utah State University, Logan, Utah 84322-4815, United States
| | - Tatyana Almeida Tavella
- Laboratory of Tropical Diseases-Prof. Dr. Luiz Jacinto da Silva, Department of Genetics, Evolution, Microbiology and Immunology, University of Campinas-UNICAMP, Campinas, São Paulo, Brazil
| | - Fabio Trindade Maranhão Costa
- Laboratory of Tropical Diseases-Prof. Dr. Luiz Jacinto da Silva, Department of Genetics, Evolution, Microbiology and Immunology, University of Campinas-UNICAMP, Campinas, São Paulo, Brazil
| | - Premkumar Lakshmanane
- Department of Microbiology and Immunology, University of North Carolina School of Medicine, Chapel Hill North Carolina 27599, United States
| | - Jean Bernatchez
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, San Diego, California 92093, United States
| | - Andre S Godoy
- São Carlos Institute of Physics, University of São Paulo, Av. João Dagnone, 1100-Santa Angelina, São Carlos, São Paulo 13563-120, Brazil
| | - Glaucius Oliva
- São Carlos Institute of Physics, University of São Paulo, Av. João Dagnone, 1100-Santa Angelina, São Carlos, São Paulo 13563-120, Brazil
| | - Jair L Siqueira-Neto
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, San Diego, California 92093, United States
| | - Peter B Madrid
- SRI International, 333 Ravenswood Avenue, Menlo Park, California 94025, United States
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| |
Collapse
|
46
|
Dos Santos Nascimento IJ, da Silva-Júnior EF, de Aquino TM. Molecular Modeling Targeting Transmembrane Serine Protease 2 (TMPRSS2) as an Alternative Drug Target Against Coronaviruses. Curr Drug Targets 2021; 23:240-259. [PMID: 34370633 DOI: 10.2174/1389450122666210809090909] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 06/03/2021] [Accepted: 06/07/2021] [Indexed: 11/22/2022]
Abstract
Since November 2019, the new Coronavirus disease (COVID-19) caused by the etiological agent SARS-CoV-2 has been responsible for several cases worldwide, becoming pandemic in March 2020. Pharmaceutical industries and academics have joined their efforts to discover new therapies to control the disease, since there are no specific drugs to combat this emerging virus. Thus, several targets have been explored, among them the transmembrane protease serine 2 (TMPRSS2) has gained greater interest in the scientific community. In this context, this review will describe the importance of TMPRSS2 protease and the significant advances in virtual screening focused on discovering new inhibitors. In this review, it was observed that molecular modeling methods could be powerful tools in identifying new molecules against SARS-CoV-2. Thus, this review could be used to guide researchers worldwide to explore the biological and clinical potential of compounds that could be promising drug candidates against SARS-CoV-2, acting by inhibition of TMPRSS2 protein.
Collapse
Affiliation(s)
- Igor José Dos Santos Nascimento
- Laboratory of Synthesis and Research in Medicinal Chemistry (LSRMEC), Institute of Chemistry and Biotechnology, Federal University of Alagoas, Maceió, Brazil
| | - Edeildo Ferreira da Silva-Júnior
- Laboratory of Synthesis and Research in Medicinal Chemistry (LSRMEC), Institute of Chemistry and Biotechnology, Federal University of Alagoas, Maceió, Brazil
| | - Thiago Mendonça de Aquino
- Laboratory of Synthesis and Research in Medicinal Chemistry (LSRMEC), Institute of Chemistry and Biotechnology, Federal University of Alagoas, Maceió, Brazil
| |
Collapse
|
47
|
Tynes M, Gao W, Burrill DJ, Batista ER, Perez D, Yang P, Lubbers N. Pairwise Difference Regression: A Machine Learning Meta-algorithm for Improved Prediction and Uncertainty Quantification in Chemical Search. J Chem Inf Model 2021; 61:3846-3857. [PMID: 34347460 DOI: 10.1021/acs.jcim.1c00670] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Machine learning (ML) plays a growing role in the design and discovery of chemicals, aiming to reduce the need to perform expensive experiments and simulations. ML for such applications is promising but difficult, as models must generalize to vast chemical spaces from small training sets and must have reliable uncertainty quantification metrics to identify and prioritize unexplored regions. Ab initio computational chemistry and chemical intuition alike often take advantage of differences between chemical conditions, rather than their absolute structure or state, to generate more reliable results. We have developed an analogous comparison-based approach for ML regression, called pairwise difference regression (PADRE), which is applicable to arbitrary underlying learning models and operates on pairs of input data points. During training, the model learns to predict differences between all possible pairs of input points. During prediction, the test points are paired with all training set points, giving rise to a set of predictions that can be treated as a distribution of which the mean is treated as a final prediction and the dispersion is treated as an uncertainty measure. Pairwise difference regression was shown to reliably improve the performance of the random forest algorithm across five chemical ML tasks. Additionally, the pair-derived dispersion is both well correlated with model error and performs well in active learning. We also show that this method is competitive with state-of-the-art neural network techniques. Thus, pairwise difference regression is a promising tool for candidate selection algorithms used in chemical discovery.
Collapse
Affiliation(s)
- Michael Tynes
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.,Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Wenhao Gao
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.,Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Daniel J Burrill
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.,Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Enrique R Batista
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.,Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Danny Perez
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ping Yang
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
48
|
Fernandes PO, Martins DM, de Souza Bozzi A, Martins JPA, de Moraes AH, Maltarollo VG. Molecular insights on ABL kinase activation using tree-based machine learning models and molecular docking. Mol Divers 2021; 25:1301-1314. [PMID: 34191245 PMCID: PMC8241884 DOI: 10.1007/s11030-021-10261-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 06/18/2021] [Indexed: 12/14/2022]
Abstract
Abelson kinase (c-Abl) is a non-receptor tyrosine kinase involved in several biological processes essential for cell differentiation, migration, proliferation, and survival. This enzyme's activation might be an alternative strategy for treating diseases such as neutropenia induced by chemotherapy, prostate, and breast cancer. Recently, a series of compounds that promote the activation of c-Abl has been identified, opening a promising ground for c-Abl drug development. Structure-based drug design (SBDD) and ligand-based drug design (LBDD) methodologies have significantly impacted recent drug development initiatives. Here, we combined SBDD and LBDD approaches to characterize critical chemical properties and interactions of identified c-Abl's activators. We used molecular docking simulations combined with tree-based machine learning models-decision tree, AdaBoost, and random forest to understand the c-Abl activators' structural features required for binding to myristoyl pocket, and consequently, to promote enzyme and cellular activation. We obtained predictive and robust models with Matthews correlation coefficient values higher than 0.4 for all endpoints and identified characteristics that led to constructing a structure-activity relationship model (SAR).
Collapse
Affiliation(s)
- Philipe Oliveira Fernandes
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Diego Magno Martins
- Departamento de Química, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Aline de Souza Bozzi
- Departamento de Química, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - João Paulo A Martins
- Departamento de Química, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Adolfo Henrique de Moraes
- Departamento de Química, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil.
| |
Collapse
|
49
|
Rácz A, Bajusz D, Miranda-Quintana RA, Héberger K. Machine learning models for classification tasks related to drug safety. Mol Divers 2021; 25:1409-1424. [PMID: 34110577 PMCID: PMC8342376 DOI: 10.1007/s11030-021-10239-x] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 05/27/2021] [Indexed: 12/23/2022]
Abstract
In this review, we outline the current trends in the field of machine learning-driven classification studies related to ADME (absorption, distribution, metabolism and excretion) and toxicity endpoints from the past six years (2015-2021). The study focuses only on classification models with large datasets (i.e. more than a thousand compounds). A comprehensive literature search and meta-analysis was carried out for nine different targets: hERG-mediated cardiotoxicity, blood-brain barrier penetration, permeability glycoprotein (P-gp) substrate/inhibitor, cytochrome P450 enzyme family, acute oral toxicity, mutagenicity, carcinogenicity, respiratory toxicity and irritation/corrosion. The comparison of the best classification models was targeted to reveal the differences between machine learning algorithms and modeling types, endpoint-specific performances, dataset sizes and the different validation protocols. Based on the evaluation of the data, we can say that tree-based algorithms are (still) dominating the field, with consensus modeling being an increasing trend in drug safety predictions. Although one can already find classification models with great performances to hERG-mediated cardiotoxicity and the isoenzymes of the cytochrome P450 enzyme family, these targets are still central to ADMET-related research efforts.
Collapse
Affiliation(s)
- Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.
| | - Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary
| | | | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.
| |
Collapse
|
50
|
Abstract
The application of artificial intelligence (AI) is currently changing very different areas of life. Artificial intelligence involves the emulation of human behavior with the aid of methods from mathematics and informatics. Machine learning (ML) represents a subdivision of AI. Algorithms for ML have the potential to optimize patient care, in that they can be utilized in a supportive way in personalized medicine, decision making and risk prediction. Although the majority of the applications in medicine are still limited to data analysis and research, it is certain that ML will become increasingly more important in scientific and clinical aspects in this supportive function. Therefore, it is necessary for clinicians to have at least a basic understanding of the functional principles, strengths and weaknesses of ML.
Collapse
Affiliation(s)
- J Sassenscheidt
- Klinik und Poliklinik für Anästhesiologie, Zentrum für Anästhesiologie und Intensivmedizin, Martinistr. 52, 20246, Hamburg, Deutschland
- Abteilung für Anästhesiologie, Intensivmedizin, Notfallmedizin, Schmerztherapie, Asklepios Klinik Altona, Paul-Ehrlich-Straße 1, 22763, Hamburg, Deutschland
| | - B Jungwirth
- Klinik für Anästhesiologie, Universitätsklinikum Ulm, 89070, Ulm, Deutschland
| | - J C Kubitz
- Klinik und Poliklinik für Anästhesiologie, Zentrum für Anästhesiologie und Intensivmedizin, Martinistr. 52, 20246, Hamburg, Deutschland.
| |
Collapse
|