1
|
Mehmood A, Kaushik AC, Wei DQ. DDSBC: A Stacking Ensemble Classifier-Based Approach for Breast Cancer Drug-Pair Cell Synergy Prediction. J Chem Inf Model 2024. [PMID: 39116326 DOI: 10.1021/acs.jcim.4c01101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/10/2024]
Abstract
Breast cancer (BC) ranks as a leading cause of mortality among women worldwide, with incidence rates continuing to rise. The quest for effective treatments has led to the adoption of drug combination therapy, aiming to enhance drug efficacy. However, identifying synergistic drug combinations remains a daunting challenge due to the myriad of potential drug pairs. Current research leverages machine learning (ML) and deep learning (DL) models for drug-pair synergy prediction and classification. Nevertheless, these models often underperform on specific cancer types, including BC, as they are trained on data spanning various cancers without any specialization. Here, we introduce a stacking ensemble classifier, the drug-drug synergy for breast cancer (DDSBC), tailored explicitly for BC drug-pair cell synergy classification. Unlike existing models that generalize across cancer types, DDSBC is exclusively developed for BC, offering a more focused approach. Our comparative analysis against classical ML methods as well as DL models developed for drug synergy prediction highlights DDSBC's superior performance across test and independent datasets on BC data. Despite certain metrics where other methods narrowly surpass DDSBC by 1-2%, DDSBC consistently emerges as the top-ranked model, showcasing significant differences in scoring metrics and robust performance in ablation studies. DDSBC's performance and practicality position it as a preferred choice or an adjunctive validation tool for identifying synergistic or antagonistic drug pairs in BC, providing valuable insights for treatment strategies.
Collapse
Affiliation(s)
- Aamir Mehmood
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P. R. China
| | - Aman Chandra Kaushik
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P. R. China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P. R. China
| |
Collapse
|
2
|
Cao Y, Balduf T, Beachy MD, Bennett MC, Bochevarov AD, Chien A, Dub PA, Dyall KG, Furness JW, Halls MD, Hughes TF, Jacobson LD, Kwak HS, Levine DS, Mainz DT, Moore KB, Svensson M, Videla PE, Watson MA, Friesner RA. Quantum chemical package Jaguar: A survey of recent developments and unique features. J Chem Phys 2024; 161:052502. [PMID: 39092934 DOI: 10.1063/5.0213317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 07/12/2024] [Indexed: 08/04/2024] Open
Abstract
This paper is dedicated to the quantum chemical package Jaguar, which is commercial software developed and distributed by Schrödinger, Inc. We discuss Jaguar's scientific features that are relevant to chemical research as well as describe those aspects of the program that are pertinent to the user interface, the organization of the computer code, and its maintenance and testing. Among the scientific topics that feature prominently in this paper are the quantum chemical methods grounded in the pseudospectral approach. A number of multistep workflows dependent on Jaguar are covered: prediction of protonation equilibria in aqueous solutions (particularly calculations of tautomeric stability and pKa), reactivity predictions based on automated transition state search, assembly of Boltzmann-averaged spectra such as vibrational and electronic circular dichroism, as well as nuclear magnetic resonance. Discussed also are quantum chemical calculations that are oriented toward materials science applications, in particular, prediction of properties of optoelectronic materials and organic semiconductors, and molecular catalyst design. The topic of treatment of conformations inevitably comes up in real world research projects and is considered as part of all the workflows mentioned above. In addition, we examine the role of machine learning methods in quantum chemical calculations performed by Jaguar, from auxiliary functions that return the approximate calculation runtime in a user interface, to prediction of actual molecular properties. The current work is second in a series of reviews of Jaguar, the first having been published more than ten years ago. Thus, this paper serves as a rare milestone on the path that is being traversed by Jaguar's development in more than thirty years of its existence.
Collapse
Affiliation(s)
- Yixiang Cao
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Ty Balduf
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Michael D Beachy
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - M Chandler Bennett
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Art D Bochevarov
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Alan Chien
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Pavel A Dub
- Schrödinger, Inc., 9868 Scranton Road, Suite 3200, San Diego, California 92121, USA
| | - Kenneth G Dyall
- Schrödinger, Inc., 101 SW Main St., Suite 1300, Portland, Oregon 97204, USA
| | - James W Furness
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Mathew D Halls
- Schrödinger, Inc., 9868 Scranton Road, Suite 3200, San Diego, California 92121, USA
| | - Thomas F Hughes
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Leif D Jacobson
- Schrödinger, Inc., 101 SW Main St., Suite 1300, Portland, Oregon 97204, USA
| | - H Shaun Kwak
- Schrödinger, Inc., 101 SW Main St., Suite 1300, Portland, Oregon 97204, USA
| | - Daniel S Levine
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Daniel T Mainz
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Kevin B Moore
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Mats Svensson
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Pablo E Videla
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Mark A Watson
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Richard A Friesner
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, USA
| |
Collapse
|
3
|
Saifi I, Bhat BA, Hamdani SS, Bhat UY, Lobato-Tapia CA, Mir MA, Dar TUH, Ganie SA. Artificial intelligence and cheminformatics tools: a contribution to the drug development and chemical science. J Biomol Struct Dyn 2024; 42:6523-6541. [PMID: 37434311 DOI: 10.1080/07391102.2023.2234039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 07/03/2023] [Indexed: 07/13/2023]
Abstract
In the ever-evolving field of drug discovery, the integration of Artificial Intelligence (AI) and Machine Learning (ML) with cheminformatics has proven to be a powerful combination. Cheminformatics, which combines the principles of computer science and chemistry, is used to extract chemical information and search compound databases, while the application of AI and ML allows for the identification of potential hit compounds, optimization of synthesis routes, and prediction of drug efficacy and toxicity. This collaborative approach has led to the discovery, preclinical evaluations and approval of over 70 drugs in recent years. To aid researchers in the pursuit of new drugs, this article presents a comprehensive list of databases, datasets, predictive and generative models, scoring functions and web platforms that have been launched between 2021 and 2022. These resources provide a wealth of information and tools for computer-assisted drug development, and are a valuable asset for those working in the field of cheminformatics. Overall, the integration of AI, ML and cheminformatics has greatly advanced the drug discovery process and continues to hold great potential for the future. As new resources and technologies become available, we can expect to see even more groundbreaking discoveries and advancements in these fields.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Ifra Saifi
- Chaudhary Charan Singh University, Meerut, Uttar Pradesh, India
| | - Basharat Ahmad Bhat
- Department of Bioresources, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| | - Syed Suhail Hamdani
- Department of Bioresources, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| | - Umar Yousuf Bhat
- Department of Zoology, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| | | | - Mushtaq Ahmad Mir
- Department of Clinical Laboratory Sciences, College of Applied Medical Science, King Khalid University, KSA, Saudi Arabia
| | - Tanvir Ul Hasan Dar
- Department of Biotechnology, School of Biosciences and Biotechnology, BGSB University, Rajouri, India
| | - Showkat Ahmad Ganie
- Department of Clinical Biochemistry, School of Biological Sciences, University of Kashmir, Srinagar, J&K, India
| |
Collapse
|
4
|
Ho Manh L, Chen VCP, Rosenberger J, Wang S, Yang Y, Schug KA. Prediction of Vacuum Ultraviolet/Ultraviolet Gas-Phase Absorption Spectra Using Molecular Feature Representations and Machine Learning. J Chem Inf Model 2024; 64:5547-5556. [PMID: 38938209 DOI: 10.1021/acs.jcim.4c00676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024]
Abstract
Ultraviolet (UV) absorption spectroscopy is a widely used tool for quantitative and qualitative analyses of chemical compounds. In the gas phase, vacuum UV (VUV) and UV absorption spectra are specific and diagnostic for many small molecules. An accurate prediction of VUV/UV absorption spectra can aid the characterization of new or unknown molecules in areas such as fuels, forensics, and pharmaceutical research. An alternative to quantum chemical spectral prediction is the use of artificial intelligence. Here, different molecular feature representation techniques were used and developed to encode chemical structures for testing three machine learning models to predict gas-phase VUV/UV absorption spectra. Structure data files (.sdf) and VUV/UV absorption spectra for 1397 volatile and semivolatile chemical compounds were used to train and test the models. New molecular features (termed ABOCH) were introduced to better capture pi-bonding, aromaticity, and halogenation. The incorporation of these new features benefited spectral prediction and demonstrated superior performance compared to computationally intensive molecular-based deep learning methods. Of the machine learning methods, the use of a Random Forest regressor returned the best accuracy score with the shortest training time. The developed machine learning prediction model also outperformed spectral predictions based on the time-dependent density functional theory.
Collapse
Affiliation(s)
- Linh Ho Manh
- Department of Industrial, Manufacturing, and Systems Engineering, The University of Texas at Arlington, Arlington, Texas 76019, United States
| | - Victoria C P Chen
- Department of Industrial, Manufacturing, and Systems Engineering, The University of Texas at Arlington, Arlington, Texas 76019, United States
| | - Jay Rosenberger
- Department of Industrial, Manufacturing, and Systems Engineering, The University of Texas at Arlington, Arlington, Texas 76019, United States
| | - Shouyi Wang
- Department of Industrial, Manufacturing, and Systems Engineering, The University of Texas at Arlington, Arlington, Texas 76019, United States
| | - Yujing Yang
- Department of Industrial, Manufacturing, and Systems Engineering, The University of Texas at Arlington, Arlington, Texas 76019, United States
| | - Kevin A Schug
- Department of Chemistry and Biochemistry, The University of Texas at Arlington, Arlington, Texas 76019, United States
| |
Collapse
|
5
|
Shirani H, Hashemianzadeh SM. Quantum-level machine learning calculations of Levodopa. Comput Biol Chem 2024; 112:108146. [PMID: 39067350 DOI: 10.1016/j.compbiolchem.2024.108146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 06/20/2024] [Accepted: 07/08/2024] [Indexed: 07/30/2024]
Abstract
Many drug molecules contain functional groups, resulting in a torsional barrier corresponding to rotation around the bond linking the fragments. In medicinal chemistry and pharmaceutical sciences, inclusive of drug design studies, the exact calculation of the potential energy surface (PES) of these molecular torsions is extremely important and precious. Machine learning (ML), including deep learning (DL), is currently one of the most rapidly evolving tools in computer-aided drug discovery and molecular simulations. In this work, we used ANI-1x neural network potential as a quantum-level ML to predict the PESs of the L-3,4-dihydroxyphenylalanine (Levodopa) antiparkinsonian drug molecule. The electronic energies and structural parameters calculated by density functional theory (DFT) using the wB97X method and all possible Pople's basis sets indicated the 6-31G(d) basis set, when used with the wB97X functional, exhibits behavior similar to that of the ANI-1x model. The vibrational frequencies investigation showed a linear correlation between DFT and ML data. All ANI-1x calculations were completed quickly in a very short computing time. From this perspective, we expect the ANI-1x dataset applied in this work to be appreciably efficient and effective in computational structure-based drug design studies.
Collapse
Affiliation(s)
- Hossein Shirani
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, P.O. Box 16846-13114, Tehran, Iran.
| | - Seyed Majid Hashemianzadeh
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, P.O. Box 16846-13114, Tehran, Iran.
| |
Collapse
|
6
|
Li J, Lardon R, Mangelinckx S, Geelen D. A practical guide to the discovery of biomolecules with biostimulant activity. JOURNAL OF EXPERIMENTAL BOTANY 2024; 75:3797-3817. [PMID: 38630561 DOI: 10.1093/jxb/erae156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 04/16/2024] [Indexed: 04/19/2024]
Abstract
The growing demand for sustainable solutions in agriculture, which are critical for crop productivity and food quality in the face of climate change and the need to reduce agrochemical usage, has brought biostimulants into the spotlight as valuable tools for regenerative agriculture. With their diverse biological activities, biostimulants can contribute to crop growth, nutrient use efficiency, and abiotic stress resilience, as well as to the restoration of soil health. Biomolecules include humic substances, protein lysates, phenolics, and carbohydrates have undergone thorough investigation because of their demonstrated biostimulant activities. Here, we review the process of the discovery and development of extract-based biostimulants, and propose a practical step-by-step pipeline that starts with initial identification of biomolecules, followed by extraction and isolation, determination of bioactivity, identification of active compound(s), elucidation of mechanisms, formulation, and assessment of effectiveness. The different steps generate a roadmap that aims to expedite the transfer of interdisciplinary knowledge from laboratory-scale studies to pilot-scale production in practical scenarios that are aligned with the prevailing regulatory frameworks.
Collapse
Affiliation(s)
- Jing Li
- HortiCell, Department Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000 Ghent, Belgium
| | - Robin Lardon
- HortiCell, Department Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000 Ghent, Belgium
| | - Sven Mangelinckx
- SynBioC, Department of Green Chemistry and Technology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000 Ghent, Belgium
| | - Danny Geelen
- HortiCell, Department Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000 Ghent, Belgium
| |
Collapse
|
7
|
Liu J, Gui Y, Rao J, Sun J, Wang G, Ren Q, Qu N, Niu B, Chen Z, Sheng X, Wang Y, Zheng M, Li X. In silico off-target profiling for enhanced drug safety assessment. Acta Pharm Sin B 2024; 14:2927-2941. [PMID: 39027254 PMCID: PMC11252485 DOI: 10.1016/j.apsb.2024.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/21/2024] [Accepted: 02/29/2024] [Indexed: 07/20/2024] Open
Abstract
Ensuring drug safety in the early stages of drug development is crucial to avoid costly failures in subsequent phases. However, the economic burden associated with detecting drug off-targets and potential side effects through in vitro safety screening and animal testing is substantial. Drug off-target interactions, along with the adverse drug reactions they induce, are significant factors affecting drug safety. To assess the liability of candidate drugs, we developed an artificial intelligence model for the precise prediction of compound off-target interactions, leveraging multi-task graph neural networks. The outcomes of off-target predictions can serve as representations for compounds, enabling the differentiation of drugs under various ATC codes and the classification of compound toxicity. Furthermore, the predicted off-target profiles are employed in adverse drug reaction (ADR) enrichment analysis, facilitating the inference of potential ADRs for a drug. Using the withdrawn drug Pergolide as an example, we elucidate the mechanisms underlying ADRs at the target level, contributing to the exploration of the potential clinical relevance of newly predicted off-target interactions. Overall, our work facilitates the early assessment of compound safety/toxicity based on off-target identification, deduces potential ADRs of drugs, and ultimately promotes the secure development of drugs.
Collapse
Affiliation(s)
- Jin Liu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
| | - Yike Gui
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Jingxin Rao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingjing Sun
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Gang Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qun Ren
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Ning Qu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Buying Niu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhiyi Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, Hangzhou 330106, China
| | - Xia Sheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yitian Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mingyue Zheng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- Nanjing University of Chinese Medicine, Nanjing 210023, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, Hangzhou 330106, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
8
|
Liu Q, He D, Fan M, Wang J, Cui Z, Wang H, Mi Y, Li N, Meng Q, Hou Y. Prediction and Interpretation Microglia Cytotoxicity by Machine Learning. J Chem Inf Model 2024. [PMID: 38949724 DOI: 10.1021/acs.jcim.4c00366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Ameliorating microglia-mediated neuroinflammation is a crucial strategy in developing new drugs for neurodegenerative diseases. Plant compounds are an important screening target for the discovery of drugs for the treatment of neurodegenerative diseases. However, due to the spatial complexity of phytochemicals, it becomes particularly important to evaluate the effectiveness of compounds while avoiding the mixing of cytotoxic substances in the early stages of compound screening. Traditional high-throughput screening methods suffer from high cost and low efficiency. A computational model based on machine learning provides a novel avenue for cytotoxicity determination. In this study, a microglia cytotoxicity classifier was developed using a machine learning approach. First, we proposed a data splitting strategy based on the molecule murcko generic scaffold, under this condition, three machine learning approaches were coupled with three kinds of molecular representation methods to construct microglia cytotoxicity classifier, which were then compared and assessed by the predictive accuracy, balanced accuracy, F1-score, and Matthews Correlation Coefficient. Then, the recursive feature elimination integrated with support vector machine (RFE-SVC) dimension reduction method was introduced to molecular fingerprints with high dimensions to further improve the model performance. Among all the microglial cytotoxicity classifiers, the SVM coupled with ECFP4 fingerprint after feature selection (ECFP4-RFE-SVM) obtained the most accurate classification for the test set (ACC of 0.99, BA of 0.99, F1-score of 0.99, MCC of 0.97). Finally, the Shapley additive explanations (SHAP) method was used in interpreting the microglia cytotoxicity classifier and key substructure smart identified as structural alerts. Experimental results show that ECFP4-RFE-SVM have reliable classification capability for microglia cytotoxicity, and SHAP can not only provide a rational explanation for microglia cytotoxicity predictions, but also offer a guideline for subsequent molecular cytotoxicity modifications.
Collapse
Affiliation(s)
- Qing Liu
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Dakuo He
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Mengmeng Fan
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Jinpeng Wang
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Zeyu Cui
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Hao Wang
- College of Information Science and Engineering, State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, P. R. China
| | - Yan Mi
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China
| | - Ning Li
- School of Traditional Chinese Materia Medica, Key Laboratory for TCM Material Basis Study and Innovative Drug Development of Shenyang City, Shenyang Pharmaceutical University, Shenyang 110016, P. R. China
| | - Qingqi Meng
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China
| | - Yue Hou
- Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, National Frontiers Science Center for Industrial Intelligence and Systems Optimization, Key Laboratory of Data Analytics and Optimization for Smart Industry, Ministry of Education, Northeastern University, Shenyang 110169, P. R. China
| |
Collapse
|
9
|
Yu L, Che M, Wu X, Luo H. Research on ultrasound-based radiomics: a bibliometric analysis. Quant Imaging Med Surg 2024; 14:4520-4539. [PMID: 39022291 PMCID: PMC11250334 DOI: 10.21037/qims-23-1867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Accepted: 05/16/2024] [Indexed: 07/20/2024]
Abstract
Background A large number of studies related to ultrasound-based radiomics have been published in recent years; however, a systematic bibliometric analysis of this topic has not yet been conducted. In this study, we attempted to identify the hotspots and frontiers in ultrasound-based radiomics through bibliometrics and to systematically characterize the overall framework and characteristics of studies through mapping and visualization. Methods A literature search was carried out in Web of Science Core Collection (WoSCC) database from January 2016 to December 2023 according to a predetermined search formula. Bibliometric analysis and visualization of the results were performed using CiteSpace, VOSviewer, R, and other platforms. Results Ultimately, 466 eligible papers were included in the study. Publication trend analysis showed that the annual publication trend of journals in ultrasound-based radiomics could be divided into three phases: there were no more than five documents published in this field in any year before 2018, a small yearly increase in the number of annual publications occurred between 2018 and 2022, and a high, stable number of publications appeared after 2022. In the analysis of publication sources, China was found to be the main contributor, with a much higher number of publications than other countries, and was followed by the United States and Italy. Frontiers in Oncology was the journal with the highest number of papers in this field, publishing 60 articles. Among the academic institutions, Fudan University, Sun Yat-sen University, and the Chinese Academy of Sciences ranked as the top three in terms of the number of documents. In the analysis of authors and cocited authors, the author with the most publications was Yuanyuan Wang, who has published 19 articles in 8 years, while Philippe Lambin was the most cited author, with 233 citations. Visualization of the results from the cocitation analysis of the literature revealed a strong centrality of the subject terms papillary thyroid cancer, biological behavior, potential biomarkers, and comparative assessment, which may be the main focal points of research in this subject. Based on the findings of the keyword analysis and cluster analysis, the keywords can be categorized into two major groups: (I) technological innovations that enable the construction of radiomics models such as machine learning and deep learning and (II) applications of predictive models to support clinical decision-making in certain diseases, such as papillary thyroid cancer, hepatocellular carcinoma (HCC), and breast cancer. Conclusions Ultrasound-based radiomics has received widespread attention in the medical field and has been gradually been applied in clinical research. Radiomics, a relatively late development in medical technology, has made substantial contributions to the diagnosis, prediction, and prognostic evaluation of diseases. Additionally, the coupling of artificial intelligence techniques with ultrasound imaging has yielded a number of promising tools that facilitate clinical decision-making and enable the practice of precision medicine. Finally, the development of ultrasound-based radiomics requires multidisciplinary cooperation and joint efforts from the field biomedicine, information technology, statistics, and clinical medicine.
Collapse
Affiliation(s)
- Lu Yu
- Department of Ultrasound, The Second Affiliated Hospital of Sichuan University, Chengdu, China
- Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, Chengdu, China
| | - Mengting Che
- Department of Tumor Radiotherapy and Chemotherapy, The Second Affiliated Hospital of Sichuan University, Chengdu, China
| | - Xu Wu
- Department of Ultrasound, The Second Affiliated Hospital of Sichuan University, Chengdu, China
- Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, Chengdu, China
| | - Hong Luo
- Department of Ultrasound, The Second Affiliated Hospital of Sichuan University, Chengdu, China
- Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, Chengdu, China
| |
Collapse
|
10
|
Burley SK, Wu-Wu A, Dutta S, Ganesan S, Zheng SXF. Impact of structural biology and the protein data bank on us fda new drug approvals of low molecular weight antineoplastic agents 2019-2023. Oncogene 2024; 43:2229-2243. [PMID: 38886570 PMCID: PMC11245395 DOI: 10.1038/s41388-024-03077-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 06/04/2024] [Accepted: 06/05/2024] [Indexed: 06/20/2024]
Abstract
Open access to three-dimensional atomic-level biostructure information from the Protein Data Bank (PDB) facilitated discovery/development of 100% of the 34 new low molecular weight, protein-targeted, antineoplastic agents approved by the US FDA 2019-2023. Analyses of PDB holdings, the scientific literature, and related documents for each drug-target combination revealed that the impact of structural biologists and public-domain 3D biostructure data was broad and substantial, ranging from understanding target biology (100% of all drug targets), to identifying a given target as likely druggable (100% of all targets), to structure-guided drug discovery (>80% of all new small-molecule drugs, made up of 50% confirmed and >30% probable cases). In addition to aggregate impact assessments, illustrative case studies are presented for six first-in-class small-molecule anti-cancer drugs, including a selective inhibitor of nuclear export targeting Exportin 1 (selinexor, Xpovio), an ATP-competitive CSF-1R receptor tyrosine kinase inhibitor (pexidartinib,Turalia), a non-ATP-competitive inhibitor of the BCR-Abl fusion protein targeting the myristoyl binding pocket within the kinase catalytic domain of Abl (asciminib, Scemblix), a covalently-acting G12C KRAS inhibitor (sotorasib, Lumakras or Lumykras), an EZH2 methyltransferase inhibitor (tazemostat, Tazverik), and an agent targeting the basic-Helix-Loop-Helix transcription factor HIF-2α (belzutifan, Welireg).
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA.
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ, 08903, USA.
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, 92093, USA.
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA.
| | - Amy Wu-Wu
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ, 08903, USA
| | - Shridar Ganesan
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ, 08903, USA
| | - Steven X F Zheng
- Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ, 08903, USA
| |
Collapse
|
11
|
Daghighi A, Casanola-Martin GM, Iduoku K, Kusic H, González-Díaz H, Rasulev B. Multi-Endpoint Acute Toxicity Assessment of Organic Compounds Using Large-Scale Machine Learning Modeling. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:10116-10127. [PMID: 38797941 DOI: 10.1021/acs.est.4c01017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
In recent years, alternative animal testing methods such as computational and machine learning approaches have become increasingly crucial for toxicity testing. However, the complexity and scarcity of available biomedical data challenge the development of predictive models. Combining nonlinear machine learning together with multicondition descriptors offers a solution for using data from various assays to create a robust model. This work applies multicondition descriptors (MCDs) to develop a QSTR (Quantitative Structure-Toxicity Relationship) model based on a large toxicity data set comprising more than 80,000 compounds and 59 different end points (122,572 data points). The prediction capabilities of developed single-task multi-end point machine learning models as well as a novel data analysis approach with the use of Convolutional Neural Networks (CNN) are discussed. The results show that using MCDs significantly improves the model and using them with CNN-1D yields the best result (R2train = 0.93, R2ext = 0.70). Several structural features showed a high level of contribution to the toxicity, including van der Waals surface area (VSA), number of nitrogen-containing fragments (nN+), presence of S-P fragments, ionization potential, and presence of C-N fragments. The developed models can be very useful tools to predict the toxicity of various compounds under different conditions, enabling quick toxicity assessment of new compounds.
Collapse
Affiliation(s)
- Amirreza Daghighi
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, North Dakota 58102, United States
- Biomedical Engineering Program, North Dakota State University, Fargo, North Dakota 58102, United States
| | - Gerardo M Casanola-Martin
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, North Dakota 58102, United States
| | - Kweeni Iduoku
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, North Dakota 58102, United States
- Biomedical Engineering Program, North Dakota State University, Fargo, North Dakota 58102, United States
| | - Hrvoje Kusic
- Faculty of Chemical Engineering and Technology, University of Zagreb, Marulicev Trg 19, Zagreb 10000, Croatia
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, Leioa 48940, Spain
- BIOFISIKA, Basque Center for Biophysics CSIC-UPVEH, Leioa 48940, Spain
- IKERBASQUE, Basque Foundation for Science,Bilbao, Biscay 48011, Spain
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, North Dakota 58102, United States
- Biomedical Engineering Program, North Dakota State University, Fargo, North Dakota 58102, United States
| |
Collapse
|
12
|
Zhou Y, Wang Z, Huang Z, Li W, Chen Y, Yu X, Tang Y, Liu G. In silico prediction of ocular toxicity of compounds using explainable machine learning and deep learning approaches. J Appl Toxicol 2024; 44:892-907. [PMID: 38329145 DOI: 10.1002/jat.4586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 01/16/2024] [Accepted: 01/16/2024] [Indexed: 02/09/2024]
Abstract
The accurate identification of chemicals with ocular toxicity is of paramount importance in health hazard assessment. In contemporary chemical toxicology, there is a growing emphasis on refining, reducing, and replacing animal testing in safety evaluations. Therefore, the development of robust computational tools is crucial for regulatory applications. The performance of predictive models is heavily reliant on the quality and quantity of data. In this investigation, we amalgamated the most extensive dataset (4901 compounds) sourced from governmental GHS-compliant databases and literature to develop binary classification models of chemical ocular toxicity. We employed 12 molecular representations in conjunction with six machine learning algorithms and two deep learning algorithms to create a series of binary classification models. The findings indicated that the deep learning method GCN outperformed the machine learning models in cross-validation, achieving an impressive AUC of 0.915. However, the top-performing machine learning model (RF-Descriptor) demonstrated excellent performance with an AUC of 0.869 on the test set and was therefore selected as the best model. To enhance model interpretability, we conducted the SHAP method and attention weights analysis. The two approaches offered visual depictions of the relevance of key descriptors and substructures in predicting ocular toxicity of chemicals. Thus, we successfully struck a delicate balance between data quality and model interpretability, rendering our model valuable for predicting and comprehending potential ocular-toxic compounds in the early stages of drug discovery.
Collapse
Affiliation(s)
- Yiqing Zhou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Ze Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Zejun Huang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Yuanting Chen
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Xinxin Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| |
Collapse
|
13
|
Sirocchi C, Biancucci F, Donati M, Bogliolo A, Magnani M, Menotta M, Montagna S. Exploring machine learning for untargeted metabolomics using molecular fingerprints. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108163. [PMID: 38626559 DOI: 10.1016/j.cmpb.2024.108163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 03/15/2024] [Accepted: 04/03/2024] [Indexed: 04/18/2024]
Abstract
BACKGROUND Metabolomics, the study of substrates and products of cellular metabolism, offers valuable insights into an organism's state under specific conditions and has the potential to revolutionise preventive healthcare and pharmaceutical research. However, analysing large metabolomics datasets remains challenging, with available methods relying on limited and incompletely annotated metabolic pathways. METHODS This study, inspired by well-established methods in drug discovery, employs machine learning on metabolite fingerprints to explore the relationship of their structure with responses in experimental conditions beyond known pathways, shedding light on metabolic processes. It evaluates fingerprinting effectiveness in representing metabolites, addressing challenges like class imbalance, data sparsity, high dimensionality, duplicate structural encoding, and interpretable features. Feature importance analysis is then applied to reveal key chemical configurations affecting classification, identifying related metabolite groups. RESULTS The approach is tested on two datasets: one on Ataxia Telangiectasia and another on endothelial cells under low oxygen. Machine learning on molecular fingerprints predicts metabolite responses effectively, and feature importance analysis aligns with known metabolic pathways, unveiling new affected metabolite groups for further study. CONCLUSION In conclusion, the presented approach leverages the strengths of drug discovery to address critical issues in metabolomics research and aims to bridge the gap between these two disciplines. This work lays the foundation for future research in this direction, possibly exploring alternative structural encodings and machine learning models.
Collapse
Affiliation(s)
- Christel Sirocchi
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy.
| | - Federica Biancucci
- Department of Biomolecular Sciences, University of Urbino, Via Saffi 2, Urbino, 61029, Italy
| | - Matteo Donati
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy
| | - Alessandro Bogliolo
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy
| | - Mauro Magnani
- Department of Biomolecular Sciences, University of Urbino, Via Saffi 2, Urbino, 61029, Italy
| | - Michele Menotta
- Department of Biomolecular Sciences, University of Urbino, Via Saffi 2, Urbino, 61029, Italy
| | - Sara Montagna
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy
| |
Collapse
|
14
|
Burger PB, Hu X, Balabin I, Muller M, Stanley M, Joubert F, Kaiser TM. FEP Augmentation as a Means to Solve Data Paucity Problems for Machine Learning in Chemical Biology. J Chem Inf Model 2024; 64:3812-3825. [PMID: 38651738 PMCID: PMC11094716 DOI: 10.1021/acs.jcim.4c00071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 04/01/2024] [Accepted: 04/02/2024] [Indexed: 04/25/2024]
Abstract
In the realm of medicinal chemistry, the primary objective is to swiftly optimize a multitude of chemical properties of a set of compounds to yield a clinical candidate poised for clinical trials. In recent years, two computational techniques, machine learning (ML) and physics-based methods, have evolved substantially and are now frequently incorporated into the medicinal chemist's toolbox to enhance the efficiency of both hit optimization and candidate design. Both computational methods come with their own set of limitations, and they are often used independently of each other. ML's capability to screen extensive compound libraries expediently is tempered by its reliance on quality data, which can be scarce especially during early-stage optimization. Contrarily, physics-based approaches like free energy perturbation (FEP) are frequently constrained by low throughput and high cost by comparison; however, physics-based methods are capable of making highly accurate binding affinity predictions. In this study, we harnessed the strength of FEP to overcome data paucity in ML by generating virtual activity data sets which then inform the training of algorithms. Here, we show that ML algorithms trained with an FEP-augmented data set could achieve comparable predictive accuracy to data sets trained on experimental data from biological assays. Throughout the paper, we emphasize key mechanistic considerations that must be taken into account when aiming to augment data sets and lay the groundwork for successful implementation. Ultimately, the study advocates for the synergy of physics-based methods and ML to expedite the lead optimization process. We believe that the physics-based augmentation of ML will significantly benefit drug discovery, as these techniques continue to evolve.
Collapse
Affiliation(s)
- Pieter B. Burger
- Avicenna
Biosciences Inc., 101
W. Chapel Hill Street, Suite 210, Durham, North Carolina 27001, United States
| | - Xiaohu Hu
- Schrödinger,
Inc., 120 West 45th Street, New York, New York 10036, United States
| | - Ilya Balabin
- Avicenna
Biosciences Inc., 101
W. Chapel Hill Street, Suite 210, Durham, North Carolina 27001, United States
| | - Morné Muller
- Avicenna
Biosciences Inc., 101
W. Chapel Hill Street, Suite 210, Durham, North Carolina 27001, United States
| | - Megan Stanley
- Microsoft
Research AI4Science, 21 Station Road, Cambridge CB1 2FB, U.K.
| | - Fourie Joubert
- Centre
for Bioinformatics and Computational Biology, Department of Biochemistry,
Genetics and Microbiology, University of
Pretoria, Pretoria 0001, South Africa
| | - Thomas M. Kaiser
- Avicenna
Biosciences Inc., 101
W. Chapel Hill Street, Suite 210, Durham, North Carolina 27001, United States
| |
Collapse
|
15
|
Handa K, Yoshimura S, Kageyama M, Iijima T. Development of Novel Methods for QSAR Modeling by Machine Learning Repeatedly: A Case Study on Drug Distribution to Each Tissue. J Chem Inf Model 2024; 64:3662-3669. [PMID: 38639496 DOI: 10.1021/acs.jcim.4c00046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2024]
Abstract
Artificial intelligence is expected to help identify excellent candidates in drug discovery. However, we face a lack of data, as it is time-consuming and expensive to acquire raw data perfectly for many compounds. Hence, we tried to develop a novel quantitative structure-activity relationship (QSAR) method to predict a parameter more precisely from an incomplete data set via optimizing data handling by making use of predicted explanatory variables. As a case study we focused on the tissue-to-plasma partition coefficient (Kp), which is an important parameter for understanding drug distribution in tissues and building the physiologically based pharmacokinetic model and is a representative of small and sparse data sets. In this study, we predicted the Kp values of 119 compounds in nine tissues (adipose, brain, gut, heart, kidney, liver, lung, muscle, and skin), although some of these were not available. To fill the missing values in Kp for each tissue, first we predicted those Kp values by the nonmissing data set using a random forest (RF) model with in vitro parameters (log P, fu, Drug Class, and fi) like a classical prediction by a QSAR model. Next, to predict the tissue-specific Kp values in a test data set, we constructed a second RF model with not only in vitro parameters but also the Kp values of other tissues (i.e., other than target tissues) predicted by the first RF model as explanatory variables. Furthermore, we tested all possible combinations of explanatory variables and selected the model with the highest predictability from the test data set as the final model. The evaluation of Kp prediction accuracy based on the root-mean-square error and R2 value revealed that the proposed models outperformed other machine learning methods such as the conventional RF and message-passing neural networks. Significant improvements were observed in the Kp values of adipose tissue, brain, kidney, liver, and skin. These improvements indicated that the Kp information on other tissues can be used to predict the same for a specific tissue. Additionally, we found a novel relationship between each tissue by evaluating all combinations of explanatory variables. In conclusion, we developed a novel RF model to predict Kp values. We hope that this method will be applied to various problems in the field of experimental biology which often contains missing values in the near future.
Collapse
Affiliation(s)
- Koichi Handa
- Toxicology & DMPK Research Department, Teijin Institute for Bio-medical Research, Teijin Pharma Limited, 4-3-2 Asahigaoka, Hino-shi, Tokyo 191-8512, Japan
| | - Saki Yoshimura
- Toxicology & DMPK Research Department, Teijin Institute for Bio-medical Research, Teijin Pharma Limited, 4-3-2 Asahigaoka, Hino-shi, Tokyo 191-8512, Japan
| | - Michiharu Kageyama
- Toxicology & DMPK Research Department, Teijin Institute for Bio-medical Research, Teijin Pharma Limited, 4-3-2 Asahigaoka, Hino-shi, Tokyo 191-8512, Japan
| | - Takeshi Iijima
- Toxicology & DMPK Research Department, Teijin Institute for Bio-medical Research, Teijin Pharma Limited, 4-3-2 Asahigaoka, Hino-shi, Tokyo 191-8512, Japan
| |
Collapse
|
16
|
Montero V, Montana M, Carré M, Vanelle P. Quinoxaline derivatives: Recent discoveries and development strategies towards anticancer agents. Eur J Med Chem 2024; 271:116360. [PMID: 38614060 DOI: 10.1016/j.ejmech.2024.116360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/19/2024] [Accepted: 03/25/2024] [Indexed: 04/15/2024]
Abstract
Cancer is a leading cause of death and a major health problem worldwide. While many effective anticancer agents are available, most drugs currently on the market are not specific, raising issues like the common side effects of chemotherapy. However, recent research hold promises for the development of more efficient and safer anticancer drugs. Quinoxaline and its derivatives are becoming recognized as a novel class of chemotherapeutic agents with activity against different tumors. The present review compiles and discusses studies concerning the therapeutic potential of the anticancer activity of quinoxaline derivatives, covering articles published between January 2018 and January 2023.
Collapse
Affiliation(s)
- Vincent Montero
- Aix Marseille Univ, CNRS, ICR UMR 7273, Equipe Pharmaco-Chimie Radicalaire, Faculté de Pharmacie, CEDEX 05, 13385, Marseille, France; AP-HM, Service de Pharmacologie Clinique et Pharmacovigilance, Hôpital de la Timone, Marseille CEDEX 05, 13385, France.
| | - Marc Montana
- Aix Marseille Univ, CNRS, ICR UMR 7273, Equipe Pharmaco-Chimie Radicalaire, Faculté de Pharmacie, CEDEX 05, 13385, Marseille, France; AP-HM, Oncopharma, Hôpital Nord, Marseille, France
| | - Manon Carré
- Centre de Recherche en Cancérologie de Marseille (CRCM), Inserm UMR1068, CNRS UMR7258, Aix-Marseille Université UM105, Institut Paoli Calmettes - Faculté de Pharmacie, Marseille, France
| | - Patrice Vanelle
- Aix Marseille Univ, CNRS, ICR UMR 7273, Equipe Pharmaco-Chimie Radicalaire, Faculté de Pharmacie, CEDEX 05, 13385, Marseille, France; AP-HM, Service Central de la Qualité et de l'Information Pharmaceutiques, Hôpital Conception, Marseille, 13005, France
| |
Collapse
|
17
|
Morales N, Valdés-Muñoz E, González J, Valenzuela-Hormazábal P, Palma JM, Galarza C, Catagua-González Á, Yáñez O, Pereira A, Bustos D. Machine Learning-Driven Classification of Urease Inhibitors Leveraging Physicochemical Properties as Effective Filter Criteria. Int J Mol Sci 2024; 25:4303. [PMID: 38673888 PMCID: PMC11049951 DOI: 10.3390/ijms25084303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 04/03/2024] [Accepted: 04/08/2024] [Indexed: 04/28/2024] Open
Abstract
Urease, a pivotal enzyme in nitrogen metabolism, plays a crucial role in various microorganisms, including the pathogenic Helicobacter pylori. Inhibiting urease activity offers a promising approach to combating infections and associated ailments, such as chronic kidney diseases and gastric cancer. However, identifying potent urease inhibitors remains challenging due to resistance issues that hinder traditional approaches. Recently, machine learning (ML)-based models have demonstrated the ability to predict the bioactivity of molecules rapidly and effectively. In this study, we present ML models designed to predict urease inhibitors by leveraging essential physicochemical properties. The methodological approach involved constructing a dataset of urease inhibitors through an extensive literature search. Subsequently, these inhibitors were characterized based on physicochemical properties calculations. An exploratory data analysis was then conducted to identify and analyze critical features. Ultimately, 252 classification models were trained, utilizing a combination of seven ML algorithms, three attribute selection methods, and six different strategies for categorizing inhibitory activity. The investigation unveiled discernible trends distinguishing urease inhibitors from non-inhibitors. This differentiation enabled the identification of essential features that are crucial for precise classification. Through a comprehensive comparison of ML algorithms, tree-based methods like random forest, decision tree, and XGBoost exhibited superior performance. Additionally, incorporating the "chemical family type" attribute significantly enhanced model accuracy. Strategies involving a gray-zone categorization demonstrated marked improvements in predictive precision. This research underscores the transformative potential of ML in predicting urease inhibitors. The meticulous methodology outlined herein offers actionable insights for developing robust predictive models within biochemical systems.
Collapse
Affiliation(s)
- Natalia Morales
- Magíster en Ciencias de la Computación, Universidad Católica del Maule, Talca 3460000, Chile; (N.M.); (J.G.)
| | - Elizabeth Valdés-Muñoz
- Doctorado en Biotecnología Traslacional, Centro de Biotecnología de los Recursos Naturales, Universidad Católica del Maule, Talca 3480094, Chile;
| | - Jaime González
- Magíster en Ciencias de la Computación, Universidad Católica del Maule, Talca 3460000, Chile; (N.M.); (J.G.)
| | - Paulina Valenzuela-Hormazábal
- Departamento de Farmacología, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción 4030000, Chile;
| | - Jonathan M. Palma
- Facultad de Ingeniería, Universidad de Talca, Curicó 3344158, Chile;
| | - Christian Galarza
- Departamento de Matemáticas, Facultad de Ciencias Naturales y Matemáticas, Escuela Superior Politécnica del Litoral, Guayaquil EC090903, Ecuador; (C.G.); (Á.C.-G.)
| | - Ángel Catagua-González
- Departamento de Matemáticas, Facultad de Ciencias Naturales y Matemáticas, Escuela Superior Politécnica del Litoral, Guayaquil EC090903, Ecuador; (C.G.); (Á.C.-G.)
| | - Osvaldo Yáñez
- Núcleo de Investigación en Data Science, Facultad de Ingeniería y Negocios, Universidad de las Américas, Santiago 7500000, Chile;
| | - Alfredo Pereira
- Facultad de Ingeniería, Arquitectura y Diseño, Universidad San Sebastián, Bellavista 7, Santiago 8420524, Chile
| | - Daniel Bustos
- Laboratorio de Bioinformática y Química Computacional, Departamento de Medicina Traslacional, Facultad de Medicina, Universidad Católica del Maule, Talca 3480094, Chile
| |
Collapse
|
18
|
Pore S, Banerjee A, Roy K. Application of machine learning-based read-across structure-property relationship (RASPR) as a new tool for predictive modelling: Prediction of power conversion efficiency (PCE) for selected classes of organic dyes in dye-sensitized solar cells (DSSCs). Mol Inform 2024; 43:e202300210. [PMID: 38374528 DOI: 10.1002/minf.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/31/2023] [Accepted: 02/04/2024] [Indexed: 02/21/2024]
Abstract
The application of various in-silico-based approaches for the prediction of various properties of materials has been an effective alternative to experimental methods. Recently, the concepts of Quantitative structure-property relationship (QSPR) and read-across (RA) methods were merged to develop a new emerging chemoinformatic tool: read-across structure-property relationship (RASPR). The RASPR method can be applicable to both large and small datasets as it uses various similarity and error-based measures. It has also been observed that RASPR models tend to have an increased external predictivity compared to the corresponding QSPR models. In this study, we have modeled the power conversion efficiency (PCE) of organic dyes used in dye-sensitized solar cells (DSSCs) by using the quantitative RASPR (q-RASPR) method. We have used relatively larger classes of organic dyes-Phenothiazines (n=207), Porphyrins (n=281), and Triphenylamines (n=229) for the modelling purpose. We have divided each of the datasets into training and test sets in 3 different combinations, and with the training sets we have developed three different QSPR models with structural and physicochemical descriptors and validated them with the corresponding test sets. These corresponding modeled descriptors were used to calculate the RASPR descriptors using a Java-based tool RASAR Descriptor Calculator v2.0 (https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home), and then data fusion was performed by pooling the previously selected structural and physicochemical descriptors with the calculated RASPR descriptors. Further feature selection algorithm was employed to develop the final RASPR PLS models. Here, we also developed different machine learning (ML) models with the descriptors selected in the QSPR PLS and RASPR PLS models, and it was found that models with RASPR descriptors superseded in external predictivity the models with only structural and physicochemical descriptors: RMSEP reduced for phenothiazines from 1.16-1.25 to 1.07-1.18, for porphyrins from 1.60-1.79 to 1.45-1.53, for triphenylamines from 1.27-1.54 to 1.20-1.47.
Collapse
Affiliation(s)
- Souvik Pore
- Drug Theoretics and Chemoinformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, 188 Raja S C Mullick Road, 700032, Kolkata, India
| | - Arkaprava Banerjee
- Drug Theoretics and Chemoinformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, 188 Raja S C Mullick Road, 700032, Kolkata, India
| | - Kunal Roy
- Drug Theoretics and Chemoinformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, 188 Raja S C Mullick Road, 700032, Kolkata, India
| |
Collapse
|
19
|
Das AP, Agarwal SM. Recent advances in the area of plant-based anti-cancer drug discovery using computational approaches. Mol Divers 2024; 28:901-925. [PMID: 36670282 PMCID: PMC9859751 DOI: 10.1007/s11030-022-10590-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 12/18/2022] [Indexed: 01/22/2023]
Abstract
Phytocompounds are a well-established source of drug discovery due to their unique chemical and functional diversities. In the area of cancer therapeutics, several phytocompounds have been used till date to design and develop new drugs. One of the desired interests of pharmaceutical companies and researchers globally is that new anti-cancer leads are discovered, for which phytocompounds can be considered a valuable source. Simultaneously, in recent years, the growth of computational approaches like virtual screening (VS), molecular dynamics (MD), pharmacophore modelling, Quantitative structure-activity relationship (QSAR), Absorption Distribution Metabolism Excretion and Toxicity (ADMET), network biology, and machine learning (ML) has gained importance due to their efficiency, reduced time-consuming nature, and cost-effectiveness. Therefore, the present review amalgamates the information on plant-based molecules identified for cancer lead discovery from in silico approaches. The mandate of this review is to discuss studies published in the last 5-6 years that aim to identify the phytomolecules as leads against cancer with the help of traditional computational approaches as well as newer techniques like network pharmacology and ML. This review also lists the databases and webservers available in the public domain for phytocompounds related information that can be harnessed for drug discovery. It is expected that the present review would be useful to pharmacologists, medicinal chemists, molecular biologists, and other researchers involved in the development of natural products (NPs) into clinically effective lead molecules.
Collapse
Affiliation(s)
- Agneesh Pratim Das
- Bioinformatics Division, ICMR-National Institute of Cancer Prevention and Research, I-7, Sector-39, Noida, Uttar Pradesh, 201301, India
| | - Subhash Mohan Agarwal
- Bioinformatics Division, ICMR-National Institute of Cancer Prevention and Research, I-7, Sector-39, Noida, Uttar Pradesh, 201301, India.
| |
Collapse
|
20
|
Srisongkram T, Tookkane D. Insights into the structure-activity relationship of pyrimidine-sulfonamide analogues for targeting BRAF V600E protein. Biophys Chem 2024; 307:107179. [PMID: 38241826 DOI: 10.1016/j.bpc.2024.107179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 01/08/2024] [Accepted: 01/11/2024] [Indexed: 01/21/2024]
Abstract
B-rapidly accelerated fibrosarcoma (BRAF) V600E plays a crucial role in the progression of cutaneous melanoma. Core structures of BRAF V600E inhibitors are based on pyrimidine-sulfonamide scaffolds. Exploring the QSAR of these structures can improve our understanding of BRAF V600E inhibitor drug design. This study utilized machine learning-based QSAR to elucidate chemical substructures of pyrimidine-sulfonamide analogues that correlated to the BRAF V600E inhibitory activity. The findings indicate that the support vector regression (SVR) combined with 15 fingerprints achieved the highest statistical performances in terms of goodness-of-fit, robustness, and predictability. Nine key fingerprints from pyrimidine-sulfonamide analogues were identified to exert the BRAF V600E inhibitory activity. These key fingerprints were validated using network-based activity cliff landscape and molecular docking. Together, the developed algorithm can serve as a screening tool for designing BRAF V600E inhibitors. To further utilize this model, we deployed our developed algorithm at https://qsarlabs.com/#braf.
Collapse
Affiliation(s)
- Tarapong Srisongkram
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand.
| | - Dheerapat Tookkane
- Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, 40002, Thailand
| |
Collapse
|
21
|
Velásquez-López Y, Ruiz-Escudero A, Arrasate S, González-Díaz H. Implementation of IFPTML Computational Models in Drug Discovery Against Flaviviridae Family. J Chem Inf Model 2024; 64:1841-1852. [PMID: 38466369 PMCID: PMC10966645 DOI: 10.1021/acs.jcim.3c01796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 02/26/2024] [Accepted: 02/27/2024] [Indexed: 03/13/2024]
Abstract
The Flaviviridae family consists of single-stranded positive-sense RNA viruses, which contains the genera Flavivirus, Hepacivirus, Pegivirus, and Pestivirus. Currently, there is an outbreak of viral diseases caused by this family affecting millions of people worldwide, leading to significant morbidity and mortality rates. Advances in computational chemistry have greatly facilitated the discovery of novel drugs and treatments for diseases associated with this family. Chemoinformatic techniques, such as the perturbation theory machine learning method, have played a crucial role in developing new approaches based on ML models that can effectively aid drug discovery. The IFPTML models have shown its capability to handle, classify, and process large data sets with high specificity. The results obtained from different models indicates that this methodology is proficient in processing the data, resulting in a reduction of the false positive rate by 4.25%, along with an accuracy of 83% and reliability of 92%. These values suggest that the model can serve as a computational tool in assisting drug discovery efforts and the development of new treatments against Flaviviridae family diseases.
Collapse
Affiliation(s)
- Yendrek Velásquez-López
- Departamento
de Química Orgánica e Inorgánica, Facultad de
Ciencia y Tecnología, Universidad
del País Vasco/Euskal Herriko Unibertsitatea UPV/EHU. Apdo. 644. 48080 Bilbao (Spain)
- Bio-Cheminformatics
Research Group, Universidad de Las Américas, Quito 170504, (Ecuador)
| | - Andrea Ruiz-Escudero
- Department
of Pharmacology, University of the Basque
Country UPV/EHU, 48940 Leioa, (Spain)
- IKERDATA
S.L., ZITEK, University of Basque Country
UPV/EHU, Rectorate Building, 48940 Leioa, Spain
| | - Sonia Arrasate
- Departamento
de Química Orgánica e Inorgánica, Facultad de
Ciencia y Tecnología, Universidad
del País Vasco/Euskal Herriko Unibertsitatea UPV/EHU. Apdo. 644. 48080 Bilbao (Spain)
| | - Humberto González-Díaz
- Departamento
de Química Orgánica e Inorgánica, Facultad de
Ciencia y Tecnología, Universidad
del País Vasco/Euskal Herriko Unibertsitatea UPV/EHU. Apdo. 644. 48080 Bilbao (Spain)
- BIOFISIKA, Basque
Center for Biophysics CSIC-UPV/EHU, 48940 Bilbao (Spain)
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao (Spain)
| |
Collapse
|
22
|
Siddique F, Anwaar A, Bashir M, Nadeem S, Rawat R, Eyupoglu V, Afzal S, Bibi M, Bin Jardan YA, Bourhia M. Revisiting methotrexate and phototrexate Zinc15 library-based derivatives using deep learning in-silico drug design approach. Front Chem 2024; 12:1380266. [PMID: 38576849 PMCID: PMC10991842 DOI: 10.3389/fchem.2024.1380266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 03/05/2024] [Indexed: 04/06/2024] Open
Abstract
Introduction: Cancer is the second most prevalent cause of mortality in the world, despite the availability of several medications for cancer treatment. Therefore, the cancer research community emphasized on computational techniques to speed up the discovery of novel anticancer drugs. Methods: In the current study, QSAR-based virtual screening was performed on the Zinc15 compound library (271 derivatives of methotrexate (MTX) and phototrexate (PTX)) to predict their inhibitory activity against dihydrofolate reductase (DHFR), a potential anticancer drug target. The deep learning-based ADMET parameters were employed to generate a 2D QSAR model using the multiple linear regression (MPL) methods with Leave-one-out cross-validated (LOO-CV) Q2 and correlation coefficient R2 values as high as 0.77 and 0.81, respectively. Results: From the QSAR model and virtual screening analysis, the top hits (09, 27, 41, 68, 74, 85, 99, 180) exhibited pIC50 ranging from 5.85 to 7.20 with a minimum binding score of -11.6 to -11.0 kcal/mol and were subjected to further investigation. The ADMET attributes using the message-passing neural network (MPNN) model demonstrated the potential of selected hits as an oral medication based on lipophilic profile Log P (0.19-2.69) and bioavailability (76.30% to 78.46%). The clinical toxicity score was 31.24% to 35.30%, with the least toxicity score (8.30%) observed with compound 180. The DFT calculations were carried out to determine the stability, physicochemical parameters and chemical reactivity of selected compounds. The docking results were further validated by 100 ns molecular dynamic simulation analysis. Conclusion: The promising lead compounds found endorsed compared to standard reference drugs MTX and PTX that are best for anticancer activity and can lead to novel therapies after experimental validations. Furthermore, it is suggested to unveil the inhibitory potential of identified hits via in-vitro and in-vivo approaches.
Collapse
Affiliation(s)
- Farhan Siddique
- School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, China
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Bahauddin Zakariya University, Multan, Pakistan
| | - Ahmar Anwaar
- Faculty of Pharmacy, Bahauddin Zakariya University, Multan, Pakistan
| | - Maryam Bashir
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Bahauddin Zakariya University, Multan, Pakistan
- Southern Punjab Institute of Health Sciences, Multan, Pakistan
| | - Sumaira Nadeem
- Department of Pharmacy, The Women University, Multan, Pakistan
| | - Ravi Rawat
- School of Health Sciences & Technology, UPES University, Dehradun, India
| | - Volkan Eyupoglu
- Department of Chemistry, Cankırı Karatekin University, Cankırı, Türkiye
| | - Samina Afzal
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Bahauddin Zakariya University, Multan, Pakistan
| | - Mehvish Bibi
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Bahauddin Zakariya University, Multan, Pakistan
| | - Yousef A. Bin Jardan
- Department of Pharmaceutics, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Mohammed Bourhia
- Laboratory of Biotechnology and Natural Resources Valorization, Faculty of Sciences, Ibn Zohr University, Agadir, Morocco
| |
Collapse
|
23
|
Subong BJJ, Ozawa T. Bio-Chemoinformatics-Driven Analysis of nsp7 and nsp8 Mutations and Their Effects on Viral Replication Protein Complex Stability. Curr Issues Mol Biol 2024; 46:2598-2619. [PMID: 38534781 DOI: 10.3390/cimb46030165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 03/12/2024] [Accepted: 03/14/2024] [Indexed: 03/28/2024] Open
Abstract
The nonstructural proteins 7 and 8 (nsp7 and nsp8) of SARS-CoV-2 are highly important proteins involved in the RNA-dependent polymerase (RdRp) protein replication complex. In this study, we analyzed the global mutation of nsp7 and nsp8 in 2022 and 2023 and analyzed the effects of mutation on the viral replication protein complex using bio-chemoinformatics. Frequently occurring variants are found to be single amino acid mutations for both nsp7 and nsp8. The most frequently occurring mutations for nsp7 which include L56F, L71F, S25L, M3I, D77N, V33I and T83I are predicted to cause destabilizing effects, whereas those in nsp8 are predicted to cause stabilizing effects, with the threonine to isoleucine mutation (T89I, T145I, T123I, T148I, T187I) being a frequent mutation. A conserved domain database analysis generated critical interaction residues for nsp7 (Lys-7, His-36 and Asn-37) and nsp8 (Lys-58, Pro-183 and Arg-190), which, according to thermodynamic calculations, are prone to destabilization. Trp-29, Phe-49 of nsp7 and Trp-154, Tyr-135 and Phe-15 of nsp8 cause greater destabilizing effects to the protein complex based on a computational alanine scan suggesting them as possible new target sites. This study provides an intensive analysis of the mutations of nsp7 and nsp8 and their possible implications for viral complex stability.
Collapse
Affiliation(s)
- Bryan John J Subong
- Department of Chemistry, School of Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8654, Japan
| | - Takeaki Ozawa
- Department of Chemistry, School of Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8654, Japan
| |
Collapse
|
24
|
Gopal D, Muthuraj R, Balaya RDA, Kanekar S, Ahmed I, Chandrasekaran J. Computational discovery of novel FYN kinase inhibitors: a cheminformatics and machine learning-driven approach to targeted cancer and neurodegenerative therapy. Mol Divers 2024:10.1007/s11030-024-10819-7. [PMID: 38418686 DOI: 10.1007/s11030-024-10819-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 01/30/2024] [Indexed: 03/02/2024]
Abstract
In this study, we explored the potential of novel inhibitors for FYN kinase, a critical target in cancer and neurodegenerative disorders, by integrating advanced cheminformatics, machine learning, and molecular simulation techniques. Our approach involved analyzing key interactions for FYN inhibition using established multi-kinase inhibitors such as Staurosporine, Dasatinib, and Saracatinib. We utilized ECFP4 circular fingerprints and the t-SNE machine learning algorithm to compare molecular similarities between FDA-approved drugs and known clinical trial inhibitors. This led to the identification of potential inhibitors, including Afatinib, Copanlisib, and Vandetanib. Using the DrugSpaceX platform, we generated a vast library of 72,196 analogues from these leads, which after careful refinement, resulted in 6008 promising candidates. Subsequent clustering identified 48 analogues with significant similarity to known inhibitors. Notably, two candidates derived from Vandetanib, DE27123047 and DE27123035, exhibited strong docking affinities and stable binding in molecular dynamics simulations. These candidates showed high potential as effective FYN kinase inhibitors, as evidenced by MMGBSA calculations and MCE-18 scores exceeding 50. Additionally, our exploration into their molecular architecture revealed potential modification sites on the quinazolin-4-amine scaffold, suggesting opportunities for strategic alterations to enhance activity and optimize ADME properties. Our research is a pioneering effort in drug discovery, unveiling novel candidates for FYN inhibition and demonstrating the efficacy of a multi-layered computational strategy. The molecular insights gained provide a pathway for strategic refinements and future experimental validations, setting a new direction in targeted drug development against diseases involving FYN kinase.
Collapse
Affiliation(s)
- Dhanushya Gopal
- Department of Pharmacology, Sri Ramachandra Faculty of Pharmacy, Sri Ramachandra Institute of Higher Education and Research (Deemed to be University), Chennai, 600116, India
| | - Rajesh Muthuraj
- Department of Pharmacology, Sri Ramachandra Faculty of Pharmacy, Sri Ramachandra Institute of Higher Education and Research (Deemed to be University), Chennai, 600116, India
| | | | - Saptami Kanekar
- Centre for Integrative Omics Data Science, Yenepoya (Deemed to be University), Mangalore, Karnataka, India
| | - Iqrar Ahmed
- Department of Pharmaceutical Chemistry, Prof. Ravindra Nikam College of Pharmacy, Dhule, India
- Division of Computer Aided Drug Design, Department of Pharmaceutical Chemistry, R. C. Patel Institute of Pharmaceutical Education and Research, Shirpur, India
| | - Jaikanth Chandrasekaran
- Department of Pharmacology, Sri Ramachandra Faculty of Pharmacy, Sri Ramachandra Institute of Higher Education and Research (Deemed to be University), Chennai, 600116, India.
| |
Collapse
|
25
|
Tripathi T, Singh DB, Tripathi T. Computational resources and chemoinformatics for translational health research. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2024; 139:27-55. [PMID: 38448138 DOI: 10.1016/bs.apcsb.2023.11.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
The integration of computational resources and chemoinformatics has revolutionized translational health research. It has offered a powerful set of tools for accelerating drug discovery. This chapter overviews the computational resources and chemoinformatics methods used in translational health research. The resources and methods can be used to analyze large datasets, identify potential drug candidates, predict drug-target interactions, and optimize treatment regimens. These resources have the potential to transform the drug discovery process and foster personalized medicine research. We discuss insights into their various applications in translational health and emphasize the need for addressing challenges, promoting collaboration, and advancing the field to fully realize the potential of these tools in transforming healthcare.
Collapse
Affiliation(s)
- Tripti Tripathi
- Molecular and Structural Biophysics Laboratory, Department of Biochemistry, North-Eastern Hill University, Shillong, India
| | - Dev Bukhsh Singh
- Department of Biotechnology, Siddharth University, Kapilvastu, Siddharth Nagar, India
| | - Timir Tripathi
- Molecular and Structural Biophysics Laboratory, Department of Zoology, North-Eastern Hill University, Shillong, India.
| |
Collapse
|
26
|
Tiwari SP, Shi W, Budhathoki S, Baker J, Sekizkardes AK, Zhu L, Kusuma VA, Hopkinson DP, Steckel JA. Creation of Polymer Datasets with Targeted Backbones for Screening of High-Performance Membranes for Gas Separation. J Chem Inf Model 2024; 64:638-652. [PMID: 38294781 DOI: 10.1021/acs.jcim.3c01232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
A simple approach was developed to computationally construct a polymer dataset by combining simplified molecular-input line-entry system (SMILES) strings of a targeted polymer backbone and a variety of molecular fragments. This method was used to create 14 polymer datasets by combining seven polymer backbones and molecules from two large molecular datasets (MOSES and QM9). Polymer backbones that were studied include four polydimethylsiloxane (PDMS) based backbones, poly(ethylene oxide) (PEO), poly(allyl glycidyl ether) (PAGE), and polyphosphazene (PPZ). The generated polymer datasets can be used for various cheminformatics tasks, including high-throughput screening for gas permeability and selectivity. This study utilized machine learning (ML) models to screen the polymers for CO2/CH4 and CO2/N2 gas separation using membranes. Several polymers of interest were identified. The results highlight that employing an ML model fitted to polymer selectivities leads to higher accuracy in predicting polymer selectivity compared to using the ratio of predicted permeabilities.
Collapse
Affiliation(s)
- Surya Prakash Tiwari
- National Energy Technology Laboratory, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
- NETL Support Contractor, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
| | - Wei Shi
- National Energy Technology Laboratory, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
| | - Samir Budhathoki
- National Energy Technology Laboratory, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
- NETL Support Contractor, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
| | - James Baker
- National Energy Technology Laboratory, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
- NETL Support Contractor, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
| | - Ali K Sekizkardes
- National Energy Technology Laboratory, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
- NETL Support Contractor, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
| | - Lingxiang Zhu
- National Energy Technology Laboratory, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
- NETL Support Contractor, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
| | - Victor A Kusuma
- National Energy Technology Laboratory, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
- NETL Support Contractor, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
| | - David P Hopkinson
- National Energy Technology Laboratory, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
| | - Janice A Steckel
- National Energy Technology Laboratory, 626 Cochran Mill Road, Pittsburgh, Pennsylvania 15236, United States
| |
Collapse
|
27
|
Chen D, Liu J, Wei GW. TopoFormer: Multiscale Topology-enabled Structure-to-Sequence Transformer for Protein-Ligand Interaction Predictions. RESEARCH SQUARE 2024:rs.3.rs-3640878. [PMID: 38405777 PMCID: PMC10889053 DOI: 10.21203/rs.3.rs-3640878/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Pre-trained deep Transformers have had tremendous success in a wide variety of disciplines. However, in computational biology, essentially all Transformers are built upon the biological sequences, which ignores vital stereochemical information and may result in crucial errors in downstream predictions. On the other hand, three-dimensional (3D) molecular structures are incompatible with the sequential architecture of Transformer and natural language processing (NLP) models in general. This work addresses this foundational challenge by a topological Transformer (TopoFormer). TopoFormer is built by integrating NLP and a multiscale topology techniques, the persistent topological hyperdigraph Laplacian (PTHL), which systematically converts intricate 3D protein-ligand complexes at various spatial scales into a NLP-admissible sequence of topological invariants and homotopic shapes. Element-specific PTHLs are further developed to embed crucial physical, chemical, and biological interactions into topological sequences. TopoFormer surges ahead of conventional algorithms and recent deep learning variants and gives rise to exemplary scoring accuracy and superior performance in ranking, docking, and screening tasks in a number of benchmark datasets. The proposed topological sequences can be extracted from all kinds of structural data in data science to facilitate various NLP models, heralding a new era in AI-driven discovery.
Collapse
Affiliation(s)
- Dong Chen
- Department of Mathematics, Michigan State University, MI, 48824, USA
| | - Jian Liu
- Department of Mathematics, Michigan State University, MI, 48824, USA
- Mathematical Science Research Center, Chongqing University of Technology, Chongqing 400054, China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI, 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
28
|
Meng W, Pan H, Sha Y, Zhai X, Xing A, Lingampelly SS, Sripathi SR, Wang Y, Li K. Metabolic Connectome and Its Role in the Prediction, Diagnosis, and Treatment of Complex Diseases. Metabolites 2024; 14:93. [PMID: 38392985 PMCID: PMC10890086 DOI: 10.3390/metabo14020093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/17/2024] [Accepted: 01/25/2024] [Indexed: 02/25/2024] Open
Abstract
The interconnectivity of advanced biological systems is essential for their proper functioning. In modern connectomics, biological entities such as proteins, genes, RNA, DNA, and metabolites are often represented as nodes, while the physical, biochemical, or functional interactions between them are represented as edges. Among these entities, metabolites are particularly significant as they exhibit a closer relationship to an organism's phenotype compared to genes or proteins. Moreover, the metabolome has the ability to amplify small proteomic and transcriptomic changes, even those from minor genomic changes. Metabolic networks, which consist of complex systems comprising hundreds of metabolites and their interactions, play a critical role in biological research by mediating energy conversion and chemical reactions within cells. This review provides an introduction to common metabolic network models and their construction methods. It also explores the diverse applications of metabolic networks in elucidating disease mechanisms, predicting and diagnosing diseases, and facilitating drug development. Additionally, it discusses potential future directions for research in metabolic networks. Ultimately, this review serves as a valuable reference for researchers interested in metabolic network modeling, analysis, and their applications.
Collapse
Affiliation(s)
- Weiyu Meng
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| | - Hongxin Pan
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| | - Yuyang Sha
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| | - Xiaobing Zhai
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| | - Abao Xing
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| | | | - Srinivasa R Sripathi
- Henderson Ocular Stem Cell Laboratory, Retina Foundation of the Southwest, Dallas, TX 75231, USA
| | - Yuefei Wang
- National Key Laboratory of Chinese Medicine Modernization, State Key Laboratory of Component-Based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China
- Haihe Laboratory of Modern Chinese Medicine, Tianjin 301617, China
| | - Kefeng Li
- Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
| |
Collapse
|
29
|
Chowdhury J, Fricke C, Bamidele O, Bello M, Yang W, Heyden A, Terejanu G. Invariant Molecular Representations for Heterogeneous Catalysis. J Chem Inf Model 2024; 64:327-339. [PMID: 38197612 PMCID: PMC10806804 DOI: 10.1021/acs.jcim.3c00594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 12/25/2023] [Accepted: 12/28/2023] [Indexed: 01/11/2024]
Abstract
Catalyst screening is a critical step in the discovery and development of heterogeneous catalysts, which are vital for a wide range of chemical processes. In recent years, computational catalyst screening, primarily through density functional theory (DFT), has gained significant attention as a method for identifying promising catalysts. However, the computation of adsorption energies for all likely chemical intermediates present in complex surface chemistries is computationally intensive and costly due to the expensive nature of these calculations and the intrinsic idiosyncrasies of the methods or data sets used. This study introduces a novel machine learning (ML) method to learn adsorption energies from multiple DFT functionals by using invariant molecular representations (IMRs). To do this, we first extract molecular fingerprints for the reaction intermediates and later use a Siamese-neural-network-based training strategy to learn invariant molecular representations or the IMR across all available functionals. Our Siamese network-based representations demonstrate superior performance in predicting adsorption energies compared with other molecular representations. Notably, when considering mean absolute values of adsorption energies as 0.43 eV (PBE-D3), 0.46 eV (BEEF-vdW), 0.81 eV (RPBE), and 0.37 eV (scan+rVV10), our IMR method has achieved the lowest mean absolute errors (MAEs) of 0.18 0.10, 0.16, and 0.18 eV, respectively. These results emphasize the superior predictive capacity of our Siamese network-based representations. The empirical findings in this study illuminate the efficacy, robustness, and dependability of our proposed ML paradigm in predicting adsorption energies, specifically for propane dehydrogenation on a platinum catalyst surface.
Collapse
Affiliation(s)
- Jawad Chowdhury
- Department
of Computer Science, University of North
Carolina at Charlotte, Charlotte, North Carolina 28223, United States
| | - Charles Fricke
- Department
of Chemical Engineering, University of South
Carolina, Columbia, South Carolina 29208, United States
| | - Olajide Bamidele
- Department
of Chemical Engineering, University of South
Carolina, Columbia, South Carolina 29208, United States
| | - Mubarak Bello
- Department
of Chemical Engineering, University of South
Carolina, Columbia, South Carolina 29208, United States
| | - Wenqiang Yang
- Department
of Chemical Engineering, University of South
Carolina, Columbia, South Carolina 29208, United States
| | - Andreas Heyden
- Department
of Chemical Engineering, University of South
Carolina, Columbia, South Carolina 29208, United States
| | - Gabriel Terejanu
- Department
of Computer Science, University of North
Carolina at Charlotte, Charlotte, North Carolina 28223, United States
| |
Collapse
|
30
|
Yin Z, Chen Y, Hao Y, Pandiyan S, Shao J, Wang L. FOTF-CPI: A compound-protein interaction prediction transformer based on the fusion of optimal transport fragments. iScience 2024; 27:108756. [PMID: 38230261 PMCID: PMC10790010 DOI: 10.1016/j.isci.2023.108756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 11/05/2023] [Accepted: 12/13/2023] [Indexed: 01/18/2024] Open
Abstract
Compound-protein interaction (CPI) affinity prediction plays an important role in reducing the cost and time of drug discovery. However, the interpretability of how fragments function in CPI is impacted by the fact that current methods ignore the affinity relationships between fragments of compounds and fragments of proteins in CPI modeling. This article introduces an improved Transformer called FOTF-CPI (a Fusion of Optimal Transport Fragments compound-protein interaction prediction model). We use an optimal transport-based fragmentation approach to improve the model's understanding of compound and protein sequences. Additionally, a fused attention mechanism is employed, which combines the features of fragments to capture full affinity information. This fused attention redistributes higher attention scores to fragments with higher affinity. Experimental results show FOTF-CPI achieves an average 2% higher performance than other models on all three datasets. Furthermore, the visualization confirms the potential of FOTF-CPI for drug discovery applications.
Collapse
Affiliation(s)
- Zeyu Yin
- School of Information Science and Technology, Nantong University, Nantong 226001, China
| | - Yu Chen
- School of Information Science and Technology, Nantong University, Nantong 226001, China
| | - Yajie Hao
- School of Information Science and Technology, Nantong University, Nantong 226001, China
| | - Sanjeevi Pandiyan
- Research Center for Intelligent Information Technology, Nantong University, Nantong 226001, China
| | - Jinsong Shao
- School of Information Science and Technology, Nantong University, Nantong 226001, China
| | - Li Wang
- School of Information Science and Technology, Nantong University, Nantong 226001, China
- Research Center for Intelligent Information Technology, Nantong University, Nantong 226001, China
| |
Collapse
|
31
|
Baygi SF, Barupal DK. IDSL_MINT: a deep learning framework to predict molecular fingerprints from mass spectra. J Cheminform 2024; 16:8. [PMID: 38238779 PMCID: PMC10797927 DOI: 10.1186/s13321-024-00804-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 01/14/2024] [Indexed: 01/22/2024] Open
Abstract
The majority of tandem mass spectrometry (MS/MS) spectra in untargeted metabolomics and exposomics studies lack any annotation. Our deep learning framework, Integrated Data Science Laboratory for Metabolomics and Exposomics-Mass INTerpreter (IDSL_MINT) can translate MS/MS spectra into molecular fingerprint descriptors. IDSL_MINT allows users to leverage the power of the transformer model for mass spectrometry data, similar to the large language models. Models are trained on user-provided reference MS/MS libraries via any customizable molecular fingerprint descriptors. IDSL_MINT was benchmarked using the LipidMaps database and improved the annotation rate of a test study for MS/MS spectra that were not originally annotated using existing mass spectral libraries. IDSL_MINT may improve the overall annotation rates in untargeted metabolomics and exposomics studies. The IDSL_MINT framework and tutorials are available in the GitHub repository at https://github.com/idslme/IDSL_MINT .Scientific contribution statement.Structural annotation of MS/MS spectra from untargeted metabolomics and exposomics datasets is a major bottleneck in gaining new biological insights. Machine learning models to convert spectra into molecular fingerprints can help in the annotation process. Here, we present IDSL_MINT, a new, easy-to-use and customizable deep-learning framework to train and utilize new models to predict molecular fingerprints from spectra for the compound annotation workflows.
Collapse
Affiliation(s)
- Sadjad Fakouri Baygi
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, CAM Building, 3rd Floor, 17 E 102 St, New York, NY, 10029, USA
| | - Dinesh Kumar Barupal
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, CAM Building, 3rd Floor, 17 E 102 St, New York, NY, 10029, USA.
| |
Collapse
|
32
|
Chang H, Zhang Z, Tian J, Bai T, Xiao Z, Wang D, Qiao R, Li C. Machine Learning-Based Virtual Screening and Identification of the Fourth-Generation EGFR Inhibitors. ACS OMEGA 2024; 9:2314-2324. [PMID: 38250375 PMCID: PMC10795152 DOI: 10.1021/acsomega.3c06225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 11/06/2023] [Accepted: 11/15/2023] [Indexed: 01/23/2024]
Abstract
Epidermal growth factor receptor (EGFR) plays a pivotal regulatory role in treating patients with advanced nonsmall cell lung cancer (NSCLC). Following the emergence of the EGFR tertiary CIS C797S mutation, all types of inhibitors lose their inhibitory activity, necessitating the urgent development of new inhibitors. Computer systems employ machine learning methods to process substantial volumes of data and construct models that enable more accurate predictions of the outcomes of new inputs. The purpose of this article is to uncover innovative fourth-generation epidermal growth factor receptor tyrosine kinase inhibitors (EGFR-TKIs) with the aid of machine learning techniques. The paper's data set was high-dimensional and sparse, encompassing both structured and unstructured descriptors. To address this considerable challenge, we introduced a fusion framework to select critical molecule descriptors by integrating the full quadratic effect model and the Lasso model. Based on structural descriptors obtained from the full quadratic effect model, we conceived and synthesized a variety of small-molecule inhibitors. These inhibitors demonstrated potent inhibitory effects on the two mutated kinases L858R/T790M/C797S and Del19/T790M/C797S. Moreover, we applied our model to virtual screening, successfully identifying four hit compounds. We have evaluated these hit ADME characteristics and look forward to conducting activity evaluations on them in the future to discover a new generation of EGFR-TKI.
Collapse
Affiliation(s)
- Hao Chang
- State
Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, P. R. China
| | - Zeyu Zhang
- School
of Mathematics and Statistics, Beijing Institute
of Technology, Beijing 100081, P. R. China
| | - Jiaxin Tian
- State
Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, P. R. China
| | - Tian Bai
- School
of Mathematics and Statistics, Beijing Institute
of Technology, Beijing 100081, P. R. China
| | - Zijie Xiao
- State
Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, P. R. China
| | - Dianpeng Wang
- School
of Mathematics and Statistics, Beijing Institute
of Technology, Beijing 100081, P. R. China
| | - Renzhong Qiao
- State
Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, P. R. China
| | - Chao Li
- State
Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, P. R. China
| |
Collapse
|
33
|
Li Z, Huang R, Xia M, Patterson TA, Hong H. Fingerprinting Interactions between Proteins and Ligands for Facilitating Machine Learning in Drug Discovery. Biomolecules 2024; 14:72. [PMID: 38254672 PMCID: PMC10813698 DOI: 10.3390/biom14010072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/26/2023] [Accepted: 12/28/2023] [Indexed: 01/24/2024] Open
Abstract
Molecular recognition is fundamental in biology, underpinning intricate processes through specific protein-ligand interactions. This understanding is pivotal in drug discovery, yet traditional experimental methods face limitations in exploring the vast chemical space. Computational approaches, notably quantitative structure-activity/property relationship analysis, have gained prominence. Molecular fingerprints encode molecular structures and serve as property profiles, which are essential in drug discovery. While two-dimensional (2D) fingerprints are commonly used, three-dimensional (3D) structural interaction fingerprints offer enhanced structural features specific to target proteins. Machine learning models trained on interaction fingerprints enable precise binding prediction. Recent focus has shifted to structure-based predictive modeling, with machine-learning scoring functions excelling due to feature engineering guided by key interactions. Notably, 3D interaction fingerprints are gaining ground due to their robustness. Various structural interaction fingerprints have been developed and used in drug discovery, each with unique capabilities. This review recapitulates the developed structural interaction fingerprints and provides two case studies to illustrate the power of interaction fingerprint-driven machine learning. The first elucidates structure-activity relationships in β2 adrenoceptor ligands, demonstrating the ability to differentiate agonists and antagonists. The second employs a retrosynthesis-based pre-trained molecular representation to predict protein-ligand dissociation rates, offering insights into binding kinetics. Despite remarkable progress, challenges persist in interpreting complex machine learning models built on 3D fingerprints, emphasizing the need for strategies to make predictions interpretable. Binding site plasticity and induced fit effects pose additional complexities. Interaction fingerprints are promising but require continued research to harness their full potential.
Collapse
Affiliation(s)
- Zoe Li
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA; (Z.L.); (T.A.P.)
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD 20892, USA; (R.H.); (M.X.)
| | - Menghang Xia
- National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD 20892, USA; (R.H.); (M.X.)
| | - Tucker A. Patterson
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA; (Z.L.); (T.A.P.)
| | - Huixiao Hong
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA; (Z.L.); (T.A.P.)
| |
Collapse
|
34
|
Faquetti ML, Slappendel L, Bigonne H, Grisoni F, Schneider P, Aichinger G, Schneider G, Sturla SJ, Burden AM. Baricitinib and tofacitinib off-target profile, with a focus on Alzheimer's disease. ALZHEIMER'S & DEMENTIA (NEW YORK, N. Y.) 2024; 10:e12445. [PMID: 38528988 PMCID: PMC10962475 DOI: 10.1002/trc2.12445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 11/10/2023] [Accepted: 12/27/2023] [Indexed: 03/27/2024]
Abstract
INTRODUCTION Janus kinase (JAK) inhibitors were recently identified as promising drug candidates for repurposing in Alzheimer's disease (AD) due to their capacity to suppress inflammation via modulation of JAK/STAT signaling pathways. Besides interaction with primary therapeutic targets, JAK inhibitor drugs frequently interact with unintended, often unknown, biological off-targets, leading to associated effects. Nevertheless, the relevance of JAK inhibitors' off-target interactions in the context of AD remains unclear. METHODS Putative off-targets of baricitinib and tofacitinib were predicted using a machine learning (ML) approach. After screening scientific literature, off-targets were filtered based on their relevance to AD. Targets that had not been previously identified as off-targets of baricitinib or tofacitinib were subsequently tested using biochemical or cell-based assays. From those, active concentrations were compared to bioavailable concentrations in the brain predicted by physiologically based pharmacokinetic (PBPK) modeling. RESULTS With the aid of ML and in vitro activity assays, we identified two enzymes previously unknown to be inhibited by baricitinib, namely casein kinase 2 subunit alpha 2 (CK2-α2) and dual leucine zipper kinase (MAP3K12), both with binding constant (K d) values of 5.8 μM. Predicted maximum concentrations of baricitinib in brain tissue using PBPK modeling range from 1.3 to 23 nM, which is two to three orders of magnitude below the corresponding binding constant. CONCLUSION In this study, we extended the list of baricitinib off-targets that are potentially relevant for AD progression and predicted drug distribution in the brain. The results suggest a low likelihood of successful repurposing in AD due to low brain permeability, even at the maximum recommended daily dose. While additional research is needed to evaluate the potential impact of the off-target interaction on AD, the combined approach of ML-based target prediction, in vitro confirmation, and PBPK modeling may help prioritize drugs with a high likelihood of being effectively repurposed for AD. Highlights This study explored JAK inhibitors' off-targets in AD using a multidisciplinary approach.We combined machine learning, in vitro tests, and PBPK modelling to predict and validate new off-target interactions of tofacitinib and baricitinib in AD.Previously unknown inhibition of two enzymes (CK2-a2 and MAP3K12) by baricitinib were confirmed using in vitro experiments.Our PBPK model indicates that baricitinib low brain permeability limits AD repurposing.The proposed multidisciplinary approach optimizes drug repurposing efforts in AD research.
Collapse
Affiliation(s)
- Maria L. Faquetti
- Department of Chemistry and Applied BiosciencesInstitute of Pharmaceutical SciencesETH ZurichZurichSwitzerland
| | - Laura Slappendel
- Department of Health Sciences and TechnologyInstitute of Food, Nutrition and Health, ETH ZurichZurichSwitzerland
| | - Hélène Bigonne
- Department of Health Sciences and TechnologyInstitute of Food, Nutrition and Health, ETH ZurichZurichSwitzerland
| | - Francesca Grisoni
- Department of Biomedical EngineeringInstitute for Complex Molecular SystemsEindhoven University of TechnologyEindhoventhe Netherlands
- Centre for Living TechnologiesAlliance TU/e, WUR, UU, UMC UtrechtUtrechtthe Netherlands
| | - Petra Schneider
- Department of Chemistry and Applied BiosciencesInstitute of Pharmaceutical SciencesETH ZurichZurichSwitzerland
- inSili.com LLCZurichSwitzerland
| | - Georg Aichinger
- Department of Health Sciences and TechnologyInstitute of Food, Nutrition and Health, ETH ZurichZurichSwitzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied BiosciencesInstitute of Pharmaceutical SciencesETH ZurichZurichSwitzerland
- ETH Singapore SEC LtdSingaporeSingapore
| | - Shana J. Sturla
- Department of Health Sciences and TechnologyInstitute of Food, Nutrition and Health, ETH ZurichZurichSwitzerland
| | - Andrea M. Burden
- Department of Chemistry and Applied BiosciencesInstitute of Pharmaceutical SciencesETH ZurichZurichSwitzerland
| |
Collapse
|
35
|
Murali A, Panwar U, Singh SK. Exploring the Role of Chemoinformatics in Accelerating Drug Discovery: A Computational Approach. Methods Mol Biol 2024; 2714:203-213. [PMID: 37676601 DOI: 10.1007/978-1-0716-3441-7_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Cheminformatics and its role in drug discovery is expected to be the privileged approach in handling large number of chemical datasets. This approach contributes toward the pharmaceutical development and assessment of chemical compounds at a faster rate efficiently. Additionally, as technological advancement impacts research, cheminformatics is being used more and more in the field of health science. This chapter describes the concepts of cheminformatics along with its involvement in drug discovery with a case study.
Collapse
Affiliation(s)
- Aarthy Murali
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Science Block, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Umesh Panwar
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Science Block, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Sanjeev Kumar Singh
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Science Block, Alagappa University, Karaikudi, Tamil Nadu, India
- Department of Data Sciences, Centre of Biomedical Research, SGPGIMS Campus, Lucknow, Uttar Pradesh, India
| |
Collapse
|
36
|
Neal WM, Pandey P, Khan SI, Khan IA, Chittiboyina AG. Machine learning and traditional QSAR modeling methods: a case study of known PXR activators. J Biomol Struct Dyn 2024; 42:903-917. [PMID: 37059719 DOI: 10.1080/07391102.2023.2196701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 03/22/2023] [Indexed: 04/16/2023]
Abstract
Pregnane X receptor (PXR), extensively expressed in human tissues related to digestion and metabolism, is responsible for recognizing and detoxifying diverse xenobiotics encountered by humans. To comprehend the promiscuous nature of PXR and its ability to bind a variety of ligands, computational approaches, viz., quantitative structure-activity relationship (QSAR) models, aid in the rapid dereplication of potential toxicological agents and mitigate the number of animals used to establish a meaningful regulatory decision. Recent advancements in machine learning techniques accommodating larger datasets are expected to aid in developing effective predictive models for complex mixtures (viz., dietary supplements) before undertaking in-depth experiments. Five hundred structurally diverse PXR ligands were used to develop traditional two-dimensional (2D) QSAR, machine-learning-based 2D-QSAR, field-based three-dimensional (3D) QSAR, and machine-learning-based 3D-QSAR models to establish the utility of predictive machine learning methods. Additionally, the applicability domain of the agonists was established to ensure the generation of robust QSAR models. A prediction set of dietary PXR agonists was used to externally-validate generated QSAR models. QSAR data analysis revealed that machine-learning 3D-QSAR techniques were more accurate in predicting the activity of external terpenes with an external validation squared correlation coefficient (R2) of 0.70 versus an R2 of 0.52 in machine-learning 2D-QSAR. Additionally, a visual summary of the binding pocket of PXR was assembled from the field 3D-QSAR models. By developing multiple QSAR models in this study, a robust groundwork for assessing PXR agonism from various chemical backbones has been established in anticipation of the identification of potential causative agents in complex mixtures.
Collapse
Affiliation(s)
- William M Neal
- Division of Pharmacognosy, Department of BioMolecular Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
| | - Pankaj Pandey
- National Center for Natural Products Research, Research Institute of Pharmaceutical Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
| | - Shabana I Khan
- Division of Pharmacognosy, Department of BioMolecular Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
- National Center for Natural Products Research, Research Institute of Pharmaceutical Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
| | - Ikhlas A Khan
- Division of Pharmacognosy, Department of BioMolecular Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
- National Center for Natural Products Research, Research Institute of Pharmaceutical Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
| | - Amar G Chittiboyina
- National Center for Natural Products Research, Research Institute of Pharmaceutical Sciences, School of Pharmacy, The University of Mississippi, University, MS, USA
| |
Collapse
|
37
|
Shen C, Luo J, Xia K. Molecular geometric deep learning. CELL REPORTS METHODS 2023; 3:100621. [PMID: 37875121 PMCID: PMC10694498 DOI: 10.1016/j.crmeth.2023.100621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 06/16/2023] [Accepted: 09/28/2023] [Indexed: 10/26/2023]
Abstract
Molecular representation learning plays an important role in molecular property prediction. Existing molecular property prediction models rely on the de facto standard of covalent-bond-based molecular graphs for representing molecular topology at the atomic level and totally ignore the non-covalent interactions within the molecule. In this study, we propose a molecular geometric deep learning model to predict the properties of molecules that aims to comprehensively consider the information of covalent and non-covalent interactions of molecules. The essential idea is to incorporate a more general molecular representation into geometric deep learning (GDL) models. We systematically test molecular GDL (Mol-GDL) on fourteen commonly used benchmark datasets. The results show that Mol-GDL can achieve a better performance than state-of-the-art (SOTA) methods. Extensive tests have demonstrated the important role of non-covalent interactions in molecular property prediction and the effectiveness of Mol-GDL models.
Collapse
Affiliation(s)
- Cong Shen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410000, China; School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410000, China.
| | - Kelin Xia
- School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore.
| |
Collapse
|
38
|
Hou Y, Bai Y, Lu C, Wang Q, Wang Z, Gao J, Xu H. Applying molecular docking to pesticides. PEST MANAGEMENT SCIENCE 2023; 79:4140-4152. [PMID: 37547967 DOI: 10.1002/ps.7700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 07/17/2023] [Accepted: 08/05/2023] [Indexed: 08/08/2023]
Abstract
Pesticide creation is related to the development of sustainable agricultural and ecological safety, and molecular docking technology can effectively help in pesticide innovation. This paper introduces the basic theory behind molecular docking, pesticide databases, and docking software. It also summarizes the application of molecular docking in the pesticide field, including the virtual screening of lead compounds, detection of pesticides and their metabolites in the environment, reverse screening of pesticide targets, and the study of resistance mechanisms. Finally, problems with the use of molecular docking technology in pesticide creation are discussed, and prospects for the future use of molecular docking technology in new pesticide development are discussed. © 2023 Society of Chemical Industry.
Collapse
Affiliation(s)
- Yang Hou
- Engineering Research Center of Pesticide of Heilongjiang Province, College of Advanced Agriculture and Ecological Environment, Heilongjiang University, Harbin, China
| | - Yuqian Bai
- Engineering Research Center of Pesticide of Heilongjiang Province, College of Advanced Agriculture and Ecological Environment, Heilongjiang University, Harbin, China
| | - Chang Lu
- Engineering Research Center of Pesticide of Heilongjiang Province, College of Advanced Agriculture and Ecological Environment, Heilongjiang University, Harbin, China
| | - Qiuchan Wang
- Engineering Research Center of Pesticide of Heilongjiang Province, College of Advanced Agriculture and Ecological Environment, Heilongjiang University, Harbin, China
| | - Zishi Wang
- Engineering Research Center of Pesticide of Heilongjiang Province, College of Advanced Agriculture and Ecological Environment, Heilongjiang University, Harbin, China
| | - Jinsheng Gao
- Engineering Research Center of Pesticide of Heilongjiang Province, College of Advanced Agriculture and Ecological Environment, Heilongjiang University, Harbin, China
| | - Hongliang Xu
- Engineering Research Center of Pesticide of Heilongjiang Province, College of Advanced Agriculture and Ecological Environment, Heilongjiang University, Harbin, China
| |
Collapse
|
39
|
Bernardi A, Bennett WFD, He S, Jones D, Kirshner D, Bennion BJ, Carpenter TS. Advances in Computational Approaches for Estimating Passive Permeability in Drug Discovery. MEMBRANES 2023; 13:851. [PMID: 37999336 PMCID: PMC10673305 DOI: 10.3390/membranes13110851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 10/19/2023] [Accepted: 10/21/2023] [Indexed: 11/25/2023]
Abstract
Passive permeation of cellular membranes is a key feature of many therapeutics. The relevance of passive permeability spans all biological systems as they all employ biomembranes for compartmentalization. A variety of computational techniques are currently utilized and under active development to facilitate the characterization of passive permeability. These methods include lipophilicity relations, molecular dynamics simulations, and machine learning, which vary in accuracy, complexity, and computational cost. This review briefly introduces the underlying theories, such as the prominent inhomogeneous solubility diffusion model, and covers a number of recent applications. Various machine-learning applications, which have demonstrated good potential for high-volume, data-driven permeability predictions, are also discussed. Due to the confluence of novel computational methods and next-generation exascale computers, we anticipate an exciting future for computationally driven permeability predictions.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Timothy S. Carpenter
- Lawrence Livermore National Laboratory, Livermore, CA 94550, USA; (A.B.); (W.F.D.B.); (S.H.); (D.J.); (D.K.); (B.J.B.)
| |
Collapse
|
40
|
Yin X, Wang X, Li Y, Wang J, Wang Y, Deng Y, Hou T, Liu H, Luo P, Yao X. CODD-Pred: A Web Server for Efficient Target Identification and Bioactivity Prediction of Small Molecules. J Chem Inf Model 2023; 63:6169-6176. [PMID: 37820365 DOI: 10.1021/acs.jcim.3c00685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Target identification and bioactivity prediction are critical steps in the drug discovery process. Here we introduce CODD-Pred (COmprehensive Drug Design Predictor), an online web server with well-curated data sets from the GOSTAR database, which is designed with a dual purpose of predicting potential protein drug targets and computing bioactivity values of small molecules. We first designed a double molecular graph perception (DMGP) framework for target prediction based on a large library of 646 498 small molecules interacting with 640 human targets. The framework achieved a top-5 accuracy of over 80% for hitting at least one target on both external validation sets. Additionally, its performance on the external validation set comprising 200 molecules surpassed that of four existing target prediction servers. Second, we collected 56 targets closely related to the occurrence and development of cancer, metabolic diseases, and inflammatory immune diseases and developed a multi-model self-validation activity prediction (MSAP) framework that enables accurate bioactivity quantification predictions for small-molecule ligands of these 56 targets. CODD-Pred is a handy tool for rapid evaluation and optimization of small molecules with specific target activity. CODD-Pred is freely accessible at http://codd.iddd.group/.
Collapse
Affiliation(s)
- Xiaodan Yin
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
- Carbon-Silicon AI Technology Co., Ltd, Zhejiang, Hangzhou 310018, China
| | - Xiaorui Wang
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
- Carbon-Silicon AI Technology Co., Ltd, Zhejiang, Hangzhou 310018, China
| | - Yuquan Li
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou, 730000, China
| | - Jike Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, China
| | - Yuwei Wang
- College of Pharmacy, Shaanxi University of Chinese Medicine, Xianyang, 712000, China
| | - Yafeng Deng
- Carbon-Silicon AI Technology Co., Ltd, Zhejiang, Hangzhou 310018, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, China
| | - Huanxiang Liu
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, China
| | - Pei Luo
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
| | - Xiaojun Yao
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, China
| |
Collapse
|
41
|
Pratap Reddy Gajulapalli V. Development of Kinase-Centric Drugs: A Computational Perspective. ChemMedChem 2023; 18:e202200693. [PMID: 37442809 DOI: 10.1002/cmdc.202200693] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 07/12/2023] [Accepted: 07/12/2023] [Indexed: 07/15/2023]
Abstract
Kinases are prominent drug targets in the pharmaceutical and research community due to their involvement in signal transduction, physiological responses, and upon dysregulation, in diseases such as cancer, neurological and autoimmune disorders. Several FDA-approved small-molecule drugs have been developed to combat human diseases since Gleevec was approved for the treatment of chronic myelogenous leukemia. Kinases were considered "undruggable" in the beginning. Several FDA-approved small-molecule drugs have become available in recent years. Most of these drugs target ATP-binding sites, but a few target allosteric sites. Among kinases that belong to the same family, the catalytic domain shows high structural and sequence conservation. Inhibitors of ATP-binding sites can cause off-target binding. Because members of the same family have similar sequences and structural patterns, often complex relationships between kinases and inhibitors are observed. To design and develop drugs with desired selectivity, it is essential to understand the target selectivity for kinase inhibitors. To create new inhibitors with the desired selectivity, several experimental methods have been designed to profile the kinase selectivity of small molecules. Experimental approaches are often expensive, laborious, time-consuming, and limited by the available kinases. Researchers have used computational methodologies to address these limitations in the design and development of effective therapeutics. Many computational methods have been developed over the last few decades, either to complement experimental findings or to forecast kinase inhibitor activity and selectivity. The purpose of this review is to provide insight into recent advances in theoretical/computational approaches for the design of new kinase inhibitors with the desired selectivity and optimization of existing inhibitors.
Collapse
|
42
|
Ningthoujam SS, Nath R, Kityania S, Mazumder PB, Dutta Choudhury M, Talukdar AD, Nahar L, Sarker SD. R software for QSAR analysis in phytopharmacological studies. PHYTOCHEMICAL ANALYSIS : PCA 2023; 34:709-728. [PMID: 37392081 DOI: 10.1002/pca.3239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 05/11/2023] [Accepted: 05/11/2023] [Indexed: 07/02/2023]
Abstract
INTRODUCTION In recent decades, quantitative structure-activity relationship (QSAR) analysis has become an important method for drug design and natural product research. With the availability of bioinformatic and cheminformatic tools, a vast number of descriptors have been generated, making it challenging to select potential independent variables that are accurately related to the dependent response variable. OBJECTIVE The objective of this study is to demonstrate various descriptor selection procedures, such as the Boruta approach, all subsets regression, the ANOVA approach, the AIC method, stepwise regression, and genetic algorithm, that can be used in QSAR studies. Additionally, we performed regression diagnostics using R software to test parameters such as normality, linearity, residual histograms, PP plots, multicollinearity, and homoscedasticity. RESULTS The workflow designed in this study highlights the different descriptor selection procedures and regression diagnostics that can be used in QSAR studies. The results showed that the Boruta approach and genetic algorithm performed better than other methods in selecting potential independent variables. The regression diagnostics parameters tested using R software, such as normality, linearity, residual histograms, PP plots, multicollinearity, and homoscedasticity, helped in identifying and diagnosing model errors, ensuring the reliability of the QSAR model. CONCLUSION QSAR analysis is vital in drug design and natural product research. To develop a reliable QSAR model, it is essential to choose suitable descriptors and perform regression diagnostics. This study offers an accessible, customizable approach for researchers to select appropriate descriptors and diagnose errors in QSAR studies.
Collapse
Affiliation(s)
| | - Rajat Nath
- Department of Life Science and Bioinformatics, Assam University, Silchar, Assam, India
| | - Sibashish Kityania
- Department of Life Science and Bioinformatics, Assam University, Silchar, Assam, India
| | | | | | - Anupam Das Talukdar
- Department of Life Science and Bioinformatics, Assam University, Silchar, Assam, India
| | - Lutfun Nahar
- Laboratory of Growth Regulators, Institute of Experimental Botany, The Czech Academy of Sciences and Palacký University, Olomouc, Czech Republic
| | - Satyajit D Sarker
- Centre for Natural Products Discovery (CNPD), School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
| |
Collapse
|
43
|
Chen H, Bajorath J. Meta-learning for transformer-based prediction of potent compounds. Sci Rep 2023; 13:16145. [PMID: 37752164 PMCID: PMC10522638 DOI: 10.1038/s41598-023-43046-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 09/18/2023] [Indexed: 09/28/2023] Open
Abstract
For many machine learning applications in drug discovery, only limited amounts of training data are available. This typically applies to compound design and activity prediction and often restricts machine learning, especially deep learning. For low-data applications, specialized learning strategies can be considered to limit required training data. Among these is meta-learning that attempts to enable learning in low-data regimes by combining outputs of different models and utilizing meta-data from these predictions. However, in drug discovery settings, meta-learning is still in its infancy. In this study, we have explored meta-learning for the prediction of potent compounds via generative design using transformer models. For different activity classes, meta-learning models were derived to predict highly potent compounds from weakly potent templates in the presence of varying amounts of fine-tuning data and compared to other transformers developed for this task. Meta-learning consistently led to statistically significant improvements in model performance, in particular, when fine-tuning data were limited. Moreover, meta-learning models generated target compounds with higher potency and larger potency differences between templates and targets than other transformers, indicating their potential for low-data compound design.
Collapse
Affiliation(s)
- Hengwei Chen
- Department of Life Science Informatics and Data Science, B-IT, Lamarr Institute for Machine Learning and Artificial Intelligence, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, 53115, Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data Science, B-IT, Lamarr Institute for Machine Learning and Artificial Intelligence, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, 53115, Bonn, Germany.
| |
Collapse
|
44
|
Williams AH, Zhan CG. Staying Ahead of the Game: How SARS-CoV-2 has Accelerated the Application of Machine Learning in Pandemic Management. BioDrugs 2023; 37:649-674. [PMID: 37464099 DOI: 10.1007/s40259-023-00611-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/28/2023] [Indexed: 07/20/2023]
Abstract
In recent years, machine learning (ML) techniques have garnered considerable interest for their potential use in accelerating the rate of drug discovery. With the emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, the utilization of ML has become even more crucial in the search for effective antiviral medications. The pandemic has presented the scientific community with a unique challenge, and the rapid identification of potential treatments has become an urgent priority. Researchers have been able to accelerate the process of identifying drug candidates, repurposing existing drugs, and designing new compounds with desirable properties using machine learning in drug discovery. To train predictive models, ML techniques in drug discovery rely on the analysis of large datasets, including both experimental and clinical data. These models can be used to predict the biological activities, potential side effects, and interactions with specific target proteins of drug candidates. This strategy has proven to be an effective method for identifying potential coronavirus disease 2019 (COVID-19) and other disease treatments. This paper offers a thorough analysis of the various ML techniques implemented to combat COVID-19, including supervised and unsupervised learning, deep learning, and natural language processing. The paper discusses the impact of these techniques on pandemic drug development, including the identification of potential treatments, the understanding of the disease mechanism, and the creation of effective and safe therapeutics. The lessons learned can be applied to future outbreaks and drug discovery initiatives.
Collapse
Affiliation(s)
- Alexander H Williams
- Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA
- GSK Upper Providence, 1250 S. Collegeville Road, Collegeville, PA, 19426, USA
| | - Chang-Guo Zhan
- Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA.
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA.
| |
Collapse
|
45
|
Küper A, Blanc-Durand P, Gafita A, Kersting D, Fendler WP, Seibold C, Moraitis A, Lückerath K, James ML, Seifert R. Is There a Role of Artificial Intelligence in Preclinical Imaging? Semin Nucl Med 2023; 53:687-693. [PMID: 37037684 DOI: 10.1053/j.semnuclmed.2023.03.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 03/14/2023] [Accepted: 03/14/2023] [Indexed: 04/12/2023]
Abstract
This review provides an overview of the current opportunities for integrating artificial intelligence methods into the field of preclinical imaging research in nuclear medicine. The growing demand for imaging agents and therapeutics that are adapted to specific tumor phenotypes can be excellently served by the evolving multiple capabilities of molecular imaging and theranostics. However, the increasing demand for rapid development of novel, specific radioligands with minimal side effects that excel in diagnostic imaging and achieve significant therapeutic effects requires a challenging preclinical pipeline: from target identification through chemical, physical, and biological development to the conduct of clinical trials, coupled with dosimetry and various pre, interim, and post-treatment staging images to create a translational feedback loop for evaluating the efficacy of diagnostic or therapeutic ligands. In virtually all areas of this pipeline, the use of artificial intelligence and in particular deep-learning systems such as neural networks could not only address the above-mentioned challenges, but also provide insights that would not have been possible without their use. In the future, we expect that not only the clinical aspects of nuclear medicine will be supported by artificial intelligence, but that there will also be a general shift toward artificial intelligence-assisted in silico research that will address the increasingly complex nature of identifying targets for cancer patients and developing radioligands.
Collapse
Affiliation(s)
- Alina Küper
- Department of Nuclear Medicine, University Hospital Essen; West German Cancer Center; German Cancer Consortium (DKTK), Essen, Germany
| | - Paul Blanc-Durand
- Department of Nuclear Medicine, Assistance Publique - Hôpitaux de Paris, Paris, France
| | - Andrei Gafita
- Division of Nuclear Medicine and Molecular Imaging, The Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD
| | - David Kersting
- Department of Nuclear Medicine, University Hospital Essen; West German Cancer Center; German Cancer Consortium (DKTK), Essen, Germany
| | - Wolfgang P Fendler
- Department of Nuclear Medicine, University Hospital Essen; West German Cancer Center; German Cancer Consortium (DKTK), Essen, Germany
| | - Constantin Seibold
- Computer Vision for Human-Computer Interaction Lab, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Alexandros Moraitis
- Department of Nuclear Medicine, University Hospital Essen; West German Cancer Center; German Cancer Consortium (DKTK), Essen, Germany
| | - Katharina Lückerath
- Department of Nuclear Medicine, University Hospital Essen; West German Cancer Center; German Cancer Consortium (DKTK), Essen, Germany
| | - Michelle L James
- Department of Radiology, Stanford University School of Medicine, Stanford, CA; Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, CA
| | - Robert Seifert
- Department of Nuclear Medicine, University Hospital Essen; West German Cancer Center; German Cancer Consortium (DKTK), Essen, Germany.
| |
Collapse
|
46
|
Guo S, Jiang J, Ren H, Wang S. Fusion of Multiple Spectra for Investigating Chemical Bonding Properties via Machine Learning. J Phys Chem Lett 2023; 14:7461-7468. [PMID: 37579021 DOI: 10.1021/acs.jpclett.3c01709] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2023]
Abstract
Chemical bonding properties are crucial to understanding the chemical behavior of molecules. Spectroscopy is a versatile technical tool to study various microscopic properties, but its interpretation suffers from human biases and the loss of high-dimensional information. Here, we present a machine learning approach to predict diverse bonding properties, including the bond dissociation energy, bond length, and α-C connectivity of hydroxyls in organic molecules, by fusing multiple spectra with different physical mechanisms. Combining nuclear magnetic resonance and vibrational spectroscopy exhibits higher prediction accuracy than what they did separately. On the hold-out test data set, the models achieve a mean absolute error of 1.243 kcal/mol and 1.041 × 10-4 Å for BDE and bond length and an accuracy of 95.09% for hydroxyl α-C connectivity. Our models demonstrate strong extrapolation capabilities when they are transferred to different molecules, external electric fields, and solvation environments. These end-to-end models pave the way to investigating chemical bonding properties by using spectroscopic observables.
Collapse
Affiliation(s)
- Sibei Guo
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Jun Jiang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
- Hefei National Laboratory, University of Science and Technology of China, Hefei, Anhui 230088, China
| | - Hao Ren
- School of Materials Science and Engineering, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Song Wang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
47
|
Miao Y, Ma H, Huang J. Recent Advances in Toxicity Prediction: Applications of Deep Graph Learning. Chem Res Toxicol 2023; 36:1206-1226. [PMID: 37562046 DOI: 10.1021/acs.chemrestox.2c00384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
The development of new drugs is time-consuming and expensive, and as such, accurately predicting the potential toxicity of a drug candidate is crucial in ensuring its safety and efficacy. Recently, deep graph learning has become prevalent in this field due to its computational power and cost efficiency. Many novel deep graph learning methods aid toxicity prediction and further prompt drug development. This review aims to connect fundamental knowledge with burgeoning deep graph learning methods. We first summarize the essential components of deep graph learning models for toxicity prediction, including molecular descriptors, molecular representations, evaluation metrics, validation methods, and data sets. Furthermore, based on various graph-related representations of molecules, we introduce several representative studies and methods for toxicity prediction from the perspective of GNN architectures and graph pretrained models. Compared to other types of models, deep graph models not only advance in higher accuracy and efficiency but also provide more intuitive insights, which is significant in the development of model interpretation and generalization ability. The graph pretrained models are emerging as they can extract prominent features from large-scale unlabeled molecular graph data and improve the performance of downstream toxicity prediction tasks. We hope this survey can serve as a handbook for individuals interested in exploring deep graph learning for toxicity prediction.
Collapse
Affiliation(s)
- Yuwei Miao
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas 76019, United States
| | - Hehuan Ma
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas 76019, United States
| | - Junzhou Huang
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas 76019, United States
| |
Collapse
|
48
|
Lamens A, Bajorath J. Explaining Multiclass Compound Activity Predictions Using Counterfactuals and Shapley Values. Molecules 2023; 28:5601. [PMID: 37513472 PMCID: PMC10383571 DOI: 10.3390/molecules28145601] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 07/18/2023] [Accepted: 07/21/2023] [Indexed: 07/30/2023] Open
Abstract
Most machine learning (ML) models produce black box predictions that are difficult, if not impossible, to understand. In pharmaceutical research, black box predictions work against the acceptance of ML models for guiding experimental work. Hence, there is increasing interest in approaches for explainable ML, which is a part of explainable artificial intelligence (XAI), to better understand prediction outcomes. Herein, we have devised a test system for the rationalization of multiclass compound activity prediction models that combines two approaches from XAI for feature relevance or importance analysis, including counterfactuals (CFs) and Shapley additive explanations (SHAP). For compounds with different single- and dual-target activities, we identified small compound modifications that induce feature changes inverting class label predictions. In combination with feature mapping, CFs and SHAP value calculations provide chemically intuitive explanations for model decisions.
Collapse
Affiliation(s)
- Alec Lamens
- Department of Life Science Informatics, B-IT, LIMES Program, Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program, Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| |
Collapse
|
49
|
Raslan MA, Raslan SA, Shehata EM, Mahmoud AS, Sabri NA. Advances in the Applications of Bioinformatics and Chemoinformatics. Pharmaceuticals (Basel) 2023; 16:1050. [PMID: 37513961 PMCID: PMC10384252 DOI: 10.3390/ph16071050] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 07/19/2023] [Accepted: 07/20/2023] [Indexed: 07/30/2023] Open
Abstract
Chemoinformatics involves integrating the principles of physical chemistry with computer-based and information science methodologies, commonly referred to as "in silico techniques", in order to address a wide range of descriptive and prescriptive chemistry issues, including applications to biology, drug discovery, and related molecular areas. On the other hand, the incorporation of machine learning has been considered of high importance in the field of drug design, enabling the extraction of chemical data from enormous compound databases to develop drugs endowed with significant biological features. The present review discusses the field of cheminformatics and proposes the use of virtual chemical libraries in virtual screening methods to increase the probability of discovering novel hit chemicals. The virtual libraries address the need to increase the quality of the compounds as well as discover promising ones. On the other hand, various applications of bioinformatics in disease classification, diagnosis, and identification of multidrug-resistant organisms were discussed. The use of ensemble models and brute-force feature selection methodology has resulted in high accuracy rates for heart disease and COVID-19 diagnosis, along with the role of special formulations for targeting meningitis and Alzheimer's disease. Additionally, the correlation between genomic variations and disease states such as obesity and chronic progressive external ophthalmoplegia, the investigation of the antibacterial activity of pyrazole and benzimidazole-based compounds against resistant microorganisms, and its applications in chemoinformatics for the prediction of drug properties and toxicity-all the previously mentioned-were presented in the current review.
Collapse
Affiliation(s)
| | | | | | - Amr S Mahmoud
- Department of Obstetrics and Gynecology, Faculty of Medicine, Ain Shams University, Cairo P.O. Box 11566, Egypt
| | - Nagwa A Sabri
- Department of Clinical Pharmacy, Faculty of Pharmacy, Ain Shams University, Cairo P.O. Box 11566, Egypt
| |
Collapse
|
50
|
Niazi SK, Mariam Z. Recent Advances in Machine-Learning-Based Chemoinformatics: A Comprehensive Review. Int J Mol Sci 2023; 24:11488. [PMID: 37511247 PMCID: PMC10380192 DOI: 10.3390/ijms241411488] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 06/30/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023] Open
Abstract
In modern drug discovery, the combination of chemoinformatics and quantitative structure-activity relationship (QSAR) modeling has emerged as a formidable alliance, enabling researchers to harness the vast potential of machine learning (ML) techniques for predictive molecular design and analysis. This review delves into the fundamental aspects of chemoinformatics, elucidating the intricate nature of chemical data and the crucial role of molecular descriptors in unveiling the underlying molecular properties. Molecular descriptors, including 2D fingerprints and topological indices, in conjunction with the structure-activity relationships (SARs), are pivotal in unlocking the pathway to small-molecule drug discovery. Technical intricacies of developing robust ML-QSAR models, including feature selection, model validation, and performance evaluation, are discussed herewith. Various ML algorithms, such as regression analysis and support vector machines, are showcased in the text for their ability to predict and comprehend the relationships between molecular structures and biological activities. This review serves as a comprehensive guide for researchers, providing an understanding of the synergy between chemoinformatics, QSAR, and ML. Due to embracing these cutting-edge technologies, predictive molecular analysis holds promise for expediting the discovery of novel therapeutic agents in the pharmaceutical sciences.
Collapse
Affiliation(s)
- Sarfaraz K Niazi
- College of Pharmacy, University of Illinois, Chicago, IL 61820, USA
| | - Zamara Mariam
- Zamara Mariam, School of Interdisciplinary Engineering & Sciences (SINES), National University of Sciences & Technology (NUST), Islamabad 24090, Pakistan
| |
Collapse
|