1
|
Gini GC. QSAR: Using the Past to Study the Present. Methods Mol Biol 2025; 2834:3-39. [PMID: 39312158 DOI: 10.1007/978-1-0716-4003-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
Quantitative structure-activity relationships (QSAR) is a method for predicting the physical and biological properties of small molecules; it is in use in industry and public services. However, as any scientific method, it is challenged by more and more requests, especially considering its possible role in assessing the safety of new chemicals. To answer the question whether QSAR, by exploiting available knowledge, can build new knowledge, the chapter reviews QSAR methods in search of a QSAR epistemology. QSAR stands on tree pillars, i.e., biological data, chemical knowledge, and modeling algorithms. Usually the biological data, resulting from good experimental practice, are taken as a true picture of the world; chemical knowledge has scientific bases; so if a QSAR model is not working, blame modeling. The role of modeling in developing scientific theories, and in producing knowledge, is so analyzed. QSAR is a mature technology and is part of a large body of in silico methods and other computational methods. The active debate about the acceptability of the QSAR models, about the way to communicate them, and the explanation to provide accompanies the development of today QSAR models. An example about predicting possible endocrine-disrupting chemicals (EDC) shows the many faces of modern QSAR methods.
Collapse
|
2
|
Liu C, Zong C, Chen S, Chu J, Yang Y, Pan Y, Yuan B, Zhang H. Machine learning-driven QSAR models for predicting the cytotoxicity of five common microplastics. Toxicology 2024; 508:153918. [PMID: 39137828 DOI: 10.1016/j.tox.2024.153918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 07/31/2024] [Accepted: 08/09/2024] [Indexed: 08/15/2024]
Abstract
In the field of microplastics (MPs) toxicity prediction, machine learning (ML) computer simulation techniques are showing great potential. In this study, six ML algorithms were utilized to predict the toxicity of MPs on BEAS-2B cells based on quantitative structure-activity relationship (QSAR) models. Comparing the models of different algorithms, the extreme gradient boosting model showed the best fit and prediction performance (R2tra = 0.9876, R2test = 0.9286). Additionally, Williams plot analysis showed that the six models developed were able to predict stably within their applicability domain, with few outliers. Finally, the three feature importance methods-Embedded Feature Importance (EFI), Recursive Feature Elimination (RFE), and SHapley Additive exPlanations (SHAP)-consistently identified particle size as the most critical feature affecting toxicity prediction. The proposed QSAR model can be utilized for preliminary environmental exposure assessments of MPs and to better understand the associated health risks.
Collapse
Affiliation(s)
- Chengzhi Liu
- College of Safety Science and Engineering, Nanjing Tech University, Nanjing, Jiangsu 210009, China.
| | - Cheng Zong
- College of Safety Science and Engineering, Nanjing Tech University, Nanjing, Jiangsu 210009, China.
| | - Shuang Chen
- College of Safety Science and Engineering, Nanjing Tech University, Nanjing, Jiangsu 210009, China.
| | - Jiangliang Chu
- College of Safety Science and Engineering, Nanjing Tech University, Nanjing, Jiangsu 210009, China.
| | - Yifan Yang
- College of Safety Science and Engineering, Nanjing Tech University, Nanjing, Jiangsu 210009, China.
| | - Yong Pan
- College of Safety Science and Engineering, Nanjing Tech University, Nanjing, Jiangsu 210009, China.
| | - Beilei Yuan
- College of Safety Science and Engineering, Nanjing Tech University, Nanjing, Jiangsu 210009, China.
| | - Huazhong Zhang
- Department of Emergency Medicine, The First Affiliated Hospital with Nanjing Medical University, Nanjing, Jiangsu 210029, China; Institute of Poisoning, Nanjing Medical University, Nanjing 211100, China.
| |
Collapse
|
3
|
Schwaebe B, He H, Glaubensklee C, Ogunseitan OA, Schoenung JM. Chemical hazard assessment toward safer electrolytes for lithium-ion batteries. INTEGRATED ENVIRONMENTAL ASSESSMENT AND MANAGEMENT 2024; 20:2231-2244. [PMID: 38837720 DOI: 10.1002/ieam.4963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 04/16/2024] [Accepted: 04/29/2024] [Indexed: 06/07/2024]
Abstract
Commercialization of rechargeable lithium-ion (Li-ion) batteries has revolutionized the design of portable electronic devices and is facilitating the current transition to electric vehicles. The technological specifications of Li-ion batteries continue to evolve through the introduction of various high-risk liquid electrolyte chemicals, yet critical evaluation of the physical, environmental, and human health hazards of these substances is lacking. Using the GreenScreen for Safer Chemicals approach, we conducted a chemical hazard assessment (CHA) of 103 electrolyte chemicals categorized into seven chemical groups: salts, carbonates, esters, ethers, sulfoxides-sulfites-sulfones, overcharge protection additives, and flame-retardant additives. To minimize data gaps, we focused on six toxicity and hazard data sources, including three empirical and three nonempirical predictive data sources. Furthermore, we investigated the structural similarities among selected electrolyte chemicals using the ChemMine tool and the simplified molecular input line entry system inputs from PubChem to evaluate whether chemicals with similar structures exhibit similar toxicity. The results demonstrate that salts, overcharge protection additives, and flame-retardant additives contain the most toxic components in the electrolyte solutions. Furthermore, carbonates, esters, and ethers account for most flammability hazards in Li-ion batteries. This study supports the complementary use of quantitative structure-activity relationship models to minimize data gaps and inconsistencies in CHA. Integr Environ Assess Manag 2024;20:2231-2244. © 2024 The Author(s). Integrated Environmental Assessment and Management published by Wiley Periodicals LLC on behalf of Society of Environmental Toxicology & Chemistry (SETAC).
Collapse
Affiliation(s)
- Branden Schwaebe
- Department of Materials Science and Engineering, University of California, Irvine, California, USA
| | - Haoyang He
- Department of Materials Science and Engineering, University of California, Irvine, California, USA
| | - Christopher Glaubensklee
- Department of Materials Science and Engineering, University of California, Irvine, California, USA
| | - Oladele A Ogunseitan
- Department of Population Health and Disease Prevention, University of California Irvine, Irvine, California, USA
- World Institute for Sustainable Development of Materials (WISDOM), University of California, Irvine, California, USA
| | - Julie M Schoenung
- Department of Materials Science and Engineering, University of California, Irvine, California, USA
- World Institute for Sustainable Development of Materials (WISDOM), University of California, Irvine, California, USA
- Department of Materials Science & Engineering, J. Mike Walker '66 Department of Mechanical Engineering, Texas A&M University, College Station, Texas, USA
| |
Collapse
|
4
|
Ancajas CMF, Oyedele AS, Butt CM, Walker AS. Advances, opportunities, and challenges in methods for interrogating the structure activity relationships of natural products. Nat Prod Rep 2024; 41:1543-1578. [PMID: 38912779 PMCID: PMC11484176 DOI: 10.1039/d4np00009a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Indexed: 06/25/2024]
Abstract
Time span in literature: 1985-early 2024Natural products play a key role in drug discovery, both as a direct source of drugs and as a starting point for the development of synthetic compounds. Most natural products are not suitable to be used as drugs without further modification due to insufficient activity or poor pharmacokinetic properties. Choosing what modifications to make requires an understanding of the compound's structure-activity relationships. Use of structure-activity relationships is commonplace and essential in medicinal chemistry campaigns applied to human-designed synthetic compounds. Structure-activity relationships have also been used to improve the properties of natural products, but several challenges still limit these efforts. Here, we review methods for studying the structure-activity relationships of natural products and their limitations. Specifically, we will discuss how synthesis, including total synthesis, late-stage derivatization, chemoenzymatic synthetic pathways, and engineering and genome mining of biosynthetic pathways can be used to produce natural product analogs and discuss the challenges of each of these approaches. Finally, we will discuss computational methods including machine learning methods for analyzing the relationship between biosynthetic genes and product activity, computer aided drug design techniques, and interpretable artificial intelligence approaches towards elucidating structure-activity relationships from models trained to predict bioactivity from chemical structure. Our focus will be on these latter topics as their applications for natural products have not been extensively reviewed. We suggest that these methods are all complementary to each other, and that only collaborative efforts using a combination of these techniques will result in a full understanding of the structure-activity relationships of natural products.
Collapse
Affiliation(s)
| | | | - Caitlin M Butt
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA.
| | - Allison S Walker
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA.
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
5
|
Srivastava P, Steuer A, Ferri F, Nicoli A, Schultz K, Bej S, Di Pizio A, Wolkenhauer O. Bitter peptide prediction using graph neural networks. J Cheminform 2024; 16:111. [PMID: 39375808 PMCID: PMC11459932 DOI: 10.1186/s13321-024-00909-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 09/22/2024] [Indexed: 10/09/2024] Open
Abstract
Bitter taste is an unpleasant taste modality that affects food consumption. Bitter peptides are generated during enzymatic processes that produce functional, bioactive protein hydrolysates or during the aging process of fermented products such as cheese, soybean protein, and wine. Understanding the underlying peptide sequences responsible for bitter taste can pave the way for more efficient identification of these peptides. This paper presents BitterPep-GCN, a feature-agnostic graph convolution network for bitter peptide prediction. The graph-based model learns the embedding of amino acids in the bitter peptide sequences and uses mixed pooling for bitter classification. BitterPep-GCN was benchmarked using BTP640, a publicly available bitter peptide dataset. The latent peptide embeddings generated by the trained model were used to analyze the activity of sequence motifs responsible for the bitter taste of the peptides. Particularly, we calculated the activity for individual amino acids and dipeptide, tripeptide, and tetrapeptide sequence motifs present in the peptides. Our analyses pinpoint specific amino acids, such as F, G, P, and R, as well as sequence motifs, notably tripeptide and tetrapeptide motifs containing FF, as key bitter signatures in peptides. This work not only provides a new predictor of bitter taste for a more efficient identification of bitter peptides in various food products but also gives a hint into the molecular basis of bitterness.Scientific ContributionOur work provides the first application of Graph Neural Networks for the prediction of peptide bitter taste. The best-developed model, BitterPep-GCN, learns the embedding of amino acids in the bitter peptide sequences and uses mixed pooling for bitter classification. The embeddings were used to analyze the sequence motifs responsible for the bitter taste.
Collapse
Affiliation(s)
- Prashant Srivastava
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany
| | - Alexandra Steuer
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Francesco Ferri
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Alessandro Nicoli
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany
| | - Kristian Schultz
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany
| | - Saptarshi Bej
- Indian Institute of Science Education and Research Thiruvananthapuram, Maruthamala P. O, Vithura, 695551, Kerala, India
| | - Antonella Di Pizio
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany.
- Professorship for Chemoinformatics and Protein Modelling, TUM School of Life Sciences, Technical University of Munich, 85354, Freising, Germany.
| | - Olaf Wolkenhauer
- Institute of Computer Science, University of Rostock, 18051, Rostock, Germany.
- Section III In Silico Biology & Machine Learning, Leibniz Institute for Food Systems Biology at the Technical University of Munich, 85354, Freising, Germany.
| |
Collapse
|
6
|
Milon TI, Wang Y, Fontenot RL, Khajouie P, Villinger F, Raghavan V, Xu W. Development of a novel representation of drug 3D structures and enhancement of the TSR-based method for probing drug and target interactions. Comput Biol Chem 2024; 112:108117. [PMID: 38852360 PMCID: PMC11390338 DOI: 10.1016/j.compbiolchem.2024.108117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 05/13/2024] [Accepted: 05/31/2024] [Indexed: 06/11/2024]
Abstract
Understanding the mechanisms underlying interactions between drugs and target proteins is critical for drug discovery. In our earlier studies, we introduced the Triangular Spatial Relationship (TSR)-based algorithm, which enables the representation of a protein's 3D structure as a vector of integers (TSR keys). These TSR keys correspond to substructures of the 3D structure of a protein and are computed based on the triangles constructed by all possible triples of Cα atoms within the protein. In this study, we report on a new TSR-based algorithm for probing drug and target interactions. Specifically, we have extended the previous algorithm in three novel directions: TSR keys for representing the 3D structure of a drug or a ligand, cross TSR keys between drugs and their targets and intra-residual TSR keys for phosphorylated amino acids. The outcomes illustrate the key contributions as follows: (i) The TSR-based method, which uses the TSR keys as features, is unique in its capability to interpret hierarchical relationships of drugs as well as drug - target complexes using common and specific TSR keys. (ii) The method can distinguish not only the binding sites from the rest of the protein structures, but also the binding sites of primary targets from those of off-targets. (iii) The method has the potential to correlate the 3D structures of drugs with their functions. (iv) Representation of 3D structures by TSR keys has its unique advantage in terms of ease of making searching for similar substructures across structure datasets easier. In summary, this study presents a novel computational methodology, with significant advantages, for providing insights into the mechanism underlying drug and target interactions.
Collapse
Affiliation(s)
- Tarikul I Milon
- Department of Chemistry, University of Louisiana at Lafayette, P.O. Box 44370, Lafayette, LA 70504, USA
| | - Yuhong Wang
- National Center for Advancing Translational Sciences, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - Ryan L Fontenot
- Department of Chemistry, University of Louisiana at Lafayette, P.O. Box 44370, Lafayette, LA 70504, USA
| | - Poorya Khajouie
- Department of Chemistry, University of Louisiana at Lafayette, P.O. Box 44370, Lafayette, LA 70504, USA; The Center for Advanced Computer Studies, University of Louisiana at Lafayette, LA 70504, USA
| | - Francois Villinger
- Department of Biology, University of Louisiana at Lafayette, New Iberia, LA 70560, USA
| | - Vijay Raghavan
- The Center for Advanced Computer Studies, University of Louisiana at Lafayette, LA 70504, USA
| | - Wu Xu
- Department of Chemistry, University of Louisiana at Lafayette, P.O. Box 44370, Lafayette, LA 70504, USA.
| |
Collapse
|
7
|
Walter M, Borghardt JM, Humbeck L, Skalic M. Multi-Task ADME/PK prediction at industrial scale: leveraging large and diverse experimental datasets. Mol Inform 2024; 43:e202400079. [PMID: 38973777 DOI: 10.1002/minf.202400079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 04/10/2024] [Accepted: 05/04/2024] [Indexed: 07/09/2024]
Abstract
ADME (Absorption, Distribution, Metabolism, Excretion) properties are key parameters to judge whether a drug candidate exhibits a desired pharmacokinetic (PK) profile. In this study, we tested multi-task machine learning (ML) models to predict ADME and animal PK endpoints trained on in-house data generated at Boehringer Ingelheim. Models were evaluated both at the design stage of a compound (i. e., no experimental data of test compounds available) and at testing stage when a particular assay would be conducted (i. e., experimental data of earlier conducted assays may be available). Using realistic time-splits, we found a clear benefit in performance of multi-task graph-based neural network models over single-task model, which was even stronger when experimental data of earlier assays is available. In an attempt to explain the success of multi-task models, we found that especially endpoints with the largest numbers of data points (physicochemical endpoints, clearance in microsomes) are responsible for increased predictivity in more complex ADME and PK endpoints. In summary, our study provides insight into how data for multiple ADME/PK endpoints in a pharmaceutical company can be best leveraged to optimize predictivity of ML models.
Collapse
Affiliation(s)
- Moritz Walter
- Medicinal Chemistry Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397, Biberach an der Riss, Germany
| | - Jens M Borghardt
- Drug Discovery Sciences Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397, Biberach an der Riss, Germany
| | - Lina Humbeck
- Medicinal Chemistry Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397, Biberach an der Riss, Germany
| | - Miha Skalic
- Medicinal Chemistry Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397, Biberach an der Riss, Germany
| |
Collapse
|
8
|
Giovannuzzi S, Shyamal SS, Bhowmik R, Ray R, Manaithiya A, Carta F, Parrkila S, Aspatwar A, Supuran CT. Physiological modeling of the metaverse of the Mycobacterium tuberculosis β-CA inhibition mechanism. Comput Biol Med 2024; 181:109029. [PMID: 39173489 DOI: 10.1016/j.compbiomed.2024.109029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 08/11/2024] [Accepted: 08/12/2024] [Indexed: 08/24/2024]
Abstract
Tuberculosis (TB) is an infectious disease that primarily affects the lungs of humans and accounts for Mycobacterium tuberculosis (Mtb) bacteria as the etiologic agent. In this study, we introduce a computational framework designed to identify the important chemical features crucial for the effective inhibition of Mtb β-CAs. Through applying a mechanistic model, we elucidated the essential features pivotal for robust inhibition. Using this model, we engineered molecules that exhibit potent inhibitory activity and introduce relevant novel chemistry. The designed molecules were prioritized for synthesis based on their predicted pKi values via the QSAR (Quantitative Structure-Activity Relationship) model. All the rationally designed and synthesized compounds were evaluated in vitro against different carbonic anhydrase isoforms expressed from the pathogen Mtb; moreover, the off-target and widely human-expressed CA I and II were also evaluated. Among the reported derivatives, 2, 4, and 5 demonstrated the most valuable in vitro activity, resulting in promising candidates for the treatment of TB infection. All the synthesized molecules exhibited favorable pharmacokinetic and toxicological profiles based on in silico predictions. Docking analysis confirmed that the zinc-binding groups bind effectively into the catalytic triad of the Mtb β-Cas, supporting the in vitro outcomes with these binding interactions. Furthermore, molecules with good prediction accuracies according to previously established mechanistic and QSAR models were utilized to delve deeper into the realm of systems biology to understand their mechanism in combating tuberculotic pathogenesis. The results pointed to the key involvement of the compounds in modulating immune responses via NF-κβ1, SRC kinase, and TNF-α to modulate granuloma formation and clearance via T cells. This dual action, in which the pathogen's enzyme is inhibited while modulating the human immune machinery, represents a paradigm shift toward more effective and comprehensive treatment approaches for combating tuberculosis.
Collapse
Affiliation(s)
- Simone Giovannuzzi
- Department of Neuroscience, Psychology, Drug Research, and Child's Health, Section of Pharmaceutical and Nutraceutical Sciences, University of Florence, Via Ugo Schiff 6, 50019, Sesto Fiorentino, Italy
| | - Sagar Singh Shyamal
- Department of Pharmaceutical Engineering & Technology, Indian Institute of Technology (Banaras Hindu University), Varanasi, India
| | - Ratul Bhowmik
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Rajarshi Ray
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Ajay Manaithiya
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Fabrizio Carta
- Department of Neuroscience, Psychology, Drug Research, and Child's Health, Section of Pharmaceutical and Nutraceutical Sciences, University of Florence, Via Ugo Schiff 6, 50019, Sesto Fiorentino, Italy
| | - Seppo Parrkila
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland; Fimlab Ltd, Tampere University Hospital, Tampere, Finland
| | - Ashok Aspatwar
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
| | - Claudiu T Supuran
- Department of Neuroscience, Psychology, Drug Research, and Child's Health, Section of Pharmaceutical and Nutraceutical Sciences, University of Florence, Via Ugo Schiff 6, 50019, Sesto Fiorentino, Italy.
| |
Collapse
|
9
|
Wright BA, Sarpong R. Molecular complexity as a driving force for the advancement of organic synthesis. Nat Rev Chem 2024; 8:776-792. [PMID: 39251714 DOI: 10.1038/s41570-024-00645-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/26/2024] [Indexed: 09/11/2024]
Abstract
The generation of molecular complexity is a primary goal in the field of synthetic chemistry. In the context of retrosynthetic analysis, the concept of molecular complexity is central to identifying productive disconnections and the development of efficient total syntheses. However, this field-defining concept is frequently invoked on an intuitive basis without precise definition or appreciation of its subtleties. Methods for quantifying molecular complexity could prove useful for characterizing the state of synthesis in a more rigorous, reliable and reproducible fashion. As a first step to evaluating the importance of these methods to the state of the field, here we present our perspective on the development of molecular complexity quantification and its implications for chemical synthesis. The extension and application of these methods beyond computer-aided synthesis planning and medicinal chemistry to the traditional practice of 'complex molecule' synthesis could have the potential to unearth new opportunities and more efficient approaches for synthesis.
Collapse
Affiliation(s)
- Brandon A Wright
- Department of Chemistry, University of California, Berkeley, USA
| | - Richmond Sarpong
- Department of Chemistry, University of California, Berkeley, USA.
| |
Collapse
|
10
|
Wang N, Li X, Xiao J, Liu S, Cao D. Data-driven toxicity prediction in drug discovery: Current status and future directions. Drug Discov Today 2024; 29:104195. [PMID: 39357621 DOI: 10.1016/j.drudis.2024.104195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 09/13/2024] [Accepted: 09/26/2024] [Indexed: 10/04/2024]
Abstract
Early toxicity assessment plays a vital role in the drug discovery process on account of its significant influence on the attrition rate of candidates. Recently, constant upgrading of information technology has greatly promoted the continuous development of toxicity prediction. To give an overview of the current state of data-driven toxicity prediction, we reviewed relevant studies and summarized them in three main respects: the features and difficulties of toxicity prediction, the evolution of modeling approaches, and the available tools for toxicity prediction. For each part, we expound the research status, existing challenges, and feasible solutions. Finally, several new directions and suggestions for toxicity prediction are also put forward.
Collapse
Affiliation(s)
- Ningning Wang
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; The Hunan Institute of Pharmacy Practice and Clinical Research, Changsha 410008 Hunan, PR China
| | - Xinliang Li
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; The Hunan Institute of Pharmacy Practice and Clinical Research, Changsha 410008 Hunan, PR China
| | - Jing Xiao
- Hunan Institute for Drug Control, Changsha 410001 Hunan, PR China
| | - Shao Liu
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; The Hunan Institute of Pharmacy Practice and Clinical Research, Changsha 410008 Hunan, PR China.
| | - Dongsheng Cao
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha 410008 Hunan, PR China; Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, PR China.
| |
Collapse
|
11
|
Kelleci Çelik F, Doğan S, Karaduman G. Drug-induced torsadogenicity prediction model: An explainable machine learning-driven quantitative structure-toxicity relationship approach. Comput Biol Med 2024; 182:109209. [PMID: 39332120 DOI: 10.1016/j.compbiomed.2024.109209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 09/03/2024] [Accepted: 09/23/2024] [Indexed: 09/29/2024]
Abstract
Drug-induced Torsade de Pointes (TdP), a life-threatening polymorphic ventricular tachyarrhythmia, emerges due to the cardiotoxic effects of pharmaceuticals. The need for precise mechanisms and clinical biomarkers to detect this adverse effect presents substantial challenges in drug safety assessment. In this study, we propose that analyzing the physicochemical properties of pharmaceuticals can provide valuable insights into their potential for torsadogenic cardiotoxicity. Our research centers on estimating TdP risk based on the molecular structure of drugs. We introduce a novel quantitative structure-toxicity relationship (QSTR) prediction model that leverages an in silico approach developed by adopting the 4R rule in laboratory animals. This approach eliminates the need for animal testing, saves time, and reduces cost. Our algorithm has successfully predicted the torsadogenic risks of various pharmaceutical compounds. To develop this model, we employed Support Vector Machine (SVM) and ensemble techniques, including Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Categorical Boosting (CatBoost). We enhanced the model's predictive accuracy through a rigorous two-step feature selection process. Furthermore, we utilized the SHapley Additive exPlanations (SHAP) technique to explain the prediction of torsadogenic risk, particularly within the RF model. This study represents a significant step towards creating a robust QSTR model, which can serve as an early screening tool for assessing the torsadogenic potential of pharmaceutical candidates or existing drugs. By incorporating molecular structure-based insights, we aim to enhance drug safety evaluation and minimize the risks of drug-induced TdP, ultimately benefiting both patients and the pharmaceutical industry.
Collapse
Affiliation(s)
- Feyza Kelleci Çelik
- Karamanoğlu Mehmetbey University, Vocational School of Health Services, 70200, Karaman, Turkey.
| | - Seyyide Doğan
- Karamanoğlu Mehmetbey University, Faculty of Economics and Administrative Science, 70200, Karaman, Turkey
| | - Gül Karaduman
- Karamanoğlu Mehmetbey University, Department of Mathematics, 70100, Karaman, Turkey
| |
Collapse
|
12
|
Li J, Li X, Kah M, Yue L, Cheng B, Wang C, Wang Z, Xing B. Unlocking the potential of carbon dots in agriculture using data-driven approaches. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 944:173605. [PMID: 38879020 DOI: 10.1016/j.scitotenv.2024.173605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 05/10/2024] [Accepted: 05/27/2024] [Indexed: 06/26/2024]
Abstract
The utilization of carbon dots (CDs) in agriculture to enhance plant growth has gained significant attention, but the data remains fractionated. Systematically integrating existing data is needed to identify the factors driving the interactions between CDs and plants and strategically guide future research. Articles reporting on CDs and their effects on plants were searched based on inclusion and exclusion criteria, resulting in the collection of 71 articles comprising a total of 2564 data points. The meta-analysis reveals that the soil and foliar application of red-emitting bio-derived CDs at a low concentration (<10 ppm) leads to the most beneficial effects on plant growth. Random forest and gradient boosting algorithms revealed that the size and dose of CDs were important factors in predicting plant responses across multiple aspects (CDs properties, plant properties, environmental factors, and experimental conditions). Specifically, smaller sizes are more favorable to growth indicators (GI) below 6 nm, nutrient and quality (NuQ) at 3-6 nm, photosynthesis (PSN) below 7 nm, and antioxidant responses (AR) below 5 nm. Overall, our analysis of existing data suggests that CDs applications can significantly improve plant responses (GI, NuQ, PSN, and AR) by 10-39 %. To unlock the full potential of CDs, customized synthesis techniques should be employed to meet the specific requirements of different crops and climate condition. For example, we recommend the synthesis of small CDs (<7 nm) with emission peak values falling within the range of 405-475 and 610-670 nm to enhance plant growth. The global prediction of plant responses to CDs application in future scenarios have shown significant improvements ranging from 17 to 58 %, suggesting that CDs have widespread applicability. This novel understanding of the impact of CDs on plant response provides valuable insights for optimizing the application of these nanomaterials in agriculture.
Collapse
Affiliation(s)
- Jing Li
- Institute of Environmental Processes and Pollution Control, and School of Environment and Ecology, Jiangnan University, Wuxi, Jiangsu 214122, China; Jiangsu Engineering Laboratory for Biomass Energy and Carbon Reduction Technology, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Xiaona Li
- Institute of Environmental Processes and Pollution Control, and School of Environment and Ecology, Jiangnan University, Wuxi, Jiangsu 214122, China; Jiangsu Engineering Laboratory for Biomass Energy and Carbon Reduction Technology, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Melanie Kah
- School of Environment, University of Auckland, Auckland 1010, New Zealand
| | - Le Yue
- Institute of Environmental Processes and Pollution Control, and School of Environment and Ecology, Jiangnan University, Wuxi, Jiangsu 214122, China; Jiangsu Engineering Laboratory for Biomass Energy and Carbon Reduction Technology, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Bingxu Cheng
- Institute of Environmental Processes and Pollution Control, and School of Environment and Ecology, Jiangnan University, Wuxi, Jiangsu 214122, China; Jiangsu Engineering Laboratory for Biomass Energy and Carbon Reduction Technology, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Chuanxi Wang
- Institute of Environmental Processes and Pollution Control, and School of Environment and Ecology, Jiangnan University, Wuxi, Jiangsu 214122, China; Jiangsu Engineering Laboratory for Biomass Energy and Carbon Reduction Technology, Jiangnan University, Wuxi, Jiangsu, 214122, China.
| | - Zhenyu Wang
- Institute of Environmental Processes and Pollution Control, and School of Environment and Ecology, Jiangnan University, Wuxi, Jiangsu 214122, China; Jiangsu Engineering Laboratory for Biomass Energy and Carbon Reduction Technology, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Baoshan Xing
- Stockbridge School of Agriculture, University of Massachusetts, Amherst, MA 01003, USA
| |
Collapse
|
13
|
Ren JN, Chen Q, Ye HYX, Cao C, Guo YM, Yang JR, Wang H, Khan MZI, Chen JZ. FGTN: Fragment-based graph transformer network for predicting reproductive toxicity. Arch Toxicol 2024:10.1007/s00204-024-03866-4. [PMID: 39292235 DOI: 10.1007/s00204-024-03866-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 09/10/2024] [Indexed: 09/19/2024]
Abstract
Reproductive toxicity is one of the important issues in chemical safety. Traditional laboratory testing methods are costly and time-consuming with raised ethical issues. Only a few in silico models have been reported to predict human reproductive toxicity, but none of them make full use of the topological information of compounds. In addition, most existing atom-based graph neural network methods focus on attributing model predictions to individual nodes or edges rather than chemically meaningful fragments or substructures. In current studies, we develop a novel fragment-based graph transformer network (FGTN) approach to generate the QSAR model of human reproductive toxicity by considering internal topological structure information of compounds. In the FGTN model, the compound is represented by a graph architecture using fragments to be nodes and bonds linking two fragments to be edges. A super molecule-level node is further proposed to connect all fragment nodes by undirected edges, obtaining global molecular features from fragment embeddings. The FGTN model achieved an accuracy (ACC) of 0.861 and an area under the receiver operating characteristic curve (AUC) value of 0.914 on nonredundant blind tests, outperforming traditional fingerprint-based machine learning models and atom-based GCN model. The FGTN model can attribute toxic predictions to fragments, generating specific structural alerts for the positive compound. Moreover, FGTN may also have the capability to distinguish various chemical isomers. We believe that FGTN can be used as a reliable and effective tool for human reproductive toxicity prediction in contribution to the advancement of chemical safety assessment.
Collapse
Affiliation(s)
- Jia-Nan Ren
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, 310058, Zhejiang, China
| | - Qiang Chen
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, 310058, Zhejiang, China
| | - Hong-Yu-Xiang Ye
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, 310058, Zhejiang, China
| | - Cheng Cao
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, 310058, Zhejiang, China
- Polytechnic Institute, Zhejiang University, 269 Shixiang Rd., Hangzhou, 310015, Zhejiang, China
| | - Ya-Min Guo
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, 310058, Zhejiang, China
| | - Jin-Rong Yang
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, 310058, Zhejiang, China
- Polytechnic Institute, Zhejiang University, 269 Shixiang Rd., Hangzhou, 310015, Zhejiang, China
| | - Hao Wang
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, 310058, Zhejiang, China
| | - Muhammad Zafar Irshad Khan
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, 310058, Zhejiang, China
| | - Jian-Zhong Chen
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
14
|
Lephalala M, Vives SS, Bisetty K. Chaotic neural network algorithm with competitive learning integrated with partial Least Square models for the prediction of the toxicity of fragrances in sanitizers and disinfectants. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 942:173754. [PMID: 38844215 DOI: 10.1016/j.scitotenv.2024.173754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 05/18/2024] [Accepted: 06/02/2024] [Indexed: 06/10/2024]
Abstract
This study addresses the need for accurate structural data regarding the toxicity of fragrances in sanitizers and disinfectants. We compare the predictive and descriptive (model stability) potential of multiple linear regression (MLR) and partial least squares (PLS) models optimized through variable selection (VS). A novel hybrid chaotic neural network algorithm with competitive learning (CCLNNA)-PLS modeling strategy can offer specific optimization with satisfactory results, even for a limited dataset. While also exploring the preliminary comparative analysis, the goal is to introduce an adapted novel CCLNNA optimization strategy for VS, inspired by neural networks, along with exploring the influence of the percentage of significant descriptors in the optimization function to enhance the final model's capabilities. We analyzed an available dataset of 24 molecules, incorporating ADMET and PaDEL descriptors as predictor variables, to explore the relationship between the response/target variable (pLC50) and the meticulously optimized set of descriptors. The suitability of the selected PLS models (cross- and external-validated accuracy combined with percentage of significant descriptors at a level equal to or >80 %) underscores the importance of expanding the dataset to amplify the validation protocols, thus enhancing future model reliability and environmental impact.
Collapse
Affiliation(s)
- Matshidiso Lephalala
- Department of Chemistry, Durban University of Technology, P.O. Box 1334, Durban 4000, South Africa
| | - Salvador Sagrado Vives
- Departamento de Química Analítica, Facultad de Farmacia. Universitat de València, E-46100 Burjassot, Valencia, Spain; Instituto Interuniversitario de Investigación de Reconocimiento Molecular y Desarrollo Tecnológico (IDM), Universitat Politècnica de València, Universitat de València, Valencia, Spain
| | - Krishna Bisetty
- Department of Chemistry, Durban University of Technology, P.O. Box 1334, Durban 4000, South Africa.
| |
Collapse
|
15
|
Zhai S, Tan Y, Zhu C, Zhang C, Gao Y, Mao Q, Zhang Y, Duan H, Yin Y. PepExplainer: An explainable deep learning model for selection-based macrocyclic peptide bioactivity prediction and optimization. Eur J Med Chem 2024; 275:116628. [PMID: 38944933 DOI: 10.1016/j.ejmech.2024.116628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 06/21/2024] [Accepted: 06/24/2024] [Indexed: 07/02/2024]
Abstract
Macrocyclic peptides possess unique features, making them highly promising as a drug modality. However, evaluating their bioactivity through wet lab experiments is generally resource-intensive and time-consuming. Despite advancements in artificial intelligence (AI) for bioactivity prediction, challenges remain due to limited data availability and the interpretability issues in deep learning models, often leading to less-than-ideal predictions. To address these challenges, we developed PepExplainer, an explainable graph neural network based on substructure mask explanation (SME). This model excels at deciphering amino acid substructures, translating macrocyclic peptides into detailed molecular graphs at the atomic level, and efficiently handling non-canonical amino acids and complex macrocyclic peptide structures. PepExplainer's effectiveness is enhanced by utilizing the correlation between peptide enrichment data from selection-based focused library and bioactivity data, and employing transfer learning to improve bioactivity predictions of macrocyclic peptides against IL-17C/IL-17 RE interaction. Additionally, PepExplainer underwent further validation for bioactivity prediction using an additional set of thirteen newly synthesized macrocyclic peptides. Moreover, it enabled the optimization of the IC50 of a macrocyclic peptide, reducing it from 15 nM to 5.6 nM based on the contribution score provided by PepExplainer. This achievement underscores PepExplainer's skill in deciphering complex molecular patterns, highlighting its potential to accelerate the discovery and optimization of macrocyclic peptides.
Collapse
Affiliation(s)
- Silong Zhai
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, China
| | - Yahong Tan
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao, 266237, China
| | - Cheng Zhu
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, China
| | - Chengyun Zhang
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, China
| | - Yan Gao
- Qilu Institute of Technology, Jinan, 250200, China
| | - Qingyi Mao
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, China
| | - Youming Zhang
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao, 266237, China
| | - Hongliang Duan
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, China.
| | - Yizhen Yin
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao, 266237, China; Shandong Research Institute of Industrial Technology, Jinan, 250101, China.
| |
Collapse
|
16
|
Zhu J, Azam NA, Haraguchi K, Zhao L, Nagamochi H, Akutsu T. Molecular Design Based on Integer Programming and Splitting Data Sets by Hyperplanes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1529-1541. [PMID: 38767997 DOI: 10.1109/tcbb.2024.3402675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
A novel framework for designing the molecular structure of chemical compounds with a desired chemical property has recently been proposed. The framework infers a desired chemical graph by solving a mixed integer linear program (MILP) that simulates the computation process of two functions: a feature function defined by a two-layered model on chemical graphs and a prediction function constructed by a machine learning method. To improve the learning performance of prediction functions in the framework, we design a method that splits a given data set C into two subsets C(i),i=1,2 by a hyperplane in a chemical space so that most compounds in the first (resp., second) subset have observed values lower (resp., higher) than a threshold θ. We construct a prediction function ψ to the data set C by combining prediction functions ψi,i=1,2 each of which is constructed on C(i) independently. The results of our computational experiments suggest that the proposed method improved the learning performance for several chemical properties to which a good prediction function has been difficult to construct.
Collapse
|
17
|
Khan MZI, Ren JN, Cao C, Ye HYX, Wang H, Guo YM, Yang JR, Chen JZ. Comprehensive hepatotoxicity prediction: ensemble model integrating machine learning and deep learning. Front Pharmacol 2024; 15:1441587. [PMID: 39234116 PMCID: PMC11373136 DOI: 10.3389/fphar.2024.1441587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 07/24/2024] [Indexed: 09/06/2024] Open
Abstract
Background Chemicals may lead to acute liver injuries, posing a serious threat to human health. Achieving the precise safety profile of a compound is challenging due to the complex and expensive testing procedures. In silico approaches will aid in identifying the potential risk of drug candidates in the initial stage of drug development and thus mitigating the developmental cost. Methods In current studies, QSAR models were developed for hepatotoxicity predictions using the ensemble strategy to integrate machine learning (ML) and deep learning (DL) algorithms using various molecular features. A large dataset of 2588 chemicals and drugs was randomly divided into training (80%) and test (20%) sets, followed by the training of individual base models using diverse machine learning or deep learning based on three different kinds of descriptors and fingerprints. Feature selection approaches were employed to proceed with model optimizations based on the model performance. Hybrid ensemble approaches were further utilized to determine the method with the best performance. Results The voting ensemble classifier emerged as the optimal model, achieving an excellent prediction accuracy of 80.26%, AUC of 82.84%, and recall of over 93% followed by bagging and stacking ensemble classifiers method. The model was further verified by an external test set, internal 10-fold cross-validation, and rigorous benchmark training, exhibiting much better reliability than the published models. Conclusion The proposed ensemble model offers a dependable assessment with a good performance for the prediction regarding the risk of chemicals and drugs to induce liver damage.
Collapse
Affiliation(s)
| | - Jia-Nan Ren
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Cheng Cao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Polytechnic Institute, Zhejiang University, Hangzhou, China
| | - Hong-Yu-Xiang Ye
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Hao Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Ya-Min Guo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Jin-Rong Yang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Polytechnic Institute, Zhejiang University, Hangzhou, China
| | - Jian-Zhong Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
18
|
Kostal J, Voutchkova-Kostal A, Bercu JP, Graham JC, Hillegass J, Masuda-Herrera M, Trejo-Martin A, Gould J. Quantum-Mechanics Calculations Elucidate Skin-Sensitizing Pharmaceutical Compounds. Chem Res Toxicol 2024; 37:1404-1414. [PMID: 39069667 DOI: 10.1021/acs.chemrestox.4c00185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Skin sensitization is a critical end point in occupational toxicology that necessitates the use of fast, accurate, and affordable models to aid in establishing handling guidance for worker protection. While many in silico models have been developed, the scarcity of reliable data for active pharmaceutical ingredients (APIs) and their intermediates (together regarded as pharmaceutical compounds) brings into question the reliability of these tools, which are largely constructed using publicly available nonspecialty chemicals. Here, we present the quantum-mechanical (QM) Computer-Aided Discovery and REdesign (CADRE) model, which was developed with the bioactive and structurally complex chemical space in mind by relying on the fundamentals of chemical interactions in key events (versus structural attributes of training-set data). Validated in this study on 345 APIs and intermediates, CADRE achieved 95% accuracy, sensitivity, and specificity and a combined 79% accuracy in assigning potency categories compared to the mouse local lymph node assay data. We show how historical outcomes from CADRE testing in the pharmaceutical space, generated over the past 10 years on ca. 2500 chemicals, can be used to probe the relationships between sensitization mechanisms (or the underlying chemical classes) and the probability of eliciting a sensitization response in mice of a given potency. We believe this information to be of value to both practitioners, who can use it to quickly screen and triage their data sets, as well as to model developers to fine-tune their structure-based tools. Lastly, we leverage our experimentally validated subset of APIs and intermediates to show the importance of dermal permeability on the sensitization potential and potency. We demonstrate that common physicochemical properties used to assess permeation, such as the octanol-water partition coefficient and molecular weight, are poor proxies for the more accurate energy-pair distributions that can be computed from mixed QM and classical simulations using model representations of the stratum corneum.
Collapse
Affiliation(s)
- Jakub Kostal
- Designing Out Toxicity (DOT) Consulting LLC, 2121 Eisenhower Avenue, Alexandria, Virginia 22314, United States
- The George Washington University, 800 22nd St. NW, Washington, District of Columbia 20052, United States
| | - Adelina Voutchkova-Kostal
- Designing Out Toxicity (DOT) Consulting LLC, 2121 Eisenhower Avenue, Alexandria, Virginia 22314, United States
| | - Joel P Bercu
- Gilead Sciences Inc. 333 Lakeside Drive, Foster City, California 94404, United States
| | - Jessica C Graham
- Genentech, Inc., 1 DNA Way, South San Francisco, California 94080, United States
| | - Jedd Hillegass
- Bristol Myers Squibb, 1 Squibb Drive, New Brunswick, New Jersey 08901, United States
| | - Melisa Masuda-Herrera
- Gilead Sciences Inc. 333 Lakeside Drive, Foster City, California 94404, United States
| | | | - Janet Gould
- SafeBridge Regulatory & Life Sciences Group, 330 Seventh Ave #2001, New York, New York 10001, United States
| |
Collapse
|
19
|
Bultum LE, Kim G, Lee SW, Lee D. Data Mining and in Silico Analysis of Ethiopian Traditional Medicine: Unveiling the Therapeutic Potential of Rumex abyssinicus Jacq. Cell Biochem Biophys 2024:10.1007/s12013-024-01478-4. [PMID: 39154130 DOI: 10.1007/s12013-024-01478-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/28/2024] [Indexed: 08/19/2024]
Abstract
Multicomponent traditional medicine prescriptions are widely used in Ethiopia for disease treatment. However, inconsistencies across practitioners, cultures, and locations have hindered the development of reliable therapeutic medicines. Systematic analysis of traditional medicine data is crucial for identifying consistent and reliable medicinal materials. In this study, we compiled and analyzed a dataset of 505 prescriptions, encompassing 567 medicinal materials used for treating 106 diseases. Using association rule mining, we identified significant associations between diseases and medicinal materials. Notably, wound healing-the most frequently treated condition-was strongly associated with Rumex abyssinicus Jacq., showing a high support value. This association led to further in silico and network analysis of R. abyssinicus Jacq. compounds, revealing 756 therapeutic targets enriched in various KEGG pathways and biological processes. The Random-Walk with Restart (RWR) algorithm applied to the CODA PPI network identified these targets as linked to diseases such as cancer, inflammation, and metabolic, immune, respiratory, and neurological disorders. Many hub target genes from the PPI network were also directly associated with wound healing, supporting the traditional use of R. abyssinicus Jacq. for treating wounds. In conclusion, this study uncovers significant associations between diseases and medicinal materials in Ethiopian traditional medicine, emphasizing the therapeutic potential of R. abyssinicus Jacq. These findings provide a foundation for further research, including in vitro and in vivo studies, to explore and validate the efficacy of traditional and natural product-derived medicines.
Collapse
Affiliation(s)
- Lemessa Etana Bultum
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea.
- Bio-Synergy Research Center, Daejeon, South Korea.
- Institute of Agricultural Life Sciences, Dong-A University, Busan, South Korea.
| | - Gwangmin Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea
- Bio-Synergy Research Center, Daejeon, South Korea
| | - Seon-Woo Lee
- Institute of Agricultural Life Sciences, Dong-A University, Busan, South Korea
| | - Doheon Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea.
- Bio-Synergy Research Center, Daejeon, South Korea.
| |
Collapse
|
20
|
Rezić I, Somogyi Škoc M. Computational Methodologies in Synthesis, Preparation and Application of Antimicrobial Polymers, Biomolecules, and Nanocomposites. Polymers (Basel) 2024; 16:2320. [PMID: 39204538 PMCID: PMC11359845 DOI: 10.3390/polym16162320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Revised: 08/05/2024] [Accepted: 08/14/2024] [Indexed: 09/04/2024] Open
Abstract
The design and optimization of antimicrobial materials (polymers, biomolecules, or nanocomposites) can be significantly advanced by computational methodologies like molecular dynamics (MD), which provide insights into the interactions and stability of the antimicrobial agents within the polymer matrix, and machine learning (ML) or design of experiment (DOE), which predicts and optimizes antimicrobial efficacy and material properties. These innovations not only enhance the efficiency of developing antimicrobial polymers but also enable the creation of materials with tailored properties to meet specific application needs, ensuring safety and longevity in their usage. Therefore, this paper will present the computational methodologies employed in the synthesis and application of antimicrobial polymers, biomolecules, and nanocomposites. By leveraging advanced computational techniques such as MD, ML, or DOE, significant advancements in the design and optimization of antimicrobial materials are achieved. A comprehensive review on recent progress, together with highlights of the most relevant methodologies' contributions to state-of-the-art materials science will be discussed, as well as future directions in the field will be foreseen. Finally, future possibilities and opportunities will be derived from the current state-of-the-art methodologies, providing perspectives on the potential evolution of polymer science and engineering of novel materials.
Collapse
Affiliation(s)
- Iva Rezić
- Department of Applied Chemistry, Faculty of Textile Technology, University of Zagreb, 10000 Zagreb, Croatia
| | - Maja Somogyi Škoc
- Department of Materials Testing, Faculty of Textile Technology, University of Zagreb, 10000 Zagreb, Croatia;
| |
Collapse
|
21
|
Noga M, Jurowski K. Toxicity of Bromo-DragonFLY as a New Psychoactive Substance: Application of In Silico Methods for the Prediction of Key Toxicological Parameters Important to Clinical and Forensic Toxicology. Chem Res Toxicol 2024. [PMID: 39119730 DOI: 10.1021/acs.chemrestox.4c00105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/10/2024]
Abstract
Bromo-DragonFLY is a synthetic new psychoactive substance (NPS) that has gained attention due to its powerful and long-lasting hallucinogenic effects, legal status, and widespread availability. This study aimed to use various in silico toxicology methods to predict key toxicological parameters for Bromo-DragonFLY, including acute toxicity (LD50), genotoxicity, cardiotoxicity, health effects, and the potential for endocrine disruption. The results indicate significant acute toxicity with noticeable variations across different species, a low likelihood of genotoxic potential suggesting potential DNA damage, and a notable risk of cardiotoxicity associated with inhibition of the hERG channel. Evaluation of endocrine disruption suggests a low probability of Bromo-DragonFLY interacting with the estrogen receptor α (ER-α), indicating minimal estrogenic activity. These insights from in silico investigations are important for advancing our understanding of this NPS in forensic and clinical toxicology. These initial toxicological examinations establish a foundation for future research efforts and contribute to developing risk assessment and management strategies for using and misusing NPS.
Collapse
Affiliation(s)
- Maciej Noga
- Department of Regulatory and Forensic Toxicology, Institute of Medical Expertises in Łódź, Ul. Aleksandrowska 67/93, 91-205 Łódź, Poland
| | - Kamil Jurowski
- Department of Regulatory and Forensic Toxicology, Institute of Medical Expertises in Łódź, Ul. Aleksandrowska 67/93, 91-205 Łódź, Poland
- Laboratory of Innovative Toxicological Research and Analyzes, Institute of Medical Studies, Medical College, Rzeszów University, Al. Mjr. W. Kopisto 2a, 35-959 Rzeszów, Poland
| |
Collapse
|
22
|
Xuan Y, Wang Y, Li R, Zhong Y, Wang N, Zhang L, Chen Q, Yu S, Yuan J. Using machine learning to classify the immunosuppressive activity of per- and polyfluoroalkyl substances. Toxicol Mech Methods 2024:1-9. [PMID: 39104137 DOI: 10.1080/15376516.2024.2387733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 07/28/2024] [Accepted: 07/29/2024] [Indexed: 08/07/2024]
Abstract
Per- and polyfluoroalkyl substances (PFASs), one of the persistent organic pollutants, have immunosuppressive effects. The evaluation of this effect has been the focus of regulatory toxicology. In this investigation, 146 PFASs (immunosuppressive or nonimmunosuppressive) and corresponding concentration gradients were collected from literature, and their structures were characterized by using Dragon descriptors. Feature importance analysis and stepwise feature elimination are used for feature selection. Three machine learning (ML) methods, namely Random Forest (RF), Extreme Gradient Boosting Machine (XGB), and Categorical Boosting Machine (CB), were utilized for model development. The model interpretability was explored by feature importance analysis and correlation analysis. The findings indicated that the three models developed have exhibited excellent performance. Among them, the best-performing RF model has an average AUC score of 0.9720 for the testing set. The results of the feature importance analysis demonstrated that concentration, SpPosA_X, IVDE, R2s, and SIC2 were the crucial molecular features. Applicability domain analysis was also performed to determine reliable prediction boundaries for the model. In conclusion, this study is the first application of ML models to investigate the immunosuppressive activity of PFASs. The variables used in the models can help understand the mechanism of the immunosuppressive activity of PFASs, allow researchers to more effectively assess the immunosuppressive potential of a large number of PFASs, and thus better guide environmental and health risk assessment efforts.
Collapse
Affiliation(s)
- Yuxin Xuan
- College of Public Health, Zhengzhou University, Zhengzhou, P. R. China
| | - Yulu Wang
- College of Public Health, Zhengzhou University, Zhengzhou, P. R. China
| | - Rui Li
- College of Public Health, Zhengzhou University, Zhengzhou, P. R. China
| | - Yuyan Zhong
- College of Public Health, Zhengzhou University, Zhengzhou, P. R. China
| | - Na Wang
- College of Public Health, Zhengzhou University, Zhengzhou, P. R. China
| | - Lingyin Zhang
- College of Public Health, Zhengzhou University, Zhengzhou, P. R. China
| | - Qian Chen
- College of Public Health, Zhengzhou University, Zhengzhou, P. R. China
| | - Shuling Yu
- Key Laboratory of Natural Medicine and Immune-Engineering of Henan Province, Henan University, Kaifeng, Henan, P. R. China
| | - Jintao Yuan
- College of Public Health, Zhengzhou University, Zhengzhou, P. R. China
| |
Collapse
|
23
|
Cao A, Zhang L, Bu Y, Sun D. Machine Learning Prediction of On/Off Target-driven Clinical Adverse Events. Pharm Res 2024; 41:1649-1658. [PMID: 39095534 DOI: 10.1007/s11095-024-03742-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Accepted: 07/06/2024] [Indexed: 08/04/2024]
Abstract
OBJECTIVE Currently, 90% of clinical drug development fails, where 30% of these failures are due to clinical toxicity. The current extensive animal toxicity studies are not predictive of clinical adverse events (AEs) at clinical doses, while current computation models only consider very few factors with limited success in clinical toxicity prediction. We aimed to address these issues by developing a machine learning (ML) model to directly predict clinical AEs. METHODS Using a dataset with 759 FDA-approved drugs with known AEs, we first adapted the ConPLex ML model to predict IC50 values of these FDA-approved drugs against their on-target and off-target binding among 477 protein targets. Subsequently, we constructed a new ML model to predict clinical AEs using IC50 values of 759 drugs' primary on-target and off-target effects along with tissue-specific protein expression profiles. RESULTS The adapted ConPLex model predicted drug-target interactions for both on- and off-target effects, as shown by co-localization of the 6 small molecule kinase inhibitors with their respective kinases. The coupled ML models demonstrated good predictive capability of clinical AEs, with accuracy over 75%. CONCLUSIONS Our approach provides a new insight into the mechanistic understanding of in vivo drug toxicity in relationship with drug on-/off-target interactions. The coupled ML models, once validated with larger datasets, may offer advantages to directly predict clinical AEs using in vitro/ex vivo and preclinical data, which will help to reduce drug development failure due to clinical toxicity.
Collapse
Affiliation(s)
- Albert Cao
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Michigan, Ann Arbor, MI, 48109, United States
- Centennial High School, Ellicott City, MD, 21042, United States
| | - Luchen Zhang
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Michigan, Ann Arbor, MI, 48109, United States
| | - Yingzi Bu
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Michigan, Ann Arbor, MI, 48109, United States
- Michigan Institute for Computational Discovery & Engineering, University of Michigan, Ann Arbor, MI, 48109, United States
| | - Duxin Sun
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Michigan, Ann Arbor, MI, 48109, United States.
- Duxin Sun, 1600 Huron Parkway, North Campus Research Complex, Building 520, Ann Arbor, MI, 48109, United States.
| |
Collapse
|
24
|
Agea MI, Čmelo I, Dehaen W, Chen Y, Kirchmair J, Sedlák D, Bartůněk P, Šícho M, Svozil D. Chemical space exploration with Molpher: Generating and assessing a glucocorticoid receptor ligand library. Mol Inform 2024; 43:e202300316. [PMID: 38979783 DOI: 10.1002/minf.202300316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 04/23/2024] [Accepted: 04/24/2024] [Indexed: 07/10/2024]
Abstract
Computational exploration of chemical space is crucial in modern cheminformatics research for accelerating the discovery of new biologically active compounds. In this study, we present a detailed analysis of the chemical library of potential glucocorticoid receptor (GR) ligands generated by the molecular generator, Molpher. To generate the targeted GR library and construct the classification models, structures from the ChEMBL database as well as from the internal IMG library, which was experimentally screened for biological activity in the primary luciferase reporter cell assay, were utilized. The composition of the targeted GR ligand library was compared with a reference library that randomly samples chemical space. A random forest model was used to determine the biological activity of ligands, incorporating its applicability domain using conformal prediction. It was demonstrated that the GR library is significantly enriched with GR ligands compared to the random library. Furthermore, a prospective analysis demonstrated that Molpher successfully designed compounds, which were subsequently experimentally confirmed to be active on the GR. A collection of 34 potential new GR ligands was also identified. Moreover, an important contribution of this study is the establishment of a comprehensive workflow for evaluating computationally generated ligands, particularly those with potential activity against targets that are challenging to dock.
Collapse
Affiliation(s)
- M Isabel Agea
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Ivan Čmelo
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Wim Dehaen
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
- Department of Organic Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Ya Chen
- Center for Bioinformatics (ZBH), Department of Informatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, 20146, Hamburg, Germany
- Division of Pharmaceutical Chemistry, Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, 1090, Vienna, Austria
| | - Johannes Kirchmair
- Center for Bioinformatics (ZBH), Department of Informatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, 20146, Hamburg, Germany
- Division of Pharmaceutical Chemistry, Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, 1090, Vienna, Austria
| | - David Sedlák
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, 14220, Czech Republic
| | - Petr Bartůněk
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, 14220, Czech Republic
| | - Martin Šícho
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Daniel Svozil
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, 14220, Czech Republic
| |
Collapse
|
25
|
Zell L, Hofer TS, Schubert M, Popoff A, Höll A, Marschhofer M, Huber-Cantonati P, Temml V, Schuster D. Impact of 2-hydroxypropyl-β-cyclodextrin inclusion complex formation on dopamine receptor-ligand interaction - A case study. Biochem Pharmacol 2024; 226:116340. [PMID: 38848779 DOI: 10.1016/j.bcp.2024.116340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 05/10/2024] [Accepted: 06/04/2024] [Indexed: 06/09/2024]
Abstract
The octanol-water distribution coefficient (logP), used as a measure of lipophilicity, plays a major role in the drug design and discovery processes. While average logP values remain unchanged in approved oral drugs since 1983, current medicinal chemistry trends towards increasingly lipophilic compounds that require adapted analytical workflows and drug delivery systems. Solubility enhancers like cyclodextrins (CDs), especially 2-hydroxypropyl-β-CD (2-HP-β-CD), have been studied in vitro and in vivo investigating their ADMET (adsorption, distribution, metabolism, excretion and toxicity)-related properties. However, data is scarce regarding the applicability of CD inclusion complexes (ICs) in vitro compared to pure compounds. In this study, dopamine receptor (DR) ligands were used as a case study, utilizing a combined in silico/in vitro workflow. Media-dependent solubility and IC stoichiometry were investigated using HPLC. NMR was used to observe IC formation-caused chemical shift deviations while in silico approaches utilizing basin hopping global minimization were used to propose putative IC binding modes. A cell-based in vitro homogeneous time-resolved fluorescence (HTRF) assay was used to quantify ligand binding affinity at the DR subtype 2 (D2R). While all ligands showed increased solubility using 2-HP-β-CD, they differed regarding IC stoichiometry and receptor binding affinity. This case study shows that IC-formation was ligand-dependent and sometimes altering in vitro binding. Therefore, IC complex formation can't be recommended as a general means of improving compound solubility for in vitro studies as they may alter ligand binding.
Collapse
Affiliation(s)
- Lukas Zell
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, Paracelsus Medical University, 5020 Salzburg, Austria; Research and Innovation Center for Novel Therapies and Regenerative Medicine, Austria
| | - Thomas S Hofer
- Institute of General, Inorganic and Theoretical Chemistry, Center for Biochemistry and Biomedicine, University of Innsbruck, 6020 Innsbruck, Austria
| | - Mario Schubert
- Department of Biosciences and Medical Biology, University of Salzburg, 5020 Salzburg, Austria; Department of Chemistry, Freie Universität Berlin, 14195 Berlin, Germany
| | - Alexander Popoff
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, Paracelsus Medical University, 5020 Salzburg, Austria; Research and Innovation Center for Novel Therapies and Regenerative Medicine, Austria
| | - Anna Höll
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, Paracelsus Medical University, 5020 Salzburg, Austria; Research and Innovation Center for Novel Therapies and Regenerative Medicine, Austria
| | - Moritz Marschhofer
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, Paracelsus Medical University, 5020 Salzburg, Austria; Research and Innovation Center for Novel Therapies and Regenerative Medicine, Austria
| | - Petra Huber-Cantonati
- Department of Pharmaceutical Biology, Institute of Pharmacy, Paracelsus Medical University, 5020 Salzburg, Austria; Research and Innovation Center for Novel Therapies and Regenerative Medicine, Austria
| | - Veronika Temml
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, Paracelsus Medical University, 5020 Salzburg, Austria; Research and Innovation Center for Novel Therapies and Regenerative Medicine, Austria
| | - Daniela Schuster
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, Paracelsus Medical University, 5020 Salzburg, Austria; Research and Innovation Center for Novel Therapies and Regenerative Medicine, Austria.
| |
Collapse
|
26
|
Wang JH, Sung TY. ToxTeller: Predicting Peptide Toxicity Using Four Different Machine Learning Approaches. ACS OMEGA 2024; 9:32116-32123. [PMID: 39072096 PMCID: PMC11270677 DOI: 10.1021/acsomega.4c04246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/20/2024] [Accepted: 06/25/2024] [Indexed: 07/30/2024]
Abstract
Examining the toxicity of peptides is essential for therapeutic peptide-based drug design. Machine learning approaches are frequently used to develop highly accurate predictors for peptide toxicity prediction. In this paper, we present ToxTeller, which provides four predictors using logistic regression, support vector machines, random forests, and XGBoost, respectively. For prediction model development, we construct a data set of toxic and nontoxic peptides from SwissProt and ConoServer databases with existence evidence levels checked. We also fully utilize the protein annotation in SwissProt to collect more toxic peptides than using keyword search alone. From this data set, we construct an independent test data set that shares at most 40% sequence similarity within itself and with the training data set. From a quite comprehensive list of 28 feature combinations, we conduct 10-fold cross-validation on the training data set to determine the optimized feature combination for model development. ToxTeller's performance is evaluated and compared with existing predictors on the independent test data set. Since toxic peptides must be avoided for drug design, we analyze strategies for reducing false-negative predictions of toxic peptides and suggest selecting models by top sensitivity instead of the widely used Matthews correlation coefficient, and also suggest using a meta-predictor approach with multiple predictors.
Collapse
Affiliation(s)
- Jen-Hung Wang
- Institute of Information
Science, Academia Sinica, Taipei 11529, Taiwan
| | - Ting-Yi Sung
- Institute of Information
Science, Academia Sinica, Taipei 11529, Taiwan
| |
Collapse
|
27
|
Lai Y, Koelmel JP, Walker DI, Price EJ, Papazian S, Manz KE, Castilla-Fernández D, Bowden JA, Nikiforov V, David A, Bessonneau V, Amer B, Seethapathy S, Hu X, Lin EZ, Jbebli A, McNeil BR, Barupal D, Cerasa M, Xie H, Kalia V, Nandakumar R, Singh R, Tian Z, Gao P, Zhao Y, Froment J, Rostkowski P, Dubey S, Coufalíková K, Seličová H, Hecht H, Liu S, Udhani HH, Restituito S, Tchou-Wong KM, Lu K, Martin JW, Warth B, Godri Pollitt KJ, Klánová J, Fiehn O, Metz TO, Pennell KD, Jones DP, Miller GW. High-Resolution Mass Spectrometry for Human Exposomics: Expanding Chemical Space Coverage. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:12784-12822. [PMID: 38984754 PMCID: PMC11271014 DOI: 10.1021/acs.est.4c01156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 07/11/2024]
Abstract
In the modern "omics" era, measurement of the human exposome is a critical missing link between genetic drivers and disease outcomes. High-resolution mass spectrometry (HRMS), routinely used in proteomics and metabolomics, has emerged as a leading technology to broadly profile chemical exposure agents and related biomolecules for accurate mass measurement, high sensitivity, rapid data acquisition, and increased resolution of chemical space. Non-targeted approaches are increasingly accessible, supporting a shift from conventional hypothesis-driven, quantitation-centric targeted analyses toward data-driven, hypothesis-generating chemical exposome-wide profiling. However, HRMS-based exposomics encounters unique challenges. New analytical and computational infrastructures are needed to expand the analysis coverage through streamlined, scalable, and harmonized workflows and data pipelines that permit longitudinal chemical exposome tracking, retrospective validation, and multi-omics integration for meaningful health-oriented inferences. In this article, we survey the literature on state-of-the-art HRMS-based technologies, review current analytical workflows and informatic pipelines, and provide an up-to-date reference on exposomic approaches for chemists, toxicologists, epidemiologists, care providers, and stakeholders in health sciences and medicine. We propose efforts to benchmark fit-for-purpose platforms for expanding coverage of chemical space, including gas/liquid chromatography-HRMS (GC-HRMS and LC-HRMS), and discuss opportunities, challenges, and strategies to advance the burgeoning field of the exposome.
Collapse
Affiliation(s)
- Yunjia Lai
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| | - Jeremy P. Koelmel
- Department
of Environmental Health Sciences, Yale School
of Public Health, New Haven, Connecticut 06520, United States
| | - Douglas I. Walker
- Gangarosa
Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, Georgia 30322, United States
| | - Elliott J. Price
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Stefano Papazian
- Department
of Environmental Science, Science for Life Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
- National
Facility for Exposomics, Metabolomics Platform, Science for Life Laboratory, Stockholm University, Solna 171 65, Sweden
| | - Katherine E. Manz
- Department
of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Delia Castilla-Fernández
- Department
of Food Chemistry and Toxicology, Faculty of Chemistry, University of Vienna, 1010 Vienna, Austria
| | - John A. Bowden
- Center for
Environmental and Human Toxicology, Department of Physiological Sciences,
College of Veterinary Medicine, University
of Florida, Gainesville, Florida 32611, United States
| | | | - Arthur David
- Univ Rennes,
Inserm, EHESP, Irset (Institut de recherche en santé, environnement
et travail) − UMR_S, 1085 Rennes, France
| | - Vincent Bessonneau
- Univ Rennes,
Inserm, EHESP, Irset (Institut de recherche en santé, environnement
et travail) − UMR_S, 1085 Rennes, France
| | - Bashar Amer
- Thermo
Fisher Scientific, San Jose, California 95134, United States
| | | | - Xin Hu
- Gangarosa
Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, Georgia 30322, United States
| | - Elizabeth Z. Lin
- Department
of Environmental Health Sciences, Yale School
of Public Health, New Haven, Connecticut 06520, United States
| | - Akrem Jbebli
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Brooklynn R. McNeil
- Biomarkers
Core Laboratory, Irving Institute for Clinical and Translational Research, Columbia University Irving Medical Center, New York, New York 10032, United States
| | - Dinesh Barupal
- Department
of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Marina Cerasa
- Institute
of Atmospheric Pollution Research, Italian National Research Council, 00015 Monterotondo, Rome, Italy
| | - Hongyu Xie
- Department
of Environmental Science, Science for Life Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Vrinda Kalia
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| | - Renu Nandakumar
- Biomarkers
Core Laboratory, Irving Institute for Clinical and Translational Research, Columbia University Irving Medical Center, New York, New York 10032, United States
| | - Randolph Singh
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| | - Zhenyu Tian
- Department
of Chemistry and Chemical Biology, Northeastern
University, Boston, Massachusetts 02115, United States
| | - Peng Gao
- Department
of Environmental and Occupational Health, and Department of Civil
and Environmental Engineering, University
of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
- UPMC Hillman
Cancer Center, Pittsburgh, Pennsylvania 15232, United States
| | - Yujia Zhao
- Institute
for Risk Assessment Sciences, Utrecht University, Utrecht 3584CM, The Netherlands
| | | | | | - Saurabh Dubey
- Biomarkers
Core Laboratory, Irving Institute for Clinical and Translational Research, Columbia University Irving Medical Center, New York, New York 10032, United States
| | - Kateřina Coufalíková
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Hana Seličová
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Helge Hecht
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Sheng Liu
- Department
of Environmental Health Sciences, Yale School
of Public Health, New Haven, Connecticut 06520, United States
| | - Hanisha H. Udhani
- Biomarkers
Core Laboratory, Irving Institute for Clinical and Translational Research, Columbia University Irving Medical Center, New York, New York 10032, United States
| | - Sophie Restituito
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| | - Kam-Meng Tchou-Wong
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| | - Kun Lu
- Department
of Environmental Sciences and Engineering, Gillings School of Global
Public Health, The University of North Carolina
at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Jonathan W. Martin
- Department
of Environmental Science, Science for Life Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
- National
Facility for Exposomics, Metabolomics Platform, Science for Life Laboratory, Stockholm University, Solna 171 65, Sweden
| | - Benedikt Warth
- Department
of Food Chemistry and Toxicology, Faculty of Chemistry, University of Vienna, 1010 Vienna, Austria
| | - Krystal J. Godri Pollitt
- Department
of Environmental Health Sciences, Yale School
of Public Health, New Haven, Connecticut 06520, United States
| | - Jana Klánová
- RECETOX,
Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
| | - Oliver Fiehn
- West Coast
Metabolomics Center, University of California−Davis, Davis, California 95616, United States
| | - Thomas O. Metz
- Biological
Sciences Division, Pacific Northwest National
Laboratory, Richland, Washington 99354, United States
| | - Kurt D. Pennell
- School
of Engineering, Brown University, Providence, Rhode Island 02912, United States
| | - Dean P. Jones
- Department
of Medicine, School of Medicine, Emory University, Atlanta, Georgia 30322, United States
| | - Gary W. Miller
- Department
of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, New York 10032, United States
| |
Collapse
|
28
|
Xu Y, Liaw A, Sheridan RP, Svetnik V. Development and Evaluation of Conformal Prediction Methods for Quantitative Structure-Activity Relationship. ACS OMEGA 2024; 9:29478-29490. [PMID: 39005801 PMCID: PMC11238240 DOI: 10.1021/acsomega.4c02017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 06/10/2024] [Accepted: 06/12/2024] [Indexed: 07/16/2024]
Abstract
The quantitative structure-activity relationship (QSAR) regression model is a commonly used technique for predicting the biological activities of compounds using their molecular descriptors. Besides accurate activity estimation, obtaining a prediction uncertainty metric like a prediction interval is highly desirable. Quantifying prediction uncertainty is an active research area in statistical and machine learning (ML), but the implementation for QSAR remains challenging. However, most ML algorithms with high predictive performance require add-on companions for estimating the uncertainty of their prediction. Conformal prediction (CP) is a promising approach as its main components are agnostic to the prediction modes, and it produces valid prediction intervals under weak assumptions on the data distribution. We proposed computationally efficient CP algorithms tailored to the most widely used ML models, including random forests, deep neural networks, and gradient boosting. The algorithms use a novel approach to the derivation of nonconformity scores from the estimates of prediction uncertainty generated by the ensembles of point predictions. The validity and efficiency of proposed algorithms are demonstrated on a diverse collection of QSAR data sets as well as simulation studies. The provided software implementing our algorithms can be used as stand-alone or easily incorporated into other ML software packages for QSAR modeling.
Collapse
Affiliation(s)
- Yuting Xu
- Early
Development Statistics, Merck & Co.,
Inc., Rahway, New Jersey 07065, United States
| | - Andy Liaw
- Early
Development Statistics, Merck & Co.,
Inc., Rahway, New Jersey 07065, United States
| | - Robert P. Sheridan
- Modeling
and Informatics, Merck & Co., Inc., Rahway, New Jersey 07033, United States
| | - Vladimir Svetnik
- Early
Development Statistics, Merck & Co.,
Inc., Rahway, New Jersey 07065, United States
| |
Collapse
|
29
|
Kırboğa KK, Işık M. Explainable artificial intelligence in the design of selective carbonic anhydrase I-II inhibitors via molecular fingerprinting. J Comput Chem 2024; 45:1530-1539. [PMID: 38491535 DOI: 10.1002/jcc.27335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 02/04/2024] [Accepted: 02/10/2024] [Indexed: 03/18/2024]
Abstract
Inhibiting the enzymes carbonic anhydrase I (CA I) and carbonic anhydrase II (CA II) presents a potential avenue for addressing nervous system ailments such as glaucoma and Alzheimer's disease. Our study explored harnessing explainable artificial intelligence (XAI) to unveil the molecular traits inherent in CA I and CA II inhibitors. The PubChem molecular fingerprints of these inhibitors, sourced from the ChEMBL database, were subjected to detailed XAI analysis. The study encompassed training 10 regression models using IC50 values, and their efficacy was gauged using metrics including R2, RMSE, and time taken. The Decision Tree Regressor algorithm emerged as the optimal performer (R2: 0.93, RMSE: 0.43, time-taken: 0.07). Furthermore, the PFI method unveiled key molecular features for CA I inhibitors, notably PubChemFP432 (C(O)N) and PubChemFP6978 (C(O)O). The SHAP analysis highlighted the significance of attributes like PubChemFP539 (C(O)NCC), PubChemFP601 (C(O)OCC), and PubChemFP432 (C(O)N) in CA I inhibitiotable n. Likewise, features for CA II inhibitors encompassed PubChemFP528(C(O)OCCN), PubChemFP791 (C(O)OCCC), PubChemFP696 (C(O)OCCCC), PubChemFP335 (C(O)NCCN), PubChemFP580 (C(O)NCCCN), and PubChemFP180 (C(O)NCCC), identified through SHAP analysis. The sulfonamide group (S), aromatic ring (A), and hydrogen bonding group (H) exert a substantial impact on CA I and CA II enzyme activities and IC50 values through the XAI approach. These insights into the CA I and CA II inhibitors are poised to guide future drug discovery efforts, serving as a beacon for innovative therapeutic interventions.
Collapse
Affiliation(s)
- Kevser Kübra Kırboğa
- Faculty of Engineering, Department of Bioengineering, Bilecik Seyh Edebali University, Bilecik, Turkey
- Bioengineering Department, Süleyman Demirel University, Isparta, Turkey
| | - Mesut Işık
- Faculty of Engineering, Department of Bioengineering, Bilecik Seyh Edebali University, Bilecik, Turkey
| |
Collapse
|
30
|
Liu J, Gui Y, Rao J, Sun J, Wang G, Ren Q, Qu N, Niu B, Chen Z, Sheng X, Wang Y, Zheng M, Li X. In silico off-target profiling for enhanced drug safety assessment. Acta Pharm Sin B 2024; 14:2927-2941. [PMID: 39027254 PMCID: PMC11252485 DOI: 10.1016/j.apsb.2024.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/21/2024] [Accepted: 02/29/2024] [Indexed: 07/20/2024] Open
Abstract
Ensuring drug safety in the early stages of drug development is crucial to avoid costly failures in subsequent phases. However, the economic burden associated with detecting drug off-targets and potential side effects through in vitro safety screening and animal testing is substantial. Drug off-target interactions, along with the adverse drug reactions they induce, are significant factors affecting drug safety. To assess the liability of candidate drugs, we developed an artificial intelligence model for the precise prediction of compound off-target interactions, leveraging multi-task graph neural networks. The outcomes of off-target predictions can serve as representations for compounds, enabling the differentiation of drugs under various ATC codes and the classification of compound toxicity. Furthermore, the predicted off-target profiles are employed in adverse drug reaction (ADR) enrichment analysis, facilitating the inference of potential ADRs for a drug. Using the withdrawn drug Pergolide as an example, we elucidate the mechanisms underlying ADRs at the target level, contributing to the exploration of the potential clinical relevance of newly predicted off-target interactions. Overall, our work facilitates the early assessment of compound safety/toxicity based on off-target identification, deduces potential ADRs of drugs, and ultimately promotes the secure development of drugs.
Collapse
Affiliation(s)
- Jin Liu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
| | - Yike Gui
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Jingxin Rao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingjing Sun
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Gang Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qun Ren
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- Nanjing University of Chinese Medicine, Nanjing 210023, China
| | - Ning Qu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Buying Niu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhiyi Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, Hangzhou 330106, China
| | - Xia Sheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yitian Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mingyue Zheng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- Nanjing University of Chinese Medicine, Nanjing 210023, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, Hangzhou 330106, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
31
|
Kehrein J, Bunker A, Luxenhofer R. POxload: Machine Learning Estimates Drug Loadings of Polymeric Micelles. Mol Pharm 2024; 21:3356-3374. [PMID: 38805643 PMCID: PMC11394009 DOI: 10.1021/acs.molpharmaceut.4c00086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
Block copolymers, composed of poly(2-oxazoline)s and poly(2-oxazine)s, can serve as drug delivery systems; they form micelles that carry poorly water-soluble drugs. Many recent studies have investigated the effects of structural changes of the polymer and the hydrophobic cargo on drug loading. In this work, we combine these data to establish an extended formulation database. Different molecular properties and fingerprints are tested for their applicability to serve as formulation-specific mixture descriptors. A variety of classification and regression models are built for different descriptor subsets and thresholds of loading efficiency and loading capacity, with the best models achieving overall good statistics for both cross- and external validation (balanced accuracies of 0.8). Subsequently, important features are dissected for interpretation, and the DrugBank is screened for potential therapeutic use cases where these polymers could be used to develop novel formulations of hydrophobic drugs. The most promising models are provided as an open-source software tool for other researchers to test the applicability of these delivery systems for potential new drug candidates.
Collapse
Affiliation(s)
- Josef Kehrein
- Soft Matter Chemistry, Department of Chemistry, Faculty of Science, University of Helsinki, A. I. Virtasen aukio 1, 00014 Helsinki, Finland
- Drug Research Program, Division of Pharmaceutical Biosciences Faculty of Pharmacy, University of Helsinki, Viikinkaari 5 E, 00014 Helsinki, Finland
| | - Alex Bunker
- Drug Research Program, Division of Pharmaceutical Biosciences Faculty of Pharmacy, University of Helsinki, Viikinkaari 5 E, 00014 Helsinki, Finland
| | - Robert Luxenhofer
- Soft Matter Chemistry, Department of Chemistry, Faculty of Science, University of Helsinki, A. I. Virtasen aukio 1, 00014 Helsinki, Finland
| |
Collapse
|
32
|
Hosseini MAH, Alizadeh AA, Shayanfar A. Prediction of the First-Pass Metabolism of a Drug After Oral Intake Based on Structural Parameters and Physicochemical Properties. Eur J Drug Metab Pharmacokinet 2024; 49:449-465. [PMID: 38733548 DOI: 10.1007/s13318-024-00892-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/07/2024] [Indexed: 05/13/2024]
Abstract
BACKGROUND AND OBJECTIVE The oral first-pass metabolism is a crucial factor that plays a key role in a drug's pharmacokinetic profile. Prediction of the oral first-pass metabolism based on chemical structural parameters can be useful in the drug-design process. Developing an orally administered drug with an acceptable pharmacokinetic profile is necessary to reduce the cost and time associated with evaluating the extent of the first-pass metabolism of a candidate compound in preclinical studies. The aim of this study is to estimate the first-pass metabolism of an orally administered drug. METHODS A set of compounds with reported first-pass metabolism data were collected. Moreover, human intestinal absorption percentage and oral bioavailability data were extracted from the literature to propose a classification system that split the drugs up based on their first-pass metabolism extents. Various structural parameters were calculated for each compound. The relations of the structural and physicochemical values of each compound to the class the compound belongs to were obtained using logistic regression. RESULTS Initial analysis showed that compounds with logD7.4 > 1 or a rugosity factor of > 1.5 are more likely to have high first-pass metabolism. Four different models that can predict the oral first-pass metabolism with acceptable error were introduced. The overall accuracies of the models were in the range of 72% (for models with simple descriptors) to 78% (for models with complex descriptors). Although the models with simple descriptors have lower accuracies compared to complex models, they are more interpretable and easier for researchers to utilize. CONCLUSION A novel classification of drugs based on the extent of the oral first-pass metabolism was introduced, and mechanistic models were developed to assign candidate compounds to the appropriate proposed classes.
Collapse
Affiliation(s)
- Mir Amir Hossein Hosseini
- Student Research Committee, Tabriz University of Medical Sciences, Tabriz, Iran
- Department of Clinical Pharmacy, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Ali Akbar Alizadeh
- Biotechnology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Ali Shayanfar
- Pharmaceutical Analysis Research Center, Tabriz University of Medical Sciences, Tabriz, Iran.
- Faculty of Pharmacy, Tabriz University of Medical Sciences, Golgasht St., Tabriz, 51664-14766, Iran.
| |
Collapse
|
33
|
Limbu S, Glasgow E, Block T, Dakshanamurthy S. A Machine-Learning-Driven Pathophysiology-Based New Approach Method for the Dose-Dependent Assessment of Hazardous Chemical Mixtures and Experimental Validations. TOXICS 2024; 12:481. [PMID: 39058133 PMCID: PMC11281031 DOI: 10.3390/toxics12070481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 06/21/2024] [Accepted: 06/26/2024] [Indexed: 07/28/2024]
Abstract
Environmental chemicals, such as PFAS, exist as mixtures and are frequently encountered at varying concentrations, which can lead to serious health effects, such as cancer. Therefore, understanding the dose-dependent toxicity of chemical mixtures is essential for health risk assessment. However, comprehensive methods to assess toxicity and identify the mechanisms of these harmful mixtures are currently absent. In this study, the dose-dependent toxicity assessments of chemical mixtures are performed in three methodologically distinct phases. In the first phase, we evaluated our machine-learning method (AI-HNN) and pathophysiology method (CPTM) for predicting toxicity. In the second phase, we integrated AI-HNN and CPTM to establish a comprehensive new approach method (NAM) framework called AI-CPTM that is targeted at refining prediction accuracy and providing a comprehensive understanding of toxicity mechanisms. The third phase involved experimental validations of the AI-CPTM predictions. Initially, we developed binary, multiclass classification, and regression models to predict binary, categorical toxicity, and toxic potencies using nearly a thousand experimental mixtures. This empirical dataset was expanded with assumption-based virtual mixtures, compensating for the lack of experimental data and broadening the scope of the dataset. For comparison, we also developed machine-learning models based on RF, Bagging, AdaBoost, SVR, GB, KR, DT, KN, and Consensus methods. The AI-HNN achieved overall accuracies of over 80%, with the AUC exceeding 90%. In the final phase, we demonstrated the superior performance and predictive capability of AI-CPTM, including for PFAS mixtures and their interaction effects, through rigorous literature and statistical validations, along with experimental dose-response zebrafish-embryo toxicity assays. Overall, the AI-CPTM approach significantly improves upon the limitations of standalone AI models, showing extensive enhancements in identifying toxic chemicals and mixtures and their mechanisms. This study is the first to develop a hybrid NAM that integrates AI with a pathophysiology method to comprehensively predict chemical-mixture toxicity, carcinogenicity, and mechanisms.
Collapse
Affiliation(s)
| | | | | | - Sivanesan Dakshanamurthy
- Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, 3700 O St. NW, Washington, DC 20057, USA
| |
Collapse
|
34
|
Garduño-Juárez R, Tovar-Anaya DO, Perez-Aguilar JM, Lozano-Aguirre Beltran LF, Zubillaga RA, Alvarez-Perez MA, Villarreal-Ramirez E. Molecular Dynamic Simulations for Biopolymers with Biomedical Applications. Polymers (Basel) 2024; 16:1864. [PMID: 39000719 PMCID: PMC11244511 DOI: 10.3390/polym16131864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/13/2024] [Accepted: 04/13/2024] [Indexed: 07/17/2024] Open
Abstract
Computational modeling (CM) is a versatile scientific methodology used to examine the properties and behavior of complex systems, such as polymeric materials for biomedical bioengineering. CM has emerged as a primary tool for predicting, setting up, and interpreting experimental results. Integrating in silico and in vitro experiments accelerates scientific advancements, yielding quicker results at a reduced cost. While CM is a mature discipline, its use in biomedical engineering for biopolymer materials has only recently gained prominence. In biopolymer biomedical engineering, CM focuses on three key research areas: (A) Computer-aided design (CAD/CAM) utilizes specialized software to design and model biopolymers for various biomedical applications. This technology allows researchers to create precise three-dimensional models of biopolymers, taking into account their chemical, structural, and functional properties. These models can be used to enhance the structure of biopolymers and improve their effectiveness in specific medical applications. (B) Finite element analysis, a computational technique used to analyze and solve problems in engineering and physics. This approach divides the physical domain into small finite elements with simple geometric shapes. This computational technique enables the study and understanding of the mechanical and structural behavior of biopolymers in biomedical environments. (C) Molecular dynamics (MD) simulations involve using advanced computational techniques to study the behavior of biopolymers at the molecular and atomic levels. These simulations are fundamental for better understanding biological processes at the molecular level. Studying the wide-ranging uses of MD simulations in biopolymers involves examining the structural, functional, and evolutionary aspects of biomolecular systems over time. MD simulations solve Newton's equations of motion for all-atom systems, producing spatial trajectories for each atom. This provides valuable insights into properties such as water absorption on biopolymer surfaces and interactions with solid surfaces, which are crucial for assessing biomaterials. This review provides a comprehensive overview of the various applications of MD simulations in biopolymers. Additionally, it highlights the flexibility, robustness, and synergistic relationship between in silico and experimental techniques.
Collapse
Affiliation(s)
- Ramón Garduño-Juárez
- Instituto de Ciencias Físicas, Universidad Nacional Autónoma de México, Cuernavaca 62210, Mexico
| | - David O Tovar-Anaya
- Laboratorio de Bioingeniería de Tejidos, División de Estudios de Posgrado e Investigación, Coyoacán 04510, Mexico
| | - Jose Manuel Perez-Aguilar
- School of Chemical Sciences, Meritorious Autonomous University of Puebla (BUAP), University City, Puebla 72570, Mexico
| | | | - Rafael A Zubillaga
- Departamento de Química, Universidad Autónoma Metropolitana-Iztapalapa, Mexico City 09340, Mexico
| | - Marco Antonio Alvarez-Perez
- Laboratorio de Bioingeniería de Tejidos, División de Estudios de Posgrado e Investigación, Coyoacán 04510, Mexico
| | - Eduardo Villarreal-Ramirez
- Laboratorio de Bioingeniería de Tejidos, División de Estudios de Posgrado e Investigación, Coyoacán 04510, Mexico
| |
Collapse
|
35
|
Zhang R, Nolte D, Sanchez-Villalobos C, Ghosh S, Pal R. Topological regression as an interpretable and efficient tool for quantitative structure-activity relationship modeling. Nat Commun 2024; 15:5072. [PMID: 38871711 DOI: 10.1038/s41467-024-49372-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2023] [Accepted: 06/04/2024] [Indexed: 06/15/2024] Open
Abstract
Quantitative structure-activity relationship (QSAR) modeling is a powerful tool for drug discovery, yet the lack of interpretability of commonly used QSAR models hinders their application in molecular design. We propose a similarity-based regression framework, topological regression (TR), that offers a statistically grounded, computationally fast, and interpretable technique to predict drug responses. We compare the predictive performance of TR on 530 ChEMBL human target activity datasets against the predictive performance of deep-learning-based QSAR models. Our results suggest that our sparse TR model can achieve equal, if not better, performance than the deep learning-based QSAR models and provide better intuitive interpretation by extracting an approximate isometry between the chemical space of the drugs and their activity space.
Collapse
Affiliation(s)
- Ruibo Zhang
- Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX, 79409, USA
| | - Daniel Nolte
- Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX, 79409, USA
| | - Cesar Sanchez-Villalobos
- Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX, 79409, USA
| | - Souparno Ghosh
- Department of Statistics, University of Nebraska - Lincoln, Lincoln, NB, 68588, USA.
| | - Ranadip Pal
- Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX, 79409, USA.
| |
Collapse
|
36
|
Zhou Y, Wang Z, Huang Z, Li W, Chen Y, Yu X, Tang Y, Liu G. In silico prediction of ocular toxicity of compounds using explainable machine learning and deep learning approaches. J Appl Toxicol 2024; 44:892-907. [PMID: 38329145 DOI: 10.1002/jat.4586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 01/16/2024] [Accepted: 01/16/2024] [Indexed: 02/09/2024]
Abstract
The accurate identification of chemicals with ocular toxicity is of paramount importance in health hazard assessment. In contemporary chemical toxicology, there is a growing emphasis on refining, reducing, and replacing animal testing in safety evaluations. Therefore, the development of robust computational tools is crucial for regulatory applications. The performance of predictive models is heavily reliant on the quality and quantity of data. In this investigation, we amalgamated the most extensive dataset (4901 compounds) sourced from governmental GHS-compliant databases and literature to develop binary classification models of chemical ocular toxicity. We employed 12 molecular representations in conjunction with six machine learning algorithms and two deep learning algorithms to create a series of binary classification models. The findings indicated that the deep learning method GCN outperformed the machine learning models in cross-validation, achieving an impressive AUC of 0.915. However, the top-performing machine learning model (RF-Descriptor) demonstrated excellent performance with an AUC of 0.869 on the test set and was therefore selected as the best model. To enhance model interpretability, we conducted the SHAP method and attention weights analysis. The two approaches offered visual depictions of the relevance of key descriptors and substructures in predicting ocular toxicity of chemicals. Thus, we successfully struck a delicate balance between data quality and model interpretability, rendering our model valuable for predicting and comprehending potential ocular-toxic compounds in the early stages of drug discovery.
Collapse
Affiliation(s)
- Yiqing Zhou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Ze Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Zejun Huang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Yuanting Chen
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Xinxin Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| |
Collapse
|
37
|
Kaboudi N, Asl SG, Nourani N, Shayanfar A. Solubilization of drugs using beta-cyclodextrin: Experimental data and modeling. ANNALES PHARMACEUTIQUES FRANÇAISES 2024; 82:663-672. [PMID: 38340807 DOI: 10.1016/j.pharma.2024.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/30/2024] [Accepted: 02/05/2024] [Indexed: 02/12/2024]
Abstract
Many drug candidates fail to complete the entire drug development process because of poor physicochemical properties. Solubility is an important physicochemical property which plays a vital role in various stages of drug discovery and development. Several methods have been proposed to enhance the solubility of drugs, and complex formation with cyclodextrins is among them. Beta-cyclodextrin (βCD) is a common excipient for solubilization of drugs. The aim of this study is to develop the mechanistic QSPR models to predict the solubility enhancement of a drug in the presence of βCD. In this study, the solubility enhancement of some drugs in the presence of 10mM βCD at 25°C was experimentally determined or collected from the literature. Two different models to predict the solubilization by βCD were developed by binary logistic regression using structural properties of drugs with more than 80% accuracy. Polar surface area and excess molar refraction are the main parameters for estimating solubilization by βCD. Moreover, other descriptors related to hydrophobicity and the capability of hydrogen bonding formation of molecules could improve the accuracy of the established models.
Collapse
Affiliation(s)
- Navid Kaboudi
- Student Research Committee, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Saba Ghasemi Asl
- Student Research Committee, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Nasim Nourani
- Biotechnology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Ali Shayanfar
- Pharmaceutical Analysis Research Center, Tabriz University of Medical Sciences, Tabriz, Iran; Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
38
|
Bessa CDPB, Feu AE, de Menezes RPB, Scotti MT, Lima JMG, Lima ML, Tempone AG, de Andrade JP, Bastida J, Borges WDS. Multitarget anti-parasitic activities of isoquinoline alkaloids isolated from Hippeastrum aulicum (Amaryllidaceae). PHYTOMEDICINE : INTERNATIONAL JOURNAL OF PHYTOTHERAPY AND PHYTOPHARMACOLOGY 2024; 128:155414. [PMID: 38503155 DOI: 10.1016/j.phymed.2024.155414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/02/2024] [Accepted: 02/03/2024] [Indexed: 03/21/2024]
Abstract
BACKGROUND Chagas disease and leishmaniasis affect a significant portion of the Latin American population and still lack efficient treatments. In this context, natural products emerge as promising compounds for developing more effective therapies, aiming to mitigate side effects and drug resistance. Notably, species from the Amaryllidaceae family emerge as potential reservoirs of antiparasitic agents due to the presence of diverse biologically active alkaloids. PURPOSE To assess the anti-Trypanosoma cruzi and anti-Leishmania infantum activity of five isolated alkaloids from Hippeastrum aulicum Herb. (Amaryllidaceae) against different life stages of the parasites using in silico and in vitro assays. Furthermore, molecular docking was employed to evaluate the interaction of the most active alkaloids. METHODS Five natural isoquinoline alkaloids isolated in suitable quantities for in vitro testing underwent preliminary in silico analysis to predict their potential efficacy against Trypanosoma cruzi (amastigote and trypomastigote forms) and Leishmania infantum (amastigote and promastigote forms). The in vitro antiparasitic activity and mammalian cytotoxicity were investigated with a subsequent comparison of both analysis (in silico and in vitro) findings. Additionally, this study employed the molecular docking technique, utilizing cruzain (T. cruzi) and sterol 14α-demethylase (CYP51, L. infantum) as crucial biological targets for parasite survival, specifically focusing on compounds that exhibited promising activities against both parasites. RESULTS Through computational techniques, it was identified that the alkaloids haemanthamine (1) and lycorine (8) were the most active against T. cruzi (amastigote and trypomastigote) and L. infantum (amastigote and promastigote), while also revealing unprecedented activity of alkaloid 7‑methoxy-O-methyllycorenine (6). The in vitro analysis confirmed the in silico tests, in which compound 1 presented the best activities against the promastigote and amastigote forms of L. infantum with half-maximal inhibitory concentration (IC50) 0.6 µM and 1.78 µM, respectively. Compound 8 exhibited significant activity against the amastigote form of T. cruzi (IC50 7.70 µM), and compound 6 demonstrated activity against the trypomastigote forms of T. cruzi and amastigote of L. infantum, with IC50 values of 89.55 and 86.12 µM, respectively. Molecular docking analyses indicated that alkaloids 1 and 8 exhibited superior interaction energies compared to the inhibitors. CONCLUSION The hitherto unreported potential of compound 6 against T. cruzi trypomastigotes and L. infantum amastigotes is now brought to the forefront. Furthermore, the acquired dataset signifies that the isolated alkaloids 1 and 8 from H. aulicum might serve as prototypes for subsequent structural refinements aimed at the exploration of novel leads against both T. cruzi and L. infantum parasites.
Collapse
Affiliation(s)
- Carliani Dal Piero Betzel Bessa
- Programa de Pós-Graduação em Química, Departamento de Química, Universidade Federal do Espírito Santo, Vitória-ES 29075-910, Brazil
| | - Amanda Eiriz Feu
- Programa de Pós-Graduação em Química, Departamento de Química, Universidade Federal do Espírito Santo, Vitória-ES 29075-910, Brazil
| | - Renata Priscila Barros de Menezes
- Programa de Pós-graduação em Produtos Naturais e Sintéticos Bioativos (PgPNSB), Universidade Federal da Paraíba, Campus I, Cidade Universitária, João Pessoa 58051-900, Brazil
| | - Marcus Tullius Scotti
- Programa de Pós-graduação em Produtos Naturais e Sintéticos Bioativos (PgPNSB), Universidade Federal da Paraíba, Campus I, Cidade Universitária, João Pessoa 58051-900, Brazil
| | | | - Marta Lopes Lima
- School of Life Sciences, University of Dundee, Scotland DD1 4HN, United Kingdom
| | | | - Jean Paulo de Andrade
- Departamento de Medicina Traslacional, Facultad de Medicina, Escuela de Química y Farmacia, Universidad Católica del Maule, Talca 3480112, Chile
| | - Jaume Bastida
- Departament de Biologia, Sanitat i Medi Ambient, Facultat de Farmàcia i Ciències de l´Alimentació, Universitat de Barcelona, Barcelona 08028, Spain
| | - Warley de Souza Borges
- Programa de Pós-Graduação em Química, Departamento de Química, Universidade Federal do Espírito Santo, Vitória-ES 29075-910, Brazil.
| |
Collapse
|
39
|
Wang J, Wang P, Liu B, Kinney PL, Huang L, Chen K. Comprehensive evaluation framework for intervention on health effects of ambient temperature. ECO-ENVIRONMENT & HEALTH 2024; 3:154-164. [PMID: 38646097 PMCID: PMC11031729 DOI: 10.1016/j.eehl.2024.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/28/2023] [Accepted: 01/12/2024] [Indexed: 04/23/2024]
Abstract
Despite the existence of many interventions to mitigate or adapt to the health effects of climate change, their effectiveness remains unclear. Here, we introduce the Comprehensive Evaluation Framework for Intervention on Health Effects of Ambient Temperature to evaluate study designs and effects of intervention studies. The framework comprises three types of interventions: proactive, indirect, and direct, and four categories of indicators: classification, methods, scope, and effects. We trialed the framework by an evaluation of existing intervention studies. The evaluation revealed that each intervention has its own applicable characteristics in terms of effectiveness, feasibility, and generalizability scores. We expanded the framework's potential by offering a list of intervention recommendations in different scenarios. Future applications are then explored to establish models of the relationship between study designs and intervention effects, facilitating effective interventions to address the health effects of ambient temperature under climate change.
Collapse
Affiliation(s)
- Jiaming Wang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
| | - Peng Wang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
- Faculty of Civil Engineering and Mechanics, Jiangsu University, Zhenjiang 212013, China
| | - Beibei Liu
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
| | - Patrick L. Kinney
- Department of Environmental Health, Boston University School of Public Health, Boston, MA 02118, USA
| | - Lei Huang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
- Center for Public Health Research, Medical School of Nanjing University, Nanjing 210093, China
| | - Kai Chen
- Department of Environmental Health Sciences, Yale Center on Climate Change and Health, Yale School of Public Health, New Haven, CT 06510, USA
| |
Collapse
|
40
|
Melo-Filho CC, Su G, Liu K, Muratov EN, Tropsha A, Liu J. Modeling interactions between Heparan sulfate and proteins based on the Heparan sulfate microarray analysis. Glycobiology 2024; 34:cwae039. [PMID: 38836441 PMCID: PMC11180703 DOI: 10.1093/glycob/cwae039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 04/30/2024] [Accepted: 05/29/2024] [Indexed: 06/06/2024] Open
Abstract
Heparan sulfate (HS), a sulfated polysaccharide abundant in the extracellular matrix, plays pivotal roles in various physiological and pathological processes by interacting with proteins. Investigating the binding selectivity of HS oligosaccharides to target proteins is essential, but the exhaustive inclusion of all possible oligosaccharides in microarray experiments is impractical. To address this challenge, we present a hybrid pipeline that integrates microarray and in silico techniques to design oligosaccharides with desired protein affinity. Using fibroblast growth factor 2 (FGF2) as a model protein, we assembled an in-house dataset of HS oligosaccharides on microarrays and developed two structural representations: a standard representation with all atoms explicit and a simplified representation with disaccharide units as "quasi-atoms." Predictive Quantitative Structure-Activity Relationship (QSAR) models for FGF2 affinity were developed using the Random Forest (RF) algorithm. The resulting models, considering the applicability domain, demonstrated high predictivity, with a correct classification rate of 0.81-0.80 and improved positive predictive values (PPV) up to 0.95. Virtual screening of 40 new oligosaccharides using the simplified model identified 15 computational hits, 11 of which were experimentally validated for high FGF2 affinity. This hybrid approach marks a significant step toward the targeted design of oligosaccharides with desired protein interactions, providing a foundation for broader applications in glycobiology.
Collapse
Affiliation(s)
- Cleber C Melo-Filho
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, 301 Beard Hall, University of North Carolina, Chapel Hill, NC 27599, United States
| | - Guowei Su
- Glycan Therapeutics, 617 Hutton Street, Raleigh, NC 27606, United States
| | - Kevin Liu
- Glycan Therapeutics, 617 Hutton Street, Raleigh, NC 27606, United States
| | - Eugene N Muratov
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, 301 Beard Hall, University of North Carolina, Chapel Hill, NC 27599, United States
| | - Alexander Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, 301 Beard Hall, University of North Carolina, Chapel Hill, NC 27599, United States
| | - Jian Liu
- Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy, 1044 Genetic Medicine Bldg., University of North Carolina, Chapel Hill, NC 27599, United States
| |
Collapse
|
41
|
Camargo PG, Dos Santos CR, Girão Albuquerque M, Rangel Rodrigues C, Lima CHDS. Py-CoMFA, docking, and molecular dynamics simulations of Leishmania (L.) amazonensis arginase inhibitors. Sci Rep 2024; 14:11575. [PMID: 38773273 PMCID: PMC11109165 DOI: 10.1038/s41598-024-62520-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 05/17/2024] [Indexed: 05/23/2024] Open
Abstract
Leishmaniasis is a disease caused by a protozoan of the genus Leishmania, affecting millions of people, mainly in tropical countries, due to poor social conditions and low economic development. First-line chemotherapeutic agents involve highly toxic pentavalent antimonials, while treatment failure is mainly due to the emergence of drug-resistant strains. Leishmania arginase (ARG) enzyme is vital in pathogenicity and contributes to a higher infection rate, thus representing a potential drug target. This study helps in designing ARG inhibitors for the treatment of leishmaniasis. Py-CoMFA (3D-QSAR) models were constructed using 34 inhibitors from different chemical classes against ARG from L. (L.) amazonensis (LaARG). The 3D-QSAR predictions showed an excellent correlation between experimental and calculated pIC50 values. The molecular docking study identified the favorable hydrophobicity contribution of phenyl and cyclohexyl groups as substituents in the enzyme allosteric site. Molecular dynamics simulations of selected protein-ligand complexes were conducted to understand derivatives' interaction modes and affinity in both active and allosteric sites. Two cinnamide compounds, 7g and 7k, were identified, with similar structures to the reference 4h allosteric site inhibitor. These compounds can guide the development of more effective arginase inhibitors as potential antileishmanial drugs.
Collapse
Affiliation(s)
- Priscila Goes Camargo
- Faculdade de Farmácia, Departamento de Fármacos e Medicamentos, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Carine Ribeiro Dos Santos
- Laboratório de Modelagem Molecular (LabMMol), Instituto de Química, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Magaly Girão Albuquerque
- Laboratório de Modelagem Molecular (LabMMol), Instituto de Química, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Carlos Rangel Rodrigues
- Faculdade de Farmácia, Departamento de Fármacos e Medicamentos, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil.
| | - Camilo Henrique da Silva Lima
- Laboratório de Modelagem Molecular (LabMMol), Instituto de Química, Universidade Federal do Rio de Janeiro, Rio de Janeiro, RJ, Brazil.
| |
Collapse
|
42
|
Walter M, Webb SJ, Gillet VJ. Interpreting Neural Network Models for Toxicity Prediction by Extracting Learned Chemical Features. J Chem Inf Model 2024; 64:3670-3688. [PMID: 38686880 PMCID: PMC11094726 DOI: 10.1021/acs.jcim.4c00127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 04/15/2024] [Accepted: 04/15/2024] [Indexed: 05/02/2024]
Abstract
Neural network models have become a popular machine-learning technique for the toxicity prediction of chemicals. However, due to their complex structure, it is difficult to understand predictions made by these models which limits confidence. Current techniques to tackle this problem such as SHAP or integrated gradients provide insights by attributing importance to the input features of individual compounds. While these methods have produced promising results in some cases, they do not shed light on how representations of compounds are transformed in hidden layers, which constitute how neural networks learn. We present a novel technique to interpret neural networks which identifies chemical substructures in training data found to be responsible for the activation of hidden neurons. For individual test compounds, the importance of hidden neurons is determined, and the associated substructures are leveraged to explain the model prediction. Using structural alerts for mutagenicity from the Derek Nexus expert system as ground truth, we demonstrate the validity of the approach and show that model explanations are competitive with and complementary to explanations obtained from an established feature attribution method.
Collapse
Affiliation(s)
- Moritz Walter
- Information
School, University of Sheffield, The Wave, 2 Whitham Road, Sheffield S10 2AH, U.K.
| | - Samuel J. Webb
- Lhasa
Limited, Granary Wharf
House, 2 Canal Wharf, Leeds LS11 5PY, U.K.
| | - Valerie J. Gillet
- Information
School, University of Sheffield, The Wave, 2 Whitham Road, Sheffield S10 2AH, U.K.
| |
Collapse
|
43
|
Burger PB, Hu X, Balabin I, Muller M, Stanley M, Joubert F, Kaiser TM. FEP Augmentation as a Means to Solve Data Paucity Problems for Machine Learning in Chemical Biology. J Chem Inf Model 2024; 64:3812-3825. [PMID: 38651738 PMCID: PMC11094716 DOI: 10.1021/acs.jcim.4c00071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 04/01/2024] [Accepted: 04/02/2024] [Indexed: 04/25/2024]
Abstract
In the realm of medicinal chemistry, the primary objective is to swiftly optimize a multitude of chemical properties of a set of compounds to yield a clinical candidate poised for clinical trials. In recent years, two computational techniques, machine learning (ML) and physics-based methods, have evolved substantially and are now frequently incorporated into the medicinal chemist's toolbox to enhance the efficiency of both hit optimization and candidate design. Both computational methods come with their own set of limitations, and they are often used independently of each other. ML's capability to screen extensive compound libraries expediently is tempered by its reliance on quality data, which can be scarce especially during early-stage optimization. Contrarily, physics-based approaches like free energy perturbation (FEP) are frequently constrained by low throughput and high cost by comparison; however, physics-based methods are capable of making highly accurate binding affinity predictions. In this study, we harnessed the strength of FEP to overcome data paucity in ML by generating virtual activity data sets which then inform the training of algorithms. Here, we show that ML algorithms trained with an FEP-augmented data set could achieve comparable predictive accuracy to data sets trained on experimental data from biological assays. Throughout the paper, we emphasize key mechanistic considerations that must be taken into account when aiming to augment data sets and lay the groundwork for successful implementation. Ultimately, the study advocates for the synergy of physics-based methods and ML to expedite the lead optimization process. We believe that the physics-based augmentation of ML will significantly benefit drug discovery, as these techniques continue to evolve.
Collapse
Affiliation(s)
- Pieter B. Burger
- Avicenna
Biosciences Inc., 101
W. Chapel Hill Street, Suite 210, Durham, North Carolina 27001, United States
| | - Xiaohu Hu
- Schrödinger,
Inc., 120 West 45th Street, New York, New York 10036, United States
| | - Ilya Balabin
- Avicenna
Biosciences Inc., 101
W. Chapel Hill Street, Suite 210, Durham, North Carolina 27001, United States
| | - Morné Muller
- Avicenna
Biosciences Inc., 101
W. Chapel Hill Street, Suite 210, Durham, North Carolina 27001, United States
| | - Megan Stanley
- Microsoft
Research AI4Science, 21 Station Road, Cambridge CB1 2FB, U.K.
| | - Fourie Joubert
- Centre
for Bioinformatics and Computational Biology, Department of Biochemistry,
Genetics and Microbiology, University of
Pretoria, Pretoria 0001, South Africa
| | - Thomas M. Kaiser
- Avicenna
Biosciences Inc., 101
W. Chapel Hill Street, Suite 210, Durham, North Carolina 27001, United States
| |
Collapse
|
44
|
Ni B, Wang H, Khalaf HKS, Blay V, Houston DR. AutoDock-SS: AutoDock for Multiconformational Ligand-Based Virtual Screening. J Chem Inf Model 2024; 64:3779-3789. [PMID: 38624083 PMCID: PMC11094722 DOI: 10.1021/acs.jcim.4c00136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 04/01/2024] [Accepted: 04/02/2024] [Indexed: 04/17/2024]
Abstract
Ligand-based virtual screening (LBVS) can be pivotal for identifying potential drug leads, especially when the target protein's structure is unknown. However, current LBVS methods are limited in their ability to consider the ligand conformational flexibility. This study presents AutoDock-SS (Similarity Searching), which adapts protein-ligand docking for use in LBVS. AutoDock-SS integrates novel ligand-based grid maps and AutoDock-GPU into a novel three-dimensional LBVS workflow. Unlike other approaches based on pregenerated conformer libraries, AutoDock-SS's built-in conformational search optimizes conformations dynamically based on the reference ligand, thus providing a more accurate representation of relevant ligand conformations. AutoDock-SS supports two modes: single and multiple ligand queries, allowing for the seamless consideration of multiple reference ligands. When tested on the Directory of Useful Decoys─Enhanced (DUD-E) data set, AutoDock-SS surpassed alternative 3D LBVS methods, achieving a mean AUROC of 0.775 and an EF1% of 25.72 in single-reference mode. The multireference mode, evaluated on the augmented DUD-E+ data set, demonstrated superior accuracy with a mean AUROC of 0.843 and an EF1% of 34.59. This enhanced performance underscores AutoDock-SS's ability to treat compounds as conformationally flexible while considering the ligand's shape, pharmacophore, and electrostatic potential, expanding the potential of LBVS methods.
Collapse
Affiliation(s)
- Boyang Ni
- Institute
for Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh EH9 3BF, U.K.
| | - Haoying Wang
- Institute
for Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh EH9 3BF, U.K.
| | - Huda Kadhim Salem Khalaf
- Institute
for Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh EH9 3BF, U.K.
| | - Vincent Blay
- Department
of Microbiology and Environmental Toxicology, University of California at Santa Cruz, Santa Cruz, California 95064, United States
| | - Douglas R. Houston
- Institute
for Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh EH9 3BF, U.K.
| |
Collapse
|
45
|
Lovrić M, Wang T, Staffe MR, Šunić I, Časni K, Lasky-Su J, Chawes B, Rasmussen MA. A Chemical Structure and Machine Learning Approach to Assess the Potential Bioactivity of Endogenous Metabolites and Their Association with Early Childhood Systemic Inflammation. Metabolites 2024; 14:278. [PMID: 38786755 PMCID: PMC11122766 DOI: 10.3390/metabo14050278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2024] [Revised: 04/29/2024] [Accepted: 05/08/2024] [Indexed: 05/25/2024] Open
Abstract
Metabolomics has gained much attention due to its potential to reveal molecular disease mechanisms and present viable biomarkers. This work uses a panel of untargeted serum metabolomes from 602 children from the COPSAC2010 mother-child cohort. The annotated part of the metabolome consists of 517 chemical compounds curated using automated procedures. We created a filtering method for the quantified metabolites using predicted quantitative structure-bioactivity relationships for the Tox21 database on nuclear receptors and stress response in cell lines. The metabolites measured in the children's serums are predicted to affect specific targeted models, known for their significance in inflammation, immune function, and health outcomes. The targets from Tox21 have been used as targets with quantitative structure-activity relationships (QSARs). They were trained for ~7000 structures, saved as models, and then applied to the annotated metabolites to predict their potential bioactivities. The models were selected based on strict accuracy criteria surpassing random effects. After application, 52 metabolites showed potential bioactivity based on structural similarity with known active compounds from the Tox21 set. The filtered compounds were subsequently used and weighted by their bioactive potential to show an association with early childhood hs-CRP levels at six months in a linear model supporting a physiological adverse effect on systemic low-grade inflammation.
Collapse
Affiliation(s)
- Mario Lovrić
- COPSAC, Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital, 2820 Gentofte, Denmark
- Centre for Applied Bioanthropology, Institute for Anthropological Research, 10000 Zagreb, Croatia;
- The Lisbon Council, 1040 Brussels, Belgium
| | - Tingting Wang
- COPSAC, Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital, 2820 Gentofte, Denmark
| | - Mads Rønnow Staffe
- Department of Food Science, University of Copenhagen, 1958 Frederiksberg, Denmark
| | - Iva Šunić
- Centre for Applied Bioanthropology, Institute for Anthropological Research, 10000 Zagreb, Croatia;
| | | | - Jessica Lasky-Su
- Department of Medicine, Boston, MA 02115, USA
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Bo Chawes
- COPSAC, Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital, 2820 Gentofte, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, 2300 Copenhagen, Denmark
| | - Morten Arendt Rasmussen
- COPSAC, Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital, 2820 Gentofte, Denmark
- Department of Food Science, University of Copenhagen, 1958 Frederiksberg, Denmark
| |
Collapse
|
46
|
Yang Z, Huang T, Pan L, Wang J, Wang L, Ding J, Xiao J. QuanDB: a quantum chemical property database towards enhancing 3D molecular representation learning. J Cheminform 2024; 16:48. [PMID: 38685101 PMCID: PMC11059686 DOI: 10.1186/s13321-024-00843-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 04/24/2024] [Indexed: 05/02/2024] Open
Abstract
Previous studies have shown that the three-dimensional (3D) geometric and electronic structure of molecules play a crucial role in determining their key properties and intermolecular interactions. Therefore, it is necessary to establish a quantum chemical (QC) property database containing the most stable 3D geometric conformations and electronic structures of molecules. In this study, a high-quality QC property database, called QuanDB, was developed, which included structurally diverse molecular entities and featured a user-friendly interface. Currently, QuanDB contains 154,610 compounds sourced from public databases and scientific literature, with 10,125 scaffolds. The elemental composition comprises nine elements: H, C, O, N, P, S, F, Cl, and Br. For each molecule, QuanDB provides 53 global and 5 local QC properties and the most stable 3D conformation. These properties are divided into three categories: geometric structure, electronic structure, and thermodynamics. Geometric structure optimization and single point energy calculation at the theoretical level of B3LYP-D3(BJ)/6-311G(d)/SMD/water and B3LYP-D3(BJ)/def2-TZVP/SMD/water, respectively, were applied to ensure highly accurate calculations of QC properties, with the computational cost exceeding 107 core-hours. QuanDB provides high-value geometric and electronic structure information for use in molecular representation models, which are critical for machine-learning-based molecular design, thereby contributing to a comprehensive description of the chemical compound space. As a new high-quality dataset for QC properties, QuanDB is expected to become a benchmark tool for the training and optimization of machine learning models, thus further advancing the development of novel drugs and materials. QuanDB is freely available, without registration, at https://quandb.cmdrg.com/ .
Collapse
Affiliation(s)
- Zhijiang Yang
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China
| | - Tengxin Huang
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China
| | - Li Pan
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China
| | - Jingjing Wang
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China
| | - Liangliang Wang
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China.
| | - Junjie Ding
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China.
| | - Junhua Xiao
- State Key Laboratory of NBC Protection for Civilian, Beijing, People's Republic of China.
| |
Collapse
|
47
|
Kim K, Jang A, Shin H, Ye I, Lee JE, Kim T, Park H, Hong S. Concurrent Optimizations of Efficacy and Blood-Brain Barrier Permeability in New Macrocyclic LRRK2 Inhibitors for Potential Parkinson's Disease Therapeutics. J Med Chem 2024. [PMID: 38684226 DOI: 10.1021/acs.jmedchem.4c00520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
The elevated activity of leucine-rich repeat kinase 2 (LRRK2) is implicated in the pathogenesis of Parkinson's disease (PD). The quest for effective LRRK2 inhibitors has been impeded by the formidable challenge of crossing the blood-brain barrier (BBB). We leveraged structure-based de novo design and developed robust three-dimensional quantitative structure-activity relationship (3D-QSAR) models to predict BBB permeability, enhancing the likelihood of the inhibitor's brain accessibility. Our strategy involved the synthesis of macrocyclic molecules by linking the two terminal nitrogen atoms of HG-10-102-01 with an alkyl chain ranging from 2 to 4 units, laying the groundwork for innovative LRRK2 inhibitor designs. Through meticulous computational and synthetic optimization of both biochemical efficacy and BBB permeability, 9 out of 14 synthesized candidates demonstrated potent low-nanomolar inhibition and significant BBB penetration. Further assessments of in vitro and in vivo effectiveness, coupled with pharmacological profiling, highlighted 8 as the promising new lead compound for PD therapeutics.
Collapse
Affiliation(s)
- Kewon Kim
- Department of Chemistry, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
- Center for Catalytic Hydrocarbon Functionalizations, Institute for Basic Science (IBS), Daejeon 34141, Korea
| | - Ahyoung Jang
- Department of Chemistry, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
- Center for Catalytic Hydrocarbon Functionalizations, Institute for Basic Science (IBS), Daejeon 34141, Korea
| | - Hochul Shin
- Whan In Pharmaceutical Co., Ltd., 11, Beobwon-ro 6-gil, Songpa-gu, Seoul 05855, Korea
| | - Inhae Ye
- Whan In Pharmaceutical Co., Ltd., 11, Beobwon-ro 6-gil, Songpa-gu, Seoul 05855, Korea
| | - Ji Eun Lee
- Whan In Pharmaceutical Co., Ltd., 11, Beobwon-ro 6-gil, Songpa-gu, Seoul 05855, Korea
| | - Taeho Kim
- Department of Bioscience and Biotechnology, Sejong University, 209 Neungdong-ro, Kwangjin-gu, Seoul 05006, Korea
| | - Hwangseo Park
- Department of Bioscience and Biotechnology, Sejong University, 209 Neungdong-ro, Kwangjin-gu, Seoul 05006, Korea
| | - Sungwoo Hong
- Department of Chemistry, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea
- Center for Catalytic Hydrocarbon Functionalizations, Institute for Basic Science (IBS), Daejeon 34141, Korea
| |
Collapse
|
48
|
Song L, Zhu H, Wang K, Li M. LGGA-MPP: Local Geometry-Guided Graph Attention for Molecular Property Prediction. J Chem Inf Model 2024; 64:3105-3113. [PMID: 38516950 DOI: 10.1021/acs.jcim.3c02058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2024]
Abstract
Molecular property prediction is a fundamental task of drug discovery. With the rapid development of deep learning, computational approaches for predicting molecular properties are experiencing increasing popularity. However, these existing methods often ignore the 3D information on molecules, which is critical in molecular representation learning. In the past few years, several self-supervised learning (SSL) approaches have been proposed to exploit the geometric information by using pre-training on 3D molecular graphs and fine-tuning on 2D molecular graphs. Most of these approaches are based on the global geometry of molecules, and there is still a challenge in capturing the local structure and local interpretability. To this end, we propose local geometry-guided graph attention (LGGA), which integrates local geometry into the attention mechanism and message-passing of graph neural networks (GNNs). LGGA introduces a novel method to model molecules, enhancing the model's ability to capture intricate local structural details. Experiments on various data sets demonstrate that the integration of local geometry has a significant impact on the improved results, and our model outperforms the state-of-the-art methods for molecular property prediction, establishing its potential as a promising tool in drug discovery and related fields.
Collapse
Affiliation(s)
- Lei Song
- School of Software, XinJiang University, Urumqi 830091, China
| | - Huimin Zhu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Kaili Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
49
|
Fan J, Shi S, Xiang H, Fu L, Duan Y, Cao D, Lu H. Predicting Elimination of Small-Molecule Drug Half-Life in Pharmacokinetics Using Ensemble and Consensus Machine Learning Methods. J Chem Inf Model 2024; 64:3080-3092. [PMID: 38563433 DOI: 10.1021/acs.jcim.3c02030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Half-life is a significant pharmacokinetic parameter included in the excretion phase of absorption, distribution, metabolism, and excretion. It is one of the key factors for the successful marketing of drug candidates. Therefore, predicting half-life is of great significance in drug design. In this study, we employed eXtreme Gradient Boosting (XGboost), randomForest (RF), gradient boosting machine (GBM), and supporting vector machine (SVM) to build quantitative structure-activity relationship (QSAR) models on 3512 compounds and evaluated model performance by using root-mean-square error (RMSE), R2, and mean absolute error (MAE) metrics and interpreted features by SHapley Additive exPlanation (SHAP). Furthermore, we developed consensus models through integrating four individual models and validated their performance using a Y-randomization test and applicability domain analysis. Finally, matched molecular pair analysis was used to extract the transformation rules. Our results revealed that XGboost outperformed other individual models (RMSE = 0.176, R2 = 0.845, MAE = 0.141). The consensus model integrating all four models continued to enhance prediction performance (RMSE = 0.172, R2 = 0.856, MAE = 0.138). We evaluated the reliability, robustness, and generalization ability via Y-randomization test and applicability domain analysis. Meanwhile, we utilized SHAP to interpret features and employed matched molecular pair analysis to extract chemical transformation rules that provide suggestions for optimizing drug structure. In conclusion, we believe that the consensus model developed in this study serve as a reliable tool to evaluate half-life in drug discovery, and the chemical transformation rules concluded in this study could provide valuable suggestions in drug discovery.
Collapse
Affiliation(s)
- Jianing Fan
- Health Management Center, Third Xiangya Hospital of Central South University, Changsha, Hunan 410013, P. R. China
- Department of Cardiology, Third Xiangya Hospital of Central South University, Changsha, Hunan 410013, P. R. China
| | - Shaohua Shi
- School of Chinese Medicine, Hong Kong Baptist University, Kowloon, Hong Kong 999077, P. R. China
| | - Hong Xiang
- Center for Experimental Medicine, Third Xiangya Hospital of Central South University, Changsha, Hunan 410013, P. R. China
| | - Li Fu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, P. R. China
| | - Yanjing Duan
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410013, P. R. China
- Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Kowloon, Hong Kong SAR 999077, P. R. China
- Department of Pharmacy, Xiangya Hospital, Central South University, Changsha 410008, Hunan P. R. China
| | - Hongwei Lu
- Health Management Center, Third Xiangya Hospital of Central South University, Changsha, Hunan 410013, P. R. China
- Department of Cardiology, Third Xiangya Hospital of Central South University, Changsha, Hunan 410013, P. R. China
- Center for Experimental Medicine, Third Xiangya Hospital of Central South University, Changsha, Hunan 410013, P. R. China
| |
Collapse
|
50
|
Aminu KS, Uzairu A, Chandra A, Singh N, Abechi SE, Shallangwa GA, Umar AB. Exploring the potential of 2-arylbenzimidazole scaffolds as novel α-amylase inhibitors: QSAR, molecular docking, simulation and pharmacokinetic studies. In Silico Pharmacol 2024; 12:29. [PMID: 38617707 PMCID: PMC11009192 DOI: 10.1007/s40203-024-00205-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 03/13/2024] [Indexed: 04/16/2024] Open
Abstract
Previous studies have shown that 2-arylbenzimidazole derivatives have a strong anti-diabetic effect. To further explore this potential, we develop new analogues of the compound using ligand-based drug design and tested their inhibitory and binding properties through QSAR analyses, molecular docking, dynamic simulations and pharmacokinetic studies. By using quantitative structure activity relationship and ligand-based modification, a highly precise predictive model and design of potent compounds was developed from the derivatives of 2-arylbenzimidazoles. Molecular docking and simulation studies were then conducted to identify the optimal binding poses and pharmacokinetic profiles of the newly generated therapeutic drugs. DFT was employed to optimize the chemical structures of 2-arylbenzimidazole derivatives using B3LYP/6-31G* as the basis set. The model with the highest R2trng set, R2adj, Q2cv, and R2test sets (0.926, 0.912, 0.903, and 0.709 respectively) was chosen to predict the inhibitory activities of the derivatives. Five analogues designed using ligand-based strategy had higher activity than the hit molecule. Additionally, the designed molecules had more favorable MolDock scores than the hit molecule and acarbose and simulation studies confirm on their stability and binding affinities towards the protein. The ADME and druglikeness properties of the analogues indicated that they are safe to consume orally and have a high potential for total clearance. The results of this study showed that the suggested analogues could act as α-amylase inhibitors, which could be used as a basis for the creation of new drugs to treat type 2 diabetes mellitus.
Collapse
Affiliation(s)
- Khalifa Sunusi Aminu
- Department of Chemistry, Ahmadu Bello University, Zaria, Nigeria
- Department of Pure and Industrial Chemistry, Bayero University, Kano, Nigeria
| | - Adamu Uzairu
- Department of Chemistry, Ahmadu Bello University, Zaria, Nigeria
| | - Anshuman Chandra
- School of Physical Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Nagendra Singh
- School of Biotechnology, Gautam Buddha University, Greater Noida, India
| | | | | | | |
Collapse
|