1
|
He Y, Gao Y, Liu K, Han W. Database, prediction, and antibacterial research of astringency based on large language models. Comput Biol Med 2025; 184:109375. [PMID: 39531926 DOI: 10.1016/j.compbiomed.2024.109375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2024] [Revised: 11/05/2024] [Accepted: 11/06/2024] [Indexed: 11/16/2024]
Abstract
Astringency, a sensory experience causing mouth dryness, significantly impacts the taste of foods such as wine and tea, and astringent molecules may exhibit antibacterial properties. Traditional methods for predicting astringency are costly, and the connection between astringency and antibacterial activity remains largely unexplored. In this study, we present a pioneering computational approach that includes: (1) the creation of the first comprehensive astringency database comprising 238 molecules; (2) the development of a Ligand-Based Prediction (LBP) framework that combines large language models, deep learning, and traditional machine learning for enhanced molecular and peptide prediction; (3) an astringency predictor achieving 0.95 accuracy and 0.90 AUC, validated through electronic tongue measurements; (4) antibacterial predictors for molecules and peptides with accuracies of 0.92 and 0.88, respectively, revealing that 51 % of astringent molecules possess antibacterial properties; (5) accessibility of these predictors via the AstringentPD and ABPD web servers. This work not only enhances the understanding of taste-related molecules but also elucidates the relationship between astringency and antibacterial properties, setting the stage for future explorations in food science and medicinal applications.
Collapse
Affiliation(s)
- Yi He
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Sciences, Jilin University, 2699 Qianjin Street, Changchun, 130012, China
| | - Yilin Gao
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Sciences, Jilin University, 2699 Qianjin Street, Changchun, 130012, China
| | - Kaifeng Liu
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Sciences, Jilin University, 2699 Qianjin Street, Changchun, 130012, China
| | - Weiwei Han
- Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Sciences, Jilin University, 2699 Qianjin Street, Changchun, 130012, China.
| |
Collapse
|
2
|
Ramahi ADA, Shinde VV, Pearce TC, Sinka IC. Virtual screening of drug materials for pharmaceutical tablet manufacturability with reference to sticking. Int J Pharm 2024; 667:124722. [PMID: 39293578 DOI: 10.1016/j.ijpharm.2024.124722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 09/12/2024] [Accepted: 09/13/2024] [Indexed: 09/20/2024]
Abstract
The manufacturing of pharmaceutical solid dosage forms, such as tablets involves a large number of successive processing operations including crystallisation of the drug substance, granulation, drying, milling, mixing of the formulation, and compaction. Each step is fraught with manufacturing problems. Undesired adhesion of powders to the surface of the compaction tooling, known as sticking, is a frequent and highly disruptive problem that occurs at the very end of the process chain when the tablet is formed. As alternatives to the mechanistic approaches to address sticking, we introduce two different machine learning strategies to predict sticking directly from the chemical formula of the drug substance, represented by molecular descriptors. An empirical database for sticking behaviour was developed and used to train the machine learning (ML) algorithms to predict sticking characteristics from molecular descriptors. The ML model has successfully classified sticking/non-sticking behaviour of powders with 100% separation. Predictions were made for materials in the Handbook of Pharmaceutical Excipients and a subset of molecules included in the ChemBL database, demonstrating the potential use of machine learning approaches to screen for sticking propensity early during drug discovery and development. This is the first time molecular descriptors and machine learning are used to predict and screen for sticking behaviour. The method has potential to transform the development of medicines by providing manufacturability information at the drug screening stage and is potentially applicable to other manufacturing problems controlled by the chemistry of the drug substance.
Collapse
|
3
|
Cui R, Ickler M, Menath J, Vogel N, Klinger D. Nanogels with tailored hydrophobicity and their behavior at air/water interfaces. SOFT MATTER 2024; 21:100-112. [PMID: 39629622 DOI: 10.1039/d4sm01186d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2024]
Abstract
The interfacial behavior of micro-/nanogels is governed to a large extent by the hydrophobicity of their polymeric network. Prevailing studies to examine this influence mostly rely on external stimuli like temperature or pH to modulate the particle hydrophobicity. Here, a sudden transition between hydrophilic and hydrophobic state prevents systematic and gradual modulation of hydrophobicity. This limits detailed correlations between interfacial behavior and network hydrophobicity. To address this challenge, we introduce a nanogel platform that allows accurate tuning of hydrophobicity on a molecular level. For this, via post-functionalization of active ester-based particles, we prepare poly(N-(2-hydroxypropyl)methacrylamide) (PHPMA) nanogels as a hydrophilic benchmark and introduce gradually varied amounts of hydrophobic propyl or dodecyl moieties to increase the nanogel hydrophobicity. We study the deformation and arrangement of these particles at an air/water interface and correlate the results with quantitative measures for nanogel hydrophobicity. We observe that increasing hydrophobicity of nanogels, either by increasing the hydrophobic moiety ratio or the alkyl chain length, leads to decreased particle deformability and aggregation of an interfacially-adsorbed monolayer. Contrary to what may be intuitively assumed, these changes are not gradual, but rather occur suddenly above a threshold in hydrophobicity. Our study further shows that the effect of hydrophobicity affects the nanogel properties differently in bulk and when adsorbed at liquid interfaces. Thus, this study establishes the transition of interfacial behavior between soft gel-like particles to a solid spherical morphology triggered by the increase in hydrophobicity.
Collapse
Affiliation(s)
- Ruiguang Cui
- Institute of Pharmacy, Freie Universität Berlin, Königin-Luise-Str. 2-4, 14197 Berlin, Germany.
| | - Maret Ickler
- Institute of Particle Technology, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058 Erlangen, Germany.
| | - Johannes Menath
- Institute of Particle Technology, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058 Erlangen, Germany.
| | - Nicolas Vogel
- Institute of Particle Technology, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058 Erlangen, Germany.
| | - Daniel Klinger
- Institute of Pharmacy, Freie Universität Berlin, Königin-Luise-Str. 2-4, 14197 Berlin, Germany.
| |
Collapse
|
4
|
Chen S, Fan T, Zhang N, Zhao L, Zhong R, Sun G. The oral acute toxicity of per- and polyfluoroalkyl compounds (PFASs) to Rat and Mouse: A mechanistic interpretation and prioritization analysis of untested PFASs by QSAR, q-RASAR and interspecies modelling methods. JOURNAL OF HAZARDOUS MATERIALS 2024; 480:136071. [PMID: 39383696 DOI: 10.1016/j.jhazmat.2024.136071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 09/07/2024] [Accepted: 10/04/2024] [Indexed: 10/11/2024]
Abstract
Per- and polyfluoroalkyl substances (PFASs) are widely used in modern industry, causing many adverse effects on both the environment and human health. In this study, for the first time, we followed OECD guidelines to systematically investigate the quantitative structure-activity relationship (QSAR) of the oral acute toxicity of PFASs to Rat and Mouse using simple 2D descriptors. The Read-Across similarity descriptors and 2D descriptors were also combined to develop the quantitative read-across structure-activity relationship (q-RASAR) models. Interspecies toxicity (iST) correlation was also explored between the two rodent species. All developed QSAR, q-RASAR and iST models met the state-of-the-art validation criteria and were applied for toxicity predictions of hundreds of untested PFASs in true external sets. Subsequently, we performed the priority ranking of the untested PFASs based on the model predictions, with the mechanistic interpretation of the top 20 most toxic PFASs predicted by both QSAR and q-RASAR models. The two univariate iST models were also used for filling the interspecies toxicity data gap. Overall, the developed QSAR, q-RASAR and iST models can be used as effective tools for predicting the oral acute toxicity of untested PFASs to Rat and Mouse, thus being important for risk assessment of PFASs in ecological environment.
Collapse
Affiliation(s)
- Shuo Chen
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Tengjiao Fan
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China; Department of Medical Technology, Beijing Pharmaceutical University of Staff and Workers (CPC Party School of Beijing Tong Ren Tang (Group) co., Ltd.), Beijing 100079, China
| | - Na Zhang
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Lijiao Zhao
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Rugang Zhong
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Guohui Sun
- Beijing Key Laboratory of Environmental and Viral Oncology, College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China.
| |
Collapse
|
5
|
Pandey SK, Roy K. Development of hybrid models by the integration of the read-across hypothesis with the QSAR framework for the assessment of developmental and reproductive toxicity (DART) tested according to OECD TG 414. Toxicol Rep 2024; 13:101822. [PMID: 39649380 PMCID: PMC11621937 DOI: 10.1016/j.toxrep.2024.101822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2024] [Revised: 11/15/2024] [Accepted: 11/18/2024] [Indexed: 12/10/2024] Open
Abstract
The governing laws mandate animal testing guidelines (TG) to assess the developmental and reproductive toxicity (DART) potential of new and current chemical compounds for the categorization, hazard identification, and labeling. In silico modeling has evolved as a promising, economical, and animal-friendly technique for assessing a chemical's potential for DART testing. The complexity of the endpoint has presented a problem for Quantitative Structure-Activity Relationship (QSAR) model developers as various facets of the chemical have to be appropriately analyzed to predict the DART. For the next-generation risk assessment (NGRA) studies, researchers and governing bodies are exploring various new approach methodologies (NAMs) integrated to address complex endpoints like repeated dose toxicity and DART. We have developed four hybrid computational models for DART studies of rodents and rabbits for their adult and fetal life stages separately. The hybrid models were created by integrating QSAR features with similarities-derived features (obtained from read-across hypotheses). This analysis has identified that this integrated method gives a better statistical quality compared to the traditional QSAR models, and the predictivity and transferability of the model are also enhanced in this new approach.
Collapse
|
6
|
Braun J, Katzberger P, Landrum GA, Riniker S. Understanding and Quantifying Molecular Flexibility: Torsion Angular Bin Strings. J Chem Inf Model 2024; 64:7917-7924. [PMID: 39390326 PMCID: PMC11523068 DOI: 10.1021/acs.jcim.4c01513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 10/01/2024] [Accepted: 10/02/2024] [Indexed: 10/12/2024]
Abstract
Molecular flexibility is a commonly used, but not easily quantified term. It is at the core of understanding composition and size of a conformational ensemble and contributes to many molecular properties. For many computational workflows, it is necessary to reduce a conformational ensemble to meaningful representatives, however defining them and guaranteeing the ensemble's completeness is difficult. We introduce the concepts of torsion angular bin strings (TABS) as a discrete vector representation of a conformer's dihedral angles and the number of possible TABS (nTABS) as an estimation for the ensemble size of a molecule, respectively. Here, we show that nTABS corresponds to an upper limit for the size of the conformational space of small molecules and compare the classification of conformer ensembles by TABS with classifications by RMSD. Overcoming known drawbacks like the molecular size dependency and threshold picking of the RMSD measure, TABS is shown to meaningfully discretize the conformational space and hence allows e.g. for fast checks of the coverage of the conformational space. The current proof-of-concept implementation is based on the ETKDGv3 conformer generator as implemented in the RDKit and known torsion preferences extracted from small-molecule crystallographic data.
Collapse
Affiliation(s)
- Jessica Braun
- Department of Chemistry and Applied
Biosciences, ETH Zurich
Vladimir-Prelog-Weg 2, Zurich 8093, Switzerland
| | - Paul Katzberger
- Department of Chemistry and Applied
Biosciences, ETH Zurich
Vladimir-Prelog-Weg 2, Zurich 8093, Switzerland
| | - Gregory A. Landrum
- Department of Chemistry and Applied
Biosciences, ETH Zurich
Vladimir-Prelog-Weg 2, Zurich 8093, Switzerland
| | - Sereina Riniker
- Department of Chemistry and Applied
Biosciences, ETH Zurich
Vladimir-Prelog-Weg 2, Zurich 8093, Switzerland
| |
Collapse
|
7
|
Fan J, Qian C, Zhou S. A Universal Framework for General Prediction of Physicochemical Properties: The Natural Growth Model. RESEARCH (WASHINGTON, D.C.) 2024; 7:0510. [PMID: 39445107 PMCID: PMC11496607 DOI: 10.34133/research.0510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 09/24/2024] [Accepted: 09/29/2024] [Indexed: 10/25/2024]
Abstract
To precisely and reasonably describe the contribution of interatomic and intermolecular interactions to the physicochemical properties of complex systems, a chemical message passing strategy as driven by graph neural network is proposed. Thus, by distinguishing inherent and environmental features of atoms, as well as proper delivering of these messages upon growth of systems from atoms to bulk level, the evolution of system features affords eventually the target properties like the adsorption wavelength, emission wavelength, solubility, photoluminescence quantum yield, ionization energy, and lipophilicity. Considering that such a model combines chemical principles and natural behavior of atom aggregation crossing multiple scales, most likely, it will be proven to be rational and efficient for more general aims in dealing with complex systems.
Collapse
Affiliation(s)
- Jinming Fan
- College of Chemical and Biological Engineering, Zhejiang Provincial Key Laboratory of Advanced Chemical Engineering Manufacture Technology,
Zhejiang University, 310058 Hangzhou, P. R. China
- Zhejiang Provincial Innovation Center of Advanced Chemicals Technology,
Institute of Zhejiang University - Quzhou, 324000 Quzhou, P. R. China
| | - Chao Qian
- College of Chemical and Biological Engineering, Zhejiang Provincial Key Laboratory of Advanced Chemical Engineering Manufacture Technology,
Zhejiang University, 310058 Hangzhou, P. R. China
- Zhejiang Provincial Innovation Center of Advanced Chemicals Technology,
Institute of Zhejiang University - Quzhou, 324000 Quzhou, P. R. China
| | - Shaodong Zhou
- College of Chemical and Biological Engineering, Zhejiang Provincial Key Laboratory of Advanced Chemical Engineering Manufacture Technology,
Zhejiang University, 310058 Hangzhou, P. R. China
- Zhejiang Provincial Innovation Center of Advanced Chemicals Technology,
Institute of Zhejiang University - Quzhou, 324000 Quzhou, P. R. China
| |
Collapse
|
8
|
Yang B, Schaefer AJ, Small BL, Leseberg JA, Bischof SM, Webster-Gardiner MS, Ess DH. Experimentally-based Fe-catalyzed ethene oligomerization machine learning model provides highly accurate prediction of propagation/termination selectivity. Chem Sci 2024:d4sc03433c. [PMID: 39449687 PMCID: PMC11495513 DOI: 10.1039/d4sc03433c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2024] [Accepted: 10/09/2024] [Indexed: 10/26/2024] Open
Abstract
Linear α-olefins (1-alkenes) are critical comonomers for ethene copolymerization. A major impediment in the development of new homogeneous Fe catalysts for ethene oligomerization to produce comonomers and other important commercial products is the prediction of propagation versus termination rates that control the α-olefin distribution (e.g., 1-butene through 1-decene), which is often referred to as a K-value. Because the transition states for propagation versus termination are generally separated by less than a one kcal mol-1 difference in energy, this selectivity cannot be accurately predicted by either DFT or wavefunction methods (even DLPNO-CCSD(T)). Therefore, we developed a sub-kcal mol-1 accuracy machine learning model based on several hundred experimental selectivity values and straightforward 2D chemical and physical features that enables the prediction of α-olefin distribution K-values. As part of our model, we developed a new ad hoc feature that boosted the model performance. This machine learning model captures the effects of a broad range of ligand architectures and chemically nonintuitive trends in oligomerization selectivity. Our machine learning model was experimentally validated by prediction of a K-value for a new Fe phosphaneyl-pyridinyl-quinoline catalyst followed by experimental measurement that showed precise agreement. In addition to quantitative predictions, we demonstrate how this machine learning model can provide qualitative catalyst design using proximity of pairs type analysis.
Collapse
Affiliation(s)
- Bo Yang
- Department of Chemistry and Biochemistry, Brigham Young University Provo Utah 84602 USA
| | - Anthony J Schaefer
- Department of Chemistry and Biochemistry, Brigham Young University Provo Utah 84602 USA
| | - Brooke L Small
- Research & Technology, Chevron Phillips Chemical 1862 Kingwood Drive Kingwood Texas 77339 USA
| | - Julie A Leseberg
- Research & Technology, Chevron Phillips Chemical 1862 Kingwood Drive Kingwood Texas 77339 USA
| | - Steven M Bischof
- Research & Technology, Chevron Phillips Chemical 1862 Kingwood Drive Kingwood Texas 77339 USA
| | | | - Daniel H Ess
- Department of Chemistry and Biochemistry, Brigham Young University Provo Utah 84602 USA
| |
Collapse
|
9
|
Ivanov J, Tenchov R, Ralhan K, Iyer KA, Agarwal S, Zhou QA. In Silico Insights: QSAR Modeling of TBK1 Kinase Inhibitors for Enhanced Drug Discovery. J Chem Inf Model 2024; 64:7488-7502. [PMID: 39289178 PMCID: PMC11480986 DOI: 10.1021/acs.jcim.4c00864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 08/17/2024] [Accepted: 09/05/2024] [Indexed: 09/19/2024]
Abstract
TBK1, or TANK-binding kinase 1, is an enzyme that functions as a serine/threonine protein kinase. It plays a crucial role in various cellular processes, including the innate immune response to viruses, cell proliferation, apoptosis, autophagy, and antitumor immunity. Dysregulation of TBK1 activity can lead to autoimmune diseases, neurodegenerative disorders, and cancer. Due to its central role in these critical pathways, TBK1 is a significant focus of research for therapeutic drug development. In this paper, we explore data from the CAS Content Collection regarding TBK1 and its implication in a large assortment of diseases and disorders. With the demand for developing efficient TBK1 inhibitors being outlined, we focus on utilizing a machine learning approach for developing predictive models for TBK1 inhibition, derived from the fragment-functional analysis descriptors. Using the extensive CAS Content Collection, we assembled a training set of TBK1 inhibitors with experimentally measured IC50 values. We explored several machine learning techniques combined with various molecular descriptors to derive and select the best TBK1 inhibitor QSAR models. Certain significant structural alerts that potentially contribute to inhibition of TBK1 are outlined and discussed. The merit of the article stems from identifying the most adequate TBK1 QSAR models and subsequent successful development of advanced positive training data to facilitate and enhance drug discovery for an important therapeutic target such as TBK1 inhibitors, based on an extensive, wide-ranging set of scientific information provided by the CAS Content Collection.
Collapse
Affiliation(s)
- Julian
M. Ivanov
- CAS,
A Division of the American Chemical Society, Columbus, Ohio 43210, United States
| | - Rumiana Tenchov
- CAS,
A Division of the American Chemical Society, Columbus, Ohio 43210, United States
| | | | | | | | | |
Collapse
|
10
|
Banerjee A, Kar S, Roy K, Patlewicz G, Charest N, Benfenati E, Cronin MTD. Molecular similarity in chemical informatics and predictive toxicity modeling: from quantitative read-across (q-RA) to quantitative read-across structure-activity relationship (q-RASAR) with the application of machine learning. Crit Rev Toxicol 2024; 54:659-684. [PMID: 39225123 DOI: 10.1080/10408444.2024.2386260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 07/25/2024] [Accepted: 07/25/2024] [Indexed: 09/04/2024]
Abstract
This article aims to provide a comprehensive critical, yet readable, review of general interest to the chemistry community on molecular similarity as applied to chemical informatics and predictive modeling with a special focus on read-across (RA) and read-across structure-activity relationships (RASAR). Molecular similarity-based computational tools, such as quantitative structure-activity relationships (QSARs) and RA, are routinely used to fill the data gaps for a wide range of properties including toxicity endpoints for regulatory purposes. This review will explore the background of RA starting from how structural information has been used through to how other similarity contexts such as physicochemical, absorption, distribution, metabolism, and elimination (ADME) properties, and biological aspects are being characterized. More recent developments of RA's integration with QSAR have resulted in the emergence of novel models such as ToxRead, generalized read-across (GenRA), and quantitative RASAR (q-RASAR). Conventional QSAR techniques have been excluded from this review except where necessary for context.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Department of Pharmaceutical Technology, Drug Theoretics and Cheminformatics (DTC) Laboratory, Jadavpur University, Kolkata, India
| | - Supratik Kar
- Department of Chemistry and Physics, Chemometrics & Molecular Modeling Laboratory, Kean University, Union, NJ, USA
| | - Kunal Roy
- Department of Pharmaceutical Technology, Drug Theoretics and Cheminformatics (DTC) Laboratory, Jadavpur University, Kolkata, India
| | - Grace Patlewicz
- Center for Computational Toxicology and Exposure, US Environmental Protection Agency, Research Triangle Park, NC, USA
| | - Nathaniel Charest
- Center for Computational Toxicology and Exposure, US Environmental Protection Agency, Research Triangle Park, NC, USA
| | - Emilio Benfenati
- Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Mark T D Cronin
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
| |
Collapse
|
11
|
Ascencio-Medina E, He S, Daghighi A, Iduoku K, Casanola-Martin GM, Arrasate S, González-Díaz H, Rasulev B. Prediction of Dielectric Constant in Series of Polymers by Quantitative Structure-Property Relationship (QSPR). Polymers (Basel) 2024; 16:2731. [PMID: 39408442 PMCID: PMC11478900 DOI: 10.3390/polym16192731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 09/13/2024] [Accepted: 09/24/2024] [Indexed: 10/20/2024] Open
Abstract
This work is devoted to the investigation of dielectric permittivity which is influenced by electronic, ionic, and dipolar polarization mechanisms, contributing to the material's capacity to store electrical energy. In this study, an extended dataset of 86 polymers was analyzed, and two quantitative structure-property relationship (QSPR) models were developed to predict dielectric permittivity. From an initial set of 1273 descriptors, the most relevant ones were selected using a genetic algorithm, and machine learning models were built using the Gradient Boosting Regressor (GBR). In contrast to Multiple Linear Regression (MLR)- and Partial Least Squares (PLS)-based models, the gradient boosting models excel in handling nonlinear relationships and multicollinearity, iteratively optimizing decision trees to improve accuracy without overfitting. The developed GBR models showed high R2 coefficients of 0.938 and 0.822, for the training and test sets, respectively. An Accumulated Local Effect (ALE) technique was applied to assess the relationship between the selected descriptors-eight for the GB_A model and six for the GB_B model, and their impact on target property. ALE analysis revealed that descriptors such as TDB09m had a strong positive effect on permittivity, while MLOGP2 showed a negative effect. These results highlight the effectiveness of the GBR approach in predicting the dielectric properties of polymers, offering improved accuracy and interpretability.
Collapse
Affiliation(s)
- Estefania Ascencio-Medina
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND 58102, USA; (E.A.-M.); (S.H.); (A.D.); (K.I.); (G.M.C.-M.)
- IKERDATA S.L., ZITEK, University of the Basque Country (UPV/EHU), Rectorate Building, 48940 Bilbao, Biscay, Spain
| | - Shan He
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND 58102, USA; (E.A.-M.); (S.H.); (A.D.); (K.I.); (G.M.C.-M.)
- IKERDATA S.L., ZITEK, University of the Basque Country (UPV/EHU), Rectorate Building, 48940 Bilbao, Biscay, Spain
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), P.O. Box 644, 48940 Bilbao, Biscay, Spain; (S.A.); (H.G.-D.)
| | - Amirreza Daghighi
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND 58102, USA; (E.A.-M.); (S.H.); (A.D.); (K.I.); (G.M.C.-M.)
- Biomedical Engineering Program, North Dakota State University, Fargo, ND 58105, USA
| | - Kweeni Iduoku
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND 58102, USA; (E.A.-M.); (S.H.); (A.D.); (K.I.); (G.M.C.-M.)
- Biomedical Engineering Program, North Dakota State University, Fargo, ND 58105, USA
| | - Gerardo M. Casanola-Martin
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND 58102, USA; (E.A.-M.); (S.H.); (A.D.); (K.I.); (G.M.C.-M.)
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), P.O. Box 644, 48940 Bilbao, Biscay, Spain; (S.A.); (H.G.-D.)
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), P.O. Box 644, 48940 Bilbao, Biscay, Spain; (S.A.); (H.G.-D.)
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND 58102, USA; (E.A.-M.); (S.H.); (A.D.); (K.I.); (G.M.C.-M.)
- Biomedical Engineering Program, North Dakota State University, Fargo, ND 58105, USA
| |
Collapse
|
12
|
Ivanov SM. Calculated hydration free energies become less accurate with increases in molecular weight. PLoS One 2024; 19:e0309996. [PMID: 39298397 DOI: 10.1371/journal.pone.0309996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 08/22/2024] [Indexed: 09/21/2024] Open
Abstract
In order for computer-aided drug design to fulfil its long held promise of delivering new medicines faster and cheaper, extensive development and validation work must be done first. This pertains particularly to molecular dynamics force fields where one important aspect-the hydration free energy (HFE) of small molecules-is often insufficiently analyzed. While most benchmarking studies report excellent accuracies of calculated hydration free energies-usually within 2 kcal/mol of experimental values-we find that deeper analysis reveals significant shortcomings. Herein, we report a dependence of HFE prediction errors on ligand molecular weight-the higher the weight, the bigger the prediction error and the higher the probability the calculated result is erroneous by a large amount. We show that in the drug-like molecular weight region, HFE predictions can easily be off by 5 kcal/mol or more. This is likely to be highly problematic in a drug discovery and development setting. We make our HFE results and molecular descriptors freely and fully available in order to encourage deeper analysis of future molecular dynamics results and facilitate development of the next generation of force fields.
Collapse
Affiliation(s)
- Stefan M Ivanov
- Faculty of Pharmacy, Medical University of Sofia, Sofia, Bulgaria
| |
Collapse
|
13
|
König C, Vellido A. Understanding predictions of drug profiles using explainable machine learning models. BioData Min 2024; 17:25. [PMID: 39090651 PMCID: PMC11293102 DOI: 10.1186/s13040-024-00378-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Accepted: 07/26/2024] [Indexed: 08/04/2024] Open
Abstract
PURPOSE The analysis of absorption, distribution, metabolism, and excretion (ADME) molecular properties is of relevance to drug design, as they directly influence the drug's effectiveness at its target location. This study concerns their prediction, using explainable Machine Learning (ML) models. The aim of the study is to find which molecular features are relevant to the prediction of the different ADME properties and measure their impact on the predictive model. METHODS The relative relevance of individual features for ADME activity is gauged by estimating feature importance in ML models' predictions. Feature importance is calculated using feature permutation and the individual impact of features is measured by SHAP additive explanations. RESULTS The study reveals the relevance of specific molecular descriptors for each ADME property and quantifies their impact on the ADME property prediction. CONCLUSION The reported research illustrates how explainable ML models can provide detailed insights about the individual contributions of molecular features to the final prediction of an ADME property, as an effort to support experts in the process of drug candidate selection through a better understanding of the impact of molecular features.
Collapse
Affiliation(s)
- Caroline König
- Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Centre, Universitat Politècnica de Catalunya (UPC Barcelona Tech), Jordi Girona 1-3, Barcelona, 08034, Catalonia, Spain.
- Department of Computer Science, Universitat Politècnica de Catalunya (UPC Barcelona Tech), Jordi Girona 1-3, Barcelona, 08034, Catalonia, Spain.
| | - Alfredo Vellido
- Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Centre, Universitat Politècnica de Catalunya (UPC Barcelona Tech), Jordi Girona 1-3, Barcelona, 08034, Catalonia, Spain
- Department of Computer Science, Universitat Politècnica de Catalunya (UPC Barcelona Tech), Jordi Girona 1-3, Barcelona, 08034, Catalonia, Spain
| |
Collapse
|
14
|
Gomatam A, Hirlekar BU, Singh KD, Murty US, Dixit VA. Improved QSAR models for PARP-1 inhibition using data balancing, interpretable machine learning, and matched molecular pair analysis. Mol Divers 2024; 28:2135-2152. [PMID: 38374474 DOI: 10.1007/s11030-024-10809-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 01/07/2024] [Indexed: 02/21/2024]
Abstract
The poly (ADP-ribose) polymerase-1 (PARP-1) enzyme is an important target in the treatment of breast cancer. Currently, treatment options include the drugs Olaparib, Niraparib, Rucaparib, and Talazoparib; however, these drugs can cause severe side effects including hematological toxicity and cardiotoxicity. Although in silico models for the prediction of PARP-1 activity have been developed, the drawbacks of these models include low specificity, a narrow applicability domain, and a lack of interpretability. To address these issues, a comprehensive machine learning (ML)-based quantitative structure-activity relationship (QSAR) approach for the informed prediction of PARP-1 activity is presented. Classification models built using the Synthetic Minority Oversampling Technique (SMOTE) for data balancing gave robust and predictive models based on the K-nearest neighbor algorithm (accuracy 0.86, sensitivity 0.88, specificity 0.80). Regression models were built on structurally congeneric datasets, with the models for the phthalazinone class and fused cyclic compounds giving the best performance. In accordance with the Organization for Economic Cooperation and Development (OECD) guidelines, a mechanistic interpretation is proposed using the Shapley Additive Explanations (SHAP) to identify the important topological features to differentiate between PARP-1 actives and inactives. Moreover, an analysis of the PARP-1 dataset revealed the prevalence of activity cliffs, which possibly negatively impacts the model's predictive performance. Finally, a set of chemical transformation rules were extracted using the matched molecular pair analysis (MMPA) which provided mechanistic insights and can guide medicinal chemists in the design of novel PARP-1 inhibitors.
Collapse
Affiliation(s)
- Anish Gomatam
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, (NIPER Guwahati), Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India, Sila Katamur (Halugurisuk), Dist: Kamrup, P.O.: Changsari, Guwahati, Assam, 781101, India
| | - Bhakti Umesh Hirlekar
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, (NIPER Guwahati), Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India, Sila Katamur (Halugurisuk), Dist: Kamrup, P.O.: Changsari, Guwahati, Assam, 781101, India
| | - Krishan Dev Singh
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, (NIPER Guwahati), Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India, Sila Katamur (Halugurisuk), Dist: Kamrup, P.O.: Changsari, Guwahati, Assam, 781101, India
| | - Upadhyayula Suryanarayana Murty
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, (NIPER Guwahati), Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India, Sila Katamur (Halugurisuk), Dist: Kamrup, P.O.: Changsari, Guwahati, Assam, 781101, India
| | - Vaibhav A Dixit
- Department of Medicinal Chemistry, National Institute of Pharmaceutical Education and Research, (NIPER Guwahati), Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Govt. of India, Sila Katamur (Halugurisuk), Dist: Kamrup, P.O.: Changsari, Guwahati, Assam, 781101, India.
| |
Collapse
|
15
|
Lu G, Pan F, Li X, Zhu Z, Zhao L, Wu Y, Tian W, Peng W, Liu J. Virtual screening strategy for anti-DPP-IV natural flavonoid derivatives based on machine learning. J Biomol Struct Dyn 2024; 42:6645-6659. [PMID: 37489054 DOI: 10.1080/07391102.2023.2237594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 07/06/2023] [Indexed: 07/26/2023]
Abstract
Flavonoids, especially their inhibitory effect on DPP-IV activity, have been widely recognized for their antidiabetic effects. However, the variety of natural flavonoid derivatives is very rich, and even subtle structural differences can lead to several orders of magnitude differences in their inhibitory activities against DPP-IV, which makes it challenging to find novel and potent anti-DPP-IV flavonoid derivatives experimentally. Therefore, there is an urgent need to develop an efficient screening pipeline that targets active natural products. Here, we propose a fusion strategy based on a QSAR model, and to simplify this process, it was applied to the discovery of flavonoid derivatives with potent anti-DPP-IV activity. First, the high-quality QSAR model (R test 2 = 0.816, MAEtest = 0.14, MSEtest = 0.026) was composed of seven key molecular property parameters, which were constructed with the genetic algorithm (GA) and passed the leave-one-out cross-validation evaluation. A total of 1,668 flavonoid derivatives were obtained from the natural product enriched by NPCD based on molecular fingerprint similarity (> 0.8). Further, the enriched flavonoid derivatives were further predicted and screened using the QED score combined with the QSAR model, and a total of 33 flavonoid derivatives (IC50pre < 6.5 μM) were found. Subsequently, three flavonoid derivatives (5,7,3',5'-tetrahydroxyflavone, 3,7-dihydroxy-5,3',4'-trimethoxyflavone, and 5,7,2',5'-tetrahydroxyflavone) with highly effective anti-DPP-IV activity were obtained by ADMET analysis. Finally, the DPP-IV inhibitory potential of these three flavonoid derivatives was verified by 100 ns MD simulation and MM/PB(GB)SA.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Gen Lu
- Key Laboratory of Livestock Infectious Diseases, Ministry of Education, Shenyang Agricultural University, Shenyang, China
| | - Fei Pan
- State Key Laboratory of Resource Insects, Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
- Beijing Engineering and Technology Research Center of Food Additives, Beijing Technology and Business University, Beijing, China
| | - Xiaotong Li
- Key Laboratory of Livestock Infectious Diseases, Ministry of Education, Shenyang Agricultural University, Shenyang, China
| | - Zehui Zhu
- Beijing Engineering and Technology Research Center of Food Additives, Beijing Technology and Business University, Beijing, China
| | - Lei Zhao
- Beijing Engineering and Technology Research Center of Food Additives, Beijing Technology and Business University, Beijing, China
| | - Ya Wu
- Institute of Resource Biology and Biotechnology, Department of Biotechnology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Wenli Tian
- State Key Laboratory of Resource Insects, Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Wenjun Peng
- State Key Laboratory of Resource Insects, Institute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Jinling Liu
- Key Laboratory of Livestock Infectious Diseases, Ministry of Education, Shenyang Agricultural University, Shenyang, China
| |
Collapse
|
16
|
Xu Y, Ma S, Cui H, Chen J, Xu S, Gong F, Golubovic A, Zhou M, Wang KC, Varley A, Lu RXZ, Wang B, Li B. AGILE platform: a deep learning powered approach to accelerate LNP development for mRNA delivery. Nat Commun 2024; 15:6305. [PMID: 39060305 PMCID: PMC11282250 DOI: 10.1038/s41467-024-50619-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 07/09/2024] [Indexed: 07/28/2024] Open
Abstract
Ionizable lipid nanoparticles (LNPs) are seeing widespread use in mRNA delivery, notably in SARS-CoV-2 mRNA vaccines. However, the expansion of mRNA therapies beyond COVID-19 is impeded by the absence of LNPs tailored for diverse cell types. In this study, we present the AI-Guided Ionizable Lipid Engineering (AGILE) platform, a synergistic combination of deep learning and combinatorial chemistry. AGILE streamlines ionizable lipid development with efficient library design, in silico lipid screening via deep neural networks, and adaptability to diverse cell lines. Using AGILE, we rapidly design, synthesize, and evaluate ionizable lipids for mRNA delivery, selecting from a vast library. Intriguingly, AGILE reveals cell-specific preferences for ionizable lipids, indicating tailoring for optimal delivery to varying cell types. These highlight AGILE's potential in expediting the development of customized LNPs, addressing the complex needs of mRNA delivery in clinical practice, thereby broadening the scope and efficacy of mRNA therapies.
Collapse
Affiliation(s)
- Yue Xu
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada
| | - Shihao Ma
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
- Peter Munk Cardiac Centre, University Health Network, Toronto, ON, Canada
| | - Haotian Cui
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
- Peter Munk Cardiac Centre, University Health Network, Toronto, ON, Canada
| | - Jingan Chen
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada
- Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada
| | - Shufen Xu
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada
| | - Fanglin Gong
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada
- Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada
| | - Alex Golubovic
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada
| | - Muye Zhou
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada
| | - Kevin Chang Wang
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada
| | - Andrew Varley
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada
| | - Rick Xing Ze Lu
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada
| | - Bo Wang
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada.
- Peter Munk Cardiac Centre, University Health Network, Toronto, ON, Canada.
- Princess Margaret Cancer Center, University Health Network, Toronto, ON, Canada.
| | - Bowen Li
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada.
- Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada.
- Department of Chemistry, University of Toronto, Toronto, ON, Canada.
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
17
|
Liang T, Liu W, Tan K, Wu A, Lu X. Advancing Ionic Liquid Research with pSCNN: A Novel Approach for Accurate Normal Melting Temperature Predictions. ACS OMEGA 2024; 9:31694-31702. [PMID: 39072063 PMCID: PMC11270577 DOI: 10.1021/acsomega.4c02393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 04/12/2024] [Accepted: 06/25/2024] [Indexed: 07/30/2024]
Abstract
Ionic liquids (ILs), known for their distinct and tunable properties, offer a broad spectrum of potential applications across various fields, including chemistry, materials science, and energy storage. However, practical applications of ILs are often limited by their unfavorable physicochemical properties. Experimental screening becomes impractical due to the vast number of potential IL combinations. Therefore, the development of a robust and efficient model for predicting the IL properties is imperative. As the defining feature, it is of practice significance to establish an accurate yet efficient model to predict the normal melting point of IL (T m), which may facilitate the discovery and design of novel ILs for specific applications. In this study, we presented a pseudo-Siamese convolution neural network (pSCNN) inspired by SCNN and focused on the T m. Utilizing a data set of 3098 ILs, we systematically assess various deep learning models (ANN, pSCNN, and Transformer-CNF), along with molecular descriptors (ECFP fingerprint and Mordred properties), for their performance in predicting the T m of ILs. Remarkably, among the investigated modeling schemes, the pSCNN, coupled with filtered Mordred descriptors, demonstrates superior performance, yielding mean absolute error (MAE) and root-mean-square error (RMSE) values of 24.36 and 31.56 °C, respectively. Feature analysis further highlights the effectiveness of the pSCNN model. Moreover, the pSCNN method, with a pair of inputs, can be extended beyond ionic liquid melting point prediction.
Collapse
Affiliation(s)
- Tao Liang
- State Key Laboratory of Physical
Chemistry of Solid Surface, Fujian Provincial Key Laboratory for Theoretical
and Computational Chemistry, Departmental of Chemistry, College of
Chemistry and Chemical Engineering, Xiamen
University, Xiamen 361005, P. R. China
| | - Wei Liu
- State Key Laboratory of Physical
Chemistry of Solid Surface, Fujian Provincial Key Laboratory for Theoretical
and Computational Chemistry, Departmental of Chemistry, College of
Chemistry and Chemical Engineering, Xiamen
University, Xiamen 361005, P. R. China
| | - Kai Tan
- State Key Laboratory of Physical
Chemistry of Solid Surface, Fujian Provincial Key Laboratory for Theoretical
and Computational Chemistry, Departmental of Chemistry, College of
Chemistry and Chemical Engineering, Xiamen
University, Xiamen 361005, P. R. China
| | - Anan Wu
- State Key Laboratory of Physical
Chemistry of Solid Surface, Fujian Provincial Key Laboratory for Theoretical
and Computational Chemistry, Departmental of Chemistry, College of
Chemistry and Chemical Engineering, Xiamen
University, Xiamen 361005, P. R. China
| | - Xin Lu
- State Key Laboratory of Physical
Chemistry of Solid Surface, Fujian Provincial Key Laboratory for Theoretical
and Computational Chemistry, Departmental of Chemistry, College of
Chemistry and Chemical Engineering, Xiamen
University, Xiamen 361005, P. R. China
| |
Collapse
|
18
|
Baran K, Barczak B, Kloskowski A. Modeling lignin extraction with ionic liquids using machine learning approach. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 935:173234. [PMID: 38768717 DOI: 10.1016/j.scitotenv.2024.173234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2024] [Revised: 04/25/2024] [Accepted: 05/12/2024] [Indexed: 05/22/2024]
Abstract
Lignin, next to cellulose, is the second most common natural biopolymer on Earth, containing a third of the organic carbon in the biosphere. For many years, lignin was perceived as waste when obtaining cellulose and hemicellulose and used as a biofuel for the production of bioenergy. However, recently, lignin has been considered a renewable raw material for the production of chemicals and materials to replace petrochemical resources. In this context, an increasing demand for high-quality lignin is to be expected. It is, therefore, essential to optimize the technological processes of obtaining it from natural sources, such as biomass. In this work, an investigation of the use of machine learning-based quantitative structure-property relationship (QSPR) modeling for the preliminary processing of lignin recovery from herbaceous biomass using ionic liquids (ILs) is described. Training of the models using experimental data collected from original publications on the topic is assumed, and molecular descriptors of the ionic liquids are used to represent structural information. The study explores the impact of both ILs' chemical structure and process parameters on the efficiency of lignin recovery from different bio sources. The findings give an insight into the extraction process and could serve as a foundation for further design of efficient and selective processes for lignin recovery using ionic liquids, which can have significant implications for producing biofuels, chemicals, and materials.
Collapse
Affiliation(s)
- Karol Baran
- Department of Physical Chemistry, Faculty of Chemistry, Gdansk University of Technology, Narutowicza 11/12, 80-233 Gdansk, Poland.
| | - Beata Barczak
- Department of Energy Conversion and Storage, Faculty of Chemistry, Gdansk University of Technology, Narutowicza 11/12, 80-233 Gdansk, Poland
| | - Adam Kloskowski
- Department of Physical Chemistry, Faculty of Chemistry, Gdansk University of Technology, Narutowicza 11/12, 80-233 Gdansk, Poland
| |
Collapse
|
19
|
Su S, Masuda T, Takai M. Explainable Prediction of Hydrophilic/Hydrophobic Property of Polymer Brush Surfaces by Chemical Modeling and Machine Learning. J Phys Chem B 2024; 128:6589-6597. [PMID: 38950384 DOI: 10.1021/acs.jpcb.3c08422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
Polymer informatics has attracted increasing attention as a specialized branch of material informatics. Hydrophilicity/hydrophobicity is one of the most important properties of interfaces involved in antifouling, self-cleaning, antifogging, oil/water separation, protein adsorption, and bioseparation. Establishing a quantitative structure-property relationship for the hydrophilicity/hydrophobicity of polymeric interfaces could significantly benefit from machine learning modeling. In this study, we aimed to construct machine learning models that could predict the static water contact angle (CA) as an indicator of hydrophilicity/hydrophobicity based on a data set of polymer brushes. The features of the polymer brush surfaces were numerically described using their grafted structures (thickness) and molecular descriptors derived from their chemical structures. We achieved accurate prediction and understanding of important parameters by employing appropriate molecular descriptors considering the Pearson correlation and machine learning models trained with nested cross-validation. The model interpretation by Shapley additive extension analysis indicated that the amount of partial polar/nonpolar structure in the molecule as well as the averaged hydrophobicity represented by MolLogP plays an important role in determining the CA. Moreover, the model can predict the CAs of polymer brushes composed of chemical structures that are not present in existing databases. The CA values of the hypothetical polymer brushes are predicted.
Collapse
Affiliation(s)
- Shiwei Su
- Department of Bioengineering, School of Engineering, The University of Tokyo 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8565, Japan
| | - Tsukuru Masuda
- Department of Bioengineering, School of Engineering, The University of Tokyo 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8565, Japan
| | - Madoka Takai
- Department of Bioengineering, School of Engineering, The University of Tokyo 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8565, Japan
| |
Collapse
|
20
|
Bandini E, Castellano Ontiveros R, Kajtazi A, Eghbali H, Lynen F. Physicochemical modelling of the retention mechanism of temperature-responsive polymeric columns for HPLC through machine learning algorithms. J Cheminform 2024; 16:72. [PMID: 38907264 PMCID: PMC11193285 DOI: 10.1186/s13321-024-00873-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 06/14/2024] [Indexed: 06/23/2024] Open
Abstract
Temperature-responsive liquid chromatography (TRLC) offers a promising alternative to reversed-phase liquid chromatography (RPLC) for environmentally friendly analytical techniques by utilizing pure water as a mobile phase, eliminating the need for harmful organic solvents. TRLC columns, packed with temperature-responsive polymers coupled to silica particles, exhibit a unique retention mechanism influenced by temperature-induced polymer hydration. An investigation of the physicochemical parameters driving separation at high and low temperatures is crucial for better column manufacturing and selectivity control. Assessment of predictability using a dataset of 139 molecules analyzed at different temperatures elucidated the molecular descriptors (MDs) relevant to retention mechanisms. Linear regression, support vector regression (SVR), and tree-based ensemble models were evaluated, with no standout performer. The precision, accuracy, and robustness of models were validated through metrics, such as r and mean absolute error (MAE), and statistical analysis. At 45 ∘ C , logP predominantly influenced retention, akin to reversed-phase columns, while at5 ∘ C , complex interactions with lipophilic and negative MDs, along with specific functional groups, dictated retention. These findings provide deeper insights into TRLC mechanisms, facilitating method development and maximizing column potential.
Collapse
Affiliation(s)
- Elena Bandini
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Univeristy of Ghent, Krijgslaan 281 S4bis, Ghent, 9000, Belgium.
| | - Rodrigo Castellano Ontiveros
- School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, 11428, Sweden
| | - Ardiana Kajtazi
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Univeristy of Ghent, Krijgslaan 281 S4bis, Ghent, 9000, Belgium
| | - Hamed Eghbali
- Packaging and Specialty Plastics R&D, Dow Benelux B.V., Terneuzen, 4530 AA, the Netherlands
| | - Frédéric Lynen
- Separation Science Group, Department of Organic and Macromolecular Chemistry, Univeristy of Ghent, Krijgslaan 281 S4bis, Ghent, 9000, Belgium
| |
Collapse
|
21
|
Adessi TG, Wagner PM, Bisogno FR, Nicotra VE, Guido ME, García ME. Enhancing structural diversity through chemical engineering of Ambrosia tenuifolia extract for novel anti-glioblastoma compounds. Sci Rep 2024; 14:14229. [PMID: 38902325 PMCID: PMC11190268 DOI: 10.1038/s41598-024-63639-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 05/30/2024] [Indexed: 06/22/2024] Open
Abstract
Natural products are an unsurpassed source of leading structures in drug discovery. The biosynthetic machinery of the producing organism offers an important source for modifying complex natural products, leading to analogs that are unattainable by chemical semisynthesis or total synthesis. In this report, through the combination of natural products chemistry and diversity-oriented synthesis, a diversity-enhanced extracts approach is proposed using chemical reactions that remodel molecular scaffolds directly on extracts of natural resources. This method was applied to subextract enriched in sesquiterpene lactones from Ambrosia tenuifolia (Fam. Asteraceae) using acid media conditions (p-toluenesulfonic acid) to change molecular skeletons. The chemically modified extract was then fractionated by a bioguided approach to obtain the pure compounds responsible for the anti-glioblastoma (GBM) activity in T98G cell cultures. Indeed, with the best candidate, chronobiological experiments were performed to evaluate temporal susceptibility to the treatment on GBM cell cultures to define the best time to apply the therapy. Finally, bioinformatics tools were used to supply qualitative and quantitative information on the physicochemical properties, chemical space, and structural similarity of the compound library obtained. As a result, natural products derivatives containing new molecular skeletons were obtained, with possible applications as chemotherapeutic agents against human GBM T98G cell cultures.
Collapse
Affiliation(s)
- Tonino G Adessi
- Facultad de Ciencias Químicas, Universidad Nacional de Córdoba (UNC), Edificio de Ciencias Químicas 2, Haya de la Torre y Medina Allende, Ciudad Universitaria, CP X5000HUA, Córdoba, Argentina
- Instituto Multidisciplinario de Biología Vegetal (IMBIV), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Córdoba, Argentina
| | - Paula M Wagner
- Facultad de Ciencias Químicas, Universidad Nacional de Córdoba (UNC), Edificio de Ciencias Químicas 2, Haya de la Torre y Medina Allende, Ciudad Universitaria, CP X5000HUA, Córdoba, Argentina
- Departamento de Química Biológica Ranwel Caputto, Centro de Investigaciones en Química Biológica de Córdoba (CIQUIBIC-CONICET), Córdoba, Argentina
| | - Fabricio R Bisogno
- Facultad de Ciencias Químicas, Universidad Nacional de Córdoba (UNC), Edificio de Ciencias Químicas 2, Haya de la Torre y Medina Allende, Ciudad Universitaria, CP X5000HUA, Córdoba, Argentina
- Instituto de Investigaciones en Físico-Química de Córdoba (INFIQC-CONICET), Córdoba, Argentina
| | - Viviana E Nicotra
- Facultad de Ciencias Químicas, Universidad Nacional de Córdoba (UNC), Edificio de Ciencias Químicas 2, Haya de la Torre y Medina Allende, Ciudad Universitaria, CP X5000HUA, Córdoba, Argentina
- Instituto Multidisciplinario de Biología Vegetal (IMBIV), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Córdoba, Argentina
| | - Mario E Guido
- Facultad de Ciencias Químicas, Universidad Nacional de Córdoba (UNC), Edificio de Ciencias Químicas 2, Haya de la Torre y Medina Allende, Ciudad Universitaria, CP X5000HUA, Córdoba, Argentina
- Departamento de Química Biológica Ranwel Caputto, Centro de Investigaciones en Química Biológica de Córdoba (CIQUIBIC-CONICET), Córdoba, Argentina
| | - Manuela E García
- Facultad de Ciencias Químicas, Universidad Nacional de Córdoba (UNC), Edificio de Ciencias Químicas 2, Haya de la Torre y Medina Allende, Ciudad Universitaria, CP X5000HUA, Córdoba, Argentina.
- Instituto Multidisciplinario de Biología Vegetal (IMBIV), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Córdoba, Argentina.
| |
Collapse
|
22
|
Chen P, Zhao N, Wang R, Chen G, Hu Y, Dou Z, Ban C. Hepatotoxicity and lipid metabolism disorders of 8:2 polyfluoroalkyl phosphate diester in zebrafish: In vivo and in silico evidence. JOURNAL OF HAZARDOUS MATERIALS 2024; 469:133807. [PMID: 38412642 DOI: 10.1016/j.jhazmat.2024.133807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 02/10/2024] [Accepted: 02/14/2024] [Indexed: 02/29/2024]
Abstract
8:2 polyfluoroalkyl phosphate diester (8:2 diPAP) has been shown to accumulate in the liver, but whether it induces hepatotoxicity and lipid metabolism disorders remains largely unknown. In this study, zebrafish embryos were exposed to 8:2 diPAP for 7 d. Hepatocellular hypertrophy and karyolysis were noted after exposure to 0.5 ng/L 8:2 diPAP, suggesting suppressed liver development. Compared to the water control, 8:2 diPAP led to significantly higher triglyceride and total cholesterol levels, but markedly lower levels of low-density lipoprotein, implying disturbed lipid homeostasis. The levels of two peroxisome proliferator activated receptor (PPAR) subtypes (pparα and pparγ) involved in hepatotoxicity and lipid metabolism were significantly upregulated by 8:2 diPAP, consistent with their overexpression as determined by immunohistochemistry. In silico results showed that 8:2 diPAP formed hydrogen bonds with PPARα and PPARγ. Among seven machine learning models, Adaptive Boosting performed the best in predicting the binding affinities of PPARα and PPARγ on the test set. The predicted binding affinity of 8:2 diPAP to PPARα (7.12) was higher than that to PPARγ (6.97) by Adaptive Boosting, which matched well with the experimental results. Our results revealed PPAR - mediated adverse effects of 8:2 diPAP on the liver and lipid metabolism of zebrafish larvae.
Collapse
Affiliation(s)
- Pengyu Chen
- Jiangsu Province Engineering Research Center for Marine Bio-resources Sustainable Utilization, College of Oceanography, Hohai University, Nanjing 210024, China; Key Laboratory of Integrated Regulation and Resources Development of Shallow Lakes of Ministry of Education, Hohai University, Nanjing 210024, China.
| | - Na Zhao
- Jiangsu Province Engineering Research Center for Marine Bio-resources Sustainable Utilization, College of Oceanography, Hohai University, Nanjing 210024, China
| | - Ruihan Wang
- Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Geng Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yuxi Hu
- Jiangsu Province Engineering Research Center for Marine Bio-resources Sustainable Utilization, College of Oceanography, Hohai University, Nanjing 210024, China
| | - Zhichao Dou
- Jiangsu Province Engineering Research Center for Marine Bio-resources Sustainable Utilization, College of Oceanography, Hohai University, Nanjing 210024, China
| | - Chenglong Ban
- Jiangsu Province Engineering Research Center for Marine Bio-resources Sustainable Utilization, College of Oceanography, Hohai University, Nanjing 210024, China
| |
Collapse
|
23
|
Shkil DO, Muhamedzhanova AA, Petrov PI, Skorb EV, Aliev TA, Steshin IS, Tumanov AV, Kislinskiy AS, Fedorov MV. Expanding Predictive Capacities in Toxicology: Insights from Hackathon-Enhanced Data and Model Aggregation. Molecules 2024; 29:1826. [PMID: 38675645 PMCID: PMC11055041 DOI: 10.3390/molecules29081826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 04/11/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024] Open
Abstract
In the realm of predictive toxicology for small molecules, the applicability domain of QSAR models is often limited by the coverage of the chemical space in the training set. Consequently, classical models fail to provide reliable predictions for wide classes of molecules. However, the emergence of innovative data collection methods such as intensive hackathons have promise to quickly expand the available chemical space for model construction. Combined with algorithmic refinement methods, these tools can address the challenges of toxicity prediction, enhancing both the robustness and applicability of the corresponding models. This study aimed to investigate the roles of gradient boosting and strategic data aggregation in enhancing the predictivity ability of models for the toxicity of small organic molecules. We focused on evaluating the impact of incorporating fragment features and expanding the chemical space, facilitated by a comprehensive dataset procured in an open hackathon. We used gradient boosting techniques, accounting for critical features such as the structural fragments or functional groups often associated with manifestations of toxicity.
Collapse
Affiliation(s)
- Dmitrii O. Shkil
- Syntelly LLC, Moscow 121205, Russia; (A.A.M.); (I.S.S.); (A.V.T.); (A.S.K.)
- Moscow Institute of Physics and Technology, Moscow 141700, Russia
| | | | | | - Ekaterina V. Skorb
- Infochemistry Scientific Center, ITMO University, Saint-Petersburg 191002, Russia; (E.V.S.); (T.A.A.)
| | - Timur A. Aliev
- Infochemistry Scientific Center, ITMO University, Saint-Petersburg 191002, Russia; (E.V.S.); (T.A.A.)
| | - Ilya S. Steshin
- Syntelly LLC, Moscow 121205, Russia; (A.A.M.); (I.S.S.); (A.V.T.); (A.S.K.)
| | | | | | - Maxim V. Fedorov
- Kharkevich Institute for Information Transmission Problems of Russian Academy of Sciences, Moscow 127994, Russia
| |
Collapse
|
24
|
Xin L, Yu H, Liu S, Ying GG, Chen CE. POPs identification using simple low-code machine learning. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 921:171143. [PMID: 38387592 DOI: 10.1016/j.scitotenv.2024.171143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 02/19/2024] [Accepted: 02/19/2024] [Indexed: 02/24/2024]
Abstract
Effectively identifying persistent organic pollutants (POPs) with extensive organic chemical datasets poses a formidable challenge but is of utmost importance. Leveraging machine learning techniques can enhance this process, but previous models often demanded advanced programming skills and high-end computing resources. In this study, we harnessed the simplicity of PyCaret, a Python-based package, to construct machine-learning models for POP screening based on 2D molecular descriptors. We compared the performance of these models against a deep convolutional neural network (DCNN) model. Utilising minimal Python code, we generated several models that exhibited superior or comparable performance to the DCNN. The most outstanding performer, the Light Gradient Boosting Machine (LGBM), achieved an accuracy of 96.20 %, an AUC of 97.70 %, and an F1 score of 82.58 %. This model outshone the DCNN model. Furthermore, it excelled in identifying POPs within the REACH PBT and compiled industrial chemical lists. Our findings highlight the accessibility and simplicity of PyCaret, requiring only a few lines of code, rendering it suitable for non-computing professionals in environmental sciences. The ability of low code machine learning tools (e.g. PyCaret) to facilitate model comparison and interpretation holds promise, encouraging prompt assessment and management of chemical substances.
Collapse
Affiliation(s)
- Lei Xin
- School of Environment, MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, South China Normal University, Guangzhou 510006, China
| | - Haiying Yu
- College of Geography and Environmental Sciences, Zhejiang Normal University, Jinhua 321004, China
| | - Sisi Liu
- School of Environment, MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, South China Normal University, Guangzhou 510006, China
| | - Guang-Guo Ying
- School of Environment, MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, South China Normal University, Guangzhou 510006, China
| | - Chang-Er Chen
- School of Environment, MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, Guangzhou 510006, China; Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, South China Normal University, Guangzhou 510006, China.
| |
Collapse
|
25
|
Pham TH, Le PK, Son DN. A data-driven QSPR model for screening organic corrosion inhibitors for carbon steel using machine learning techniques. RSC Adv 2024; 14:11157-11168. [PMID: 38590346 PMCID: PMC10999907 DOI: 10.1039/d4ra02159b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 04/02/2024] [Indexed: 04/10/2024] Open
Abstract
Machine learning (ML) techniques have shown great potential for screening corrosion inhibitors. In this study, a data-driven quantitative structure-property relationship (QSPR) model using the gradient boosting decision tree (GB) algorithm combined with the permutation feature importance (PFI) technique was developed to predict the corrosion inhibition efficiency (IE) of organic compounds on carbon steel. The results showed that the PFI method effectively selected the molecular descriptors most relevant to the IE. Using these important molecular descriptors, an IE predictive model was trained on a dataset encompassing various categories of organic corrosion inhibitors for carbon steel, achieving RMSE, MAE, and R2 of 6.40%, 4.80%, and 0.72, respectively. The integration of GB with PFI within the ML workflow demonstrated significantly enhanced IE predictive capability compared to previously reported ML models. Subsequent assessments involved the application of the trained model to drug-based corrosion inhibitors. The model demonstrates robust predictive capability when validated on available and our own experimental results. Furthermore, the model has been employed to predict IE for more than 1500 drug compounds, suggesting five novel drug compounds with the highest predicted IE on carbon steel. The developed ML workflow and associated model will be useful in accelerating the development of next-generation corrosion inhibitors for carbon steel.
Collapse
Affiliation(s)
- Thanh Hai Pham
- Ho Chi Minh City University of Technology (HCMUT) 268 Ly Thuong Kiet Street, District 10 Ho Chi Minh City Vietnam
- Vietnam National University Ho Chi Minh City Linh Trung Ward Ho Chi Minh City Vietnam
- Vietnam Institute for Tropical Technology and Environmental Protection 57A Truong Quoc Dung Street Phu Nhuan District Ho Chi Minh City Vietnam
| | - Phung K Le
- Ho Chi Minh City University of Technology (HCMUT) 268 Ly Thuong Kiet Street, District 10 Ho Chi Minh City Vietnam
- Vietnam National University Ho Chi Minh City Linh Trung Ward Ho Chi Minh City Vietnam
| | - Do Ngoc Son
- Ho Chi Minh City University of Technology (HCMUT) 268 Ly Thuong Kiet Street, District 10 Ho Chi Minh City Vietnam
- Vietnam National University Ho Chi Minh City Linh Trung Ward Ho Chi Minh City Vietnam
| |
Collapse
|
26
|
Charest N, Lowe CN, Ramsland C, Meyer B, Samano V, Williams AJ. Improving predictions of compound amenability for liquid chromatography-mass spectrometry to enhance non-targeted analysis. Anal Bioanal Chem 2024; 416:2565-2579. [PMID: 38530399 PMCID: PMC11228616 DOI: 10.1007/s00216-024-05229-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 02/14/2024] [Accepted: 02/16/2024] [Indexed: 03/28/2024]
Abstract
Mass-spectrometry-based non-targeted analysis (NTA), in which mass spectrometric signals are assigned chemical identities based on a systematic collation of evidence, is a growing area of interest for toxicological risk assessment. Successful NTA results in better identification of potentially hazardous pollutants within the environment, facilitating the development of targeted analytical strategies to best characterize risks to human and ecological health. A supporting component of the NTA process involves assessing whether suspected chemicals are amenable to the mass spectrometric method, which is necessary in order to assign an observed signal to the chemical structure. Prior work from this group involved the development of a random forest model for predicting the amenability of 5517 unique chemical structures to liquid chromatography-mass spectrometry (LC-MS). This work improves the interpretability of the group's prior model of the same endpoint, as well as integrating 1348 more data points across negative and positive ionization modes. We enhance interpretability by feature engineering, a machine learning practice that reduces the input dimensionality while attempting to preserve performance statistics. We emphasize the importance of interpretable machine learning models within the context of building confidence in NTA identification. The novel data were curated by the labeling of compounds as amenable or unamenable by expert curators, resulting in an enhanced set of chemical compounds to expand the applicability domain of the prior model. The balanced accuracy benchmark of the newly developed model is comparable to performance previously reported (mean CV BA is 0.84 vs. 0.82 in positive mode, and 0.85 vs. 0.82 in negative mode), while on a novel external set, derived from this work's data, the Matthews correlation coefficients (MCC) for the novel models are 0.66 and 0.68 for positive and negative mode, respectively. Our group's prior published models scored MCC of 0.55 and 0.54 on the same external sets. This demonstrates appreciable improvement over the chemical space captured by the expanded dataset. This work forms part of our ongoing efforts to develop models with higher interpretability and higher performance to support NTA efforts.
Collapse
Affiliation(s)
- Nathaniel Charest
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA.
| | - Charles N Lowe
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| | | | - Brian Meyer
- Senior Environmental Employment Program, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| | - Vicente Samano
- Senior Environmental Employment Program, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| | - Antony J Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| |
Collapse
|
27
|
Han W, Xu X, Fan Q, Yan Y, Zhang Y, Chen Y, Liu H. In silico construction of a focused fragment library facilitating exploration of chemical space. Mol Inform 2024; 43:e202300256. [PMID: 38193642 DOI: 10.1002/minf.202300256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 12/11/2023] [Accepted: 01/06/2024] [Indexed: 01/10/2024]
Abstract
Fragment-based drug design (FBDD) has emerged as a captivating subject in the realm of computer-aided drug design, enabling the generation of novel molecules through the rearrangement of ring systems within known compounds. The construction of focused fragment library plays a pivotal role in FBDD, necessitating the compilation of all potential bioactive ring systems capable of interacting with a specific target. In our study, we propose a workflow for the development of a focused fragment library and combinatorial compound library. The fragment library comprises seed fragments and collected fragments. The extraction of seed fragments is guided by receptor information, serving as a prerequisite for establishing a focused libraries. Conversely, collected fragments are obtained using the feature graph method, which offers a simplified representation of fragments and strikes a balance between diversity and similarity when categorizing different fragments. The utilization of feature graph facilitates the rational partitioning of chemical space at fragment level, enabling the exploration of desired chemical space and enhancing the efficiency of screening compound library. Analysis demonstrates that our workflow enables the enumeration of a greater number of entirely new potential compounds, thereby aiding in the rational design of drugs.
Collapse
Affiliation(s)
- Weijie Han
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Xiaohe Xu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Qing Fan
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Yingchao Yan
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - YanMin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, 639 Longmian Avenue, Nanjing, 211198, China
| |
Collapse
|
28
|
Setiya A, Jani V, Sonavane U, Joshi R. MolToxPred: small molecule toxicity prediction using machine learning approach. RSC Adv 2024; 14:4201-4220. [PMID: 38292268 PMCID: PMC10826801 DOI: 10.1039/d3ra07322j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 01/23/2024] [Indexed: 02/01/2024] Open
Abstract
Different types of chemicals and products may exhibit various health risks when administered into the human body. For toxicity reasons, the number of new drugs entering the market through the conventional drug development process has been reduced over the years. However, with the advent of big data and artificial intelligence, machine learning techniques have emerged as a potential solution for predicting toxicity and ensuring efficient drug development and chemical safety. An ML model for toxicity prediction can reduce experimental costs and time while addressing ethical concerns by drastically reducing the need for animals and clinical trials. Herein, MolToxPred, an ML-based tool, has been developed using a stacked model approach to predict the potential toxicity of small molecules and metabolites. The stacked model consists of random forest, multi-layer perceptron, and LightGBM as base classifiers and Logistic Regression as the meta classifier. For training and validation purposes, a comprehensive set of toxic and non-toxic molecules is curated. Different structural and physicochemical-based features in the form of molecular descriptors and fingerprints were employed. MolToxPred utilizes a comprehensive feature selection process and optimizes its hyperparameters through Bayesian optimization with stratified 5-fold cross-validation. In the evaluation phase, MolToxPred achieved an AUROC of 87.76% on the test set and 88.84% on an external validation set. The McNemar test was used as the post-hoc test to determine if the stacked models' performance was significantly different compared to the base learners. The developed stacked model outperformed its base classifiers and an existing tool in the literature, reaffirming its better performance. The hypothesis is that the incorporation of a diverse set of data, the subsequent feature selection, and a stacked ensemble approach give MolToxPred the edge over other methods. In addition to this, an attempt has been made to identify structural alerts responsible for endpoints of the Tox21 data to determine the association of a molecule with a plausible downstream pathway of action. MolToxPred may be helpful for drug discovery and regulatory pipelines in pharmaceutical and other industries for in silico toxicity prediction of small molecule candidates.
Collapse
Affiliation(s)
- Anjali Setiya
- HPC-Medical & Bioinformatics Applications Group, Centre for Development of Advanced Computing (C-DAC) Innovation Park, Panchawati, Pashan Pune 411008 India
| | - Vinod Jani
- HPC-Medical & Bioinformatics Applications Group, Centre for Development of Advanced Computing (C-DAC) Innovation Park, Panchawati, Pashan Pune 411008 India
| | - Uddhavesh Sonavane
- HPC-Medical & Bioinformatics Applications Group, Centre for Development of Advanced Computing (C-DAC) Innovation Park, Panchawati, Pashan Pune 411008 India
| | - Rajendra Joshi
- HPC-Medical & Bioinformatics Applications Group, Centre for Development of Advanced Computing (C-DAC) Innovation Park, Panchawati, Pashan Pune 411008 India
| |
Collapse
|
29
|
Kim Y, Jung H, Kumar S, Paton RS, Kim S. Designing solvent systems using self-evolving solubility databases and graph neural networks. Chem Sci 2024; 15:923-939. [PMID: 38239675 PMCID: PMC10793204 DOI: 10.1039/d3sc03468b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 12/04/2023] [Indexed: 01/22/2024] Open
Abstract
Designing solvent systems is key to achieving the facile synthesis and separation of desired products from chemical processes, so many machine learning models have been developed to predict solubilities. However, breakthroughs are needed to address deficiencies in the model's predictive accuracy and generalizability; this can be addressed by expanding and integrating experimental and computational solubility databases. To maximize predictive accuracy, these two databases should not be trained separately, and they should not be simply combined without reconciling the discrepancies from different magnitudes of errors and uncertainties. Here, we introduce self-evolving solubility databases and graph neural networks developed through semi-supervised self-training approaches. Solubilities from quantum-mechanical calculations are referred to during semi-supervised learning, but they are not directly added to the experimental database. Dataset augmentation is performed from 11 637 experimental solubilities to >900 000 data points in the integrated database, while correcting for the discrepancies between experiment and computation. Our model was successfully applied to study solvent selection in organic reactions and separation processes. The accuracy (mean absolute error around 0.2 kcal mol-1 for the test set) is quantitatively useful in exploring Linear Free Energy Relationships between reaction rates and solvation free energies for 11 organic reactions. Our model also accurately predicted the partition coefficients of lignin-derived monomers and drug-like molecules. While there is room for expanding solubility predictions to transition states, radicals, charged species, and organometallic complexes, this approach will be attractive to predictive chemistry areas where experimental, computational, and other heterogeneous data should be combined.
Collapse
Affiliation(s)
- Yeonjoon Kim
- Department of Chemistry, Colorado State University Fort Collins CO 80523 USA
- Department of Chemistry, Pukyong National University Busan 48513 Republic of Korea
| | - Hojin Jung
- Department of Chemistry, Colorado State University Fort Collins CO 80523 USA
| | - Sabari Kumar
- Department of Chemistry, Colorado State University Fort Collins CO 80523 USA
| | - Robert S Paton
- Department of Chemistry, Colorado State University Fort Collins CO 80523 USA
| | - Seonah Kim
- Department of Chemistry, Colorado State University Fort Collins CO 80523 USA
| |
Collapse
|
30
|
Choung OH, Vianello R, Segler M, Stiefl N, Jiménez-Luna J. Extracting medicinal chemistry intuition via preference machine learning. Nat Commun 2023; 14:6651. [PMID: 37907461 PMCID: PMC10618272 DOI: 10.1038/s41467-023-42242-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 09/21/2023] [Indexed: 11/02/2023] Open
Abstract
The lead optimization process in drug discovery campaigns is an arduous endeavour where the input of many medicinal chemists is weighed in order to reach a desired molecular property profile. Building the expertise to successfully drive such projects collaboratively is a very time-consuming process that typically spans many years within a chemist's career. In this work we aim to replicate this process by applying artificial intelligence learning-to-rank techniques on feedback that was obtained from 35 chemists at Novartis over the course of several months. We exemplify the usefulness of the learned proxies in routine tasks such as compound prioritization, motif rationalization, and biased de novo drug design. Annotated response data is provided, and developed models and code made available through a permissive open-source license.
Collapse
Affiliation(s)
- Oh-Hyeon Choung
- Novartis Institutes for Biomedical Research, 4002, Basel, Switzerland
| | - Riccardo Vianello
- Novartis Institutes for Biomedical Research, 4002, Basel, Switzerland
| | - Marwin Segler
- Microsoft Research AI4Science, CB1 2FB, Cambridge, UK
| | - Nikolaus Stiefl
- Novartis Institutes for Biomedical Research, 4002, Basel, Switzerland.
| | | |
Collapse
|
31
|
Nguyen HT, Yoshinouchi Y, Hirano M, Nomiyama K, Nakata H, Kim EY, Iwata H. In silico simulations and molecular descriptors to predict in vitro transactivation potencies of Baikal seal estrogen receptors by environmental contaminants. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2023; 265:115495. [PMID: 37748367 DOI: 10.1016/j.ecoenv.2023.115495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 08/31/2023] [Accepted: 09/16/2023] [Indexed: 09/27/2023]
Abstract
Baikal seals (Pusa sibirica) are vulnerable to high levels of organic pollutants. Here, we evaluated the transactivation potencies of bisphenols (BPs) and hydroxylated polychlorinated biphenyls (OH-PCBs) via the Baikal seal estrogen receptor α and β (bsERα and bsERβ) using in vitro and in silico approaches. In vitro reporter gene assays showed that most BPs and OH-PCBs exhibited estrogenic activity with bsER sub-type-specific potency. Among the BPs tested, bisphenol AF showed the lowest EC50 for both bsERs. 4'-OH-CB50 and 4'-OH-CB30 showed the lowest EC50 among OH-PCBs tested for bsERα and bsERβ, respectively. 4-((4-Isopropoxyphenyl)-sulfonyl)phenol, 4'-OH-CB72, and 4'-OH-CB121 showed weak bsERα-specific transactivation. Only 4-OH-CB107 did not affect both bsERs. In silico docking simulations revealed the binding affinities of these chemicals to bsERs and partially explained the in vitro results. Using the in silico simulations and molecular descriptors as explanatory variables and the in vitro results as objective variables, the quantitative structure-activity relationship (QSAR) models constructed for classification and regression accurately separated bsER-active compounds from non-active compounds and predicted the in vitro bsERα- and bsERβ-transactivation potencies, respectively. The QSAR models also suggested that chemical polarity, van der Waals surface area, bridging atom structure, position of the phenolic-OH group, and ligand interactions with key residues of the ligand binding pocket are critical variables to account for the bsER transactivation potency of the test compounds. We also succeeded in constructing computational models for predicting in vitro transactivation potencies of mouse ERs in the same manner, demonstrating the applicability of our approach independent of species-specific responses.
Collapse
Affiliation(s)
- Hoa Thanh Nguyen
- Center for Marine Environmental Studies, Ehime University, Matsuyama 7908577, Japan
| | - Yuka Yoshinouchi
- Center for Marine Environmental Studies, Ehime University, Matsuyama 7908577, Japan
| | - Masashi Hirano
- Department of Food and Life Science, School of Agriculture, Tokai University, Kumamoto 8612055, Japan
| | - Kei Nomiyama
- Center for Marine Environmental Studies, Ehime University, Matsuyama 7908577, Japan
| | - Haruhiko Nakata
- Faculty of Advanced Science and Technology, Kumamoto University, Kumamoto 8608555, Japan
| | - Eun-Young Kim
- Department of Life and Nanopharmaceutical Science and Department of Biology, Kyung Hee University, Seoul 130701, Republic of Korea
| | - Hisato Iwata
- Center for Marine Environmental Studies, Ehime University, Matsuyama 7908577, Japan.
| |
Collapse
|
32
|
Krzyzanowski A, Pahl A, Grigalunas M, Waldmann H. Spacial Score─A Comprehensive Topological Indicator for Small-Molecule Complexity. J Med Chem 2023; 66:12739-12750. [PMID: 37651653 PMCID: PMC10544027 DOI: 10.1021/acs.jmedchem.3c00689] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Indexed: 09/02/2023]
Abstract
The fraction of sp3-hybridized carbons (Fsp3) and the fraction of stereogenic carbons (FCstereo) are two widely employed scores of molecular complexity with strong links to biologically relevant features. However, they do not comprehensively express molecular topology, and they often do not match the chemical intuition of complexity. We propose the spacial score (SPS) as an empirical scoring system that builds upon the principle underlying Fsp3 and FCstereo and expresses the spacial complexity of a compound in a uniform manner on a highly granular scale. The size-normalized SPS (nSPS) can differentiate distributions of natural products and synthetic compounds and is applicable in the analysis of biological activity data. Analysis of the ChEMBL database revealed general trends of increasing selectivity and potency with increasing nSPS. SPS can also be used advantageously in planning and analysis of synthesis programs for direct comparison of chemical transformations and intermediates in reaction sequences.
Collapse
Affiliation(s)
- Adrian Krzyzanowski
- Department
of Chemical Biology, Max Planck Institute
of Molecular Physiology, Otto-Hahn-Straße 11, 44227 Dortmund, Germany
- Faculty
of Chemistry, Chemical Biology Technical
University Dortmund, Otto-Hahn-Straße 6, 44221 Dortmund, Germany
| | - Axel Pahl
- Compound
Management and Screening Center, Max Planck
Institute of Molecular Physiology, Otto-Hahn-Straße 11, 44227 Dortmund, Germany
| | - Michael Grigalunas
- Department
of Chemical Biology, Max Planck Institute
of Molecular Physiology, Otto-Hahn-Straße 11, 44227 Dortmund, Germany
| | - Herbert Waldmann
- Department
of Chemical Biology, Max Planck Institute
of Molecular Physiology, Otto-Hahn-Straße 11, 44227 Dortmund, Germany
- Faculty
of Chemistry, Chemical Biology Technical
University Dortmund, Otto-Hahn-Straße 6, 44221 Dortmund, Germany
| |
Collapse
|
33
|
Saranjam L, Nedyalkova M, Fuguet E, Simeonov V, Mas F, Madurga S. Collection of Partition Coefficients in Hexadecyltrimethylammonium Bromide, Sodium Cholate, and Lithium Perfluorooctanesulfonate Micellar Solutions: Experimental Determination and Computational Predictions. Molecules 2023; 28:5729. [PMID: 37570699 PMCID: PMC10420229 DOI: 10.3390/molecules28155729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Revised: 07/25/2023] [Accepted: 07/26/2023] [Indexed: 08/13/2023] Open
Abstract
This study focuses on determining the partition coefficients (logP) of a diverse set of 63 molecules in three distinct micellar systems: hexadecyltrimethylammonium bromide (HTAB), sodium cholate (SC), and lithium perfluorooctanesulfonate (LPFOS). The experimental log p values were obtained through micellar electrokinetic chromatography (MEKC) experiments, conducted under controlled pH conditions. Then, Quantum Mechanics (QM) and machine learning approaches are proposed for the prediction of the partition coefficients in these three micellar systems. In the applied QM approach, the experimentally obtained partition coefficients were correlated with the calculated values for the case of the 15 solvent mixtures. Using Density Function Theory (DFT) with the B3LYP functional, we calculated the solvation free energies of 63 molecules in these 16 solvents. The combined data from the experimental partition coefficients in the three micellar formulations showed that the 1-propanol/water combination demonstrated the best agreement with the experimental partition coefficients for the SC and HTAB micelles. Moreover, we employed the SVM approach and k-means clustering based on the generation of the chemical descriptor space. The analysis revealed distinct partitioning patterns associated with specific characteristic features within each identified class. These results indicate the utility of the combined techniques when we want an efficient and quicker model for predicting partition coefficients in diverse micelles.
Collapse
Affiliation(s)
- Leila Saranjam
- Department of Material Science and Physical Chemistry, Research Institute of Theoretical and Computational Chemistry (IQTCUB), University of Barcelona, C/Martí i Franquès 1, 08028 Barcelona, Spain; (L.S.); (F.M.)
| | - Miroslava Nedyalkova
- Faculty of Chemistry and Pharmacy, University of Sofia “St. Kl. Ohridski”, 1 James Bourchier Blvd., 1164 Sofia, Bulgaria;
| | - Elisabet Fuguet
- Department of Chemical Engineering and Analytical Chemistry, Institute of Biomedicine (IBUB), University of Barcelona, C/Martí i Franquès 1, 08028 Barcelona, Spain;
- Serra Húnter Programme, Generalitat de Catalunya, 08017 Barcelona, Spain
| | - Vasil Simeonov
- Faculty of Chemistry and Pharmacy, University of Sofia “St. Kl. Ohridski”, 1 James Bourchier Blvd., 1164 Sofia, Bulgaria;
| | - Francesc Mas
- Department of Material Science and Physical Chemistry, Research Institute of Theoretical and Computational Chemistry (IQTCUB), University of Barcelona, C/Martí i Franquès 1, 08028 Barcelona, Spain; (L.S.); (F.M.)
| | - Sergio Madurga
- Department of Material Science and Physical Chemistry, Research Institute of Theoretical and Computational Chemistry (IQTCUB), University of Barcelona, C/Martí i Franquès 1, 08028 Barcelona, Spain; (L.S.); (F.M.)
| |
Collapse
|
34
|
Sobańska AW. In silico assessment of risks associated with pesticides exposure during pregnancy. CHEMOSPHERE 2023; 329:138649. [PMID: 37043889 DOI: 10.1016/j.chemosphere.2023.138649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 04/04/2023] [Accepted: 04/07/2023] [Indexed: 05/03/2023]
Abstract
Novel Quantitative Structure-Activity Relationship (QSAR) models of compounds' placenta (PL) permeability expressed as their log FM (fetus-to-mother blood concentration) values or binary PL1/0 (crossing/non-crossing) score were generated using a number of statistical tools: Multiple Linear Regression, Boosted Trees, Principal Component Analysis and Artificial Neural Networks, on the basis of molecular descriptors calculated by Mordred software and selected using Partial Least Squares (PLS) analysis. It was established that the most important predictor of both log FM and the binary PL1/0 score is Lipinski - a binary variable reflecting the compounds' ability to satisfy the criteria of drug-likeness according to the Lipinski's "Rule of 5". The quantitative (log FM) and qualitative (PL1/0) models of PL permeability were applied to 345 pesticides from different chemical families (triazines, carbamates, pyrethroids, organochlorine, organophosphorus and miscellaneous compounds). The ability of studied pesticides to cross the placenta was assessed; the basic physico-chemical parameters responsible for good or poor placenta transport of pesticides were identified and the relationships between the pesticides' PL permeability, blood-brain barrier (BBB) transfer and gastro-intestinal (GI) absorption were investigated. It was found (on the basis of logistic regression analysis) that the probability of a compound crossing the placenta (PL1) is inversely correlated with its lipophilicity and molar refractivity and positively correlated with the total count of oxygen and nitrogen atoms.
Collapse
Affiliation(s)
- Anna W Sobańska
- Department of Analytical Chemistry Medical University of Lodz, 90-151, Łódź, Muszyńskiego 1, Poland.
| |
Collapse
|
35
|
Duncan KM, Trousdale RC, Gonzales CN, Steel WH, Walker RA. l-Phenylalanine Partitioning Mechanisms in Model Biological Membranes. J Phys Chem B 2023. [PMID: 37315336 DOI: 10.1021/acs.jpcb.2c08582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Time-resolved fluorescence spectroscopy in combination with differential scanning calorimetry (DSC) was used to study the chemical interactions that occur when l-phenylalanine is introduced to solutions containing phosphatidylcholine vesicles. Studies reported in this work address open questions about l-Phe's affinity for lipid vesicle bilayers, the effects of l-Phe partitioning on bilayer properties, l-Phe's solvation within a lipid bilayer, and the amount of l-Phe within that local solvation environment. DSC data show that l-Phe reduces the amount of heat necessary to melt saturated phosphatidylcholine bilayers from their gel to liquid-crystalline state but does not change the transition temperature (Tgel-lc). Time-resolved emission shows only a single l-Phe lifetime at low temperatures corresponding to l-Phe remaining solvated in aqueous solution. At temperatures close to Tgel-lc, a second, shorter lifetime appears that is assigned to l-Phe already embedded within the membrane that becomes hydrated as water starts to permeate the lipid bilayer. This new lifetime is attributed to a conformationally restricted rotamer in the bilayer's polar headgroup region and accounts for up to 30% of the emission amplitude. Results reported for dipalmitoylphosphatidylcholine (DPPC, 16:0) lipid vesicles prove to be general, with similar effects observed for dimyristoylphosphatidylcholine (DMPC, 14:0) and distearoylphosphatidylcholine (DSPC, 18:0) vesicles. Taken together, these results create a complete and compelling picture of how l-Phe associates with model biological membranes. Furthermore, this approach to examining amino acid partitioning into membranes and the resulting solvation forces points to new strategies for studying the structure and chemistry of membrane-soluble peptides and selected membrane proteins.
Collapse
Affiliation(s)
- Katelyn M Duncan
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana 59717, United States
| | - Rhys C Trousdale
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana 59717, United States
| | - Cristina N Gonzales
- Department of Chemistry, Reed College, Portland, Oregon 97202, United States
| | - William H Steel
- Department of Chemistry, York College of Pennsylvania, York, Pennsylvania 17403, United States
| | - Robert A Walker
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana 59717, United States
- Montana Materials Science Program, Montana State University, Bozeman, Montana 59717, United States
| |
Collapse
|
36
|
Fan J, Qian C, Zhou S. Machine Learning Spectroscopy Using a 2-Stage, Generalized Constituent Contribution Protocol. RESEARCH (WASHINGTON, D.C.) 2023; 6:0115. [PMID: 37287889 PMCID: PMC10243197 DOI: 10.34133/research.0115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Accepted: 03/20/2023] [Indexed: 06/09/2023]
Abstract
A corrected group contribution (CGC)-molecule contribution (MC)-Bayesian neural network (BNN) protocol for accurate prediction of absorption spectra is presented. Upon combination of BNN with CGC methods, the full absorption spectra of various molecules are afforded accurately and efficiently-by using only a small dataset for training. Here, with a small training sample (<100), accurate prediction of maximum wavelength for single molecules is afforded with the first stage of the protocol; by contrast, previously reported machine learning (ML) methods require >1,000 samples to ensure the accuracy of prediction. Furthermore, with <500 samples, the mean square error in the prediction of full ultraviolet spectra reaches <2%; for comparison, ML models with molecular SMILES for training require a much larger dataset (>2,000) to achieve comparable accuracy. Moreover, by employing an MC method designed specifically for CGC that properly interprets the mixing rule, the spectra of mixtures are obtained with high accuracy. The logical origins of the good performance of the protocol are discussed in detail. Considering that such a constituent contribution protocol combines chemical principles and data-driven tools, most likely, it will be proven efficient to solve molecular-property-relevant problems in wider fields.
Collapse
Affiliation(s)
- Jinming Fan
- College of Chemical and Biological Engineering, Zhejiang Provincial Key Laboratory of Advanced Chemical Engineering Manufacture Technology, Zhejiang University, 310027 Hangzhou, P. R. China
- Institute of Zhejiang University - Quzhou, Zheda Rd. #99, 324000 Quzhou, P. R. China
| | - Chao Qian
- College of Chemical and Biological Engineering, Zhejiang Provincial Key Laboratory of Advanced Chemical Engineering Manufacture Technology, Zhejiang University, 310027 Hangzhou, P. R. China
- Institute of Zhejiang University - Quzhou, Zheda Rd. #99, 324000 Quzhou, P. R. China
| | - Shaodong Zhou
- College of Chemical and Biological Engineering, Zhejiang Provincial Key Laboratory of Advanced Chemical Engineering Manufacture Technology, Zhejiang University, 310027 Hangzhou, P. R. China
- Institute of Zhejiang University - Quzhou, Zheda Rd. #99, 324000 Quzhou, P. R. China
| |
Collapse
|
37
|
Bao LQ, Baecker D, Mai Dung DT, Phuong Nhung N, Thi Thuan N, Nguyen PL, Phuong Dung PT, Huong TTL, Rasulev B, Casanola-Martin GM, Nam NH, Pham-The H. Development of Activity Rules and Chemical Fragment Design for In Silico Discovery of AChE and BACE1 Dual Inhibitors against Alzheimer's Disease. Molecules 2023; 28:molecules28083588. [PMID: 37110831 PMCID: PMC10142303 DOI: 10.3390/molecules28083588] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 04/15/2023] [Accepted: 04/18/2023] [Indexed: 04/29/2023] Open
Abstract
Multi-target drug development has become an attractive strategy in the discovery of drugs to treat of Alzheimer's disease (AzD). In this study, for the first time, a rule-based machine learning (ML) approach with classification trees (CT) was applied for the rational design of novel dual-target acetylcholinesterase (AChE) and β-site amyloid-protein precursor cleaving enzyme 1 (BACE1) inhibitors. Updated data from 3524 compounds with AChE and BACE1 measurements were curated from the ChEMBL database. The best global accuracies of training/external validation for AChE and BACE1 were 0.85/0.80 and 0.83/0.81, respectively. The rules were then applied to screen dual inhibitors from the original databases. Based on the best rules obtained from each classification tree, a set of potential AChE and BACE1 inhibitors were identified, and active fragments were extracted using Murcko-type decomposition analysis. More than 250 novel inhibitors were designed in silico based on active fragments and predicted AChE and BACE1 inhibitory activity using consensus QSAR models and docking validations. The rule-based and ML approach applied in this study may be useful for the in silico design and screening of new AChE and BACE1 dual inhibitors against AzD.
Collapse
Affiliation(s)
- Le-Quang Bao
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi 10000, Vietnam
| | - Daniel Baecker
- Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, University of Greifswald, Friedrich-Ludwig-Jahn-Straße 17, 17489 Greifswald, Germany
| | - Do Thi Mai Dung
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi 10000, Vietnam
| | - Nguyen Phuong Nhung
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi 10000, Vietnam
| | - Nguyen Thi Thuan
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi 10000, Vietnam
| | - Phuong Linh Nguyen
- College of Computing & Informatics, Drexel University, 3141 Chestnut St., Philadelphia, PA 19104, USA
| | - Phan Thi Phuong Dung
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi 10000, Vietnam
| | - Tran Thi Lan Huong
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi 10000, Vietnam
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND 58102, USA
| | | | - Nguyen-Hai Nam
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi 10000, Vietnam
| | - Hai Pham-The
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi 10000, Vietnam
| |
Collapse
|
38
|
García-Andrade X, García Tahoces P, Pérez-Ríos J, Martínez Núñez E. Barrier Height Prediction by Machine Learning Correction of Semiempirical Calculations. J Phys Chem A 2023; 127:2274-2283. [PMID: 36877614 PMCID: PMC10845151 DOI: 10.1021/acs.jpca.2c08340] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/19/2023] [Indexed: 03/07/2023]
Abstract
Different machine learning (ML) models are proposed in the present work to predict density functional theory-quality barrier heights (BHs) from semiempirical quantum mechanical (SQM) calculations. The ML models include a multitask deep neural network, gradient-boosted trees by means of the XGBoost interface, and Gaussian process regression. The obtained mean absolute errors are similar to those of previous models considering the same number of data points. The ML corrections proposed in this paper could be useful for rapid screening of the large reaction networks that appear in combustion chemistry or in astrochemistry. Finally, our results show that 70% of the features with the highest impact on model output are bespoke predictors. This custom-made set of predictors could be employed by future Δ-ML models to improve the quantitative prediction of other reaction properties.
Collapse
Affiliation(s)
| | - Pablo García Tahoces
- Department
of Electronics and Computer Science, University
of Santiago de Compostela, Santiago de Compostela 15782, Spain
| | - Jesús Pérez-Ríos
- Department
of Physics, Stony Brook University, Stony Brook, New York 11794, United States
- Institute
for Advanced Computational Science, Stony
Brook University, Stony
Brook, New York 11794-3800, United States
| | - Emilio Martínez Núñez
- Department
of Physical Chemistry, University of Santiago
de Compostela, Santiago
de Compostela 15782, Spain
| |
Collapse
|
39
|
Lameiro RF, Montanari CA. Investigating the Lack of Translation from Cruzain Inhibition to Trypanosoma cruzi Activity with Machine Learning and Chemical Space Analyses. ChemMedChem 2023; 18:e202200434. [PMID: 36692246 DOI: 10.1002/cmdc.202200434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 01/17/2023] [Accepted: 01/17/2023] [Indexed: 01/25/2023]
Abstract
Chagas disease is a neglected tropical disease caused by the protozoa Trypanosoma cruzi. Cruzain, its main cysteine protease, is commonly targeted in drug discovery efforts to find new treatments for this disease. Even though the essentiality of this enzyme for the parasite has been established, many cruzain inhibitors fail as trypanocidal agents. This lack of translation from biochemical to biological assays can involve several factors, including suboptimal physicochemical properties. In this work, we aim to rationalize this phenomenon through chemical space analyses of calculated molecular descriptors. These include statistical tests, visualization of projections, scaffold analysis, and creation of machine learning models coupled with interpretability methods. Our results demonstrate a significant difference between the chemical spaces of cruzain and T. cruzi inhibitors, with compounds with more hydrogen bond donors and rotatable bonds being more likely to be good cruzain inhibitors, but less likely to be active on T. cruzi. In addition, cruzain inhibitors seem to occupy specific regions of the chemical space that cannot be easily correlated with T. cruzi activity, which means that using predictive modeling to determine whether cruzain inhibitors will be trypanocidal is not a straightforward task. We believe that the conclusions from this work might be of interest for future projects that aim to develop novel trypanocidal compounds.
Collapse
Affiliation(s)
- Rafael F Lameiro
- Medicinal and Biological Chemistry Group, São Carlos Institute of Chemistry, University of São Paulo, Trabalhador São-Carlense Avenue 400, São Carlos, Brazil
| | - Carlos A Montanari
- Medicinal and Biological Chemistry Group, São Carlos Institute of Chemistry, University of São Paulo, Trabalhador São-Carlense Avenue 400, São Carlos, Brazil
| |
Collapse
|
40
|
Liu C, Chen Y, Guo G, Zhao Q, Jiang H, Wu K, Peng Q, Chen Y, Fang D, Shen B, Shen H, Wu D, Sun H. Interpretable Machine Learning Model for Predicting Interaction Energies between Dimethyl Sulfide and Potential Absorbing Solvents. Ind Eng Chem Res 2023. [DOI: 10.1021/acs.iecr.2c04559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Affiliation(s)
- Chuanlei Liu
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Yuxiang Chen
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Guanchu Guo
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Qiyue Zhao
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Hao Jiang
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Kongguo Wu
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Qilong Peng
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Yu Chen
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Diyi Fang
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Benxian Shen
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
- International Joint Research Center of Green Energy Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Haitao Shen
- International Joint Research Center of Green Energy Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Di Wu
- Alexandra Navrotsky Institute for Experimental Thermodynamics, Washington State University, Pullman, Washington 99163, United States
- Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, Washington 99163, United States
- Materials Science and Engineering, Washington State University, Pullman, Washington 99163, United States
- Department of Chemistry, Washington State University, Pullman, Washington 99163, United States
| | - Hui Sun
- School of Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
- International Joint Research Center of Green Energy Chemical Engineering, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
41
|
Vincze A, Dékány G, Bicsak R, Formanek A, Moreau Y, Koplányi G, Takács G, Katona G, Balogh-Weiser D, Arany Á, Balogh GT. Natural Lipid Extracts as an Artificial Membrane for Drug Permeability Assay: In Vitro and In Silico Characterization. Pharmaceutics 2023; 15:pharmaceutics15030899. [PMID: 36986760 PMCID: PMC10053807 DOI: 10.3390/pharmaceutics15030899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 03/04/2023] [Accepted: 03/06/2023] [Indexed: 03/12/2023] Open
Abstract
In vitro non-cellular permeability models such as the parallel artificial membrane permeability assay (PAMPA) are widely applied tools for early-phase drug candidate screening. In addition to the commonly used porcine brain polar lipid extract for modeling the blood–brain barrier’s permeability, the total and polar fractions of bovine heart and liver lipid extracts were investigated in the PAMPA model by measuring the permeability of 32 diverse drugs. The zeta potential of the lipid extracts and the net charge of their glycerophospholipid components were also determined. Physicochemical parameters of the 32 compounds were calculated using three independent forms of software (Marvin Sketch, RDKit, and ACD/Percepta). The relationship between the lipid-specific permeabilities and the physicochemical descriptors of the compounds was investigated using linear correlation, Spearman correlation, and PCA analysis. While the results showed only subtle differences between total and polar lipids, permeability through liver lipids highly differed from that of the heart or brain lipid-based models. Correlations between the in silico descriptors (e.g., number of amide bonds, heteroatoms, and aromatic heterocycles, accessible surface area, and H-bond acceptor–donor balance) of drug molecules and permeability values were also found, which provides support for understanding tissue-specific permeability.
Collapse
Affiliation(s)
- Anna Vincze
- Department of Chemical and Environmental Process Engineering, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
| | - Gergely Dékány
- Department of Chemical and Environmental Process Engineering, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
| | - Richárd Bicsak
- Department of Chemical and Environmental Process Engineering, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
| | - András Formanek
- ESAT-STADIUS KU LEUVEN, 3001 Leuven, Belgium
- Department of Measurement and Information Systems, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
| | - Yves Moreau
- ESAT-STADIUS KU LEUVEN, 3001 Leuven, Belgium
| | - Gábor Koplányi
- Department of Organic Chemistry and Technology, Faculty of Chemical Technology and Biotechnology, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
| | - Gergely Takács
- Department of Chemical and Environmental Process Engineering, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
- Mcule.com Kft, Bartók Béla út 105-113, H-1115 Budapest, Hungary
| | - Gábor Katona
- Institute of Pharmaceutical Technology and Regulatory Affairs, Faculty of Pharmacy, University of Szeged, Eötvös Str. 6, H-6720 Szeged, Hungary
| | - Diána Balogh-Weiser
- Department of Organic Chemistry and Technology, Faculty of Chemical Technology and Biotechnology, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
- Department of Physical Chemistry and Materials Science, Faculty of Chemical Technology and Biotechnology, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
| | - Ádám Arany
- ESAT-STADIUS KU LEUVEN, 3001 Leuven, Belgium
| | - György T. Balogh
- Department of Chemical and Environmental Process Engineering, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
- Institute of Pharmacodynamics and Biopharmacy, Faculty of Pharmacy, University of Szeged, Eötvös u. 6, H-6720 Szeged, Hungary
- Correspondence:
| |
Collapse
|
42
|
Su Y, Dai Y, Zeng Y, Wei C, Chen Y, Ge F, Zheng P, Zhou D, Dral PO, Wang C. Interpretable Machine Learning of Two-Photon Absorption. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2204902. [PMID: 36658720 PMCID: PMC10015897 DOI: 10.1002/advs.202204902] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 12/19/2022] [Indexed: 06/17/2023]
Abstract
Molecules with strong two-photon absorption (TPA) are important in many advanced applications such as upconverted laser and photodynamic therapy, but their design is hampered by the high cost of experimental screening and accurate quantum chemical (QC) calculations. Here a systematic study is performed by collecting an experimental TPA database with ≈900 molecules, analyzing with interpretable machine learning (ML) the key molecular features explaining TPA magnitudes, and building a fast ML model for predictions. The ML model has prediction errors of similar magnitude compared to experimental and affordable QC methods errors and has the potential for high-throughput screening as additionally validated with the new experimental measurements. ML feature analysis is generally consistent with common beliefs which is quantified and rectified. The most important feature is conjugation length followed by features reflecting the effects of donor and acceptor substitution and coplanarity.
Collapse
Affiliation(s)
- Yuming Su
- State Key Laboratory of Physical Chemistry of Solid SurfacesDepartment of ChemistryCollege of Chemistry and Chemical Engineering, iChemInnovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM)Xiamen University361005XiamenP. R. China
| | - Yiheng Dai
- State Key Laboratory of Physical Chemistry of Solid SurfacesDepartment of ChemistryCollege of Chemistry and Chemical Engineering, iChemInnovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM)Xiamen University361005XiamenP. R. China
| | - Yifan Zeng
- State Key Laboratory of Physical Chemistry of Solid SurfacesDepartment of ChemistryCollege of Chemistry and Chemical Engineering, iChemInnovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM)Xiamen University361005XiamenP. R. China
| | - Caiyun Wei
- State Key Laboratory of Physical Chemistry of Solid SurfacesDepartment of ChemistryCollege of Chemistry and Chemical Engineering, iChemInnovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM)Xiamen University361005XiamenP. R. China
| | - Yangtao Chen
- State Key Laboratory of Physical Chemistry of Solid SurfacesDepartment of ChemistryCollege of Chemistry and Chemical Engineering, iChemInnovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM)Xiamen University361005XiamenP. R. China
| | - Fuchun Ge
- Department of ChemistryCollege of Chemistry and Chemical EngineeringiChemXiamen UniversityFujian Provincial Key Laboratory of Theoretical and Computational ChemistryXiamen University361005XiamenP. R. China
| | - Peikun Zheng
- Department of ChemistryCollege of Chemistry and Chemical EngineeringiChemXiamen UniversityFujian Provincial Key Laboratory of Theoretical and Computational ChemistryXiamen University361005XiamenP. R. China
| | - Da Zhou
- School of Mathematical Sciences and Fujian Provincial Key Laboratory of Mathematical Modeling and High‐Performance Scientific ComputationXiamen UniversityXiamen361005P. R. China
| | - Pavlo O. Dral
- Department of ChemistryCollege of Chemistry and Chemical EngineeringiChemXiamen UniversityFujian Provincial Key Laboratory of Theoretical and Computational ChemistryXiamen University361005XiamenP. R. China
| | - Cheng Wang
- State Key Laboratory of Physical Chemistry of Solid SurfacesDepartment of ChemistryCollege of Chemistry and Chemical Engineering, iChemInnovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM)Xiamen University361005XiamenP. R. China
| |
Collapse
|
43
|
Predicting the Mechanical Properties of Polyurethane Elastomers Using Machine Learning. CHINESE JOURNAL OF POLYMER SCIENCE 2023. [DOI: 10.1007/s10118-022-2838-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
|
44
|
Svrkota B, Krmar J, Protić A, Otašević B. The secret of reversed-phase/weak cation exchange retention mechanisms in mixed-mode liquid chromatography applied for small drug molecule analysis. J Chromatogr A 2023; 1690:463776. [PMID: 36640679 DOI: 10.1016/j.chroma.2023.463776] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 01/02/2023] [Accepted: 01/03/2023] [Indexed: 01/07/2023]
Abstract
Resolving complex sample mixtures by liquid chromatography in a single run is challenging. The so-called mixed-mode liquid chromatography (MMLC) which combines several retention mechanisms within a single column, can provide resource-efficient separation of solutes of diverse nature. The Acclaim Mixed-Mode WCX-1 column, encompassing hydrophobic and weak cation exchange interactions, was employed for the analysis of small drug molecules. The stationary phase's interaction abilities were assessed by analysing molecules of different ionisation potentials. Mixed Quantitative Structure-Retention Relationship (QSRR) models were developed for revealing significant experimental parameters (EPs) and molecular features governing molecular retention. According to the plan of Face-Centred Central Composite Design, EPs (column temperature, acetonitrile content, pH and buffer concentration of aqueous mobile phase) variations were included in QSRR modelling. QSRRs were developed upon the whole data set (global model) and upon discrete parts, related to similarly ionized analytes (local models) by applying gradient boosted trees as a regression tool. Root mean squared errors of prediction for global and local QSRR models for cations, anions and neutrals were respectively 0.131; 0.105; 0.102 and 0.042 with the coefficient of determination 0.947; 0.872; 0.954 and 0.996, indicating satisfactory performances of all models, with slightly better accuracy of local ones. The research showed that influences of EPs were dependant on the molecule's ionisation potential. The molecular descriptors highlighted by models pointed out that electrostatic and hydrophobic interactions and hydrogen bonds participate in the retention process. The molecule's conformation significance was evaluated along with the topological relationship between the interaction centres, explicitly determined for each molecular species through local models. All models showed good molecular retention predictability thus showing potential for facilitating the method development.
Collapse
Affiliation(s)
- Bojana Svrkota
- University of Belgrade - Faculty of Pharmacy, Department of Drug Analysis, Vojvode Stepe 450, 11221 Belgrade, Serbia
| | - Jovana Krmar
- University of Belgrade - Faculty of Pharmacy, Department of Drug Analysis, Vojvode Stepe 450, 11221 Belgrade, Serbia
| | - Ana Protić
- University of Belgrade - Faculty of Pharmacy, Department of Drug Analysis, Vojvode Stepe 450, 11221 Belgrade, Serbia
| | - Biljana Otašević
- University of Belgrade - Faculty of Pharmacy, Department of Drug Analysis, Vojvode Stepe 450, 11221 Belgrade, Serbia.
| |
Collapse
|
45
|
Wieske LHE, Atilaw Y, Poongavanam V, Erdélyi M, Kihlberg J. Going Viral: An Investigation into the Chameleonic Behaviour of Antiviral Compounds. Chemistry 2023; 29:e202202798. [PMID: 36286339 PMCID: PMC10107787 DOI: 10.1002/chem.202202798] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 10/23/2022] [Accepted: 10/25/2022] [Indexed: 12/15/2022]
Abstract
The ability to adjust conformations in response to the polarity of the environment, i.e. molecular chameleonicity, is considered to be important for conferring both high aqueous solubility and high cell permeability to drugs in chemical space beyond Lipinski's rule of 5. We determined the conformational ensembles populated by the antiviral drugs asunaprevir, simeprevir, atazanavir and daclatasvir in polar (DMSO-d6 ) and non-polar (chloroform) environments with NMR spectroscopy. Daclatasvir was fairly rigid, whereas the first three showed large flexibility in both environments, that translated into major differences in solvent accessible 3D polar surface area within each conformational ensemble. No significant differences in size and polar surface area were observed between the DMSO-d6 and chloroform ensembles of these three drugs. We propose that such flexible compounds are characterized as "partial molecular chameleons" and hypothesize that their ability to adopt conformations with low polar surface area contributes to their membrane permeability and oral absorption.
Collapse
Affiliation(s)
- Lianne H E Wieske
- Department of Chemistry - BMC, Uppsala University, Box 576, SE-751 23, Uppsala, Sweden
| | - Yoseph Atilaw
- Department of Chemistry - BMC, Uppsala University, Box 576, SE-751 23, Uppsala, Sweden
| | | | - Máté Erdélyi
- Department of Chemistry - BMC, Uppsala University, Box 576, SE-751 23, Uppsala, Sweden
| | - Jan Kihlberg
- Department of Chemistry - BMC, Uppsala University, Box 576, SE-751 23, Uppsala, Sweden
| |
Collapse
|
46
|
Sanabria-Chanaga E, Meneses-Ruiz DM, Puertas-Santamaría EF, Mancha-Meléndez FM, Bratoeff E, Loza-Mejía MA, Salazar JR. Synthesis, in silico, and in vivo anti-inflammatory evaluation of 3β-cinnamoyloxy substituted pregna-4,16-diene-6,20-diones derivatives. J Biomol Struct Dyn 2022; 40:12184-12193. [PMID: 34468278 DOI: 10.1080/07391102.2021.1969279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Pregnane derivatives have been studied mainly for their 5α-reductase activity. However, the anti-inflammatory activities of such compounds are still poorly explored. In the search for new anti-inflammatory agents, seven new pregnane derivatives 6a-g, with cinnamic acid esters at C-3 were prepared and fully characterized. The anti-inflammatory activity of compounds was assessed in TPA induced mice ear model. From them, compound 6 b was the most active to reduce edema, with an ED50 of 0.017 mg/ear. Also, Molecular Docking and Molecular Dynamics studies were performed to identify a potential molecular target related to the inflammatory process. The in vivo results suggest that 6 b could be a potent anti-inflammatory compound, while in silico studies suggest its interaction with some critical enzymes in the inflammatory response.
Collapse
Affiliation(s)
- Elkin Sanabria-Chanaga
- Departamento de Química, Universidad de Pamplona, Pamplona, Colombia.,Departamento de Farmacia, Facultad de Química, Universidad Nacional Autónoma de México, Ciudad Universitaria, Coyoacán, Ciudad de México, México
| | | | - Erick Francisco Puertas-Santamaría
- Design, Isolation, and Synthesis of Bioactive Molecules Research Group, Facultad de Ciencias Químicas, Universidad La Salle-México, Ciudad de México, México
| | - Fernando Manuel Mancha-Meléndez
- Design, Isolation, and Synthesis of Bioactive Molecules Research Group, Facultad de Ciencias Químicas, Universidad La Salle-México, Ciudad de México, México
| | - Eugene Bratoeff
- Departamento de Farmacia, Facultad de Química, Universidad Nacional Autónoma de México, Ciudad Universitaria, Coyoacán, Ciudad de México, México
| | - Marco A Loza-Mejía
- Design, Isolation, and Synthesis of Bioactive Molecules Research Group, Facultad de Ciencias Químicas, Universidad La Salle-México, Ciudad de México, México
| | - Juan Rodrigo Salazar
- Design, Isolation, and Synthesis of Bioactive Molecules Research Group, Facultad de Ciencias Químicas, Universidad La Salle-México, Ciudad de México, México
| |
Collapse
|
47
|
A TastePeptides-Meta system including an umami/bitter classification model Umami_YYDS, a TastePeptidesDB database and an open-source package Auto_Taste_ML. Food Chem 2022; 405:134812. [DOI: 10.1016/j.foodchem.2022.134812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 10/25/2022] [Accepted: 10/28/2022] [Indexed: 11/11/2022]
|
48
|
Sauer S, Matter H, Hessler G, Grebner C. Optimizing interactions to protein binding sites by integrating docking-scoring strategies into generative AI methods. Front Chem 2022; 10:1012507. [PMID: 36339033 PMCID: PMC9629386 DOI: 10.3389/fchem.2022.1012507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 09/20/2022] [Indexed: 11/14/2022] Open
Abstract
The identification and optimization of promising lead molecules is essential for drug discovery. Recently, artificial intelligence (AI) based generative methods provided complementary approaches for generating molecules under specific design constraints of relevance in drug design. The goal of our study is to incorporate protein 3D information directly into generative design by flexible docking plus an adapted protein-ligand scoring function, thereby moving towards automated structure-based design. First, the protein-ligand scoring function RFXscore integrating individual scoring terms, ligand descriptors, and combined terms was derived using the PDBbind database and internal data. Next, design results for different workflows are compared to solely ligand-based reward schemes. Our newly proposed, optimal workflow for structure-based generative design is shown to produce promising results, especially for those exploration scenarios, where diverse structures fitting to a protein binding site are requested. Best results are obtained using docking followed by RFXscore, while, depending on the exact application scenario, it was also found useful to combine this approach with other metrics that bias structure generation into "drug-like" chemical space, such as target-activity machine learning models, respectively.
Collapse
Affiliation(s)
| | | | | | - Christoph Grebner
- Synthetic Molecular Design, Integrated Drug Discovery, Sanofi, Frankfurt, Germany
| |
Collapse
|
49
|
Halder AK, Haghbakhsh R, Voroshylova IV, Duarte ARC, Cordeiro MNDS. Predicting the Surface Tension of Deep Eutectic Solvents: A Step Forward in the Use of Greener Solvents. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27154896. [PMID: 35956845 PMCID: PMC9370217 DOI: 10.3390/molecules27154896] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 07/28/2022] [Accepted: 07/28/2022] [Indexed: 11/16/2022]
Abstract
Deep eutectic solvents (DES) are an important class of green solvents that have been developed as an alternative to toxic solvents. However, the large-scale industrial application of DESs requires fine-tuning their physicochemical properties. Among others, surface tension is one of such properties that have to be considered while designing novel DESs. In this work, we present the results of a detailed evaluation of Quantitative Structure-Property Relationships (QSPR) modeling efforts designed to predict the surface tension of DESs, following the Organization for Economic Co-operation and Development (OECD) guidelines. The data set used comprises a large number of structurally diverse binary DESs and the models were built systematically through rigorous validation methods, including ‘mixtures-out’- and ‘compounds-out’-based data splitting. The most predictive individual QSPR model found is shown to be statistically robust, besides providing valuable information about the structural and physicochemical features responsible for the surface tension of DESs. Furthermore, the intelligent consensus prediction strategy applied to multiple predictive models led to consensus models with similar statistical robustness to the individual QSPR model. The benefits of the present work stand out also from its reproducibility since it relies on fully specified computational procedures and on publicly available tools. Finally, our results not only guide the future design and screening of novel DESs with a desirable surface tension but also lays out strategies for efficiently setting up silico-based models for binary mixtures.
Collapse
Affiliation(s)
- Amit Kumar Halder
- LAQV@REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal;
- Dr B C, Roy College of Pharmacy and Allied Health Sciences, Dr. Meghnad Saha Sarani, Bidhannagar, Durgapur 713212, WB, India
- Correspondence: (A.K.H.); (M.N.D.S.C.); Tel.: +351-2240-2502 (M.N.D.S.C.)
| | - Reza Haghbakhsh
- Department of Chemical Engineering, Faculty of Engineering, University of Isfahan, Isfahan 81746-73441, Iran;
- LAQV@REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, 2829-516 Caparica, Portugal;
| | - Iuliia V. Voroshylova
- LAQV@REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal;
| | - Ana Rita C. Duarte
- LAQV@REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, 2829-516 Caparica, Portugal;
| | - Maria Natalia D. S. Cordeiro
- LAQV@REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal;
- Correspondence: (A.K.H.); (M.N.D.S.C.); Tel.: +351-2240-2502 (M.N.D.S.C.)
| |
Collapse
|
50
|
Integration of Ligand-Based and Structure-Based Methods for the Design of Small-Molecule TLR7 Antagonists. Molecules 2022; 27:molecules27134026. [PMID: 35807273 PMCID: PMC9268101 DOI: 10.3390/molecules27134026] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 05/31/2022] [Accepted: 06/01/2022] [Indexed: 12/30/2022] Open
Abstract
Toll-like receptor 7 (TLR7) is activated in response to the binding of single-stranded RNA. Its over-activation has been implicated in several autoimmune disorders, and thus, it is an established therapeutic target in such circumstances. TLR7 small-molecule antagonists are not yet available for therapeutic use. We conducted a ligand-based drug design of new TLR7 antagonists through a concerted effort encompassing 2D-QSAR, 3D-QSAR, and pharmacophore modelling of 54 reported TLR7 antagonists. The developed 2D-QSAR model depicted an excellent correlation coefficient (R2training: 0.86 and R2test: 0.78) between the experimental and estimated activities. The ligand-based drug design approach utilizing the 3D-QSAR model (R2training: 0.95 and R2test: 0.84) demonstrated a significant contribution of electrostatic potential and steric fields towards the TLR7 antagonism. This consolidated approach, along with a pharmacophore model with high correlation (Rtraining: 0.94 and Rtest: 0.92), was used to design quinazoline-core-based hTLR7 antagonists. Subsequently, the newly designed molecules were subjected to molecular docking onto the previously proposed binding model and a molecular dynamics study for a better understanding of their binding pattern. The toxicity profiles and drug-likeness characteristics of the designed compounds were evaluated with in silico ADMET predictions. This ligand-based study contributes towards a better understanding of lead optimization and the future development of potent TLR7 antagonists.
Collapse
|