1
|
Geci R, Gadaleta D, de Lomana MG, Ortega-Vallbona R, Colombo E, Serrano-Candelas E, Paini A, Kuepfer L, Schaller S. Systematic evaluation of high-throughput PBK modelling strategies for the prediction of intravenous and oral pharmacokinetics in humans. Arch Toxicol 2024; 98:2659-2676. [PMID: 38722347 PMCID: PMC11272695 DOI: 10.1007/s00204-024-03764-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 04/23/2024] [Indexed: 07/26/2024]
Abstract
Physiologically based kinetic (PBK) modelling offers a mechanistic basis for predicting the pharmaco-/toxicokinetics of compounds and thereby provides critical information for integrating toxicity and exposure data to replace animal testing with in vitro or in silico methods. However, traditional PBK modelling depends on animal and human data, which limits its usefulness for non-animal methods. To address this limitation, high-throughput PBK modelling aims to rely exclusively on in vitro and in silico data for model generation. Here, we evaluate a variety of in silico tools and different strategies to parameterise PBK models with input values from various sources in a high-throughput manner. We gather 2000 + publicly available human in vivo concentration-time profiles of 200 + compounds (IV and oral administration), as well as in silico, in vitro and in vivo determined compound-specific parameters required for the PBK modelling of these compounds. Then, we systematically evaluate all possible PBK model parametrisation strategies in PK-Sim and quantify their prediction accuracy against the collected in vivo concentration-time profiles. Our results show that even simple, generic high-throughput PBK modelling can provide accurate predictions of the pharmacokinetics of most compounds (87% of Cmax and 84% of AUC within tenfold). Nevertheless, we also observe major differences in prediction accuracies between the different parameterisation strategies, as well as between different compounds. Finally, we outline a strategy for high-throughput PBK modelling that relies exclusively on freely available tools. Our findings contribute to a more robust understanding of the reliability of high-throughput PBK modelling, which is essential to establish the confidence necessary for its utilisation in Next-Generation Risk Assessment.
Collapse
Affiliation(s)
- René Geci
- esqLABS GmbH, Saterland, Germany.
- Institute for Systems Medicine with Focus on Organ Interaction, University Hospital RWTH Aachen, Aachen, Germany.
| | | | - Marina García de Lomana
- Machine Learning Research, Research and Development, Pharmaceuticals, Bayer AG, Berlin, Germany
| | | | - Erika Colombo
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | | | | | - Lars Kuepfer
- Institute for Systems Medicine with Focus on Organ Interaction, University Hospital RWTH Aachen, Aachen, Germany
| | | |
Collapse
|
2
|
Llompart P, Minoletti C, Baybekov S, Horvath D, Marcou G, Varnek A. Will we ever be able to accurately predict solubility? Sci Data 2024; 11:303. [PMID: 38499581 PMCID: PMC10948805 DOI: 10.1038/s41597-024-03105-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 02/29/2024] [Indexed: 03/20/2024] Open
Abstract
Accurate prediction of thermodynamic solubility by machine learning remains a challenge. Recent models often display good performances, but their reliability may be deceiving when used prospectively. This study investigates the origins of these discrepancies, following three directions: a historical perspective, an analysis of the aqueous solubility dataverse and data quality. We investigated over 20 years of published solubility datasets and models, highlighting overlooked datasets and the overlaps between popular sets. We benchmarked recently published models on a novel curated solubility dataset and report poor performances. We also propose a workflow to cure aqueous solubility data aiming at producing useful models for bench chemist. Our results demonstrate that some state-of-the-art models are not ready for public usage because they lack a well-defined applicability domain and overlook historical data sources. We report the impact of factors influencing the utility of the models: interlaboratory standard deviation, ionic state of the solute and data sources. The herein obtained models, and quality-assessed datasets are publicly available.
Collapse
Affiliation(s)
- P Llompart
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
- IDD/CADD, Sanofi, Vitry-Sur-Seine, France
| | | | - S Baybekov
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
| | - D Horvath
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
| | - G Marcou
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France.
| | - A Varnek
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
| |
Collapse
|
3
|
Szternfeld P, Demoury C, Brian W, Michelet JY, Van Leeuw V, Van Hoeck E, Joly L. Modelling the pesticide transfer during tea and herbal tea infusions by the identification of critical infusion parameters. Food Chem 2023; 429:136893. [PMID: 37480773 DOI: 10.1016/j.foodchem.2023.136893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 07/13/2023] [Accepted: 07/13/2023] [Indexed: 07/24/2023]
Abstract
Pesticide residues in tea and herbal tea often exceed EU maximum residue limits. Consideration of the transfer of pesticides from the leaves (called transfer factors) to the brew is essential to assess the associated risk. This study identified infusion parameters influencing the transfer behaviour of 61 pesticides and elaborated a predictive model for pesticides with unknown transfer factors in black, green, herbal and flavoured teas. Tea type and the presence of flavours were the criteria that most influenced the pesticide transfer. Interestingly, infusion parameters that are individual and area dependent such as infusion time, temperature and water hardness, did not play a significant role. Beta regression models developed to characterise pesticide behaviour during infusion showed good predictions for most pesticides and revealed that log (P) was the main physico-chemical parameter to estimate the pesticide transfer. The transfer factors database and validated models are valuable tools for improving risk assessment.
Collapse
Affiliation(s)
- Philippe Szternfeld
- Service Organic Contaminants and Additives, Department of Chemical and Physical Health Risks, Sciensano, 14 rue Juliette Wytsman, 1050 Brussels, Belgium.
| | - Claire Demoury
- Service Risk and Health Impact Assessment, Department of Chemical and Physical Health Risks, Sciensano, 14 rue Juliette Wytsman, 1050 Brussels, Belgium
| | - Wendy Brian
- Service Organic Contaminants and Additives, Department of Chemical and Physical Health Risks, Sciensano, 14 rue Juliette Wytsman, 1050 Brussels, Belgium
| | - Jean-Yves Michelet
- Service Organic Contaminants and Additives, Department of Chemical and Physical Health Risks, Sciensano, 14 rue Juliette Wytsman, 1050 Brussels, Belgium
| | - Virginie Van Leeuw
- Service Organic Contaminants and Additives, Department of Chemical and Physical Health Risks, Sciensano, 14 rue Juliette Wytsman, 1050 Brussels, Belgium
| | - Els Van Hoeck
- Service Organic Contaminants and Additives, Department of Chemical and Physical Health Risks, Sciensano, 14 rue Juliette Wytsman, 1050 Brussels, Belgium
| | - Laure Joly
- Service Organic Contaminants and Additives, Department of Chemical and Physical Health Risks, Sciensano, 14 rue Juliette Wytsman, 1050 Brussels, Belgium
| |
Collapse
|
4
|
Tang N. Insights into Chemical Structure-Based Modeling for New Sweetener Discovery. Foods 2023; 12:2563. [PMID: 37444301 DOI: 10.3390/foods12132563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 06/27/2023] [Accepted: 06/29/2023] [Indexed: 07/15/2023] Open
Abstract
The search for novel, natural, high-sweetness, low-calorie sweeteners remains open and challenging. In the present study, the structure-based machine learning modeling and sweetness recognition mechanism were investigated to assist this process. It was found that whether or not a compound was sweet was closely related to molecular connectivity and composition (the number of hydrogen bond acceptors and donors), tpsaEfficiency, structural complexity, and shape (nAtomP and Fsp3). While the relative sweetness of sweet compounds was more determined by the molecular properties (tpsaEfficiency and Log P), structural complexity and composition (nAtomP and ATSm 1). The built machine learning models exhibited very good performance for classifying the sweet/non-sweet compounds and predicting the relative sweetness of the compounds. Moreover, a specific binding pocket was found for sweet compounds, and the sweet compounds mainly interacted with the VFT domain of the T1R2-T1R3 through hydrogen bonds. In addition, the results indicated that among the sweet compounds, those that were sweeter bound to the VFT domain stronger than those that had low sweetness. This study provides very useful information for developing new sweeteners.
Collapse
Affiliation(s)
- Ning Tang
- Beijing Key Laboratory of Functional Food from Plant Resources, College of Food Science and Nutritional Engineering, China Agricultural University, Beijing 100083, China
| |
Collapse
|
5
|
Song S, Wang Y, Tian X, He W, Chen F, Wu J, Zhang Q. Predicting the Melting Point of Energetic Molecules Using a Learnable Graph Neural Fingerprint Model. J Phys Chem A 2023; 127:4328-4337. [PMID: 37141395 DOI: 10.1021/acs.jpca.3c00112] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Melting point prediction for organic molecules has drawn widespread attention from both academic and industrial communities. In this work, a learnable graph neural fingerprint (GNF) was employed to develop a melting point prediction model using a dataset of over 90,000 organic molecules. The GNF model exhibited a significant advantage, with a mean absolute error (MAE) of 25.0 K, when compared to other featurization methods. Furthermore, by integrating prior knowledge through a customized descriptor set (i.e., CDS) into GNF, the accuracy of the resulting model, GNF_CDS, improved to 24.7 K, surpassing the performance of previously reported models for a wide range of structurally diverse organic compounds. Moreover, the generalizability of the GNF_CDS model was significantly improved with a decreased MAE of 17 K for an independent dataset containing melt-castable energetic molecules. This work clearly demonstrates that prior knowledge is still beneficial for modeling molecular properties despite the powerful learning capability of graph neural networks, especially in specific fields where chemical data are lacking.
Collapse
Affiliation(s)
- Siwei Song
- Institute of Chemical Materials, China Academy of Engineering Physics, Mianyang, Sichuan 621000, China
| | - Yi Wang
- Institute of Chemical Materials, China Academy of Engineering Physics, Mianyang, Sichuan 621000, China
| | - Xiaolan Tian
- Institute of Chemical Materials, China Academy of Engineering Physics, Mianyang, Sichuan 621000, China
| | - Wei He
- School of Aeronautics and Astronautics, Sichuan University, Chengdu, Sichuan 610065, China
| | - Fang Chen
- Institute of Chemical Materials, China Academy of Engineering Physics, Mianyang, Sichuan 621000, China
| | - Junnan Wu
- Institute of Chemical Materials, China Academy of Engineering Physics, Mianyang, Sichuan 621000, China
| | - Qinghua Zhang
- School of Astronautics, Northwestern Polytechnic University, Xi'an, Shaanxi 710072, China
| |
Collapse
|
6
|
Thermochemical Transition in Non-Hydrogen-Bonded Polymers and Theory of Latent Decomposition. Polymers (Basel) 2022; 14:polym14235054. [PMID: 36501449 PMCID: PMC9737646 DOI: 10.3390/polym14235054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 11/15/2022] [Accepted: 11/18/2022] [Indexed: 11/23/2022] Open
Abstract
Although thermosets and various biopolymers cannot be softened without being decomposed, the vast majority of thermoplastics are believed to exhibit thermal transitions solely related to physical alterations of their structure-a behavior typical of low molecular weight substances. In this study, Differential Scanning Calorimetry (DSC), Fourier Transform Infrared Spectroscopy (FTIR) and Thermogravimetry (TGA) were used to study the softening of four common non-hydrogen-bonded thermoplastic polymers (polypropylene, polypropylene-grafted-maleic anhydride, poly(vinyl chloride) and polystyrene) along with a hydrogen-bonded polymer as a reference, namely, poly(vinyl alcohol). It is shown that the softening of these polymers is a thermochemical transition. Based on fundamental concepts of statistical thermodynamics, it is proposed that the thermal transition behavior of all kinds of polymers is qualitatively the same: polymers cannot be softened without being decomposed (in resemblance with their incapability to boil) and the only difference between the various types of polymers is quantitative and lies in the extent of decomposition during softening. Decomposition seems to reach a local maximum during softening; however, it is predicted that polymers constantly decompose even at room temperature and, by heating, (sensible) decomposition is not initiated but simply accelerated. The term "latent decomposition" is proposed to describe this concept.
Collapse
|
7
|
Carrera GVSM. The Melting Point Profile of Organic Molecules: A Chemoinformatic Approach. ADVANCED THEORY AND SIMULATIONS 2022. [DOI: 10.1002/adts.202200503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Gonçalo V. S. M. Carrera
- Chemistry Department LAQV‐REQUIMTE NOVA School of Science and Technology Caparica 2829‐516 Portugal
| |
Collapse
|
8
|
Tsioptsias C, Tsivintzelis I. On the Thermodynamic Thermal Properties of Quercetin and Similar Pharmaceuticals. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27196630. [PMID: 36235166 PMCID: PMC9571029 DOI: 10.3390/molecules27196630] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 10/02/2022] [Accepted: 10/03/2022] [Indexed: 11/16/2022]
Abstract
The thermodynamic properties of pharmaceuticals are of major importance since they are involved in drug design, processing, optimization and modelling. In this study, a long-standing confusion regarding the thermodynamic properties of flavonoids and similar pharmaceuticals is recognized and clarified. As a case study, the thermal behavior of quercetin is examined with various techniques. It is shown that quercetin does not exhibit glass transition nor a melting point, but on the contrary, it does exhibit various thermochemical transitions (structural relaxation occurring simultaneously with decomposition). Inevitably, the physical meaning of the reported experimental values of the thermodynamic properties, such as the heat of fusion and heat capacity, are questioned. The discussion for this behavior is focused on the weakening of the chemical bonds. The interpretations along with the literature data suggest that the thermochemical transition might be exhibited by various flavonoids and other similar pharmaceuticals, and is related to the difficulty in the prediction/modelling of their melting point.
Collapse
|
9
|
Tsioptsias C, Spartali C, Marras SI, Ntampou X, Tsivintzelis I, Panayiotou C. Thermochemical Transition in Low Molecular Weight Substances: The Example of the Silybin Flavonoid. Molecules 2022; 27:molecules27196345. [PMID: 36234879 PMCID: PMC9572013 DOI: 10.3390/molecules27196345] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 09/21/2022] [Accepted: 09/22/2022] [Indexed: 11/16/2022] Open
Abstract
Silybin is a complex organic molecule with high bioactivity, extracted from the plant Silybum. As a pharmaceutical substance, silybin’s bioactivity has drawn considerable attention, while its other properties, e.g., thermodynamic properties and thermal stability, have been less studied. Silybin has been reported to exhibit a melting point, and values for its heat of fusion have been provided. In this work, differential scanning calorimetry, thermogravimetry including derivative thermogravimetry, infrared spectroscopy, and microscopy were used to provide evidence that silybin exhibits a thermochemical transition, i.e., softening occurring simultaneously with decomposition. Data from the available literature in combination with critical discussion of the results in a general framework suggest that thermochemical transition is a broad effect exhibited by various forms of matter (small molecules, macromolecules, natural, synthetic, organic, inorganic). The increased formation of hydrogen bonding contributes to this behavior through a dual influence: (a) inhibition of melting and (b) facilitation of decomposition due to weakening of chemical bonds.
Collapse
Affiliation(s)
- Costas Tsioptsias
- Department of Chemical Engineering, Aristotle University of Thessaloniki, University Campus, 54124 Thessaloniki, Greece
- Correspondence: (C.T.); (I.T.); (C.P.)
| | - Christina Spartali
- Department of Biochemistry and Biotechnology, University of Thessaly, 41500 Larissa, Greece
| | - Sotirios I. Marras
- Department of Biochemistry and Biotechnology, University of Thessaly, 41500 Larissa, Greece
| | - Xanthi Ntampou
- Department of Chemical Engineering, Aristotle University of Thessaloniki, University Campus, 54124 Thessaloniki, Greece
| | - Ioannis Tsivintzelis
- Department of Chemical Engineering, Aristotle University of Thessaloniki, University Campus, 54124 Thessaloniki, Greece
- Correspondence: (C.T.); (I.T.); (C.P.)
| | - Costas Panayiotou
- Department of Chemical Engineering, Aristotle University of Thessaloniki, University Campus, 54124 Thessaloniki, Greece
- Correspondence: (C.T.); (I.T.); (C.P.)
| |
Collapse
|
10
|
Okezue MA, Byrn SJ, Clase KL. Determining the solubilities for benzoate, nicotinate, hydrochloride, and malonate salts of bedaquiline. Int J Pharm 2022; 627:122229. [PMID: 36162611 DOI: 10.1016/j.ijpharm.2022.122229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 09/11/2022] [Accepted: 09/18/2022] [Indexed: 11/16/2022]
Abstract
Determining the solubility of a compound is important for predicting its oral bioavailability, the medium to be used for dissolution, and solvents for cleaning during manufacturing. The solubilities of the newly synthesized benzoate, hydrochloride, nicotinate, and malonate salts of bedaquiline were elucidated, and the plausible reasons for the differences observed in their experimental aqueous solubilities were highlighted. The shake flask method was used to determine the experimental solubilities of the bedaquiline free base and all the salts in water, 0.01 N HCl, and pH 6.8 buffer. The molar and mole fraction solubility estimates of the salts were determined using equations for ideal and non-ideal situations. Furthermore, the relative contribution of the lattice and activity coefficient to the overall aqueous solubility of the salts were predicted graphically. The new salts ranked hydrochloride [0.6437 mg/mL] > malonate [0.0268 mg/ml] > nicotinate [0.0024 mg/mL] > benzoate [0.0004 mg/mL], showed improved aqueous solubility over the free base. The general solubility equation [GSE], fairly predicted the solubilities for the benzoate and malonate salts, but the ideal solubility equations provided poor estimates of their experimental values. Based on the ideal solubility estimates, the crystal lattice contributions of all salts were malonate > nicotinate > HCl > benzoate. However, using the activity coefficient values, the order of hydrophobicity of the bedaquiline salts was: benzoate > nicotinate > malonate > HCl. The salts forms of bedaquiline offered additional solubility as a function of their crystallinity and hydrophobicity.
Collapse
Affiliation(s)
- Mercy A Okezue
- Department of Industrial & Physical Pharmacy, Purdue University, West Lafayette, IN, USA.
| | - Stephen J Byrn
- Department of Industrial & Physical Pharmacy, Purdue University, West Lafayette, IN, USA
| | - Kari L Clase
- School of Agricultural & Biological Engineering, Biotechnology Innovation & Regulatory Science, Purdue University, West Lafayette, IN, USA
| |
Collapse
|
11
|
Tsioptsias C, Tsivintzelis I. Insights on thermodynamic thermal properties and infrared spectroscopic band assignments of gallic acid. J Pharm Biomed Anal 2022; 221:115065. [PMID: 36162278 DOI: 10.1016/j.jpba.2022.115065] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 09/10/2022] [Accepted: 09/16/2022] [Indexed: 02/07/2023]
Abstract
Gallic acid (3,4,5-trihydroxybenzoic acid) is a popular nutraceutical found in various natural sources. A confusion regarding its thermodynamic properties, e.g., melting point, can be detected in the reported literature values. Similar issues exist for the assignment of its spectroscopic bands in the region of hydroxyl stretching vibrations. In this study, thermal analysis techniques, infrared spectroscopy and X-ray diffraction were used to study the thermal behavior of gallic acid. It is shown that gallic acid exhibits various thermochemical transitions (solid-solid and solid-liquid transitions). The value of the specific heat of the thermal transition around 90 °C indicates that this effect is not only related to water removal, but to decomposition. The absence of significant/exclusive water removal at 90 °C suggests that water being present in the structure of gallic acid is strongly bounded, while the main pathway for the decomposition around 90 °C seems to be the dehydration through esterification reaction between -COOH and -OH groups of gallic acid. Recrystallization of gallic acid from methanol-heavy water solvent mixture, leads to the incorporation of heavy water in its structure. The comparative evaluation of the recrystallized and raw gallic acid allows for a proper spectroscopic band assignment of various vibrations. The thermal effect around 260 °C is a typical thermochemical transition and not a melting point. The extensive polymorphism of gallic acid and the respective solid-solid transformations are also related to partial decomposition.
Collapse
Affiliation(s)
- C Tsioptsias
- Laboratory of Physical Chemistry, Department of Chemical Engineering, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.
| | - I Tsivintzelis
- Laboratory of Physical Chemistry, Department of Chemical Engineering, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.
| |
Collapse
|
12
|
Avdeef A, Kansy M. Trends in PhysChem Properties of Newly Approved Drugs over the Last Six Years; Predicting Solubility of Drugs Approved in 2021. J SOLUTION CHEM 2022. [DOI: 10.1007/s10953-022-01199-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
13
|
Xiouras C, Cameli F, Quilló GL, Kavousanakis ME, Vlachos DG, Stefanidis GD. Applications of Artificial Intelligence and Machine Learning Algorithms to Crystallization. Chem Rev 2022; 122:13006-13042. [PMID: 35759465 DOI: 10.1021/acs.chemrev.2c00141] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Artificial intelligence and specifically machine learning applications are nowadays used in a variety of scientific applications and cutting-edge technologies, where they have a transformative impact. Such an assembly of statistical and linear algebra methods making use of large data sets is becoming more and more integrated into chemistry and crystallization research workflows. This review aims to present, for the first time, a holistic overview of machine learning and cheminformatics applications as a novel, powerful means to accelerate the discovery of new crystal structures, predict key properties of organic crystalline materials, simulate, understand, and control the dynamics of complex crystallization process systems, as well as contribute to high throughput automation of chemical process development involving crystalline materials. We critically review the advances in these new, rapidly emerging research areas, raising awareness in issues such as the bridging of machine learning models with first-principles mechanistic models, data set size, structure, and quality, as well as the selection of appropriate descriptors. At the same time, we propose future research at the interface of applied mathematics, chemistry, and crystallography. Overall, this review aims to increase the adoption of such methods and tools by chemists and scientists across industry and academia.
Collapse
Affiliation(s)
- Christos Xiouras
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Fabio Cameli
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Gustavo Lunardon Quilló
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium.,Chemical and BioProcess Technology and Control, Department of Chemical Engineering, Faculty of Engineering Technology, KU Leuven, Gebroeders de Smetstraat 1, 9000 Ghent, Belgium
| | - Mihail E Kavousanakis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece
| | - Dionisios G Vlachos
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Georgios D Stefanidis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece.,Laboratory for Chemical Technology, Ghent University; Tech Lane Ghent Science Park 125, B-9052 Ghent, Belgium
| |
Collapse
|
14
|
Avdeef A, Kansy M. Predicting Solubility of Newly-Approved Drugs (2016–2020) with a Simple ABSOLV and GSE(Flexible-Acceptor) Consensus Model Outperforming Random Forest Regression. J SOLUTION CHEM 2022; 51:1020-1055. [PMID: 35153342 PMCID: PMC8818506 DOI: 10.1007/s10953-022-01141-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 11/10/2021] [Indexed: 11/24/2022]
Abstract
This study applies the ‘Flexible-Acceptor’ variant of the General Solubility Equation, GSE(Φ,B), to the prediction of the aqueous intrinsic solubility, log10S0, of FDA recently-approved (2016–2020) ‘small-molecule’ new molecular entities (NMEs). The novel equation had been shown to predict the solubility of drugs beyond Lipinski’s ‘Rule of 5’ chemical space (bRo5) to a precision nearly matching that of the Random Forest Regression (RFR) machine learning method. Since then, it was found that the GSE(Φ,B) appears to work well not only for bRo5 NMEs, but also for Ro5 drugs. To put context to GSE(Φ,B), Yalkowsky’s GSE(classic), Abraham’s ABSOLV, and Breiman’s RFR models were also applied to predict log10 S0 of 72 newly-approve NMEs, for which useable reported solubility values could be accessed (nearly 60% from FDA New Drug Application published reports). Except for GSE (classic), the prediction models were retrained with an enlarged version of the Wiki-pS0 database (nearly 400 added log10 S0 entries since our recent previous study). Thus, these four models were further validated by the additional independent solubility measurements which the newly-approved drugs introduced. The prediction methods ranked RFR ~ GSE (Φ,B) > ABSOLV > GSE (classic) in performance. It was further demonstrated that the biases generated in the four separate models could be nearly eliminated in a consensus model based on the average of just two of the methods: GSE (Φ,B) and ABSOLV. The resulting consensus prediction equation is simple in form and can be easily incorporated into spreadsheet calculations. Even more significant, it slightly outperformed the RFR method.
Collapse
|
15
|
Structural modification aimed for improving solubility of lead compounds in early phase drug discovery. Bioorg Med Chem 2022; 56:116614. [DOI: 10.1016/j.bmc.2022.116614] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 12/15/2021] [Accepted: 01/06/2022] [Indexed: 12/19/2022]
|
16
|
Li Y, Xu Y, Yu Y. CRNNTL: Convolutional Recurrent Neural Network and Transfer Learning for QSAR Modeling in Organic Drug and Material Discovery. Molecules 2021; 26:molecules26237257. [PMID: 34885843 PMCID: PMC8658888 DOI: 10.3390/molecules26237257] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 11/25/2021] [Accepted: 11/26/2021] [Indexed: 11/16/2022] Open
Abstract
Molecular latent representations, derived from autoencoders (AEs), have been widely used for drug or material discovery over the past couple of years. In particular, a variety of machine learning methods based on latent representations have shown excellent performance on quantitative structure–activity relationship (QSAR) modeling. However, the sequence feature of them has not been considered in most cases. In addition, data scarcity is still the main obstacle for deep learning strategies, especially for bioactivity datasets. In this study, we propose the convolutional recurrent neural network and transfer learning (CRNNTL) method inspired by the applications of polyphonic sound detection and electrocardiogram classification. Our model takes advantage of both convolutional and recurrent neural networks for feature extraction, as well as the data augmentation method. According to QSAR modeling on 27 datasets, CRNNTL can outperform or compete with state-of-art methods in both drug and material properties. In addition, the performances on one isomers-based dataset indicate that its excellent performance results from the improved ability in global feature extraction when the ability of the local one is maintained. Then, the transfer learning results show that CRNNTL can overcome data scarcity when choosing relative source datasets. Finally, the high versatility of our model is shown by using different latent representations as inputs from other types of AEs.
Collapse
Affiliation(s)
- Yaqin Li
- West China Tianfu Hospital, Sichuan University, Chengdu 610041, China
- Correspondence: (Y.L.); (Y.Y.)
| | - Yongjin Xu
- Department of Chemistry and Molecular Biology, University of Gothenburg, Kemivägen 10, 41296 Gothenburg, Sweden;
| | - Yi Yu
- Department of Chemistry and Molecular Biology, University of Gothenburg, Kemivägen 10, 41296 Gothenburg, Sweden;
- Correspondence: (Y.L.); (Y.Y.)
| |
Collapse
|
17
|
Haywood AL, Redshaw J, Hanson-Heine MWD, Taylor A, Brown A, Mason AM, Gärtner T, Hirst JD. Kernel Methods for Predicting Yields of Chemical Reactions. J Chem Inf Model 2021; 62:2077-2092. [PMID: 34699222 DOI: 10.1021/acs.jcim.1c00699] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The use of machine learning methods for the prediction of reaction yield is an emerging area. We demonstrate the applicability of support vector regression (SVR) for predicting reaction yields, using combinatorial data. Molecular descriptors used in regression tasks related to chemical reactivity have often been based on time-consuming, computationally demanding quantum chemical calculations, usually density functional theory. Structure-based descriptors (molecular fingerprints and molecular graphs) are quicker and easier to calculate and are applicable to any molecule. In this study, SVR models built on structure-based descriptors were compared to models built on quantum chemical descriptors. The models were evaluated along the dimension of each reaction component in a set of Buchwald-Hartwig amination reactions. The structure-based SVR models outperformed the quantum chemical SVR models, along the dimension of each reaction component. The applicability of the models was assessed with respect to similarity to training. Prospective predictions of unseen Buchwald-Hartwig reactions are presented for synthetic assessment, to validate the generalizability of the models, with particular interest along the aryl halide dimension.
Collapse
Affiliation(s)
- Alexe L Haywood
- School of Chemistry, University of Nottingham, University Park, Nottingham NG7 2RD, U.K
| | - Joseph Redshaw
- School of Chemistry, University of Nottingham, University Park, Nottingham NG7 2RD, U.K
| | | | - Adam Taylor
- GlaxoSmithKline, Gunnels Wood Road, Stevenage SG1 2NY, U.K
| | - Alex Brown
- GlaxoSmithKline, Gunnels Wood Road, Stevenage SG1 2NY, U.K
| | - Andrew M Mason
- GlaxoSmithKline, Gunnels Wood Road, Stevenage SG1 2NY, U.K
| | - Thomas Gärtner
- Machine Learning Research Unit, TU Wien Informatics, Vienna 1040, Austria
| | - Jonathan D Hirst
- School of Chemistry, University of Nottingham, University Park, Nottingham NG7 2RD, U.K
| |
Collapse
|
18
|
Koutsoukos S, Philippi F, Malaret F, Welton T. A review on machine learning algorithms for the ionic liquid chemical space. Chem Sci 2021; 12:6820-6843. [PMID: 34123314 PMCID: PMC8153233 DOI: 10.1039/d1sc01000j] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 04/28/2021] [Indexed: 01/05/2023] Open
Abstract
There are thousands of papers published every year investigating the properties and possible applications of ionic liquids. Industrial use of these exceptional fluids requires adequate understanding of their physical properties, in order to create the ionic liquid that will optimally suit the application. Computational property prediction arose from the urgent need to minimise the time and cost that would be required to experimentally test different combinations of ions. This review discusses the use of machine learning algorithms as property prediction tools for ionic liquids (either as standalone methods or in conjunction with molecular dynamics simulations), presents common problems of training datasets and proposes ways that could lead to more accurate and efficient models.
Collapse
Affiliation(s)
- Spyridon Koutsoukos
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| | - Frederik Philippi
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| | - Francisco Malaret
- Department of Chemical Engineering, Imperial College London South Kensington Campus London SW7 2AZ UK
| | - Tom Welton
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| |
Collapse
|
19
|
Mi W, Chen H, Zhu DA, Zhang T, Qian F. Melting point prediction of organic molecules by deciphering the chemical structure into a natural language. Chem Commun (Camb) 2021; 57:2633-2636. [PMID: 33587048 DOI: 10.1039/d0cc07384a] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Establishing quantitative structure-property relationships for the rational design of small molecule drugs at the early discovery stage is highly desirable. Using natural language processing (NLP), we proposed a machine learning model to process the line notation of small organic molecules, allowing the prediction of their melting points. The model prediction accuracy benefits from training upon different canonicalized SMILES forms of the same molecules and does not decrease with increasing size, complexity, and structural flexibility. When a combination of two different canonicalized SMILES forms is used to train the model, the prediction accuracy improves. Largely distinguished from the previous fragment-based or descriptor-based models, the prediction accuracy of this NLP-based model does not decrease with increasing size, complexity, and structural flexibility of molecules. By representing the chemical structure as a natural language, this NLP-based model offers a potential tool for quantitative structure-property prediction for drug discovery and development.
Collapse
Affiliation(s)
- Weiming Mi
- Department of Automation, Tsinghua University, Beijing National Research Center for Information Science and Technology, Beijing 100084, P. R. China.
| | | | | | | | | |
Collapse
|
20
|
Abstract
Molecular dynamics (MD) simulations have become increasingly useful in the modern drug development process. In this review, we give a broad overview of the current application possibilities of MD in drug discovery and pharmaceutical development. Starting from the target validation step of the drug development process, we give several examples of how MD studies can give important insights into the dynamics and function of identified drug targets such as sirtuins, RAS proteins, or intrinsically disordered proteins. The role of MD in antibody design is also reviewed. In the lead discovery and lead optimization phases, MD facilitates the evaluation of the binding energetics and kinetics of the ligand-receptor interactions, therefore guiding the choice of the best candidate molecules for further development. The importance of considering the biological lipid bilayer environment in the MD simulations of membrane proteins is also discussed, using G-protein coupled receptors and ion channels as well as the drug-metabolizing cytochrome P450 enzymes as relevant examples. Lastly, we discuss the emerging role of MD simulations in facilitating the pharmaceutical formulation development of drugs and candidate drugs. Specifically, we look at how MD can be used in studying the crystalline and amorphous solids, the stability of amorphous drug or drug-polymer formulations, and drug solubility. Moreover, since nanoparticle drug formulations are of great interest in the field of drug delivery research, different applications of nano-particle simulations are also briefly summarized using multiple recent studies as examples. In the future, the role of MD simulations in facilitating the drug development process is likely to grow substantially with the increasing computer power and advancements in the development of force fields and enhanced MD methodologies.
Collapse
|
21
|
Qiu J, Li J, Albrecht J, Janey J. Simple Method (CHEM-SP) to Predict Solubility from 2-D Chemical Structures. Org Process Res Dev 2020. [DOI: 10.1021/acs.oprd.0c00404] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jun Qiu
- Product Development, Bristol-Myers Squibb Company, One Squibb Drive, New Brunswick, New Jersey 08903, United States
| | - Jun Li
- Product Development, Bristol-Myers Squibb Company, One Squibb Drive, New Brunswick, New Jersey 08903, United States
| | - Jacob Albrecht
- Product Development, Bristol-Myers Squibb Company, One Squibb Drive, New Brunswick, New Jersey 08903, United States
| | - Jacob Janey
- Product Development, Bristol-Myers Squibb Company, One Squibb Drive, New Brunswick, New Jersey 08903, United States
| |
Collapse
|
22
|
Feng Z, Cao J, Zhang Q, Lin L. The drug likeness analysis of anti-inflammatory clerodane diterpenoids. Chin Med 2020; 15:126. [PMID: 33298100 PMCID: PMC7727157 DOI: 10.1186/s13020-020-00407-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Accepted: 11/26/2020] [Indexed: 12/14/2022] Open
Abstract
Inflammation is an active defense response of the body against external stimuli. Long term low-grade inflammation has been considered as a deteriorated factor for aging, cancer, neurodegeneration and metabolic disorders. The clinically used glucocorticoids and non-steroidal anti-inflammatory drugs are not suitable for chronic inflammation. Therefore, it's urgent to discover and develop new effective and safe drugs to attenuate inflammation. Clerodane diterpenoids, a class of bicyclic diterpenoids, are widely distributed in plants of the Labiatae, Euphorbiaceae and Verbenaceae families, as well as fungi, bacteria, and marine sponges. Dozens of anti-inflammatory clerodane diterpenoids have been identified on different assays, both in vitro and in vivo. In the current review, the up-to-date research progresses of anti-inflammatory clerodane diterpenoids were summarized, and their druglikeness was analyzed, which provided the possibility for further development of anti-inflammatory drugs.
Collapse
Affiliation(s)
- Zheling Feng
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Avenida da Universidade, Taipa, Macau, 999078, People's Republic of China
| | - Jun Cao
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Avenida da Universidade, Taipa, Macau, 999078, People's Republic of China
| | - Qingwen Zhang
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Avenida da Universidade, Taipa, Macau, 999078, People's Republic of China
| | - Ligen Lin
- State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Avenida da Universidade, Taipa, Macau, 999078, People's Republic of China.
| |
Collapse
|
23
|
Li L, Yin XH, Diao KS. Improving the Solubility and Bioavailability of Pemafibrate via a New Polymorph Form II. ACS OMEGA 2020; 5:26245-26252. [PMID: 33073151 PMCID: PMC7557989 DOI: 10.1021/acsomega.0c04005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 09/21/2020] [Indexed: 06/11/2023]
Abstract
Pemafibrate is a new generation of anti-hyperlipidemia drugs. However, its poor solubility in water (0.410 mg/mL at 25 °C) has limited its oral bioavailability. In this study, we aimed to improve the solubility and consequently the oral bioavailability of pemafibrate via a new polymorph. A new polymorph Form II was successfully obtained by controlling the crystallization temperature and characterized by multiple analysis methods. The thermodynamic properties of Form I and Form II are almost the same, the melting points of crystal Form I [differential scanning calorimetry (DSC) onset: 97.5 °C, melting entropy: -76 J/g] and crystal Form II (DSC onset: 96.6 °C, melting entropy: -80 J/g) are very close, and the crystallinity of both is very high. In pure water, Form II is about 1.9 times that of Form I in terms of the intrinsic dissolution rate (IDR) and powder solubility. In medium, the IDR characterization was performed in a pH 6.8 buffer. The solubility of this Form II in 0.1 M HCl (pH 1.0) and phosphate buffers (pH 6.8) was investigated, and the results showed that the solubility of Form II was 2.1 and 2.0 times that of Form I, respectively. The crystal structure of Form II shows that the hydrophilic carboxyl groups of the compound are arranged outside the unit cell, which may be the reason for the increased solubility. We also studied the pharmacokinetics of beagle dogs. The mean AUC0-24h of Form II is about 2.6 times that of Form I, indicating that the solubility and bioavailability of pemafibrate can indeed be improved by forming the new polymorph Form II. It may become an ideal solid form of active pharmaceutically ingredient suitable for pharmaceutical preparations, and it can be further studied in the later period.
Collapse
Affiliation(s)
- Long Li
- Sichuan
Kelun Pharmaceutical Research Institute Co., Ltd., Chengdu 610000, China
| | - Xian-Hong Yin
- College
of Chemistry and Chemical Engineering, Guangxi
University for Nationalities, Nanning 530006, China
| | - Kai-Sheng Diao
- College
of Chemistry and Chemical Engineering, Guangxi
University for Nationalities, Nanning 530006, China
| |
Collapse
|
24
|
Abstract
This study describes a novel nonlinear variant of the well-known Yalkowsky general solubility equation (GSE). The modified equation can be trained with small molecules, mostly from the Lipinski Rule of 5 (Ro5) chemical space, to predict the intrinsic aqueous solubility, S0, of large molecules (MW > 800 Da) from beyond the rule of 5 (bRo5) space, to an accuracy almost equal to that of a recently described random forest regression (RFR) machine learning analysis. The new approach replaces the GSE constant factors in the intercept (0.5), the octanol-water log P (-1.0), and melting point, mp (-0.01) terms with simple exponential functions incorporating the sum descriptor, Φ+B (Kier Φ molecular flexibility and Abraham H-bond acceptor potential). The constants in the modified three-variable (log P, mp, Φ+B) equation were determined by partial least-squares (PLS) refinement using a small-molecule log S0 training set (n = 6541) of mostly druglike molecules. In this "flexible-acceptor" GSE(Φ,B) model, the coefficient of log P (normally fixed at -1.0) varies smoothly from -1.1 for rigid nonionizable molecules (Φ+B = 0) to -0.39 for typically flexible (Φ ∼ 20, B ∼ 6) large molecules. The intercept (traditionally fixed at +0.5) varies smoothly from +1.9 for completely inflexible small molecules to -2.2 for typically flexible large molecules. The mp coefficient (-0.007) remains practically constant, near the traditional value (-0.01) for most molecules, which suggests that the small-to-large molecule continuum is mainly solvation responsive, apparently with only minor changes in the crystal lattice contributions. For a test set of 32 large molecules (e.g., cyclosporine A, gramicidin A, leuprolide, nafarelin, oxytocin, vancomycin, and mostly natural-product-derived therapeutics used in infectious/viral diseases, in immunosuppression, and in oncology) the modified equation predicted the intrinsic solubility with a root-mean-square error of 1.10 log unit, compared to 3.0 by the traditional GSE, and 1.07 by RFR.
Collapse
Affiliation(s)
- Alex Avdeef
- in-ADME Research, 1732 First Avenue, no. 102, New York 10128, United States
| | | |
Collapse
|
25
|
Duncan KM, Casey A, Gobrogge CA, Trousdale RC, Piontek SM, Cook MJ, Steel WH, Walker RA. Coumarin Partitioning in Model Biological Membranes: Limitations of log P as a Predictor. J Phys Chem B 2020; 124:8299-8308. [DOI: 10.1021/acs.jpcb.0c06109] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Katelyn M. Duncan
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana 59717, United States
| | - Aoife Casey
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana 59717, United States
| | - Christine A. Gobrogge
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana 59717, United States
| | - Rhys C. Trousdale
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana 59717, United States
| | - Stefan M. Piontek
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana 59717, United States
| | - Matthew J. Cook
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana 59717, United States
| | - William H. Steel
- Department of Chemistry, York College of Pennsylvania, York, Pennsylvania 17403, United States
| | - Robert A. Walker
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, Montana 59717, United States
- Montana Materials Science Program, Montana State University, Bozeman, Montana 59717, United States
| |
Collapse
|
26
|
Llinas A, Oprisiu I, Avdeef A. Findings of the Second Challenge to Predict Aqueous Solubility. J Chem Inf Model 2020; 60:4791-4803. [DOI: 10.1021/acs.jcim.0c00701] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Antonio Llinas
- DMPK, Research and Early Development, Respiratory & Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, Gothenburg SE 431 50, Sweden
| | - Ioana Oprisiu
- Data Science & Artificial Intelligence, Imaging & Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg SE 431 50, Sweden
| | - Alex Avdeef
- in-ADME Research, 1732 First Avenue, #102, New York, New York 10128, United States
| |
Collapse
|
27
|
Falcón-Cano G, Molina C, Cabrera-Pérez MÁ. ADME prediction with KNIME: In silico aqueous solubility consensus model based on supervised recursive random forest approaches. ADMET AND DMPK 2020; 8:251-273. [PMID: 35300309 PMCID: PMC8915604 DOI: 10.5599/admet.852] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 08/01/2020] [Indexed: 12/12/2022] Open
Abstract
In-silico prediction of aqueous solubility plays an important role during the drug discovery and development processes. For many years, the limited performance of in-silico solubility models has been attributed to the lack of high-quality solubility data for pharmaceutical molecules. However, some studies suggest that the poor accuracy of solubility prediction is not related to the quality of the experimental data and that more precise methodologies (algorithms and/or set of descriptors) are required for predicting aqueous solubility for pharmaceutical molecules. In this study a large and diverse database was generated with aqueous solubility values collected from two public sources; two new recursive machine-learning approaches were developed for data cleaning and variable selection, and a consensus model based on regression and classification algorithms was created. The modeling protocol, which includes the curation of chemical and experimental data, was implemented in KNIME, with the aim of obtaining an automated workflow for the prediction of new databases. Finally, we compared several methods or models available in the literature with our consensus model, showing results comparable or even outperforming previous published models.
Collapse
Affiliation(s)
- Gabriela Falcón-Cano
- Unit of Modeling and Experimental Biopharmaceutics. Centro de Bioactivos Químicos. Universidad Central “Marta Abreu” de las Villas. Santa Clara 54830, Villa Clara, Cuba
| | | | - Miguel Ángel Cabrera-Pérez
- Unit of Modeling and Experimental Biopharmaceutics. Centro de Bioactivos Químicos. Universidad Central “Marta Abreu” de las Villas. Santa Clara 54830, Villa Clara, Cuba
- Department of Pharmacy and Pharmaceutical Technology, University of Valencia, Burjassot 46100, Valencia, Spain
- Department of Engineering, Area of Pharmacy and Pharmaceutical Technology, Miguel Hernández University, 03550 Sant Joan d'Alacant, Alicante, Spain
| |
Collapse
|
28
|
McDonagh JL, Swope WC, Anderson RL, Johnston MA, Bray DJ. What can digitisation do for formulated product innovation and development? POLYM INT 2020. [DOI: 10.1002/pi.6056] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
| | | | | | | | - David J Bray
- The Hartree Centre STFC Daresbury Laboratory Warrington WA4 4AD UK
| |
Collapse
|
29
|
Wyttenbach N, Niederquell A, Kuentz M. Machine Estimation of Drug Melting Properties and Influence on Solubility Prediction. Mol Pharm 2020; 17:2660-2671. [DOI: 10.1021/acs.molpharmaceut.0c00355] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Nicole Wyttenbach
- Roche Pharmaceutical Research & Early Development, Pre-Clinical CMC, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4000 Basel, Switzerland
| | - Andreas Niederquell
- University of Applied Sciences and Arts Northwestern Switzerland, Institute of Pharma Technology, Hofackerstr. 30, CH-4132 Muttenz, Switzerland
| | - Martin Kuentz
- University of Applied Sciences and Arts Northwestern Switzerland, Institute of Pharma Technology, Hofackerstr. 30, CH-4132 Muttenz, Switzerland
| |
Collapse
|
30
|
Rana P, Kogut S, Wen X, Akhlaghi F, Aleo MD. Most Influential Physicochemical and In Vitro Assay Descriptors for Hepatotoxicity and Nephrotoxicity Prediction. Chem Res Toxicol 2020; 33:1780-1790. [PMID: 32338883 DOI: 10.1021/acs.chemrestox.0c00040] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Drug-induced organ injury is a major reason for drug candidate attrition in preclinical and clinical drug development. The liver, kidneys, and heart have been recognized as the most common organ systems affected in safety-related attrition or the subject of black box warnings and postmarket drug withdrawals. In silico physicochemical property calculations and in vitro assays have been utilized separately in the early stages of the drug discovery and development process to predict drug safety. In this study, we combined physicochemical properties and in vitro cytotoxicity assays including mitochondrial dysfunction to build organ-specific univariate and multivariable logistic regression models to achieve odds ratios for the prediction of clinical hepatotoxicity, nephrotoxicity, and cardiotoxicity using 215 marketed drugs. The multivariable hepatotoxic predictive model showed an odds ratio of 6.2 (95% confidence interval (CI) 1.7-22.8) or 7.5 (95% CI 3.2-17.8) for mitochondrial inhibition or drug plasma Cmax >1 μM for drugs associated with liver injury, respectively. The multivariable nephrotoxicity predictive model showed an odds ratio of 5.8 (95% CI 2.0-16.9), 6.4 (95% CI 1.1-39.3), or 15.9 (95% CI 2.8-89.0) for drug plasma Cmax >1 μM, mitochondrial inhibition, or hydrogen-bond-acceptor atoms >7 for drugs associated with kidney injury, respectively. Conversely, drugs with a total polar surface area ≥75 Å were 79% (odds ratio 0.21, 95% CI 0.061-0.74) less likely to be associated with kidney injury. Drugs belonging to the extended clearance classification system (ECCS) class 4, where renal secretion is the primary clearance mechanism (low permeability drugs that are bases/neutrals), were 4 (95% CI 1.8-9.5) times more likely to to be associated with kidney injury with this data set. Alternatively, ECCS class 2 drugs, where hepatic metabolism is the primary clearance (high permeability drugs that are bases/neutrals) were 77% less likely (odds ratio 0.23 95% CI 0.095-0.54) to to be associated with kidney injury. A cardiotoxicity model was poorly defined using any of these drug physicochemical attributes. Combining in silico physicochemical properties descriptors along with in vitro toxicity assays can be used to build predictive toxicity models to select small molecule therapeutics with less potential to cause liver and kidney organ toxicity.
Collapse
Affiliation(s)
- Payal Rana
- Drug Safety Research and Development, Pfizer, Inc., Eastern Point Road, Groton, Connecticut 06340, United States
| | - Stephen Kogut
- College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Xuerong Wen
- College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Fatemeh Akhlaghi
- College of Pharmacy, University of Rhode Island, Kingston, Rhode Island 02881, United States
| | - Michael D Aleo
- Drug Safety Research and Development, Pfizer, Inc., Eastern Point Road, Groton, Connecticut 06340, United States
| |
Collapse
|
31
|
Fioressi SE, Bacelo DE, Aranda JF, Duchowicz PR. Prediction of the aqueous solubility of diverse compounds by 2D-QSPR. J Mol Liq 2020. [DOI: 10.1016/j.molliq.2020.112572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
32
|
Li L, Yin XH, Diao KS. Improving the solubility and bioavailability of anti-hepatitis B drug PEC via PEC–fumaric acid cocrystal. RSC Adv 2020; 10:36125-36134. [PMID: 35517067 PMCID: PMC9056957 DOI: 10.1039/d0ra06608g] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 09/23/2020] [Indexed: 11/21/2022] Open
Abstract
A cocrystal of PEC with fumaric acid (FUA) (PEC–FUA, 1 : 1) was successfully obtained and characterized. The mean AUC0–24 h of the cocrystal is about 4.2 times that of free PEC.
Collapse
Affiliation(s)
- Long Li
- Sichuan Kelun Pharmaceutical Research Institute Co., Ltd
- Chengdu 610000
- China
| | - Xian-Hong Yin
- College of Chemistry and Chemical Engineering
- Guangxi University for Nationalities
- Nanning
- China
| | - Kai-Sheng Diao
- College of Chemistry and Chemical Engineering
- Guangxi University for Nationalities
- Nanning
- China
| |
Collapse
|
33
|
Modeling Physico-Chemical ADMET Endpoints with Multitask Graph Convolutional Networks. Molecules 2019; 25:molecules25010044. [PMID: 31877719 PMCID: PMC6982787 DOI: 10.3390/molecules25010044] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Revised: 12/19/2019] [Accepted: 12/20/2019] [Indexed: 11/19/2022] Open
Abstract
Simple physico-chemical properties, like logD, solubility, or melting point, can reveal a great deal about how a compound under development might later behave. These data are typically measured for most compounds in drug discovery projects in a medium throughput fashion. Collecting and assembling all the Bayer in-house data related to these properties allowed us to apply powerful machine learning techniques to predict the outcome of those assays for new compounds. In this paper, we report our finding that, especially for predicting physicochemical ADMET endpoints, a multitask graph convolutional approach appears a highly competitive choice. For seven endpoints of interest, we compared the performance of that approach to fully connected neural networks and different single task models. The new model shows increased predictive performance compared to previous modeling methods and will allow early prioritization of compounds even before they are synthesized. In addition, our model follows the generalized solubility equation without being explicitly trained under this constraint.
Collapse
|
34
|
Performance-advantaged ether diesel bioblendstock production by a priori design. Proc Natl Acad Sci U S A 2019; 116:26421-26430. [PMID: 31843899 DOI: 10.1073/pnas.1911107116] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Lignocellulosic biomass offers a renewable carbon source which can be anaerobically digested to produce short-chain carboxylic acids. Here, we assess fuel properties of oxygenates accessible from catalytic upgrading of these acids a priori for their potential to serve as diesel bioblendstocks. Ethers derived from C2 and C4 carboxylic acids are identified as advantaged fuel candidates with significantly improved ignition quality (>56% cetane number increase) and reduced sooting (>86% yield sooting index reduction) when compared to commercial petrodiesel. The prescreening process informed conversion pathway selection toward a C11 branched ether, 4-butoxyheptane, which showed promise for fuel performance and health- and safety-related attributes. A continuous, solvent-free production process was then developed using metal oxide acidic catalysts to provide improved thermal stability, water tolerance, and yields. Liter-scale production of 4-butoxyheptane enabled fuel property testing to confirm predicted fuel properties, while incorporation into petrodiesel at 20 vol % demonstrated 10% improvement in ignition quality and 20% reduction in intrinsic sooting tendency. Storage stability of the pure bioblendstock and 20 vol % blend was confirmed with a common fuel antioxidant, as was compatibility with elastomeric components within existing engine and fueling infrastructure. Technoeconomic analysis of the conversion process identified major cost drivers to guide further research and development. Life-cycle analysis determined the potential to reduce greenhouse gas emissions by 50 to 271% relative to petrodiesel, depending on treatment of coproducts.
Collapse
|
35
|
Esaki T, Ohashi R, Watanabe R, Natsume-Kitatani Y, Kawashima H, Nagao C, Komura H, Mizuguchi K. Constructing an In Silico Three-Class Predictor of Human Intestinal Absorption With Caco-2 Permeability and Dried-DMSO Solubility. J Pharm Sci 2019; 108:3630-3639. [DOI: 10.1016/j.xphs.2019.07.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 07/06/2019] [Accepted: 07/17/2019] [Indexed: 01/03/2023]
|
36
|
Lai T, Pencheva K, Chow E, Docherty R. De-Risking Early-Stage Drug Development With a Bespoke Lattice Energy Predictive Model: A Materials Science Informatics Approach to Address Challenges Associated With a Diverse Chemical Space. J Pharm Sci 2019; 108:3176-3186. [DOI: 10.1016/j.xphs.2019.06.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Revised: 05/24/2019] [Accepted: 06/12/2019] [Indexed: 01/11/2023]
|
37
|
Precipitation of test chemicals in reaction solutions used in the amino acid derivative reactivity assay and the direct peptide reactivity assay. J Pharmacol Toxicol Methods 2019; 100:106624. [PMID: 31445998 DOI: 10.1016/j.vascn.2019.106624] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 08/05/2019] [Accepted: 08/07/2019] [Indexed: 11/22/2022]
Abstract
The Amino acid Derivative Reactivity Assay (ADRA) was developed by the authors as an in chemico alternative to animal testing for skin sensitization potential. Although ADRA is based on the same scientific principles as the Direct Peptide Reactivity Assay (DPRA), a comparison of the results from these two test methods shows a far lower incidence of precipitation of test chemicals in reaction solutions for ADRA than for DPRA. Specifically, a comparison of the results for 82 test chemicals that were tested using both DPRA and ADRA showed that while there were 30 chemicals tested using DPRA for which precipitation was found in the reaction solution, there were just three chemicals tested using ADRA for which even slight turbidity was found in the reaction solution. In contrast to the fact that many DPRA test chemicals with a n-Octanol/Water Partition Coefficient (LogKow) of 2.0 or higher exhibited precipitation, there were only three ADRA test chemicals that exhibited turbidity, and these were all highly hydrophobic with a LogKow of greater than 6.0. Moreover, one of the DPRA test chemicals that exhibited precipitation also gave a false negative result, suggesting that anytime a test chemical exhibits precipitation in the reaction solution during DPRA testing the results must be interpreted with the greatest care, although all false positives are not caused by precipitation of test chemicals. Therefore, since relatively few ADRA test chemicals exhibited precipitation relative to DPRA, we consider ADRA to be an extremely useful means of testing a wide variety of chemical substances.
Collapse
|
38
|
Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem Rev 2019; 119:10520-10594. [PMID: 31294972 DOI: 10.1021/acs.chemrev.8b00728] [Citation(s) in RCA: 351] [Impact Index Per Article: 70.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
- Xin Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Ryan Byrne
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Gisbert Schneider
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| |
Collapse
|
39
|
Abdelaziz A, Zaitsau DH, Kuratieva NV, Verevkin SP, Schick C. Melting of nucleobases. Getting the cutting edge of "Walden's Rule". Phys Chem Chem Phys 2019; 21:12787-12797. [PMID: 30888011 DOI: 10.1039/c9cp00716d] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Walden's Rule is an empirical observation of an invariant fusion entropy during fusion of non-associated organic compounds. For the five nucleobases, adenine, thymine, cytosine, guanine, and uracil, surprisingly high fusion temperatures and enthalpies have been measured using a specially developed fast scanning calorimetry method that prevents decomposition. Even when nucleobases admittedly possess very high fusion temperatures, e.g. the value of 862 K measured for guanine really exceeds all expectations of the feasible dimension of the fusion temperature for such a relatively small and simple organic molecule. Hirshfeld surface analysis has been applied in order to find out an explanation for such extremely unusual thermal behavior of nucleobases. We rationalized the observed trends in terms of fusion entropy (Walden's constant = 56.5 J K-1 mol-1) as the entropic penalty of fusion not only for "non-associated", as proposed by Walden in 1908, but also for "ideal associated" systems like nucleobases.
Collapse
Affiliation(s)
- A Abdelaziz
- University of Rostock, Institute of Physics, Albert-Einstein-Str. 23-24, 18051 Rostock, Germany. and University of Rostock, Faculty of Interdisciplinary Research, Competence Centre CALOR, Albert-Einstein-Str. 25, 18051 Rostock, Germany.
| | - D H Zaitsau
- University of Rostock, Institute of Chemistry, Dr-Lorenz-Weg 2, 18059 Rostock, Germany
| | - N V Kuratieva
- Nikolaev Institute of Inorganic Chemistry of Siberian Branch of Russian Academy of Sciences, 630090 Novosibirsk, Russia
| | - S P Verevkin
- University of Rostock, Faculty of Interdisciplinary Research, Competence Centre CALOR, Albert-Einstein-Str. 25, 18051 Rostock, Germany. and University of Rostock, Institute of Chemistry, Dr-Lorenz-Weg 2, 18059 Rostock, Germany and Kazan Federal University, 18 Kremlyovskaya Street, Kazan 420008, Russian Federation
| | - C Schick
- University of Rostock, Institute of Physics, Albert-Einstein-Str. 23-24, 18051 Rostock, Germany. and University of Rostock, Faculty of Interdisciplinary Research, Competence Centre CALOR, Albert-Einstein-Str. 25, 18051 Rostock, Germany. and Kazan Federal University, 18 Kremlyovskaya Street, Kazan 420008, Russian Federation
| |
Collapse
|
40
|
Nedyalkova MA, Madurga S, Tobiszewski M, Simeonov V. Calculating the Partition Coefficients of Organic Solvents in Octanol/Water and Octanol/Air. J Chem Inf Model 2019; 59:2257-2263. [PMID: 31042037 DOI: 10.1021/acs.jcim.9b00212] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Partition coefficients define how a solute is distributed between two immiscible phases at equilibrium. The experimental estimation of partition coefficients in a complex system can be an expensive, difficult, and time-consuming process. Here a computational strategy to predict the distributions of a set of solutes in two relevant phase equilibria is presented. The octanol/water and octanol/air partition coefficients are predicted for a group of polar solvents using density functional theory (DFT) calculations in combination with a solvation model based on density (SMD) and are in excellent agreement with experimental data. Thus, the use of quantum-chemical calculations to predict partition coefficients from free energies should be a valuable alternative for unknown solvents. The obtained results indicate that the SMD continuum model in conjunction with any of the three DFT functionals (B3LYP, M06-2X, and M11) agrees with the observed experimental values. The highest correlation to experimental data for the octanol/water partition coefficients was reached by the M11 functional; for the octanol/air partition coefficient, the M06-2X functional yielded the best performance. To the best of our knowledge, this is the first computational approach for the prediction of octanol/air partition coefficients by DFT calculations, which has remarkable accuracy and precision.
Collapse
Affiliation(s)
- Miroslava A Nedyalkova
- Inorganic Chemistry Department, Faculty of Chemistry and Pharmacy , University of Sofia , Sofia 1164 , Bulgaria
| | - Sergio Madurga
- Departament de Ciència de Materials i Química Física and Institut de Química Teòrica i Computacional (IQTCUB) , Universitat de Barcelona , 08028 Barcelona , Catalonia , Spain
| | - Marek Tobiszewski
- Department of Analytical Chemistry, Faculty of Chemistry , Gdańsk University of Technology (GUT) , 80-233 Gdańsk , Poland
| | - Vasil Simeonov
- Analytical Chemistry Department, Faculty of Chemistry and Pharmacy , University of Sofia , Sofia 1164 , Bulgaria
| |
Collapse
|
41
|
Llinas A, Avdeef A. Solubility Challenge Revisited after Ten Years, with Multilab Shake-Flask Data, Using Tight (SD ∼ 0.17 log) and Loose (SD ∼ 0.62 log) Test Sets. J Chem Inf Model 2019; 59:3036-3040. [DOI: 10.1021/acs.jcim.9b00345] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Antonio Llinas
- DMPK, Respiratory, Inflammation and Autoimmunity, IMED Biotech Unit, AstraZeneca, Gothenburg, SE-43183 Sweden
| | - Alex Avdeef
- in-ADME Research, 1732 First Avenue, #102, New York, New York 10128, United States
| |
Collapse
|
42
|
Hossain S, Kabedev A, Parrow A, Bergström CAS, Larsson P. Molecular simulation as a computational pharmaceutics tool to predict drug solubility, solubilization processes and partitioning. Eur J Pharm Biopharm 2019; 137:46-55. [PMID: 30771454 PMCID: PMC6434319 DOI: 10.1016/j.ejpb.2019.02.007] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 02/05/2019] [Accepted: 02/13/2019] [Indexed: 01/12/2023]
Abstract
In this review we will discuss how computational methods, and in particular classical molecular dynamics simulations, can be used to calculate solubility of pharmaceutically relevant molecules and systems. To the extent possible, we focus on the non-technical details of these calculations, and try to show also the added value of a more thorough and detailed understanding of the solubilization process obtained by using computational simulations. Although the main focus is on classical molecular dynamics simulations, we also provide the reader with some insights into other computational techniques, such as the COSMO-method, and also discuss Flory-Huggins theory and solubility parameters. We hope that this review will serve as a valuable starting point for any pharmaceutical researcher, who has not yet fully explored the possibilities offered by computational approaches to solubility calculations.
Collapse
Affiliation(s)
- Shakhawath Hossain
- Department of Pharmacy, Uppsala Biomedical Center, Uppsala University, 751 23 Uppsala, Sweden; Swedish Drug Delivery Forum (SDDF), Uppsala University, Sweden
| | - Aleksei Kabedev
- Department of Pharmacy, Uppsala Biomedical Center, Uppsala University, 751 23 Uppsala, Sweden
| | - Albin Parrow
- Department of Pharmacy, Uppsala Biomedical Center, Uppsala University, 751 23 Uppsala, Sweden
| | - Christel A S Bergström
- Department of Pharmacy, Uppsala Biomedical Center, Uppsala University, 751 23 Uppsala, Sweden; Swedish Drug Delivery Forum (SDDF), Uppsala University, Sweden
| | - Per Larsson
- Department of Pharmacy, Uppsala Biomedical Center, Uppsala University, 751 23 Uppsala, Sweden; Swedish Drug Delivery Forum (SDDF), Uppsala University, Sweden.
| |
Collapse
|
43
|
Smith BR, Ashton KM, Brodbelt A, Dawson T, Jenkinson MD, Hunt NT, Palmer DS, Baker MJ. Combining random forest and 2D correlation analysis to identify serum spectral signatures for neuro-oncology. Analyst 2018; 141:3668-78. [PMID: 26818218 DOI: 10.1039/c5an02452h] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Fourier transform infrared (FTIR) spectroscopy has long been established as an analytical technique for the measurement of vibrational modes of molecular systems. More recently, FTIR has been used for the analysis of biofluids with the aim of becoming a tool to aid diagnosis. For the clinician, this represents a convenient, fast, non-subjective option for the study of biofluids and the diagnosis of disease states. The patient also benefits from this method, as the procedure for the collection of serum is much less invasive and stressful than traditional biopsy. This is especially true of patients in whom brain cancer is suspected. A brain biopsy is very unpleasant for the patient, potentially dangerous and can occasionally be inconclusive. We therefore present a method for the diagnosis of brain cancer from serum samples using FTIR and machine learning techniques. The scope of the study involved 433 patients from whom were collected 9 spectra each in the range 600-4000 cm(-1). To begin the development of the novel method, various pre-processing steps were investigated and ranked in terms of final accuracy of the diagnosis. Random forest machine learning was utilised as a classifier to separate patients into cancer or non-cancer categories based upon the intensities of wavenumbers present in their spectra. Generalised 2D correlational analysis was then employed to further augment the machine learning, and also to establish spectral features important for the distinction between cancer and non-cancer serum samples. Using these methods, sensitivities of up to 92.8% and specificities of up to 91.5% were possible. Furthermore, ratiometrics were also investigated in order to establish any correlations present in the dataset. We show a rapid, computationally light, accurate, statistically robust methodology for the identification of spectral features present in differing disease states. With current advances in IR technology, such as the development of rapid discrete frequency collection, this approach is of importance to enable future clinical translation and enables IR to achieve its potential.
Collapse
Affiliation(s)
- Benjamin R Smith
- WestCHEM, Department of Pure and Applied Chemistry, University of Strathclyde, Thomas Graham Building, 295 Cathedral Street, Glasgow, Scotland G1 1XL, UK. and WestCHEM, Department of Pure and Applied Chemistry, University of Strathclyde, Technology and Innovation Centre, 99 George Street, Glasgow G1 1RD, UK.
| | - Katherine M Ashton
- Neuropathology, Lancashire Teaching Hospitals NHS Trust, Royal Preston Hospital, Sharoe Green Lane, Fulwood, Preston, PR2 9HT, UK
| | - Andrew Brodbelt
- Neurosurgery, The Walton Centre NHS Foundation Trust, Lower Lane, Fazakerley, Liverpool, L9 7LJ, UK
| | - Timothy Dawson
- Neuropathology, Lancashire Teaching Hospitals NHS Trust, Royal Preston Hospital, Sharoe Green Lane, Fulwood, Preston, PR2 9HT, UK
| | - Michael D Jenkinson
- Neurosurgery, The Walton Centre NHS Foundation Trust, Lower Lane, Fazakerley, Liverpool, L9 7LJ, UK
| | - Neil T Hunt
- SUPA, Department of Physics, University of Strathclyde, 107 Rottenrow East, Glasgow, G4 0NG, UK
| | - David S Palmer
- WestCHEM, Department of Pure and Applied Chemistry, University of Strathclyde, Thomas Graham Building, 295 Cathedral Street, Glasgow, Scotland G1 1XL, UK.
| | - Matthew J Baker
- WestCHEM, Department of Pure and Applied Chemistry, University of Strathclyde, Technology and Innovation Centre, 99 George Street, Glasgow G1 1RD, UK.
| |
Collapse
|
44
|
Boobier S, Osbourn A, Mitchell JBO. Can human experts predict solubility better than computers? J Cheminform 2017; 9:63. [PMID: 29238891 PMCID: PMC5729181 DOI: 10.1186/s13321-017-0250-y] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2017] [Accepted: 12/02/2017] [Indexed: 11/10/2022] Open
Abstract
In this study, we design and carry out a survey, asking human experts to predict the aqueous solubility of druglike organic compounds. We investigate whether these experts, drawn largely from the pharmaceutical industry and academia, can match or exceed the predictive power of algorithms. Alongside this, we implement 10 typical machine learning algorithms on the same dataset. The best algorithm, a variety of neural network known as a multi-layer perceptron, gave an RMSE of 0.985 log S units and an R2 of 0.706. We would not have predicted the relative success of this particular algorithm in advance. We found that the best individual human predictor generated an almost identical prediction quality with an RMSE of 0.942 log S units and an R2 of 0.723. The collection of algorithms contained a higher proportion of reasonably good predictors, nine out of ten compared with around half of the humans. We found that, for either humans or algorithms, combining individual predictions into a consensus predictor by taking their median generated excellent predictivity. While our consensus human predictor achieved very slightly better headline figures on various statistical measures, the difference between it and the consensus machine learning predictor was both small and statistically insignificant. We conclude that human experts can predict the aqueous solubility of druglike molecules essentially equally well as machine learning algorithms. We find that, for either humans or algorithms, combining individual predictions into a consensus predictor by taking their median is a powerful way of benefitting from the wisdom of crowds.
Collapse
Affiliation(s)
- Samuel Boobier
- Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, University of St Andrews, St Andrews, KY16 9ST, Scotland, UK
| | - Anne Osbourn
- Department of Metabolic Biology, John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, UK
| | - John B O Mitchell
- Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, University of St Andrews, St Andrews, KY16 9ST, Scotland, UK.
| |
Collapse
|
45
|
McDonagh JL, Silva AF, Vincent MA, Popelier PLA. Machine Learning of Dynamic Electron Correlation Energies from Topological Atoms. J Chem Theory Comput 2017; 14:216-224. [PMID: 29211469 DOI: 10.1021/acs.jctc.7b01157] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We present an innovative method for predicting the dynamic electron correlation energy of an atom or a bond in a molecule utilizing topological atoms. Our approach uses the machine learning method Kriging (Gaussian Process Regression with a non-zero mean function) to predict these dynamic electron correlation energy contributions. The true energy values are calculated by partitioning the MP2 two-particle density-matrix via the Interacting Quantum Atoms (IQA) procedure. To our knowledge, this is the first time such energies have been predicted by a machine learning technique. We present here three important proof-of-concept cases: the water monomer, the water dimer, and the van der Waals complex H2···He. These cases represent the final step toward the design of a full IQA potential for molecular simulation. This final piece will enable us to consider situations in which dispersion is the dominant intermolecular interaction. The results from these examples suggest a new method by which dispersion potentials for molecular simulation can be generated.
Collapse
Affiliation(s)
- James L McDonagh
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester M1 7DN, Great Britain
| | - Arnaldo F Silva
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester M1 7DN, Great Britain
| | - Mark A Vincent
- School of Chemistry, The University of Manchester , Oxford Road, Manchester M13 9PL, Great Britain
| | - Paul L A Popelier
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester M1 7DN, Great Britain.,School of Chemistry, The University of Manchester , Oxford Road, Manchester M13 9PL, Great Britain
| |
Collapse
|
46
|
|
47
|
Riniker S. Molecular Dynamics Fingerprints (MDFP): Machine Learning from MD Data To Predict Free-Energy Differences. J Chem Inf Model 2017; 57:726-741. [DOI: 10.1021/acs.jcim.6b00778] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Sereina Riniker
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
48
|
Kim S, Jinich A, Aspuru-Guzik A. MultiDK: A Multiple Descriptor Multiple Kernel Approach for Molecular Discovery and Its Application to Organic Flow Battery Electrolytes. J Chem Inf Model 2017; 57:657-668. [DOI: 10.1021/acs.jcim.6b00332] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- Sungjin Kim
- Department of Chemistry and
Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, Massachusetts 02138, United States
| | - Adrián Jinich
- Department of Chemistry and
Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, Massachusetts 02138, United States
| | - Alán Aspuru-Guzik
- Department of Chemistry and
Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, Massachusetts 02138, United States
| |
Collapse
|
49
|
Abstract
Drug discovery is a multidisciplinary and multivariate optimization endeavor. As such, in silico screening tools have gained considerable importance to archive, analyze and exploit the vast and ever-increasing amount of experimental data generated throughout the process. The current review will focus on the computer-aided prediction of the numerous properties that need to be controlled during the discovery of a preliminary hit and its promotion to a viable clinical candidate. It does not pretend to the almost impossible task of an exhaustive report but will highlight a few key points that need to be collectively addressed both by chemists and biologists to fuel the drug discovery pipeline with innovative and safe drug candidates.
Collapse
Affiliation(s)
- Didier Rognan
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS-Université de Strasbourg, 74 route du Rhin, 67400 Illkirch, France.
| |
Collapse
|
50
|
Zang Q, Mansouri K, Williams AJ, Judson RS, Allen DG, Casey WM, Kleinstreuer NC. In Silico Prediction of Physicochemical Properties of Environmental Chemicals Using Molecular Fingerprints and Machine Learning. J Chem Inf Model 2017; 57:36-49. [PMID: 28006899 PMCID: PMC6131700 DOI: 10.1021/acs.jcim.6b00625] [Citation(s) in RCA: 81] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
There are little available toxicity data on the vast majority of chemicals in commerce. High-throughput screening (HTS) studies, such as those being carried out by the U.S. Environmental Protection Agency (EPA) ToxCast program in partnership with the federal Tox21 research program, can generate biological data to inform models for predicting potential toxicity. However, physicochemical properties are also needed to model environmental fate and transport, as well as exposure potential. The purpose of the present study was to generate an open-source quantitative structure-property relationship (QSPR) workflow to predict a variety of physicochemical properties that would have cross-platform compatibility to integrate into existing cheminformatics workflows. In this effort, decades-old experimental property data sets available within the EPA EPI Suite were reanalyzed using modern cheminformatics workflows to develop updated QSPR models capable of supplying computationally efficient, open, and transparent HTS property predictions in support of environmental modeling efforts. Models were built using updated EPI Suite data sets for the prediction of six physicochemical properties: octanol-water partition coefficient (logP), water solubility (logS), boiling point (BP), melting point (MP), vapor pressure (logVP), and bioconcentration factor (logBCF). The coefficient of determination (R2) between the estimated values and experimental data for the six predicted properties ranged from 0.826 (MP) to 0.965 (BP), with model performance for five of the six properties exceeding those from the original EPI Suite models. The newly derived models can be employed for rapid estimation of physicochemical properties within an open-source HTS workflow to inform fate and toxicity prediction models of environmental chemicals.
Collapse
Affiliation(s)
- Qingda Zang
- Integrated Laboratory Systems, Inc., Research Triangle Park, NC 27709, USA
| | - Kamel Mansouri
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711, USA
| | - Antony J. Williams
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711, USA
| | - Richard S. Judson
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711, USA
| | - David G. Allen
- Integrated Laboratory Systems, Inc., Research Triangle Park, NC 27709, USA
| | - Warren M. Casey
- National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA
| | - Nicole C. Kleinstreuer
- National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA
| |
Collapse
|