1
|
Wang K, Amidon GL, Smith DE. Physiological Dynamics in the Upper Gastrointestinal Tract and the Development of Gastrointestinal Absorption Models for the Immediate-Release Oral Dosage Forms in Healthy Adult Human. Pharm Res 2023; 40:2607-2626. [PMID: 37783928 DOI: 10.1007/s11095-023-03597-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 08/26/2023] [Indexed: 10/04/2023]
Abstract
This review is a revisit of various oral drug absorption models developed in the past decades, focusing on how to incorporate the physiological dynamics in the upper gastrointestinal (GI) tract. For immediate-release oral drugs, GI absorption is a critical input of drug exposure and subsequent human body response, yet difficult to model largely due to the complex GI environment. One of the biggest hurdles lies at capturing the high within-subject variability (WSV) of bioavailability measures, which can be mechanistically explained by the GI physiological dynamics. A thorough summary of how GI dynamics is handled in the absorption models would promote the development of mechanism-based oral drug absorption models, aid in the design of clinical studies regarding dosing regimens and bioequivalence studies based on WSV, and advance the decision-making on formulation selection.
Collapse
Affiliation(s)
- Kai Wang
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, MI, 48109, USA.
| | - Gordon L Amidon
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
| | - David E Smith
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
| |
Collapse
|
2
|
Zhu T, Chen Y, Tao C. Multiple machine learning algorithms assisted QSPR models for aqueous solubility: Comprehensive assessment with CRITIC-TOPSIS. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 857:159448. [PMID: 36252662 DOI: 10.1016/j.scitotenv.2022.159448] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 10/06/2022] [Accepted: 10/11/2022] [Indexed: 06/16/2023]
Abstract
As an essential environmental property, the aqueous solubility quantifies the hydrophobicity of a compound. It could be further utilized to evaluate the ecological risk and toxicity of organic pollutants. Concerned about the proliferation of organic contaminants in water and the associated technical burden, researchers have developed QSPR models to predict aqueous solubility. However, there are no standard procedures or best practices on how to comprehensively evaluate models. Hence, the CRITIC-TOPSIS comprehensive assessment method was first-ever proposed according to a variety of statistical parameters in the environmental model research field. 39 models based on 13 ML algorithms (belonged to 4 tribes) and 3 descriptor screening methods, were developed to calculate aqueous solubility values (log Kws) for organic chemicals reliably and verify the effectiveness of the comprehensive assessment method. The evaluations were carried out for exhibiting better predictive accuracy and external competitiveness of the MLR-1, XGB-1, DNN-1, and kNN-1 models in contrast to other prediction models in each tribe. Further, XGB model based on SRM (XGB-1, C = 0.599) was selected as an optimal pathway for prediction of aqueous solubility. We hope that the proposed comprehensive evaluation approach could act as a promising tool for selecting the optimum environmental property prediction methods.
Collapse
Affiliation(s)
- Tengyi Zhu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China.
| | - Ying Chen
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Cuicui Tao
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| |
Collapse
|
3
|
Li M, Chen H, Zhang H, Zeng M, Chen B, Guan L. Prediction of the Aqueous Solubility of Compounds Based on Light Gradient Boosting Machines with Molecular Fingerprints and the Cuckoo Search Algorithm. ACS OMEGA 2022; 7:42027-42035. [PMID: 36440111 PMCID: PMC9685740 DOI: 10.1021/acsomega.2c03885] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 10/18/2022] [Indexed: 06/16/2023]
Abstract
Aqueous solubility is one of the most important physicochemical properties in drug discovery. At present, the prediction of aqueous solubility of compounds is still a challenging problem. Machine learning has shown great potential in solubility prediction. Most machine learning models largely rely on the setting of hyperparameters, and their performance can be improved by setting the hyperparameters in a better way. In this paper, we used MACCS fingerprints to represent the structural features and optimized the hyperparameters of the light gradient boosting machine (LightGBM) with the cuckoo search algorithm (CS). Based on the above representation and optimization, the CS-LightGBM model was established to predict the aqueous solubility of 2446 organic compounds and the obtained prediction results were compared with those obtained with the other six different machine learning models (RF, GBDT, XGBoost, LightGBM, SVR, and BO-LightGBM). The comparison results showed that the CS-LightGBM model had a better prediction performance than the other six different models. RMSE, MAE, and R 2 of the CS-LightGBM model were, respectively, 0.7785, 0.5117, and 0.8575. In addition, this model has good scalability and can be used to solve solubility prediction problems in other fields such as solvent selection and drug screening.
Collapse
|
4
|
Kuroda M, Watanabe R, Esaki T, Kawashima H, Ohashi R, Sato T, Honma T, Komura H, Mizuguchi K. Utilizing public and private sector data to build better machine learning models for the prediction of pharmacokinetic parameters. Drug Discov Today 2022; 27:103339. [PMID: 35973660 DOI: 10.1016/j.drudis.2022.103339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 07/11/2022] [Accepted: 08/11/2022] [Indexed: 11/20/2022]
Abstract
One solution to compensate for the shortage of publicly available data is to collect more quality-controlled data from the private sector through public-private partnerships. However, several issues must be resolved before implementing such a system. Here, we review the technical aspects of public-private partnerships using our initiative in Japan as an example. In particular, we focus on the procedure for collecting data from multiple private sector companies and building prediction models and discuss how merging public and private sector datasets will help to improve the chemical space coverage and prediction performance. Teaser: Japan's first public-private consortium in pharmacokinetics has incorporated data from multiple pharmaceutical companies to create useful predictive models.
Collapse
Affiliation(s)
- Masataka Kuroda
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; Discovery Technology Laboratories, Mitsubishi Tanabe Pharma Corporation, 1000, Kamoshida-cho, Aoba-ku, Yokohama, Kanagawa 227-0033, Japan.
| | - Reiko Watanabe
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Tsuyoshi Esaki
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; The Centre for Data Science Education and Research, Shiga University, 1-1-1, Banba, Hikone, Shiga 522-8522, Japan
| | - Hitoshi Kawashima
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan
| | - Rikiya Ohashi
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; Discovery Technology Laboratories, Mitsubishi Tanabe Pharma Corporation, 1000, Kamoshida-cho, Aoba-ku, Yokohama, Kanagawa 227-0033, Japan
| | - Tomohiro Sato
- RIKEN Center for Biosystems Dynamics Research, 1-7-22, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Teruki Honma
- RIKEN Center for Biosystems Dynamics Research, 1-7-22, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Hiroshi Komura
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; University Research Administration Centre, Osaka Metropolitan University, 1-2-7, Asahi, Abeno-ku, Osaka 545-0051, Japan
| | - Kenji Mizuguchi
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan.
| |
Collapse
|
5
|
Sorkun MC, Koelman JVA, Er S. Pushing the limits of solubility prediction via quality-oriented data selection. iScience 2021; 24:101961. [PMID: 33437941 PMCID: PMC7788089 DOI: 10.1016/j.isci.2020.101961] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 11/18/2020] [Accepted: 12/15/2020] [Indexed: 01/19/2023] Open
Abstract
Accurate prediction of the solubility of chemical substances in solvents remains a challenge. The sparsity of high-quality solubility data is recognized as the biggest hurdle in the development of robust data-driven methods for practical use. Nonetheless, the effects of the quality and quantity of data on aqueous solubility predictions have not yet been scrutinized. In this study, the roles of the size and the quality of data sets on the performances of the solubility prediction models are unraveled, and the concepts of actual and observed performances are introduced. In an effort to curtail the gap between actual and observed performances, a quality-oriented data selection method, which evaluates the quality of data and extracts the most accurate part of it through statistical validation, is designed. Applying this method on the largest publicly available solubility database and using a consensus machine learning approach, a top-performing solubility prediction model is achieved.
Collapse
Affiliation(s)
- Murat Cihan Sorkun
- DIFFER - Dutch Institute for Fundamental Energy Research, De Zaale 20, 5612 AJ Eindhoven, the Netherlands
- CCER - Center for Computational Energy Research, De Zaale 20, 5612 AJ Eindhoven, the Netherlands
- Department of Applied Physics, Eindhoven University of Technology, 5600 MB Eindhoven, the Netherlands
| | - J.M. Vianney A. Koelman
- DIFFER - Dutch Institute for Fundamental Energy Research, De Zaale 20, 5612 AJ Eindhoven, the Netherlands
- CCER - Center for Computational Energy Research, De Zaale 20, 5612 AJ Eindhoven, the Netherlands
- Department of Applied Physics, Eindhoven University of Technology, 5600 MB Eindhoven, the Netherlands
| | - Süleyman Er
- DIFFER - Dutch Institute for Fundamental Energy Research, De Zaale 20, 5612 AJ Eindhoven, the Netherlands
- CCER - Center for Computational Energy Research, De Zaale 20, 5612 AJ Eindhoven, the Netherlands
| |
Collapse
|
6
|
QSPR models for water solubility of ammonium hexafluorosilicates: analysis of the effects of hydrogen bonds. Struct Chem 2020. [DOI: 10.1007/s11224-020-01652-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
7
|
Falcón-Cano G, Molina C, Cabrera-Pérez MÁ. ADME prediction with KNIME: In silico aqueous solubility consensus model based on supervised recursive random forest approaches. ADMET AND DMPK 2020; 8:251-273. [PMID: 35300309 PMCID: PMC8915604 DOI: 10.5599/admet.852] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 08/01/2020] [Indexed: 12/12/2022] Open
Abstract
In-silico prediction of aqueous solubility plays an important role during the drug discovery and development processes. For many years, the limited performance of in-silico solubility models has been attributed to the lack of high-quality solubility data for pharmaceutical molecules. However, some studies suggest that the poor accuracy of solubility prediction is not related to the quality of the experimental data and that more precise methodologies (algorithms and/or set of descriptors) are required for predicting aqueous solubility for pharmaceutical molecules. In this study a large and diverse database was generated with aqueous solubility values collected from two public sources; two new recursive machine-learning approaches were developed for data cleaning and variable selection, and a consensus model based on regression and classification algorithms was created. The modeling protocol, which includes the curation of chemical and experimental data, was implemented in KNIME, with the aim of obtaining an automated workflow for the prediction of new databases. Finally, we compared several methods or models available in the literature with our consensus model, showing results comparable or even outperforming previous published models.
Collapse
Affiliation(s)
- Gabriela Falcón-Cano
- Unit of Modeling and Experimental Biopharmaceutics. Centro de Bioactivos Químicos. Universidad Central "Marta Abreu" de las Villas. Santa Clara 54830, Villa Clara, Cuba
| | | | - Miguel Ángel Cabrera-Pérez
- Unit of Modeling and Experimental Biopharmaceutics. Centro de Bioactivos Químicos. Universidad Central "Marta Abreu" de las Villas. Santa Clara 54830, Villa Clara, Cuba.,Department of Pharmacy and Pharmaceutical Technology, University of Valencia, Burjassot 46100, Valencia, Spain.,Department of Engineering, Area of Pharmacy and Pharmaceutical Technology, Miguel Hernández University, 03550 Sant Joan d'Alacant, Alicante, Spain
| |
Collapse
|
8
|
Abstract
At the end of her academic career, the author summarizes the main aspects of QSAR modeling, giving comments and suggestions according to her 23 years' experience in QSAR research on environmental topics. The focus is mainly on Multiple Linear Regression, particularly Ordinary Least Squares, using a Genetic Algorithm for variable selection from various theoretical molecular descriptors, but the comments can be useful also for other QSAR methods. The need for rigorous validation, also external, and for applicability domain check to guarantee predictivity and reliability of QSAR models is particularly highlighted. The commented approach is the “predictive” one, based on chemometrics, and is usefully applied to the prioritization of environmental pollutants. All the discussed points and the author's ideas are implemented in the software QSARINS, as a legacy to the QSAR community.
Collapse
|
9
|
Toropov AA, Toropova AP, Marzo M, Benfenati E. Use of the index of ideality of correlation to improve aquatic solubility model. J Mol Graph Model 2020; 96:107525. [DOI: 10.1016/j.jmgm.2019.107525] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 11/27/2019] [Accepted: 12/23/2019] [Indexed: 12/18/2022]
|
10
|
Toropova AP, Toropov AA, Carnesecchi E, Benfenati E, Dorne JL. The using of the Index of Ideality of Correlation (IIC) to improve predictive potential of models of water solubility for pesticides. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2020; 27:13339-13347. [PMID: 32020455 DOI: 10.1007/s11356-020-07820-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 01/21/2020] [Indexed: 06/10/2023]
Abstract
Models for water solubility of pesticides suggested in this manuscript are important data from point of view of ecologic engineering. The Index of Ideality of Correlation (IIC) of groups of quantitative structure-property relationships (QSPRs) for water solubility of pesticides related to the calibration sets was used to identify good in silico models. This comparison confirmed the high IIC set provides better statistical quality of the model for the validation set. Though there are large databases on solubility, the reliable prediction of the endpoint for new substances which are potential pesticides is an important ecologic task. Unfortunately, predictive models for various endpoints suffer overtraining, and the IIC serves to avoid or at least reduce this. Thus, the approach suggested has both theoretical and economic effects for ecology.
Collapse
Affiliation(s)
- Alla P Toropova
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Science, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156, Milan, Italy.
| | - Andrey A Toropov
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Science, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156, Milan, Italy
| | - Edoardo Carnesecchi
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Science, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156, Milan, Italy
- Institute for Risk Assessment Sciences, Utrecht University, PO Box 80177, 3508 TD, Utrecht, The Netherlands
| | - Emilio Benfenati
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Science, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156, Milan, Italy
| | - Jean Lou Dorne
- Scientific Committee and Emerging Risks Unit, European Food Safety Authority, Via Carlo Magno 1A, 43126, Parma, Italy
| |
Collapse
|
11
|
Fioressi SE, Bacelo DE, Aranda JF, Duchowicz PR. Prediction of the aqueous solubility of diverse compounds by 2D-QSPR. J Mol Liq 2020. [DOI: 10.1016/j.molliq.2020.112572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
12
|
Toropova AP, Toropov AA. Whether the Validation of the Predictive Potential of Toxicity Models is a Solved Task? Curr Top Med Chem 2019; 19:2643-2657. [PMID: 31702504 DOI: 10.2174/1568026619666191105111817] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 09/02/2019] [Accepted: 09/04/2019] [Indexed: 12/23/2022]
Abstract
Different kinds of biological activities are defined by complex biochemical interactions, which are termed as a "mathematical function" not only of the molecular structure but also for some additional circumstances, such as physicochemical conditions, interactions via energy and information effects between a substance and organisms, organs, cells. These circumstances lead to the great complexity of prediction for biochemical endpoints, since all "details" of corresponding phenomena are practically unavailable for the accurate registration and analysis. Researchers have not a possibility to carry out and analyse all possible ways of the biochemical interactions, which define toxicological or therapeutically attractive effects via direct experiment. Consequently, a compromise, i.e. the development of predictive models of the above phenomena, becomes necessary. However, the estimation of the predictive potential of these models remains a task that is solved only partially. This mini-review presents a collection of attempts to be used for the above-mentioned task, two special statistical indices are proposed, which may be a measure of the predictive potential of models. These indices are (i) Index of Ideality of Correlation; and (ii) Correlation Contradiction Index.
Collapse
Affiliation(s)
- Alla P Toropova
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via La Masa 19, 20156 Milano, Italy
| | - Andrey A Toropov
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via La Masa 19, 20156 Milano, Italy
| |
Collapse
|
13
|
Gelmboldt V, Kravtsov V, Fonari M. Ammonium hexafluoridosilicates: Synthesis, structures, properties, applications. J Fluor Chem 2019. [DOI: 10.1016/j.jfluchem.2019.04.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
14
|
Raevsky OA, Grigorev VY, Polianczyk DE, Raevskaja OE, Dearden JC. Aqueous Drug Solubility: What Do We Measure, Calculate and QSPR Predict? Mini Rev Med Chem 2019; 19:362-372. [PMID: 30058484 DOI: 10.2174/1389557518666180727164417] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Revised: 07/06/2018] [Accepted: 07/20/2018] [Indexed: 01/07/2023]
Abstract
Detailed critical analysis of publications devoted to QSPR of aqueous solubility is presented in the review with discussion of four types of aqueous solubility (three different thermodynamic solubilities with unknown solute structure, intrinsic solubility, solubility in physiological media at pH=7.4 and kinetic solubility), variety of molecular descriptors (from topological to quantum chemical), traditional statistical and machine learning methods as well as original QSPR models.
Collapse
Affiliation(s)
- Oleg A Raevsky
- Department of Computer-Aided Molecular Design, Institute of Physiologically Active Compounds, Russian Academy of Science, Chernogolovka, Russian Federation
| | - Veniamin Y Grigorev
- Department of Computer-Aided Molecular Design, Institute of Physiologically Active Compounds, Russian Academy of Science, Chernogolovka, Russian Federation
| | - Daniel E Polianczyk
- Department of Computer-Aided Molecular Design, Institute of Physiologically Active Compounds, Russian Academy of Science, Chernogolovka, Russian Federation
| | - Olga E Raevskaja
- Department of Computer-Aided Molecular Design, Institute of Physiologically Active Compounds, Russian Academy of Science, Chernogolovka, Russian Federation
| | - John C Dearden
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, United Kingdom
| |
Collapse
|
15
|
Palmblad M. Visual and Semantic Enrichment of Analytical Chemistry Literature Searches by Combining Text Mining and Computational Chemistry. Anal Chem 2019; 91:4312-4316. [PMID: 30835438 PMCID: PMC6448173 DOI: 10.1021/acs.analchem.8b05818] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
The
open-access scientific literature contains a wealth of information
for meaningful text mining. However, this information is not always
easy to retrieve. This technical note addresses the problem by a new
flexible method combining in a single workflow existing resources
for literature searches, text mining, and large-scale prediction of
physicochemical and biological properties. The results are visualized
as virtual mass spectra, chromatograms, or images in styles new to
text mining but familiar to analytical chemistry. The method is demonstrated
on comparisons of analytical-chemistry techniques and semantically
enriched searches for proteins and their activities, but it may also
be of general utility in experimental design, drug discovery, chemical
syntheses, business intelligence, and historical studies. The method
is realized in shareable scientific workflows using only freely available
data, services, and software that scale to millions of publications
and named chemical entities in the literature.
Collapse
Affiliation(s)
- Magnus Palmblad
- Center for Proteomics and Metabolomics , Leiden University Medical Center , Postzone S3-P, Postbus 9600, 2300 RC Leiden , The Netherlands
| |
Collapse
|
16
|
Sou T, Bergström CAS. Automated assays for thermodynamic (equilibrium) solubility determination. DRUG DISCOVERY TODAY. TECHNOLOGIES 2018; 27:11-19. [PMID: 30103859 DOI: 10.1016/j.ddtec.2018.04.004] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 04/26/2018] [Accepted: 04/27/2018] [Indexed: 06/08/2023]
Abstract
Solubility is a crucial physicochemical property for drug candidates and is important in both drug discovery and development. Poor solubility is detrimental to absorption after oral administration and can mask compound activity in bioassays in various ways. Hence, solubility liabilities should ideally be identified as early as possible in the drug development process. With the increasing number of compounds as potential drug candidates, automated thermodynamic solubility assays for high throughput screening enabling rapid evaluation of a large number of compounds are becoming increasingly important. This review discusses the current status of the most widely used automated assays for thermodynamic solubility, followed by recent high throughput measurements of properties related to solubility (e.g. dissolution rate and supersaturation) and a brief overview of predictive computational methods for thermodynamic solubility reported in the literature.
Collapse
Affiliation(s)
- Tomás Sou
- Department of Pharmacy, Uppsala University, BMC P.O. Box 580, SE-751 23 Uppsala, Sweden
| | - Christel A S Bergström
- Department of Pharmacy, Uppsala University, BMC P.O. Box 580, SE-751 23 Uppsala, Sweden.
| |
Collapse
|
17
|
Affiliation(s)
- Saeed Alqahtani
- Department of Clinical Pharmacy, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| |
Collapse
|
18
|
Raevsky OA, Grigorev VY, Polianczyk DE, Raevskaja OE, Dearden JC. Six global and local QSPR models of aqueous solubility at pH = 7.4 based on structural similarity and physicochemical descriptors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2017; 28:661-676. [PMID: 28891683 DOI: 10.1080/1062936x.2017.1368704] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 08/14/2017] [Indexed: 06/07/2023]
Abstract
Aqueous solubility at pH = 7.4 is a very important property for medicinal chemists because this is the pH value of physiological media. The present work describes the application of three different methods (support vector machine (SVM), random forest (RF) and multiple linear regression (MLR)) and three local quantitative structure-property relationship (QSPR) models (regression corrected by nearest neighbours (RCNN), arithmetic mean property (AMP) and local regression property (LoReP)) to construct stable QSPRs with clear mechanistic interpretation. Our data set contained experimental values of aqueous solubility at pH = 7.4 of 387 chemicals (349 in the training set and 38 in the test set including 16 own measurements). The initial descriptor pool contained 210 physicochemical descriptors, calculated from the HYBOT, DRAGON, SYBYL and VolSurf+ programs. Six QSPRs with good statistics based on fundamentals of aqueous solubility and optimization of descriptor space were obtained. Those models have an RMSE close to experimental error (0.70), and are amenable to physical interpretation. The QSPR models developed in this study may be useful for medicinal chemists. Global MLR, RF and SVM models may be valuable for consideration of common factors that influence solubility. The RCNN, AMP and LoReP local models may be helpful for the optimization of aqueous solubility in small sets of related chemicals.
Collapse
Affiliation(s)
- O A Raevsky
- a Department of Computer-Aided Molecular Design , Russian Academy of Science , Chernogolovka , Russia
| | - V Y Grigorev
- a Department of Computer-Aided Molecular Design , Russian Academy of Science , Chernogolovka , Russia
| | - D E Polianczyk
- a Department of Computer-Aided Molecular Design , Russian Academy of Science , Chernogolovka , Russia
| | - O E Raevskaja
- a Department of Computer-Aided Molecular Design , Russian Academy of Science , Chernogolovka , Russia
| | - J C Dearden
- b School of Pharmacy and Biomolecular Sciences , Liverpool John Moores University , Liverpool , UK
| |
Collapse
|
19
|
Prediction of N-Methyl-D-Aspartate Receptor GluN1-Ligand Binding Affinity by a Novel SVM-Pose/SVM-Score Combinatorial Ensemble Docking Scheme. Sci Rep 2017; 7:40053. [PMID: 28059133 PMCID: PMC5216401 DOI: 10.1038/srep40053] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Accepted: 11/30/2016] [Indexed: 01/24/2023] Open
Abstract
The glycine-binding site of the N-methyl-D-aspartate receptor (NMDAR) subunit GluN1 is a potential pharmacological target for neurodegenerative disorders. A novel combinatorial ensemble docking scheme using ligand and protein conformation ensembles and customized support vector machine (SVM)-based models to select the docked pose and to predict the docking score was generated for predicting the NMDAR GluN1-ligand binding affinity. The predicted root mean square deviation (RMSD) values in pose by SVM-Pose models were found to be in good agreement with the observed values (n = 30, r2 = 0.928–0.988, = 0.894–0.954, RMSE = 0.002–0.412, s = 0.001–0.214), and the predicted pKi values by SVM-Score were found to be in good agreement with the observed values for the training samples (n = 24, r2 = 0.967, = 0.899, RMSE = 0.295, s = 0.170) and test samples (n = 13, q2 = 0.894, RMSE = 0.437, s = 0.202). When subjected to various statistical validations, the developed SVM-Pose and SVM-Score models consistently met the most stringent criteria. A mock test asserted the predictivity of this novel docking scheme. Collectively, this accurate novel combinatorial ensemble docking scheme can be used to predict the NMDAR GluN1-ligand binding affinity for facilitating drug discovery.
Collapse
|
20
|
McDonagh JL, Palmer DS, Mourik TV, Mitchell JBO. Are the Sublimation Thermodynamics of Organic Molecules Predictable? J Chem Inf Model 2016; 56:2162-2179. [PMID: 27749062 DOI: 10.1021/acs.jcim.6b00033] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
We compare a range of computational methods for the prediction of sublimation thermodynamics (enthalpy, entropy, and free energy of sublimation). These include a model from theoretical chemistry that utilizes crystal lattice energy minimization (with the DMACRYS program) and quantitative structure property relationship (QSPR) models generated by both machine learning (random forest and support vector machines) and regression (partial least squares) methods. Using these methods we investigate the predictability of the enthalpy, entropy and free energy of sublimation, with consideration of whether such a method may be able to improve solubility prediction schemes. Previous work has suggested that the major source of error in solubility prediction schemes involving a thermodynamic cycle via the solid state is in the modeling of the free energy change away from the solid state. Yet contrary to this conclusion other work has found that the inclusion of terms such as the enthalpy of sublimation in QSPR methods does not improve the predictions of solubility. We suggest the use of theoretical chemistry terms, detailed explicitly in the Methods section, as descriptors for the prediction of the enthalpy and free energy of sublimation. A data set of 158 molecules with experimental sublimation thermodynamics values and some CSD refcodes has been collected from the literature and is provided with their original source references.
Collapse
Affiliation(s)
- James L McDonagh
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester, M1 7DN, U.K.,School of Chemistry, University of St Andrews , North Haugh, St Andrews, Fife, Scotland, United Kingdom , KY16 9ST
| | - David S Palmer
- Department of Pure and Applied Chemistry, University of Strathclyde , Thomas Graham Building, 295 Cathedral Street, Glasgow, Scotland, United Kingdom , G1 1XL
| | - Tanja van Mourik
- School of Chemistry, University of St Andrews , North Haugh, St Andrews, Fife, Scotland, United Kingdom , KY16 9ST
| | - John B O Mitchell
- School of Chemistry, University of St Andrews , North Haugh, St Andrews, Fife, Scotland, United Kingdom , KY16 9ST
| |
Collapse
|