1
|
Panwar P, Yang Q, Martini A. Temperature-Dependent Density and Viscosity Prediction for Hydrocarbons: Machine Learning and Molecular Dynamics Simulations. J Chem Inf Model 2024; 64:2760-2774. [PMID: 37582234 DOI: 10.1021/acs.jcim.3c00231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Machine learning-based predictive models allow rapid and reliable prediction of material properties and facilitate innovative materials design. Base oils used in the formulation of lubricant products are complex hydrocarbons of varying sizes and structure. This study developed Gaussian process regression-based models to accurately predict the temperature-dependent density and dynamic viscosity of 305 complex hydrocarbons. In our approach, strongly correlated/collinear predictors were trimmed, important predictors were selected by least absolute shrinkage and selection operator (LASSO) regularization and prior domain knowledge, hyperparameters were systematically optimized by Bayesian optimization, and the models were interpreted. The approach provided versatile and quantitative structure-property relationship (QSPR) models with relatively simple predictors for determining the dynamic viscosity and density of complex hydrocarbons at any temperature. In addition, we developed molecular dynamics simulation-based descriptors and evaluated the feasibility and versatility of dynamic descriptors from simulations for predicting the material properties. It was found that the models developed using a comparably smaller pool of dynamic descriptors performed similarly in predicting density and viscosity to models based on many more static descriptors. The best models were shown to predict density and dynamic viscosity with coefficient of determination (R2) values of 99.6% and 97.7%, respectively, for all data sets, including a test data set of 45 molecules. Finally, partial dependency plots (PDPs), individual conditional expectation (ICE) plots, local interpretable model-agnostic explanation (LIME) values, and trimmed model R2 values were used to identify the most important static and dynamic predictors of the density and viscosity.
Collapse
Affiliation(s)
- Pawan Panwar
- Department of Mechanical Engineering, University of California Merced, 5200 North Lake Road, Merced, California 95343, United States
| | - Quanpeng Yang
- Department of Mechanical Engineering, University of California Merced, 5200 North Lake Road, Merced, California 95343, United States
| | - Ashlie Martini
- Department of Mechanical Engineering, University of California Merced, 5200 North Lake Road, Merced, California 95343, United States
| |
Collapse
|
2
|
Panwar P, Yang Q, Martini A. PyL3dMD: Python LAMMPS 3D molecular descriptors package. J Cheminform 2023; 15:69. [PMID: 37507792 PMCID: PMC10385924 DOI: 10.1186/s13321-023-00737-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 07/16/2023] [Indexed: 07/30/2023] Open
Abstract
Molecular descriptors characterize the biological, physical, and chemical properties of molecules and have long been used for understanding molecular interactions and facilitating materials design. Some of the most robust descriptors are derived from geometrical representations of molecules, called 3-dimensional (3D) descriptors. When calculated from molecular dynamics (MD) simulation trajectories, 3D descriptors can also capture the effects of operating conditions such as temperature or pressure. However, extracting 3D descriptors from MD trajectories is non-trivial, which hinders their wide use by researchers developing advanced quantitative-structure-property-relationship models using machine learning. Here, we describe a suite of open-source Python-based post-processing routines, called PyL3dMD, for calculating 3D descriptors from MD simulations. PyL3dMD is compatible with the popular simulation package LAMMPS and enables users to compute more than 2000 3D molecular descriptors from atomic trajectories generated by MD simulations. PyL3dMD is freely available via GitHub and can be easily installed and used as a highly flexible Python package on all major platforms (Windows, Linux, and macOS). A performance benchmark study used descriptors calculated by PyL3dMD to develop a neural network and the results showed that PyL3dMD is fast and efficient in calculating descriptors for large and complex molecular systems with long simulation durations. PyL3dMD facilitates the calculation of 3D molecular descriptors using MD simulations, making it a valuable tool for cheminformatics studies.
Collapse
Affiliation(s)
- Pawan Panwar
- Department of Mechanical Engineering, University of California Merced, 5200 North Lake Road, Merced, CA, 95343, USA.
| | - Quanpeng Yang
- Department of Mechanical Engineering, University of California Merced, 5200 North Lake Road, Merced, CA, 95343, USA
| | - Ashlie Martini
- Department of Mechanical Engineering, University of California Merced, 5200 North Lake Road, Merced, CA, 95343, USA.
| |
Collapse
|
3
|
Xiao ZJ, Chen JW, Wang Y, Wang ZY. In silico package models for deriving values of solute parameters in linear solvation energy relationships. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2023; 34:21-37. [PMID: 36625152 DOI: 10.1080/1062936x.2022.2162576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Accepted: 12/20/2022] [Indexed: 06/17/2023]
Abstract
Environmental partitioning influences fate, exposure and ecological risks of chemicals. Linear solvation energy relationship (LSER) models may serve as efficient tools for estimating environmental partitioning parameter values that are commonly deficient for many chemicals. Nonetheless, scarcities of empirical solute parameter values of LSER models restricted the application. This study developed and evaluated in silico methods and models to derive the values, in which excess molar refraction, molar volume and logarithm of hexadecane/air partition coefficient were computed from density functional theory; dipolarity/polarizability parameter, solute H-bond acidity and basicity parameters were predicted by quantitative structure-activity relationship models developed with theoretical molecular descriptors. New LSER models on four physicochemical properties relevant with environmental partitioning (n-octanol/water partition coefficients, n-octanol/air partition coefficients, water solubilities, sub-cooled liquid vapour pressures) were constructed using the in silico solute parameter values, which exhibited comparable performance with conventional LSER models using the empirical solute parameter values. The package models for deriving the LSER solute parameter values, with advantages that they are free of instrumental determinations, may lay the foundation for high-throughput estimating environmental partition parameter values of diverse organic chemicals.
Collapse
Affiliation(s)
- Z J Xiao
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - J W Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Y Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Z Y Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| |
Collapse
|
4
|
Bonnot K, Benoit P, Hoyau S, Mamy L, Patureau D, Servien R, Rapacioli M, Bessac F. Accuracy of Computational Chemistry Methods to Calculate Organic Contaminant Molecular Properties. ChemistrySelect 2022. [DOI: 10.1002/slct.202203586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- Kevin Bonnot
- INRAE Univ. Montpellier, LBE, 102 avenue des Etangs 11100 Narbonne France
- Université Paris-Saclay INRAE AgroParisTech, UMR ECOSYS 78850 Thiverval-Grignon France
| | - Pierre Benoit
- Université Paris-Saclay INRAE AgroParisTech, UMR ECOSYS 78850 Thiverval-Grignon France
| | - Sophie Hoyau
- Université de Toulouse; Laboratoire de Chimie et Physique Quantiques (UMR 5626), UPS, CNRS 118, route de Narbonne F-31062 Toulouse France
| | - Laure Mamy
- Université Paris-Saclay INRAE AgroParisTech, UMR ECOSYS 78850 Thiverval-Grignon France
| | - Dominique Patureau
- INRAE Univ. Montpellier, LBE, 102 avenue des Etangs 11100 Narbonne France
| | - Rémi Servien
- INRAE Univ. Montpellier, LBE, 102 avenue des Etangs 11100 Narbonne France
| | - Mathias Rapacioli
- Université de Toulouse; Laboratoire de Chimie et Physique Quantiques (UMR 5626), UPS, CNRS 118, route de Narbonne F-31062 Toulouse France
| | - Fabienne Bessac
- Université de Toulouse; Laboratoire de Chimie et Physique Quantiques (UMR 5626), UPS, CNRS 118, route de Narbonne F-31062 Toulouse France
- Université de Toulouse; INPT; Ecole d'Ingénieurs de Purpan 75, voie du TOEC, BP 57611 F-31076 Toulouse Cedex 03 France
| |
Collapse
|
5
|
Holt RA, Seybold PG. Computational Estimation of the Acidities of Pyrimidines and Related Compounds. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27020385. [PMID: 35056699 PMCID: PMC8782049 DOI: 10.3390/molecules27020385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 12/20/2021] [Accepted: 01/01/2022] [Indexed: 12/02/2022]
Abstract
Pyrimidines are key components in the genetic code of living organisms and the pyrimidine scaffold is also found in many bioactive and medicinal compounds. The acidities of these compounds, as represented by their pKas, are of special interest since they determine the species that will prevail under different pH conditions. Here, a quantum chemical quantitative structure–activity relationship (QSAR) approach was employed to estimate these acidities. Density-functional theory calculations at the B3LYP/6-31+G(d,p) level and the SM8 aqueous solvent model were employed, and the energy difference ∆EH2O between the parent compound and its dissociation product was used as a variation parameter. Excellent estimates for both the cation → neutral (pKa1, R2 = 0.965) and neutral → anion (pKa2, R2 = 0.962) dissociations were obtained. A commercial package from Advanced Chemical Design also yielded excellent results for these acidities.
Collapse
|
6
|
Dupeux T, Gaudin T, Marteau‐Roussy C, Aubry J, Nardello‐Rataj V. COSMO‐RS as an effective tool for predicting the physicochemical properties of fragrance raw materials. FLAVOUR FRAG J 2022. [DOI: 10.1002/ffj.3690] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Tristan Dupeux
- Univ. LilleCNRSCentrale LilleUniv. ArtoisUMR 8181 – UCCS – Unité de Catalyse et Chimie du Solide Lille France
- International Flavors & Fragrances (Fragrance Beauty Care) Neuilly‐sur‐Seine France
| | - Théophile Gaudin
- Univ. LilleCNRSCentrale LilleUniv. ArtoisUMR 8181 – UCCS – Unité de Catalyse et Chimie du Solide Lille France
| | | | - Jean‐Marie Aubry
- Univ. LilleCNRSCentrale LilleUniv. ArtoisUMR 8181 – UCCS – Unité de Catalyse et Chimie du Solide Lille France
| | - Véronique Nardello‐Rataj
- Univ. LilleCNRSCentrale LilleUniv. ArtoisUMR 8181 – UCCS – Unité de Catalyse et Chimie du Solide Lille France
| |
Collapse
|
7
|
Kim JY, Kim KB, Lee BM. Validation of Quantitative Structure-Activity Relationship (QSAR) and Quantitative Structure-Property Relationship (QSPR) approaches as alternatives to skin sensitization risk assessment. JOURNAL OF TOXICOLOGY AND ENVIRONMENTAL HEALTH. PART A 2021; 84:945-959. [PMID: 34338166 DOI: 10.1080/15287394.2021.1956660] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The aim of this study was conducted to validate the physicochemical properties of a total of 362 chemicals [305 skin sensitizers (212 in the previous study + 93 additional new chemicals), 57 non-skin sensitizers (38 in the previous study + 19 additional new chemicals)] for skin sensitization risk assessment using quantitative structure-activity relationship (QSAR)/quantitative structure-property relationship (QSPR) approaches. The average melting point (MP), surface tension (ST), and density (DS) of the 305 skin sensitizers and 57 non-sensitizers were used to determine the cutoff values distinguishing positive and negative sensitization, and correlation coefficients were employed to derive effective 3-fold concentration (EC3 (%)) values. QSAR models were also utilized to assess skin sensitization. The sensitivity, specificity, and accuracy were 80, 15, and 70%, respectively, for the Toxtree QSAR model; 88, 46, and 81%, respectively, for Vega; and 56, 61, and 56%, respectively, for Danish EPA QSAR. Surprisingly, the sensitivity, specificity, and accuracy were 60, 80, and 64%, respectively, when MP, ST, and DS (MP+ST+DS) were used in this study. Further, MP+ST+DS exhibited a sensitivity of 77%, specificity 57%, and accuracy 73% when the derived EC3 values were classified into local lymph node assay (LLNA) skin sensitizer and non-sensitizer categories. Thus, MP, ST, and DS may prove useful in predicting EC3 values as not only an alternative approach to animal testing but also for skin sensitization risk assessment.
Collapse
Affiliation(s)
- Ji Yun Kim
- Division of Toxicology, College of Pharmacy, Sungkyunkwan University, Suwon, Gyeonggi-do, South Korea
| | - Kyu-Bong Kim
- College of Pharmacy, Dankook University Dandae-ro, Cheonan, Chungnam, South Korea
| | - Byung-Mu Lee
- Division of Toxicology, College of Pharmacy, Sungkyunkwan University, Suwon, Gyeonggi-do, South Korea
| |
Collapse
|
8
|
Morency M, Néron S, Iftimie R, Wuest JD. Predicting p Ka Values of Quinols and Related Aromatic Compounds with Multiple OH Groups. J Org Chem 2021; 86:14444-14460. [PMID: 34613729 DOI: 10.1021/acs.joc.1c01279] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Quinonoid compounds play central roles as redox-active agents in photosynthesis and respiration and are also promising replacements for inorganic materials currently used in batteries. To design new quinonoid compounds and predict their state of protonation and redox behavior under various conditions, their pKa values must be known. Methods that can predict the pKa values of simple phenols cannot reliably handle complex analogues in which multiple OH groups are present and may form intramolecular hydrogen bonds. We have therefore developed a straightforward method based on a linear relationship between experimental pKa values and calculated differences in energy between quinols and their deprotonated forms. Simple adjustments allow reliable predictions of pKa values when intramolecular hydrogen bonds are present. Our approach has been validated by showing that predicted and experimental values for over 100 quinols and related compounds differ by an average of only 0.3 units. This accuracy makes it possible to select proper pKa values when experimental data vary, predict the acidity of quinols and related compounds before they are made, and determine the sites and orders of deprotonation in complex structures with multiple OH groups.
Collapse
Affiliation(s)
- Mathieu Morency
- Département de Chimie, Université de Montréal, Montréal, Québec H2V 0B3, Canada
| | - Sébastien Néron
- Département de Chimie, Université de Montréal, Montréal, Québec H2V 0B3, Canada
| | - Radu Iftimie
- Département de Chimie, Université de Montréal, Montréal, Québec H2V 0B3, Canada
| | - James D Wuest
- Département de Chimie, Université de Montréal, Montréal, Québec H2V 0B3, Canada
| |
Collapse
|
9
|
Ebi T, Sen A, Dhital RN, Yamada YMA, Kaneko H. Design of Experimental Conditions with Machine Learning for Collaborative Organic Synthesis Reactions Using Transition-Metal Catalysts. ACS OMEGA 2021; 6:27578-27586. [PMID: 34693179 PMCID: PMC8529890 DOI: 10.1021/acsomega.1c04826] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 09/28/2021] [Indexed: 06/13/2023]
Abstract
To improve product yields in synthetic reactions, it is important to use appropriate catalysts. In this study, we used machine learning to design catalysts for a reaction system in which both Buchwald-Hartwig-type and Suzuki-Miyaura-type cross-coupling reactions proceed simultaneously. First, using an existing dataset, yield prediction models were constructed with machine learning between experimental conditions, including the substrate and catalyst and the yields of the two products. Seven methods for calculating both the substrate and catalyst descriptors were proposed, and the predictive ability of the yield prediction models was discussed in terms of the descriptors and machine learning methods. Then, the constructed models were used to predict the compound yields for new combinations of substrates and catalysts, and the predictions were experimentally validated with high reproducibility, confirming that machine learning can predict yields from experimental conditions with high accuracy. In addition, to design catalysts that will improve the yields in our dataset, we added datasets collected from scientific papers and designed catalyst ligands. The proposed catalyst candidates were tested in actual synthetic experiments, and the experimental results exceeded the existing yields.
Collapse
Affiliation(s)
- Tomoya Ebi
- Department
of Applied Chemistry, School of Science and Technology, Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa 214-8571, Japan
| | - Abhijit Sen
- RIKEN
Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Raghu N. Dhital
- RIKEN
Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Yoichi M. A. Yamada
- RIKEN
Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Hiromasa Kaneko
- Department
of Applied Chemistry, School of Science and Technology, Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa 214-8571, Japan
- RIKEN
Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| |
Collapse
|
10
|
Kaneko H. Examining variable selection methods for the predictive performance of regression models and the proportion of selected variables and selected random variables. Heliyon 2021; 7:e07356. [PMID: 34195450 PMCID: PMC8237311 DOI: 10.1016/j.heliyon.2021.e07356] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 05/02/2021] [Accepted: 06/16/2021] [Indexed: 11/24/2022] Open
Abstract
The selection of a descriptor, X, is crucial for improving the interpretation and prediction accuracy of a regression model. In this study, the prediction accuracy of models constructed using the selected X was determined and the results of variable selection, according to the number of selected X and number of selected variables that are unrelated to an objective variable, such as activities and properties (y), were investigated to evaluate the variable or feature selection methods. Variable selection methods include least absolute shrinkage and selection operator, genetic algorithm-based partial least squares, genetic algorithm-based support vector regression, and Boruta. Several regression analysis methods were used to test the prediction accuracy of the model constructed using the selected X. The characteristics of each variable selection method were analyzed using eight datasets. The results showed that even when variables unrelated to y were selected by variable selection and the number of unrelated variables was the same as the number of the original variables, a regression model with good accuracy, which ignores the influence of such noise variables, can be constructed by applying various regression analysis methods. Additionally, the variables related to y must not to be deleted. These findings provide a basis for improving the variable selection methods.
Collapse
Affiliation(s)
- Hiromasa Kaneko
- Department of Applied Chemistry, School of Science and Technology, Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa 214-8571, Japan
| |
Collapse
|
11
|
Fayet G, Rotureau P. Chemoinformatics for the Safety of Energetic and Reactive Materials at Ineris. Mol Inform 2020; 41:e2000190. [PMID: 33283975 DOI: 10.1002/minf.202000190] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 12/06/2020] [Indexed: 11/07/2022]
Abstract
The characterization of physical hazards of substances is a key information to manage the risks associated to their use, storage and transport. With decades of work in this area, Ineris develops and implements cutting-edge experimental facilities allowing such characterizations at different scales and under various conditions to study all of the dreaded accident scenarios. This review presents the efforts engaged by Ineris more recently in the field of chemoinformatics to develop and use new predictive methods for the anticipation and management of industrials risks associated to energetic and reactive materials as a complement to experiments. An overview of the methods used for the development of Quantitative Structure-Property Relationships for physical hazards are presented and discussed regarding the specificities associated to this class of properties. A review of models developed at Ineris is also provided from the first tentative models on the explosivity of nitro compounds to the successful application to the flammability of organic mixtures. Then, a discussion is proposed on the use of QSPR models. Good practices for robust use for QSPR models are recalled with specific comments related to physical hazards, notably for regulatory purpose. Dissemination and training efforts engaged by Ineris are also presented. The potential offered by these predictive methods in terms of in silico design and for the development of new intrinsically safer technologies in safety-by-design strategies is finally discussed. At last, challenges and perspectives to extend the application of chemoinformatics in the field of safety and in particular for the physical hazards of energetic and reactive substances are proposed.
Collapse
Affiliation(s)
- Guillaume Fayet
- Ineris, Accidental Risk Division, Parc Technologique Alata, 60550, Verneuil-en-Halatte, France
| | - Patricia Rotureau
- Ineris, Accidental Risk Division, Parc Technologique Alata, 60550, Verneuil-en-Halatte, France
| |
Collapse
|
12
|
Achary PGR, Toropova AP, Toropov AA. Prediction of the self‐accelerating decomposition temperature of organic peroxides. PROCESS SAFETY PROGRESS 2020. [DOI: 10.1002/prs.12189] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Patnala Ganga Raju Achary
- Department of Chemistry Institute of Technical Education and Research (ITER), Siksha 'O' Anusandhan deemed to be University Bhubaneswar Odisha India
| | - Alla P. Toropova
- Department of Environmental Health Science, Laboratory of Environmental Chemistry and Toxicology Istituto di Ricerche Farmacologiche Mario Negri IRCCS Milan Italy
| | - Andrey A. Toropov
- Department of Environmental Health Science, Laboratory of Environmental Chemistry and Toxicology Istituto di Ricerche Farmacologiche Mario Negri IRCCS Milan Italy
| |
Collapse
|
13
|
Li C, Li H, Zong HH, Huang Y, Gozin M, Sun CQ, Zhang L. Strategies for Achieving Balance between Detonation Performance and Crystal Stability of High-Energy-Density Materials. iScience 2020; 23:100944. [PMID: 32163898 PMCID: PMC7066234 DOI: 10.1016/j.isci.2020.100944] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 02/18/2020] [Accepted: 02/21/2020] [Indexed: 01/07/2023] Open
Abstract
Performance-stability contradiction of high-energy-density materials (HEDMs) is a long-standing puzzle in the field of chemistry and material science. Bridging the gap that exists between detonation performance of new HEDMs and their stability remains a formidable challenge. Achieving optimal balance between the two contradictory factors is of a significant demand for deep-well oil and gas drilling, space exploration, and other civil and defense applications. Herein, supercomputers and latest quantitative computational strategies were employed and high-throughput quantum calculations were conducted for 67 reported HEDMs. Based on statistical analysis of large amounts of physico-chemical data, in-crystal interspecies interactions were identified to be the one that provokes the performance-stability contradiction of HEDMs. To design new HEDMs with both good detonation performance and high stability, the proposed systematic and comprehensive strategies must be satisfied, which could promote the development of crystal engineering of HEDMs to an era of theory-guided rational design of materials.
Collapse
Affiliation(s)
- Chongyang Li
- Key Laboratory of Low-dimensional Materials and Application Technology (Ministry of Education), School of Materials Science and Engineering, Xiangtan University, Xiangtan 411105, China; CAEP Software Center for High Performance Numerical Simulation, Beijing 100088, China
| | - Hui Li
- Science and Technology on Combustion and Explosion Laboratory, Xi'an Modern Chemistry Research Institute, Xi'an 710065, China; School of Chemistry, Faculty of Exact Science, Tel Aviv University, Tel Aviv 69978, Israel
| | - He-Hou Zong
- Institute of Chemical Materials, China Academy of EngineeringPhysics (CAEP), Mianyang 621900, China
| | - Yongli Huang
- Key Laboratory of Low-dimensional Materials and Application Technology (Ministry of Education), School of Materials Science and Engineering, Xiangtan University, Xiangtan 411105, China.
| | - Michael Gozin
- School of Chemistry, Faculty of Exact Science, Tel Aviv University, Tel Aviv 69978, Israel.
| | - Chang Q Sun
- EBEAM, Yangtze Normal University, Chongqing 408100, China; NOVITAS, Nanyang Technological University, Singapore 639798, Singapore.
| | - Lei Zhang
- CAEP Software Center for High Performance Numerical Simulation, Beijing 100088, China; Institute of Applied Physics and Computational Mathematics, Beijing 100088, China.
| |
Collapse
|
14
|
Duchowicz PR. QSPR studies on water solubility, octanol-water partition coefficient and vapour pressure of pesticides. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2020; 31:135-148. [PMID: 31842624 DOI: 10.1080/1062936x.2019.1699602] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 11/27/2019] [Indexed: 06/10/2023]
Abstract
The assessment of the environmental fate and (eco)toxicological effects of pesticide compounds is of crucial importance. The present review is focused on Quantitative Structure-Property Relationships (QSPR) applications on three environmentally relevant physicochemical properties of pesticides, which can be used for assessing their environmental partition and transport, as well as exposure potential namely water solubility, octanol-water partition coefficient and vapour pressure. This article revises various interesting QSPR applications with special emphasis on studies developed during the 2009-2019 period.
Collapse
Affiliation(s)
- P R Duchowicz
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA), CONICET, UNLP, La Plata, Argentina
| |
Collapse
|
15
|
Fantke P, Aurisano N, Provoost J, Karamertzanis PG, Hauschild M. Toward effective use of REACH data for science and policy. ENVIRONMENT INTERNATIONAL 2020; 135:105336. [PMID: 31884133 DOI: 10.1016/j.envint.2019.105336] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 11/15/2019] [Accepted: 11/15/2019] [Indexed: 06/10/2023]
Affiliation(s)
- Peter Fantke
- Quantitative Sustainability Assessment, Department of Technology, Management and Economics, Technical University of Denmark, Produktionstorvet 424, 2800 Kgs. Lyngby, Denmark.
| | - Nicolò Aurisano
- Quantitative Sustainability Assessment, Department of Technology, Management and Economics, Technical University of Denmark, Produktionstorvet 424, 2800 Kgs. Lyngby, Denmark
| | - Jeroen Provoost
- Computational Assessment Unit, Directorate of Prioritisation and Integration, European Chemicals Agency, Annankatu 18, 00121 Helsinki, Finland
| | - Panagiotis G Karamertzanis
- Computational Assessment Unit, Directorate of Prioritisation and Integration, European Chemicals Agency, Annankatu 18, 00121 Helsinki, Finland
| | - Michael Hauschild
- Quantitative Sustainability Assessment, Department of Technology, Management and Economics, Technical University of Denmark, Produktionstorvet 424, 2800 Kgs. Lyngby, Denmark
| |
Collapse
|
16
|
Shin HK. Electron configuration-based neural network model to predict physicochemical properties of inorganic compounds. RSC Adv 2020; 10:33268-33278. [PMID: 35515036 PMCID: PMC9056678 DOI: 10.1039/d0ra05873d] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 09/01/2020] [Indexed: 11/21/2022] Open
Abstract
Registration, evaluation, and authorization of chemicals (REACH), the regulation of chemicals in use, imposes the characterization and report of the physicochemical properties of compounds. To cope with the financial burden of the experiments, the use of computational models is permitted for prediction of properties. Although a number of physicochemical property prediction models have been developed, their applicability domain is limited to organic molecules since most available data are concerned with organic molecules, and most of the molecular descriptors are restricted to organic molecule calculations. Prediction models developed for inorganic compounds were intended to predict endpoints relevant to novel material design. Therefore, no models were available for predicting endpoints of inorganic compounds that are significant to regulatory perspectives. In this study, boiling point, water solubility, melting point, and pyrolysis point prediction models were developed for inorganic compounds based on their composition. The electron configuration of each element in the molecule was used as a descriptor in this study. The dataset covered a wide range of endpoints and diverse elements in their structure. The performance of the models was measured using R2, mean absolute error, and Spearman's correlation coefficient, and indicated good prediction accuracy of continuous endpoints and prioritization of inorganic compounds. Registration, evaluation, and authorization of chemicals (REACH), the regulation of chemicals in use, imposes the characterization and report of the physicochemical properties of compounds.![]()
Collapse
Affiliation(s)
- Hyun Kil Shin
- Toxicoinformatics Group
- Department of Predictive Toxicology
- Korea Institute of Toxicology
- Daejeon
- Republic of Korea
| |
Collapse
|
17
|
Keshavarz MH, Bagheri V. A Simple Correlation for Assessment of the Shock Wave Energy in Underwater Detonation. Z Anorg Allg Chem 2019. [DOI: 10.1002/zaac.201900221] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
| | - Vahid Bagheri
- Department of Chemistry; Malek-ashtar University of Technology; P.O. Box 83145/115 Shahin-Shahr Iran
| |
Collapse
|
18
|
de Morais e Silva L, Lorenzo VP, Lopes WS, Scotti L, Scotti MT. Predictive Computational Tools for Assessment of Ecotoxicological Activity of Organic Micropollutants in Various Water Sources in Brazil. Mol Inform 2019; 38:e1800156. [DOI: 10.1002/minf.201800156] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Accepted: 01/06/2019] [Indexed: 01/18/2023]
Affiliation(s)
- Luana de Morais e Silva
- Post-Graduate Program in Science and Environmental TechnologyDepartment of Sanitary and Environmental EngineeringState University of Paraíba 58429500 Campina Grande, PB Brazil
| | - Vitor Prates Lorenzo
- Federal Institute of Education, Science and Technology Sertão Pernambucano 56316686 Petrolina, Pernambuco Brazil
| | - Wilton Silva Lopes
- Post-Graduate Program in Science and Environmental TechnologyDepartment of Sanitary and Environmental EngineeringState University of Paraíba 58429500 Campina Grande, PB Brazil
| | - Luciana Scotti
- Post-Graduate Program in Natural and Synthetic Bioactive ProductsFederal University of Paraíba 58051-900 João Pessoa, PB Brazil
| | - Marcus Tullius Scotti
- Post-Graduate Program in Natural and Synthetic Bioactive ProductsFederal University of Paraíba 58051-900 João Pessoa, PB Brazil
| |
Collapse
|
19
|
Revealing Solid Properties of High-energy-density Molecular Cocrystals from the Cooperation of Hydrogen Bonding and Molecular Polarizability. Sci Rep 2019; 9:1257. [PMID: 30718589 PMCID: PMC6362133 DOI: 10.1038/s41598-018-37500-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 11/30/2018] [Indexed: 11/08/2022] Open
Abstract
In the domain of high-energy-density materials, the understanding to physico-chemical properties has long been primarily based on molecular structures whereas the crystal packing effect that significantly affects solid properties has been seldom involved. Herewith we predict the solid properties of six novel energetic cocrystals by taking into account of the crystal packing effect using a quantum chemistry method. We discover that the hydrogen bonding causes an increase in the molecular polarizability and their cooperation significantly changes the solid-state nature of the cocrystals compared to the pristine crystal and the gas counterparts. For example, stabilizing the multi-component molecular association by increasing the binding energy by 19-41% over the pristine crystals, improving the detonation performance by 5-10% and reducing the sensitivity to external stimuli compared to their pure crystal or gas counterparts. Therefore, the solid nature of the cocrystal is not a simple combination of the pure crystalline properties of its components and the heterogeneous molecular coupling effects must be considered to design improved functional cocrystals.
Collapse
|
20
|
Geremia KL, Seybold PG. Computational estimation of the acidities of purines and indoles. J Mol Model 2019; 25:12. [PMID: 30607649 DOI: 10.1007/s00894-018-3892-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Accepted: 12/04/2018] [Indexed: 10/27/2022]
Abstract
Purines and related compounds are central ingredients in the genetic code and form the structural framework for many drugs and other bioactive compounds. A key feature of these compounds is their acidity, as expressed by their pKa values. For a proper understanding of the behaviors of these compounds, it is important to have a theoretical means for estimating their acidities. Here we present a quantum-chemical quantitative structure-activity relationship (QSAR) study of these compounds aimed at estimating the aqueous pKa values of purines and related compounds based on the energy differences in solution ΔE(H2O) between the parent compounds and their dissociation products. This method was applied to both the cation → neutral (pKa1) and neutral → anion (pKa2) dissociations of the compounds. Computations were performed using density functional theory at the B3LYP/6-31 + G** level with the SM8 aqueous solvent model. Good-quality QSAR regression equations were obtained for both dissociations using the ΔE(H2O) descriptor. These equations were applied to estimate missing pKa values for compounds in this category, and should also be applicable to the acidities of other related heterocyclic compounds.
Collapse
Affiliation(s)
- Kara L Geremia
- Department of Chemistry, Wright State University, Dayton, OH, 45435, USA
| | - Paul G Seybold
- Department of Chemistry, Wright State University, Dayton, OH, 45435, USA.
| |
Collapse
|
21
|
Al-Fakih AM, Algamal ZY, Lee MH, Aziz M. A penalized quantitative structure-property relationship study on melting point of energetic carbocyclic nitroaromatic compounds using adaptive bridge penalty. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2018; 29:339-353. [PMID: 29493376 DOI: 10.1080/1062936x.2018.1439531] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2017] [Accepted: 02/07/2018] [Indexed: 06/08/2023]
Abstract
A penalized quantitative structure-property relationship (QSPR) model with adaptive bridge penalty for predicting the melting points of 92 energetic carbocyclic nitroaromatic compounds is proposed. To ensure the consistency of the descriptor selection of the proposed penalized adaptive bridge (PBridge), we proposed a ridge estimator ([Formula: see text]) as an initial weight in the adaptive bridge penalty. The Bayesian information criterion was applied to ensure the accurate selection of the tuning parameter ([Formula: see text]). The PBridge based model was internally and externally validated based on [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], the Y-randomization test, [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text] and the applicability domain. The validation results indicate that the model is robust and not due to chance correlation. The descriptor selection and prediction performance of PBridge for the training dataset outperforms the other methods used. PBridge shows the highest [Formula: see text] of 0.959, [Formula: see text] of 0.953, [Formula: see text] of 0.949 and [Formula: see text] of 0.959, and the lowest [Formula: see text] and [Formula: see text]. For the test dataset, PBridge shows a higher [Formula: see text] of 0.945 and [Formula: see text] of 0.948, and a lower [Formula: see text] and [Formula: see text], indicating its better prediction performance. The results clearly reveal that the proposed PBridge is useful for constructing reliable and robust QSPRs for predicting melting points prior to synthesizing new organic compounds.
Collapse
Affiliation(s)
- A M Al-Fakih
- a Faculty of Science, Department of Chemistry , Universiti Teknologi Malaysia , Johor , Malaysia
- b Faculty of Science, Department of Chemistry , Sana'a University , Sana'a , Yemen
| | - Z Y Algamal
- c Department of Statistics and Informatics , University of Mosul , Mosul , Iraq
| | - M H Lee
- d Faculty of Science, Department of Mathematical Sciences , Universiti Teknologi Malaysia , Johor , Malaysia
| | - M Aziz
- a Faculty of Science, Department of Chemistry , Universiti Teknologi Malaysia , Johor , Malaysia
- e Advanced Membrane Technology Centre , Universiti Teknologi Malaysia , Johor , Malaysia
| |
Collapse
|
22
|
Raevsky OA, Grigorev VY, Polianczyk DE, Raevskaja OE, Dearden JC. Six global and local QSPR models of aqueous solubility at pH = 7.4 based on structural similarity and physicochemical descriptors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2017; 28:661-676. [PMID: 28891683 DOI: 10.1080/1062936x.2017.1368704] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 08/14/2017] [Indexed: 06/07/2023]
Abstract
Aqueous solubility at pH = 7.4 is a very important property for medicinal chemists because this is the pH value of physiological media. The present work describes the application of three different methods (support vector machine (SVM), random forest (RF) and multiple linear regression (MLR)) and three local quantitative structure-property relationship (QSPR) models (regression corrected by nearest neighbours (RCNN), arithmetic mean property (AMP) and local regression property (LoReP)) to construct stable QSPRs with clear mechanistic interpretation. Our data set contained experimental values of aqueous solubility at pH = 7.4 of 387 chemicals (349 in the training set and 38 in the test set including 16 own measurements). The initial descriptor pool contained 210 physicochemical descriptors, calculated from the HYBOT, DRAGON, SYBYL and VolSurf+ programs. Six QSPRs with good statistics based on fundamentals of aqueous solubility and optimization of descriptor space were obtained. Those models have an RMSE close to experimental error (0.70), and are amenable to physical interpretation. The QSPR models developed in this study may be useful for medicinal chemists. Global MLR, RF and SVM models may be valuable for consideration of common factors that influence solubility. The RCNN, AMP and LoReP local models may be helpful for the optimization of aqueous solubility in small sets of related chemicals.
Collapse
Affiliation(s)
- O A Raevsky
- a Department of Computer-Aided Molecular Design , Russian Academy of Science , Chernogolovka , Russia
| | - V Y Grigorev
- a Department of Computer-Aided Molecular Design , Russian Academy of Science , Chernogolovka , Russia
| | - D E Polianczyk
- a Department of Computer-Aided Molecular Design , Russian Academy of Science , Chernogolovka , Russia
| | - O E Raevskaja
- a Department of Computer-Aided Molecular Design , Russian Academy of Science , Chernogolovka , Russia
| | - J C Dearden
- b School of Pharmacy and Biomolecular Sciences , Liverpool John Moores University , Liverpool , UK
| |
Collapse
|
23
|
Korotkov AS, Gravit M. 3D-map modelling for the melting points prediction of intumescent flame-retardant coatings. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2017; 28:677-689. [PMID: 28884596 DOI: 10.1080/1062936x.2017.1370725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Accepted: 08/20/2017] [Indexed: 06/07/2023]
Abstract
The applicability of 3D map modelling for melting point prediction was studied. The melting points in the ammonium polyphosphate-pentaerythritol-melamine chemical system of intumescent flame-retardant coatings over a wide range of concentrations were collected. The ternary diagram (triangle) of the melting points was plotted and an approximated 3D map was built for the range 205-345°C. The present work contains the thermal data for the observed ternary system and provides a new graphic system for making predictions for intumescent flame-retardant coatings. The applicability of the calculated 3D map for obtaining experimental samples of fire-retardant paints with a low melting point for thin steel constructions was shown.
Collapse
Affiliation(s)
- A S Korotkov
- a Laboratory of Optical Materials and Structures , Institute of Semiconductor Physics , Novosibirsk , Russia
- b LLC, ChemCenter, Research and Development Department , Novosibirsk , Russia
| | - M Gravit
- c St. Petersburg State Polytechnical University, Department of Construction of Unique Buildings and Structures , Saint-Petersburg , Russia
| |
Collapse
|
24
|
Conformations of n -alkyl-α/β- d -glucopyranoside surfactants: Impact on molecular properties. COMPUT THEOR CHEM 2017. [DOI: 10.1016/j.comptc.2016.12.020] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
25
|
Abstract
It is widely accepted that modern QSAR began in the early 1960s. However, as long ago as 1816 scientists were making predictions about physical and chemical properties. The first investigations into the correlation of biological activities with physicochemical properties such as molecular weight and aqueous solubility began in 1841, almost 60 years before the important work of Overton and Meyer linking aquatic toxicity to lipid-water partitioning. Throughout the 20th century QSAR progressed, though there were many lean years. In 1962 came the seminal work of Corwin Hansch and co-workers, which stimulated a huge interest in the prediction of biological activities. Initially that interest lay largely within medicinal chemistry and drug design, but in the 1970s and 1980s, with increasing ecotoxicological concerns, QSAR modelling of environmental toxicities began to grow, especially once regulatory authorities became involved. Since then QSAR has continued to expand, with over 1400 publications annually from 2011 onwards.
Collapse
|
26
|
Jafari M, Keshavarz MH, Noorbala MR, Kamalvand M. A Reliable Method for Prediction of the Condensed Phase Enthalpy of Formation of High Nitrogen Content Materials through their Gas Phase Information. ChemistrySelect 2016. [DOI: 10.1002/slct.201601184] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Affiliation(s)
- Mohammad Jafari
- Department of Chemistry; Malek-ashtar University of Technology; Shahin-shahr, P.O. Box 83145/115 Islamic Republic of Iran
| | - Mohammad Hossein Keshavarz
- Department of Chemistry; Malek-ashtar University of Technology; Shahin-shahr, P.O. Box 83145/115 Islamic Republic of Iran
| | - Mohammad Reza Noorbala
- Department of Chemistry, Faculty of Science; Yazd University; Yazd, P.O. Box 89195/741 Islamic Republic of Iran
| | - Mohammad Kamalvand
- Department of Chemistry, Faculty of Science; Yazd University; Yazd, P.O. Box 89195/741 Islamic Republic of Iran
| |
Collapse
|
27
|
Bergström CAS, Charman WN, Porter CJH. Computational prediction of formulation strategies for beyond-rule-of-5 compounds. Adv Drug Deliv Rev 2016; 101:6-21. [PMID: 26928657 DOI: 10.1016/j.addr.2016.02.005] [Citation(s) in RCA: 106] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Revised: 02/11/2016] [Accepted: 02/17/2016] [Indexed: 12/12/2022]
Abstract
The physicochemical properties of some contemporary drug candidates are moving towards higher molecular weight, and coincidentally also higher lipophilicity in the quest for biological selectivity and specificity. These physicochemical properties move the compounds towards beyond rule-of-5 (B-r-o-5) chemical space and often result in lower water solubility. For such B-r-o-5 compounds non-traditional delivery strategies (i.e. those other than conventional tablet and capsule formulations) typically are required to achieve adequate exposure after oral administration. In this review, we present the current status of computational tools for prediction of intestinal drug absorption, models for prediction of the most suitable formulation strategies for B-r-o-5 compounds and models to obtain an enhanced understanding of the interplay between drug, formulation and physiological environment. In silico models are able to identify the likely molecular basis for low solubility in physiologically relevant fluids such as gastric and intestinal fluids. With this baseline information, a formulation scientist can, at an early stage, evaluate different orally administered, enabling formulation strategies. Recent computational models have emerged that predict glass-forming ability and crystallisation tendency and therefore the potential utility of amorphous solid dispersion formulations. Further, computational models of loading capacity in lipids, and therefore the potential for formulation as a lipid-based formulation, are now available. Whilst such tools are useful for rapid identification of suitable formulation strategies, they do not reveal drug localisation and molecular interaction patterns between drug and excipients. For the latter, Molecular Dynamics simulations provide an insight into the interplay between drug, formulation and intestinal fluid. These different computational approaches are reviewed. Additionally, we analyse the molecular requirements of different targets, since these can provide an early signal that enabling formulation strategies will be required. Based on the analysis we conclude that computational biopharmaceutical profiling can be used to identify where non-conventional gateways, such as prediction of 'formulate-ability' during lead optimisation and early development stages, are important and may ultimately increase the number of orally tractable contemporary targets.
Collapse
Affiliation(s)
- Christel A S Bergström
- Drug Delivery, Disposition and Dynamics, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, Victoria 3052, Australia; Department of Pharmacy, Uppsala University, Uppsala Biomedical Center, P.O. Box 580, SE-751 23 Uppsala, Sweden.
| | - William N Charman
- Drug Delivery, Disposition and Dynamics, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, Victoria 3052, Australia
| | - Christopher J H Porter
- Drug Delivery, Disposition and Dynamics, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, Victoria 3052, Australia; ARC Centre of Excellence in Convergent Nano-Bio Science and Technology, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, Victoria 3052, Australia
| |
Collapse
|
28
|
Tetko IV, M. Lowe D, Williams AJ. The development of models to predict melting and pyrolysis point data associated with several hundred thousand compounds mined from PATENTS. J Cheminform 2016; 8:2. [PMID: 26807157 PMCID: PMC4724158 DOI: 10.1186/s13321-016-0113-y] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Accepted: 01/08/2016] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Melting point (MP) is an important property in regards to the solubility of chemical compounds. Its prediction from chemical structure remains a highly challenging task for quantitative structure-activity relationship studies. Success in this area of research critically depends on the availability of high quality MP data as well as accurate chemical structure representations in order to develop models. Currently, available datasets for MP predictions have been limited to around 50k molecules while lots more data are routinely generated following the synthesis of novel materials. Significant amounts of MP data are freely available within the patent literature and, if it were available in the appropriate form, could potentially be used to develop predictive models. RESULTS We have developed a pipeline for the automated extraction and annotation of chemical data from published PATENTS. Almost 300,000 data points have been collected and used to develop models to predict melting and pyrolysis (decomposition) points using tools available on the OCHEM modeling platform (http://ochem.eu). A number of technical challenges were simultaneously solved to develop models based on these data. These included the handing of sparse data matrices with >200,000,000,000 entries and parallel calculations using 32 × 6 cores per task using 13 descriptor sets totaling more than 700,000 descriptors. We showed that models developed using data collected from PATENTS had similar or better prediction accuracy compared to the highly curated data used in previous publications. The separation of data for chemicals that decomposed rather than melting, from compounds that did undergo a normal melting transition, was performed and models for both pyrolysis and MPs were developed. The accuracy of the consensus MP models for molecules from the drug-like region of chemical space was similar to their estimated experimental accuracy, 32 °C. Last but not least, important structural features related to the pyrolysis of chemicals were identified, and a model to predict whether a compound will decompose instead of melting was developed. CONCLUSIONS We have shown that automated tools for the analysis of chemical information have reached a mature stage allowing for the extraction and collection of high quality data to enable the development of structure-activity relationship models. The developed models and data are publicly available at http://ochem.eu/article/99826.
Collapse
Affiliation(s)
- Igor V. Tetko
- />Institute of Structural Biology, Helmholtz Zentrum München für Gesundheit und Umwelt (HMGU), Ingolstädter Landstraße 1, b. 60w, 85764 Neuherberg, Germany
- />BigChem GmbH, 85764 Neuherberg, Germany
| | - Daniel M. Lowe
- />NextMove Software Limited, Innovation Centre (Unit 23), Cambridge Science Park, Cambridge, CB4 0EY UK
| | | |
Collapse
|
29
|
Nieto-Draghi C, Fayet G, Creton B, Rozanska X, Rotureau P, de Hemptinne JC, Ungerer P, Rousseau B, Adamo C. A General Guidebook for the Theoretical Prediction of Physicochemical Properties of Chemicals for Regulatory Purposes. Chem Rev 2015; 115:13093-164. [PMID: 26624238 DOI: 10.1021/acs.chemrev.5b00215] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Carlos Nieto-Draghi
- IFP Energies nouvelles , 1 et 4 avenue de Bois-Préau, 92852 Rueil-Malmaison, France
| | - Guillaume Fayet
- INERIS, Parc Technologique Alata, BP2 , 60550 Verneuil-en-Halatte, France
| | - Benoit Creton
- IFP Energies nouvelles , 1 et 4 avenue de Bois-Préau, 92852 Rueil-Malmaison, France
| | - Xavier Rozanska
- Materials Design S.A.R.L. , 18, rue de Saisset, 92120 Montrouge, France
| | - Patricia Rotureau
- INERIS, Parc Technologique Alata, BP2 , 60550 Verneuil-en-Halatte, France
| | | | - Philippe Ungerer
- Materials Design S.A.R.L. , 18, rue de Saisset, 92120 Montrouge, France
| | - Bernard Rousseau
- Laboratoire de Chimie-Physique, Université Paris Sud , UMR 8000 CNRS, Bât. 349, 91405 Orsay Cedex, France
| | - Carlo Adamo
- Institut de Recherche Chimie Paris, PSL Research University, CNRS, Chimie Paristech , 11 rue P. et M. Curie, F-75005 Paris, France.,Institut Universitaire de France , 103 Boulevard Saint Michel, F-75005 Paris, France
| |
Collapse
|
30
|
Gaudin T, Rotureau P, Fayet G. Mixture Descriptors toward the Development of Quantitative Structure–Property Relationship Models for the Flash Points of Organic Mixtures. Ind Eng Chem Res 2015. [DOI: 10.1021/acs.iecr.5b01457] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Théophile Gaudin
- INERIS, Parc Technologique Alata, BP2, 60550 Verneuil-en-Halatte, France
| | - Patricia Rotureau
- INERIS, Parc Technologique Alata, BP2, 60550 Verneuil-en-Halatte, France
| | - Guillaume Fayet
- INERIS, Parc Technologique Alata, BP2, 60550 Verneuil-en-Halatte, France
| |
Collapse
|
31
|
Husch T, Yilmazer ND, Balducci A, Korth M. Large-scale virtual high-throughput screening for the identification of new battery electrolyte solvents: computing infrastructure and collective properties. Phys Chem Chem Phys 2015; 17:3394-401. [DOI: 10.1039/c4cp04338c] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
A volunteer computing approach is presented for the purpose of screening a large number of molecular structures with respect to their suitability as new battery electrolyte solvents.
Collapse
Affiliation(s)
- Tamara Husch
- Institute for Theoretical Chemistry
- Ulm University
- 89069 Ulm
- Germany
| | | | | | - Martin Korth
- Institute for Theoretical Chemistry
- Ulm University
- 89069 Ulm
- Germany
| |
Collapse
|
32
|
Dearden JC, Rowe PH. Use of artificial neural networks in the QSAR prediction of physicochemical properties and toxicities for REACH legislation. Methods Mol Biol 2015; 1260:65-88. [PMID: 25502376 DOI: 10.1007/978-1-4939-2239-0_5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
With the introduction of the REACH legislation in the European Union, there is a requirement for property and toxicity data on chemicals produced in or imported into the EU at levels of 1 tonne/year or more. This has meant an increase in the in silico prediction of such data. One of the chief predictive approaches is QSAR (quantitative structure-activity relationships), which is widely used in many fields. A QSAR approach that is increasingly being used is that of artificial neural networks (ANNs), and this chapter discusses its application to the range of physicochemical properties and toxicities required by REACH. ANNs generally outperform the main QSAR approach of multiple linear regression (MLR), although other approaches such as support vector machines sometimes outperform ANNs. Most ANN QSARs reported to date comply with only two of the five OECD Guidelines for the Validation of (Q)SARs.
Collapse
Affiliation(s)
- John C Dearden
- School of Pharmacy & Biomolecular Sciences, Liverpool John Moores University, Byrom Street, Liverpool, L3 3AF, UK,
| | | |
Collapse
|
33
|
Carroll FA, Brown DM, Quina FH. Predicting Boiling Points and Flash Points of Monochloroalkanes from Structure. Ind Eng Chem Res 2014. [DOI: 10.1021/ie503162h] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Felix A. Carroll
- Department
of Chemistry, Davidson College, Davidson, North Carolina 28035, United States
| | - David M. Brown
- Department
of Chemistry, Davidson College, Davidson, North Carolina 28035, United States
| | - Frank H. Quina
- Instituto
de Química, Universidade de São Paulo, C.P.
26077, São Paulo 05513-970, Brazil
| |
Collapse
|
34
|
Tetko IV, Sushko Y, Novotarskyi S, Patiny L, Kondratov I, Petrenko AE, Charochkina L, Asiri AM. How accurately can we predict the melting points of drug-like compounds? J Chem Inf Model 2014; 54:3320-9. [PMID: 25489863 PMCID: PMC4702524 DOI: 10.1021/ci5005288] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
This article contributes a highly accurate model for predicting the melting points (MPs) of medicinal chemistry compounds. The model was developed using the largest published data set, comprising more than 47k compounds. The distributions of MPs in drug-like and drug lead sets showed that >90% of molecules melt within [50,250]°C. The final model calculated an RMSE of less than 33 °C for molecules from this temperature interval, which is the most important for medicinal chemistry users. This performance was achieved using a consensus model that performed calculations to a significantly higher accuracy than the individual models. We found that compounds with reactive and unstable groups were overrepresented among outlying compounds. These compounds could decompose during storage or measurement, thus introducing experimental errors. While filtering the data by removing outliers generally increased the accuracy of individual models, it did not significantly affect the results of the consensus models. Three analyzed distance to models did not allow us to flag molecules, which had MP values fell outside the applicability domain of the model. We believe that this negative result and the public availability of data from this article will encourage future studies to develop better approaches to define the applicability domain of models. The final model, MP data, and identified reactive groups are available online at http://ochem.eu/article/55638.
Collapse
Affiliation(s)
- Igor V Tetko
- Helmholtz-Zentrum München - German Research Centre for Environmental Health (GmbH), Institute of Structural Biology , Munich 85764, Germany
| | | | | | | | | | | | | | | |
Collapse
|
35
|
Pan Y, Zhang Y, Jiang J, Ding L. Prediction of the self-accelerating decomposition temperature of organic peroxides using the quantitative structure–property relationship (QSPR) approach. J Loss Prev Process Ind 2014. [DOI: 10.1016/j.jlp.2014.06.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
|
36
|
Prana V, Rotureau P, Fayet G, André D, Hub S, Vicot P, Rao L, Adamo C. Prediction of the thermal decomposition of organic peroxides by validated QSPR models. JOURNAL OF HAZARDOUS MATERIALS 2014; 276:216-224. [PMID: 24887124 DOI: 10.1016/j.jhazmat.2014.05.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 04/15/2014] [Accepted: 05/05/2014] [Indexed: 06/03/2023]
Abstract
Organic peroxides are unstable chemicals which can easily decompose and may lead to explosion. Such a process can be characterized by physico-chemical parameters such as heat and temperature of decomposition, whose determination is crucial to manage related hazards. These thermal stability properties are also required within many regulatory frameworks related to chemicals in order to assess their hazardous properties. In this work, new quantitative structure-property relationships (QSPR) models were developed to predict accurately the thermal stability of organic peroxides from their molecular structure respecting the OECD guidelines for regulatory acceptability of QSPRs. Based on the acquisition of 38 reference experimental data using DSC (differential scanning calorimetry) apparatus in homogenous experimental conditions, multi-linear models were derived for the prediction of the decomposition heat and the onset temperature using different types of molecular descriptors. Models were tested by internal and external validation tests and their applicability domains were defined and analyzed. Being rigorously validated, they presented the best performances in terms of fitting, robustness and predictive power and the descriptors used in these models were linked to the peroxide bond whose breaking represents the main decomposition mechanism of organic peroxides.
Collapse
Affiliation(s)
- Vinca Prana
- Institut de Recherche de Chimie Paris, Chimie ParisTech CNRS, 11 rue P. et M. Curie, Paris 75005, France; Institut National de l'Environnement Industriel et des Risques (INERIS), Parc Technologique Alata, BP2, Verneuil-en-Halatte 60550, France
| | - Patricia Rotureau
- Institut National de l'Environnement Industriel et des Risques (INERIS), Parc Technologique Alata, BP2, Verneuil-en-Halatte 60550, France.
| | - Guillaume Fayet
- Institut National de l'Environnement Industriel et des Risques (INERIS), Parc Technologique Alata, BP2, Verneuil-en-Halatte 60550, France
| | - David André
- ARKEMA, rue Henri Moissan, BP63, Pierre Benite 69493, France
| | - Serge Hub
- ARKEMA, rue Henri Moissan, BP63, Pierre Benite 69493, France
| | - Patricia Vicot
- Institut National de l'Environnement Industriel et des Risques (INERIS), Parc Technologique Alata, BP2, Verneuil-en-Halatte 60550, France
| | - Li Rao
- Institut de Recherche de Chimie Paris, Chimie ParisTech CNRS, 11 rue P. et M. Curie, Paris 75005, France
| | - Carlo Adamo
- Institut de Recherche de Chimie Paris, Chimie ParisTech CNRS, 11 rue P. et M. Curie, Paris 75005, France; Institut Universitaire de France, 103 Boulevard Saint Michel, Paris F-75005, France
| |
Collapse
|
37
|
Stenzel A, Goss KU, Endo S. Prediction of partition coefficients for complex environmental contaminants: Validation of COSMOtherm, ABSOLV, and SPARC. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY 2014; 33:1537-43. [PMID: 24668883 DOI: 10.1002/etc.2587] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Revised: 03/04/2014] [Accepted: 03/20/2014] [Indexed: 05/26/2023]
Abstract
Prediction of partition coefficients is essential for screening of environmentally relevant compounds. Prediction methods using only the molecular structure as input are especially useful for this purpose. In the present study, the authors validated 3 prediction method-COSMOtherm, ABSOLV, and SPARC-which are based on more mechanistic approaches than most other quantitative structure-activity relationships. Validation was based on a consistent experimental data set of up to 270 compounds, mostly pesticides and flame retardants. The validation systems included 3 gas chromatographic (GC) columns and 4 liquid/liquid systems that represent all relevant types of intermolecular interactions. Results revealed that the overall prediction accuracy of COSMOtherm and ABSOLV is comparable, whereas SPARC performance is substantially lower than the other methods. For instance, the root mean squared error for the 4 liquid/liquid partition coefficients was 0.65 log units to 0.93 log units for COSMOtherm, 0.64 log units to 0.95 log units for ABSOLV, and 1.43 to 2.85 log units for SPARC. In addition, version and parameterization influences of COSMOtherm on the prediction accuracy were determined.
Collapse
Affiliation(s)
- Angelika Stenzel
- Helmholtz Centre for Environmental Research UFZ, Leipzig, Germany
| | | | | |
Collapse
|
38
|
Fayet G, Rotureau P. Development of simple QSPR models for the impact sensitivity of nitramines. J Loss Prev Process Ind 2014. [DOI: 10.1016/j.jlp.2014.04.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
39
|
Fayet G, Rotureau P, Minisini B. Decomposition mechanisms of trinitroalkyl compounds: a theoretical study from aliphatic to aromatic nitro compounds. Phys Chem Chem Phys 2014; 16:6614-22. [PMID: 24569436 DOI: 10.1039/c3cp54719a] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The chemical mechanisms involved in the decomposition of trinitroethyl compounds were studied for both aliphatic and aromatic derivatives using density functional theory calculations. At first, in the case of 1,1,1-trinitrobutane, used as a reference molecule, two primary channels were highlighted among the five investigated ones: the breaking of the C-N bond and the HONO elimination. Then, the influence of various structural parameters was studied for these two reactions by changing the length of the carbon chain, adding substituents or double bonds along the carbon chain. If some slight changes in activation energies were observed for most of these features, no modification of the competition between the two investigated reactions was highlighted and the breaking of the C-N bond remained the favoured mechanism. At last, the reactions involving the trinitroalkyl fragments were highlighted to be more competitive than reactions involving nitro groups linked to aromatic cycles in two aromatic systems (4-(1,1,1-trinitrobutyl)-nitrobenzene and 2-(1,1,1-trinitrobutyl)-nitrobenzene). This showed that aromatic nitro compounds with trinitroalkyl derivatives decompose from their alkyl part and may be considered more likely as aliphatic than as aromatic regarding the initiation of their decomposition process.
Collapse
Affiliation(s)
- Guillaume Fayet
- Institut National de l'Environnement Industriel et des Risques (INERIS), Parc Technologique Alata, BP2, 60550 Verneuil-en-Halatte, France.
| | | | | |
Collapse
|
40
|
Raevsky OA, Grigor'ev VY, Polianczyk DE, Raevskaja OE, Dearden JC. Calculation of aqueous solubility of crystalline un-ionized organic chemicals and drugs based on structural similarity and physicochemical descriptors. J Chem Inf Model 2014; 54:683-91. [PMID: 24456022 DOI: 10.1021/ci400692n] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Solubilities of crystalline organic compounds calculated according to AMP (arithmetic mean property) and LoReP (local one-parameter regression) models based on structural and physicochemical similarities are presented. We used data on water solubility of 2615 compounds in un-ionized form measured at 25±5 °C. The calculation results were compared with the equation based on the experimental data for lipophilicity and melting point. According to statistical criteria, the model based on structural and physicochemical similarities showed a better fit with the experimental data. An additional advantage of this model is that it uses only theoretical descriptors, and this provides means for calculating water solubility for both existing and not yet synthesized compounds.
Collapse
Affiliation(s)
- Oleg A Raevsky
- Institute of Physiologically Active Compounds, Russian Academy of Science , Chernogolovka, Russia
| | | | | | | | | |
Collapse
|