1
|
Xiao Z, Zhu M, Chen J, You Z. Integrated Transfer Learning and Multitask Learning Strategies to Construct Graph Neural Network Models for Predicting Bioaccumulation Parameters of Chemicals. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024. [PMID: 39051472 DOI: 10.1021/acs.est.4c02421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
Accurate prediction of parameters related to the environmental exposure of chemicals is crucial for the sound management of chemicals. However, the lack of large data sets for training models may result in poor prediction accuracy and robustness. Herein, integrated transfer learning (TL) and multitask learning (MTL) was proposed for constructing a graph neural network (GNN) model (abbreviated as TL-MTL-GNN model) using n-octanol/water partition coefficients as a source domain. The TL-MTL-GNN model was trained to predict three bioaccumulation parameters based on enlarged data sets that cover 2496 compounds with at least one bioaccumulation parameter. Results show that the TL-MTL-GNN model outperformed single-task GNN models with and without the TL, as well as conventional machine learning models trained with molecular descriptors or fingerprints. Applicability domains were characterized by a state-of-the-art structure-activity landscape-based (abbreviated as ADSAL) methodology. The TL-MTL-GNN model coupled with the optimal ADSAL was employed to predict bioaccumulation parameters for around 60,000 chemicals, with more than 13,000 compounds identified as bioaccumulative chemicals. The high predictive accuracy and robustness of the TL-MTL-GNN model demonstrate the feasibility of integrating the TL and MTL strategy in modeling small-sized data sets. The strategy holds significant potential for addressing small data challenges in modeling environmental chemicals.
Collapse
Affiliation(s)
- Zijun Xiao
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Minghua Zhu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
- Key Laboratory of Integrated Regulation and Resources Development of Shallow Lakes of Ministry of Education, College of Environment, Hohai University, Nanjing 210098, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zecang You
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
2
|
Zhu J, Huang Y, Yi Q, Bu L, Zhou S, Shi Z. Predicting reactivity dynamics of halogen species and trace organic contaminants using machine learning models. CHEMOSPHERE 2024; 346:140659. [PMID: 37949193 DOI: 10.1016/j.chemosphere.2023.140659] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 11/04/2023] [Accepted: 11/06/2023] [Indexed: 11/12/2023]
Abstract
Reactions of reactive halogen species (Cl•, Br•, and Cl2•-) with trace organic contaminants (TrOCs) have received much attention in recent years, and their k values are fundamental parameters for understanding their reaction mechanisms. However, k values are usually unknown. In this study, we developed machine learning (ML)-based quantitative structure-activity relationship (QSAR) models to predict k values. We tested five algorithms, namely, random forest, neural network, XGBoost, support vector machine (SVM), and multilinear regression, using molecular descriptors (MDs) and molecular fingerprints (MFs) as inputs. The optimal algorithms were MD-XGBoost for Cl• and Br•, and MF-SVM for Cl2•-, respectively, with R2test values of 0.876, 0.743, and 0.853. We found that electron-withdrawing/donating groups tended to interfere with the reactivity of Cl2•- more than Cl• and Br•. This explains why MFs are better inputs for predictive models of Cl2•-, whereas MDs are more suitable for Cl• and Br•. Furthermore, we interpreted the models using SHAP analysis, and the results indicated that our models accurately predicted k values both statistically and mechanistically. Our models provide useful tools for obtaining unknown k values and help researchers understand the inherent relationships between the models.
Collapse
Affiliation(s)
- Jingyi Zhu
- Hunan Engineering Research Center of Water Security Technology and Application, College of Civil Engineering, Hunan University, Changsha, 410082, PR China
| | - Yuanxi Huang
- Hunan Engineering Research Center of Water Security Technology and Application, College of Civil Engineering, Hunan University, Changsha, 410082, PR China
| | - Qihang Yi
- Hunan University Design and Research Institute Co., Ltd., Changsha, 410082, PR China
| | - Lingjun Bu
- Hunan Engineering Research Center of Water Security Technology and Application, College of Civil Engineering, Hunan University, Changsha, 410082, PR China.
| | - Shiqing Zhou
- Hunan Engineering Research Center of Water Security Technology and Application, College of Civil Engineering, Hunan University, Changsha, 410082, PR China
| | - Zhou Shi
- Hunan Engineering Research Center of Water Security Technology and Application, College of Civil Engineering, Hunan University, Changsha, 410082, PR China
| |
Collapse
|
3
|
Pandey V. Predictionof Environmental FateandToxicityofInsecticidesUsing Multi-Target QSAR Approach. Chem Biodivers 2024; 21:e202301213. [PMID: 38109053 DOI: 10.1002/cbdv.202301213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 12/03/2023] [Indexed: 12/19/2023]
Abstract
Ecotoxicological risk assessments form the foundation of regulatory decisions for industrial chemicals used in various sectors. In this study, a multi-target-QSAR model established by a backpropagation neural network trained with the Levenberg-Marquardt (LM) algorithm was used to construct a statistically robust and easily interpretable Mt-QSAR model with high external predictability for the simultaneous prediction of the environmental fate in form of octanol-water partition coefficient (LogP), (BCF) and acute oral toxicity in mammals and birds (LD50rat ) and (LD50bird ) for a wide range of chemical structural classes of insecticides. Principal component analysis was performed on descriptors selected by the SW-MLR method, and the selected PCs were used for constructing the SW-MLR-PCA-ANN model. The developed well-trained model (RMSE=0.83, MPE=0.004, CCC=0.82, IIC=0.78, R2 =0.69) was statistically robust as indicated by the external validation parameters (RMSE=0.93, MPE=0.008, CCC=0.77, IIC=0.68, R2 =0.61). The AD of the developed Mt-QSAR model was also defined to identify the most reliable predictions. Finally, the missing values in the dataset for the aforementioned targets were predicted using the constructed Mt-QSAR model. The proposed approach can be used for simultaneous prediction of the environmental fate of new insecticides, especially ones that haven't been tested yet.
Collapse
Affiliation(s)
- Vandana Pandey
- Department of Chemistry, Kurukshetra University, Kurukshetra, Haryana, 136119, India
| |
Collapse
|
4
|
Chen P, Hu Y, Chen G, Zhao N, Dou Z. Probing the bioconcentration and metabolism disruption of bisphenol A and its analogues in adult female zebrafish from integrated AutoQSAR and metabolomics studies. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 905:167011. [PMID: 37704156 DOI: 10.1016/j.scitotenv.2023.167011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 08/31/2023] [Accepted: 09/10/2023] [Indexed: 09/15/2023]
Abstract
Plenty of emerging bisphenol A (BPA) substitutes rise to wait for assessment of bioconcentration and metabolism disruption. Computational methods are useful to fill the data gap in chemical risk assessment, such as automated quantitative structure-activity relationship (AutoQSAR). It is not clear how AutoQSAR performs in predicting the bioconcentration factor (BCF) in adult zebrafish. Herein, AutoQSAR was used to predict the logBCFs of BPA, bisphenol AF (BPAF), bisphenol B, bisphenol F and bisphenol S (BPS). For the test set, a linear relationship was shown between the observed and predicted logBCFs with a slope of 0.97. The predicted logBCFs of these five bisphenols were quite close to their experimental data with a slope of 0.94, suggesting better performance than directed message passing neural networks and EPI Suite with a slope of 0.69 and 0.61, respectively. Thus, AutoQSAR is powerful in modeling logBCFs in fish with minimal time and expertise. To link bioconcentration with metabolic effects, female zebrafish were exposed to BPA, BPAF and BPS for metabolomics analysis. BPA caused a significant disturbance in amino acid metabolism, while BPAF and BPS significantly altered another three metabolic pathways, showing chemical-specific responses. BPAF with the highest logBCF elicited the strongest metabolomic responses reflected by the metabolic effect level index, followed by BPA and BPS. Thus, BPAF and BPS elicited higher or similar metabolism disruption compared with BPA in female zebrafish, respectively, reflecting consequences of bioconcentration.
Collapse
Affiliation(s)
- Pengyu Chen
- Jiangsu Province Engineering Research Center for Marine Bio-resources Sustainable Utilization, College of Oceanography, Hohai University, Nanjing 210024, China; Key Laboratory of Integrated Regulation and Resources Development of Shallow Lakes of Ministry of Education, Hohai University, Nanjing 210024, China.
| | - Yuxi Hu
- Jiangsu Province Engineering Research Center for Marine Bio-resources Sustainable Utilization, College of Oceanography, Hohai University, Nanjing 210024, China
| | - Geng Chen
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 330106, China
| | - Na Zhao
- Jiangsu Province Engineering Research Center for Marine Bio-resources Sustainable Utilization, College of Oceanography, Hohai University, Nanjing 210024, China
| | - Zhichao Dou
- Jiangsu Province Engineering Research Center for Marine Bio-resources Sustainable Utilization, College of Oceanography, Hohai University, Nanjing 210024, China
| |
Collapse
|
5
|
Zhu T, Zhang Y, Li Y, Tao T, Tao C. Contribution of molecular structures and quantum chemistry technique to root concentration factor: An innovative application of interpretable machine learning. JOURNAL OF HAZARDOUS MATERIALS 2023; 459:132320. [PMID: 37604035 DOI: 10.1016/j.jhazmat.2023.132320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 08/03/2023] [Accepted: 08/15/2023] [Indexed: 08/23/2023]
Abstract
Root concentration factor (RCF) is a significant parameter to characterize uptake and accumulation of hazardous organic contaminants (HOCs) by plant roots. However, complex interactions among chemicals, plant roots and soil make it challenging to identify underlying mechanisms of uptake and accumulation of HOCs. Here, nine machine learning techniques were applied to investigate major factors controlling RCF based on variable combinations of molecular descriptors (MD), MACCS fingerprints, quantum chemistry descriptors (QCD) and three physicochemical properties related to chemical-soil-plant system. Compared to models with variables including MACCS fingerprints or solitary physicochemical properties, the XGBoost-6 model developed by the variable combination of MD, QCD and three physicochemical properties achieved the most remarkable performance, with R2 of 0.977. Model interpretation achieved by permutation variable importance and partial dependence plots revealed the vital importance of HOCs lipophilicity, lipid content of plant roots, soil organic matter content, the overall deformability and the molecular dispersive ability of HOCs for regulating RCF. The integration of MD and QCD with physicochemical properties could improve our knowledge of underlying mechanisms regarding HOCs accumulation in plant roots from innovative structural perspectives. Multiple variables combination-oriented performance improvement of model can be extended to other parameters prediction in environmental risk assessment field.
Collapse
Affiliation(s)
- Tengyi Zhu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China.
| | - Yu Zhang
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Yi Li
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Tianyun Tao
- College of Agriculture, Yangzhou University, Yangzhou 225009, Jiangsu, China
| | - Cuicui Tao
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| |
Collapse
|
6
|
Denaro G, Curcio L, Borri A, D'Orsi L, De Gaetano A. A dynamic integrated model for mercury bioaccumulation in marine organisms. ECOL INFORM 2023. [DOI: 10.1016/j.ecoinf.2023.102056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
7
|
Yang L, Chen P, He K, Wang R, Chen G, Shan G, Zhu L. Predicting bioconcentration factor and estrogen receptor bioactivity of bisphenol a and its analogues in adult zebrafish by directed message passing neural networks. ENVIRONMENT INTERNATIONAL 2022; 169:107536. [PMID: 36152365 DOI: 10.1016/j.envint.2022.107536] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 08/23/2022] [Accepted: 09/19/2022] [Indexed: 06/16/2023]
Abstract
The bioconcentration factor (BCF) is a key parameter for bioavailability assessment of environmental pollutants in regulatory frameworks. The comparative toxicology and mechanism of action of congeners are also of concern. However, there are limitations to acquire them by conducting field and laboratory experiments while machinelearning is emerging as a promising predictive tool to fill the gap. In this study, the Direct Message Passing Neural Network (DMPNN) was applied to predict logBCFs of bisphenol A (BPA) and its four analogues (bisphenol AF (BPAF), bisphenol B (BPB), bisphenol F (BPF) and bisphenol S (BPS)). For the test set, the Pearson correlation coefficient (PCC) and mean square error (MSE) were 0.85 and 0.52 respectively, suggesting a good predictive performance. The predicted logBCFs values by the DMPNN ranging from 0.35 (BPS) to 2.14 (BPAF) coincided well with those by the classical EPI Suite (BCFBAF model). Besides, estrogen receptor α (ERα) bioactivity of these bisphenols was also predicted well by the DMPNN, with a probability of 97.0 % (BPB) to 99.7 % (BPAF), which was validated by the extent of vitellogenin (VTG) induction in male zebrafish as a biomarker except BPS. Thus, with little need for expert knowledge, DMPNN is confirmed to be a useful tool to accurately predict logBCF and screen for estrogenic activity from molecular structures. Moreover, a gender difference was noted in the changes of three endpoints (logBCF, ER binding affinity and VTG levels), the rank order of which was BPAF > BPB > BPA > BPF > BPS consistently, and abnormal amino acid metabolism is featured as an omics signature of abnormal hormone protein expression.
Collapse
Affiliation(s)
- Liping Yang
- Key Laboratory of Pollution Processes and Environmental Criteria, Ministry of Education, Tianjin Key Laboratory of Environmental Remediation and Pollution Control, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China
| | - Pengyu Chen
- Key Laboratory of Pollution Processes and Environmental Criteria, Ministry of Education, Tianjin Key Laboratory of Environmental Remediation and Pollution Control, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China; College of Oceanography, Hohai University, Nanjing 210098, China
| | - Keyan He
- Key Laboratory of Pollution Processes and Environmental Criteria, Ministry of Education, Tianjin Key Laboratory of Environmental Remediation and Pollution Control, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China
| | - Ruihan Wang
- College of Chemistry, Sichuan University, Chengdu, Sichuan 610064, China
| | - Geng Chen
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 330106, China
| | - Guoqiang Shan
- Key Laboratory of Pollution Processes and Environmental Criteria, Ministry of Education, Tianjin Key Laboratory of Environmental Remediation and Pollution Control, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China.
| | - Lingyan Zhu
- Key Laboratory of Pollution Processes and Environmental Criteria, Ministry of Education, Tianjin Key Laboratory of Environmental Remediation and Pollution Control, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China
| |
Collapse
|
8
|
Bertato L, Chirico N, Papa E. Predicting the Bioconcentration Factor in Fish from Molecular Structures. TOXICS 2022; 10:toxics10100581. [PMID: 36287860 PMCID: PMC9610932 DOI: 10.3390/toxics10100581] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Revised: 09/25/2022] [Accepted: 09/26/2022] [Indexed: 05/14/2023]
Abstract
The bioconcentration factor (BCF) is one of the metrics used to evaluate the potential of a substance to bioaccumulate into aquatic organisms. In this work, linear and non-linear regression QSARs were developed for the prediction of log BCF using different computational approaches, and starting from a large and structurally heterogeneous dataset. The new MLR-OLS and ANN regression models have good fitting with R2 values of 0.62 and 0.70, respectively, and comparable external predictivity with R2ext 0.64 and 0.65 (RMSEext of 0.78 and 0.76), respectively. Furthermore, linear and non-linear classification models were developed using the regulatory threshold BCF >2000. A class balanced subset was used to develop classification models which were applied to chemicals not used to create the QSARs. These classification models are characterized by external and internal accuracy up to 84% and 90%, respectively, and sensitivity and specificity up to 90% and 80%, respectively. QSARs presented in this work are validated according to regulatory requirements and their quality is in line with other tools available for the same endpoint and dataset, with the advantage of low complexity and easy application through the software QSAR-ME Profiler. These QSARs can be used as alternatives for, or in combination with, existing models to support bioaccumulation assessment procedures.
Collapse
|
9
|
Liman W, Oubahmane M, Hdoufane I, Bjij I, Villemin D, Daoud R, Cherqaoui D, El Allali A. Monte Carlo Method and GA-MLR-Based QSAR Modeling of NS5A Inhibitors against the Hepatitis C Virus. Molecules 2022; 27:molecules27092729. [PMID: 35566079 PMCID: PMC9099611 DOI: 10.3390/molecules27092729] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 04/13/2022] [Accepted: 04/21/2022] [Indexed: 11/16/2022] Open
Abstract
Hepatitis C virus (HCV) is a serious disease that threatens human health. Despite consistent efforts to inhibit the virus, it has infected more than 58 million people, with 300,000 deaths per year. The HCV nonstructural protein NS5A plays a critical role in the viral life cycle, as it is a major contributor to the viral replication and assembly processes. Therefore, its importance is evident in all currently approved HCV combination treatments. The present study identifies new potential compounds for possible medical use against HCV using the quantitative structure–activity relationship (QSAR). In this context, a set of 36 NS5A inhibitors was used to build QSAR models using genetic algorithm multiple linear regression (GA-MLR) and Monte Carlo optimization and were implemented in the software CORAL. The Monte Carlo method was used to build QSAR models using SMILES-based optimal descriptors. Four splits were performed and 24 QSAR models were developed and verified through internal and external validation. The model created for split 3 produced a higher value of the determination coefficients using the validation set (R2 = 0.991 and Q2 = 0.943). In addition, this model provides interesting information about the structural features responsible for the increase and decrease of inhibitory activity, which were used to develop eight novel NS5A inhibitors. The constructed GA-MLR model with satisfactory statistical parameters (R2 = 0.915 and Q2 = 0.941) confirmed the predicted inhibitory activity for these compounds. The Absorption, Distribution, Metabolism, Elimination, and Toxicity (ADMET) predictions showed that the newly designed compounds were nontoxic and exhibited acceptable pharmacological properties. These results could accelerate the process of discovering new drugs against HCV.
Collapse
Affiliation(s)
- Wissal Liman
- African Genome Center, Mohammed VI Polytechnic University, Ben Guerir 43150, Morocco; (W.L.); (R.D.)
| | - Mehdi Oubahmane
- Department of Chemistry, Faculty of Sciences Semlalia, BP 2390, Marrakech 40000, Morocco; (M.O.); (I.H.); (D.C.)
| | - Ismail Hdoufane
- Department of Chemistry, Faculty of Sciences Semlalia, BP 2390, Marrakech 40000, Morocco; (M.O.); (I.H.); (D.C.)
| | - Imane Bjij
- Institut Supérieur des Professions Infirmières et Techniques de Santé (ISPITS), Dakhla 73000, Morocco;
| | - Didier Villemin
- Ecole Nationale Supérieure d’Ingénieurs (ENSICAEN) Laboratoire de Chimie Moléculaire et Thioorganique, UMR 6507 CNRS, INC3M, FR3038, Labex EMC3, Labex SynOrg ENSICAEN & Université de Caen, 14118 Caen, France;
| | - Rachid Daoud
- African Genome Center, Mohammed VI Polytechnic University, Ben Guerir 43150, Morocco; (W.L.); (R.D.)
| | - Driss Cherqaoui
- Department of Chemistry, Faculty of Sciences Semlalia, BP 2390, Marrakech 40000, Morocco; (M.O.); (I.H.); (D.C.)
| | - Achraf El Allali
- African Genome Center, Mohammed VI Polytechnic University, Ben Guerir 43150, Morocco; (W.L.); (R.D.)
- Correspondence:
| |
Collapse
|
10
|
Xu P, Chen H, Li M, Lu W. New Opportunity: Machine Learning for Polymer Materials Design and Discovery. ADVANCED THEORY AND SIMULATIONS 2022. [DOI: 10.1002/adts.202100565] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Pengcheng Xu
- Materials Genome Institute Shanghai University Shanghai 200444 China
| | - Huimin Chen
- Department of Mathematics College of Sciences Shanghai University Shanghai 200444 China
| | - Minjie Li
- Department of Chemistry College of Sciences Shanghai University Shanghai 200444 China
| | - Wencong Lu
- Materials Genome Institute Shanghai University Shanghai 200444 China
- Department of Chemistry College of Sciences Shanghai University Shanghai 200444 China
| |
Collapse
|
11
|
Cysewski P, Jeliński T, Cymerman P, Przybyłek M. Solvent Screening for Solubility Enhancement of Theophylline in Neat, Binary and Ternary NADES Solvents: New Measurements and Ensemble Machine Learning. Int J Mol Sci 2021; 22:ijms22147347. [PMID: 34298966 PMCID: PMC8304713 DOI: 10.3390/ijms22147347] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 06/29/2021] [Accepted: 07/06/2021] [Indexed: 12/13/2022] Open
Abstract
Theophylline, a typical representative of active pharmaceutical ingredients, was selected to study the characteristics of experimental and theoretical solubility measured at 25 °C in a broad range of solvents, including neat, binary mixtures and ternary natural deep eutectics (NADES) prepared with choline chloride, polyols and water. There was a strong synergistic effect of organic solvents mixed with water, and among the experimentally studied binary systems, the one containing DMSO with water in unimolar proportions was found to be the most effective in theophylline dissolution. Likewise, for NADES, the addition of water (0.2 molar fraction) resulted in increased solubility compared to pure eutectics, with the highest solubilisation potential offered by the composition of choline chloride with glycerol. The ensemble of Statistica Automated Neural Networks (SANNs) developed using intermolecular interactions in pure systems has been found to be a very accurate model for solubility computations. This machine learning protocol was also applied as an extensive screening for potential solvents with higher solubility of theophylline. Such solvents were identified in all three subgroups, including neat solvents, binary mixtures and ternary NADES systems. Some methodological considerations of SANNs applications for future modelling were also provided. Although the developed protocol is focused exclusively on theophylline solubility, it also has general importance and can be used for the development of predictive models adequate for solvent screening of other compounds in a variety of systems. Formulation of such a model offers rational guidance for the selection of proper candidates as solubilisers in the designed solvents screening.
Collapse
|