1
|
Kotli M, Piir G, Maran U. Pesticide effect on earthworm lethality via interpretable machine learning. JOURNAL OF HAZARDOUS MATERIALS 2024; 461:132577. [PMID: 37793249 DOI: 10.1016/j.jhazmat.2023.132577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 09/15/2023] [Accepted: 09/16/2023] [Indexed: 10/06/2023]
Abstract
Earthworms are among the most important animals (invertebrates) for soil health. Many chemical substances released into nature for agricultural development, such as pesticides, may have unwanted effects on those organisms. However, it is essential to understand the extent of the impact of chemicals on soil health first and then make the proper decisions for regulatory or commercial purposes. We hypothesize that there is an expressible quantitative structure-activity relationship (QSAR) between the structure of pesticide compounds and the acute toxicity effect of earthworm species Eisenia fetida. The description of this relationship allows for a better assessment of the impact of chemicals on the said earthworm. To describe this relationship, a dataset of chemicals was collected from open-access sources to develop a mathematical model. A novel approach, combining genetic algorithm and Bayesian optimization, was used to select structural features into the model and to optimize model parameters. The final QSAR classification model was created with the Random Forest algorithm and exhibited good prediction Accuracy of 0.78 on training set and 0.80 on test set. The model representation follows FAIR principles and is available on QsarDB.org.
Collapse
Affiliation(s)
- Mihkel Kotli
- University of Tartu, Institute of Chemistry, Tartu, Estonia
| | - Geven Piir
- University of Tartu, Institute of Chemistry, Tartu, Estonia
| | - Uko Maran
- University of Tartu, Institute of Chemistry, Tartu, Estonia.
| |
Collapse
|
2
|
Toots KM, Sild S, Leis J, Acree WE, Maran U. Machine Learning Quantitative Structure–Property Relationships as a Function of Ionic Liquid Cations for the Gas-Ionic Liquid Partition Coefficient of Hydrocarbons. Int J Mol Sci 2022; 23:ijms23147534. [PMID: 35886881 PMCID: PMC9323540 DOI: 10.3390/ijms23147534] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 06/27/2022] [Accepted: 06/30/2022] [Indexed: 02/01/2023] Open
Abstract
Ionic liquids (ILs) are known for their unique characteristics as solvents and electrolytes. Therefore, new ILs are being developed and adapted as innovative chemical environments for different applications in which their properties need to be understood on a molecular level. Computational data-driven methods provide means for understanding of properties at molecular level, and quantitative structure–property relationships (QSPRs) provide the framework for this. This framework is commonly used to study the properties of molecules in ILs as an environment. The opposite situation where the property is considered as a function of the ionic liquid does not exist. The aim of the present study was to supplement this perspective with new knowledge and to develop QSPRs that would allow the understanding of molecular interactions in ionic liquids based on the structure of the cationic moiety. A wide range of applications in electrochemistry, separation and extraction chemistry depends on the partitioning of solutes between the ionic liquid and the surrounding environment that is characterized by the gas-ionic liquid partition coefficient. To model this property as a function of the structure of a cationic counterpart, a series of ionic liquids was selected with a common bis-(trifluoromethylsulfonyl)-imide anion, [Tf2N]−, for benzene, hexane and cyclohexane. MLR, SVR and GPR machine learning approaches were used to derive data-driven models and their performance was compared. The cross-validation coefficients of determination in the range 0.71–0.93 along with other performance statistics indicated a strong accuracy of models for all data series and machine learning methods. The analysis and interpretation of descriptors revealed that generally higher lipophilicity and dispersion interaction capability, and lower polarity in the cations induces a higher partition coefficient for benzene, hexane, cyclohexane and hydrocarbons in general. The applicability domain analysis of models concluded that there were no highly influential outliers and the models are applicable to a wide selection of cation families with variable size, polarity and aliphatic or aromatic nature.
Collapse
Affiliation(s)
- Karl Marti Toots
- Department of Chemistry, University of Tartu, 14a Ravila Street, 50411 Tartu, Estonia; (K.M.T.); (S.S.); (J.L.)
| | - Sulev Sild
- Department of Chemistry, University of Tartu, 14a Ravila Street, 50411 Tartu, Estonia; (K.M.T.); (S.S.); (J.L.)
| | - Jaan Leis
- Department of Chemistry, University of Tartu, 14a Ravila Street, 50411 Tartu, Estonia; (K.M.T.); (S.S.); (J.L.)
| | - William E. Acree
- Department of Chemistry, University of North Texas, 1155 Union Circle Drive #305070, Denton, TX 76203, USA;
| | - Uko Maran
- Department of Chemistry, University of Tartu, 14a Ravila Street, 50411 Tartu, Estonia; (K.M.T.); (S.S.); (J.L.)
- Correspondence:
| |
Collapse
|
3
|
Toots KM, Sild S, Leis J, Acree Jr. WE, Maran U. The quantitative structure-property relationships for the gas-ionic liquid partition coefficient of a large variety of organic compounds in three ionic liquids. J Mol Liq 2021. [DOI: 10.1016/j.molliq.2021.117573] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
4
|
Piir G, Sild S, Maran U. Binary and multi-class classification for androgen receptor agonists, antagonists and binders. CHEMOSPHERE 2021; 262:128313. [PMID: 33182081 DOI: 10.1016/j.chemosphere.2020.128313] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 08/24/2020] [Accepted: 09/10/2020] [Indexed: 06/11/2023]
Abstract
Androgens and androgen receptor regulate a variety of biological effects in the human body. The impaired functioning of androgen receptor may have different adverse health effects from cancer to infertility. Therefore, it is important to determine whether new chemicals have any binding activity and act as androgen agonists or antagonists before commercial use. Due to the large number of chemicals that require experimental testing, the computational methods are a viable alternative. Therefore, the aim of the present study was to develop predictive QSAR models for classifying compounds according to their activity at the androgen receptor. A large data set of chemicals from the CoMPARA project was used for this purpose and random forest classification models have been developed for androgen binding, agonistic, and antagonistic activity. In addition, a unique effort has been made for multi-class approach that discriminates between inactive compounds, agonists and antagonists simultaneously. For the evaluation set, the classification models predicted agonists with 80% of accuracy and for the antagonists' and binders' the respective metrics were 72% and 78%. Combining agonists, antagonists and inactive compounds into a multi-class approach added complexity to the modelling task and resulted to 64% prediction accuracy for the evaluation set. Considering the size of the training data sets and their imbalance, the achieved evaluation accuracy is very good. The final classification models are available for exploring and predicting at QsarDB repository (https://doi.org/10.15152/QDB.236).
Collapse
Affiliation(s)
- Geven Piir
- University of Tartu, Institute of Chemistry, Ravila 14A, Tartu, 50411, Estonia
| | - Sulev Sild
- University of Tartu, Institute of Chemistry, Ravila 14A, Tartu, 50411, Estonia
| | - Uko Maran
- University of Tartu, Institute of Chemistry, Ravila 14A, Tartu, 50411, Estonia.
| |
Collapse
|
5
|
Zukić S, Maran U. Modelling of antiproliferative activity measured in HeLa cervical cancer cells in a series of xanthene derivatives. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2020; 31:905-921. [PMID: 33236957 DOI: 10.1080/1062936x.2020.1839131] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 10/15/2020] [Indexed: 06/11/2023]
Abstract
Cancer remains one of the leading causes of death in humans, and new drug substances are therefore being developed. Thus, the anti-cancer activity of xanthene derivatives has become an important topic in the development of new and potent anti-cancer drug substances. Previously published novel series of xanthen-3-one and xanthen-1,8-dione derivatives have been synthesized in one of our laboratories and showed anti-proliferative activity in HeLa cancer cell lines. This series serves as a good basis to develop quantitative structure-activity relationship (QSAR), to study the relations between anti-proliferative activity and chemical structures. A QSAR model has been derived that relies only on two-dimensional molecular descriptors, providing mechanistic insight into the anti-proliferative activity of xanthene derivatives. The model is validated internally and externally and additionally with the set of inactive compounds of the original data, confirming model applicability for the design and discovery of novel xanthene derivatives. The QSAR model is available at the QsarDB repository (http://dx.doi.10.15152/QDB.237).
Collapse
Affiliation(s)
- S Zukić
- Department of Pharmaceutical Chemistry, University of Sarajevo , Sarajevo, Bosnia and Herzegovina
| | - U Maran
- Department of Chemistry, University of Tartu , Tartu, Estonia
| |
Collapse
|
6
|
Medina-Franco JL, Naveja JJ, López-López E. Reaching for the bright StARs in chemical space. Drug Discov Today 2019; 24:2162-2169. [PMID: 31557448 DOI: 10.1016/j.drudis.2019.09.013] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2019] [Revised: 09/10/2019] [Accepted: 09/17/2019] [Indexed: 02/07/2023]
Abstract
Visualization of activity data in chemical space is common in drug discovery. Navigating the space in a systematic manner is not trivial, given its size and huge coverage. To this end, methods for data visualization have been developed charting biological activity into chemical space. Herein, we review the progress in different visualization approaches to explore the chemical space aiming at reaching insightful structure-activity relationships (SARs) in the chemical space. We discuss recent methods including consensus diversity plots, ChemMaps, and constellation plots. Several of the methods we review can be extended to analyze other properties of interest in medicinal chemistry, such as structure-toxicity relationships, and can be adapted to postprocess results of virtual screening (VS) of large compound libraries.
Collapse
Affiliation(s)
- José L Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico.
| | - J Jesús Naveja
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico; PECEM, School of Medicine, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Edgar López-López
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| |
Collapse
|
7
|
Asgaonkar KD, Patil SM, Chitre TS, Ghegade VN, Jadhav SR, Sande SS, Kulkarni AS. Comparative Docking Studies: A Drug Design Tool for Some Pyrazine- Thiazolidinone Based Derivatives for Anti-HIV Activity. Curr Comput Aided Drug Des 2019; 15:252-258. [DOI: 10.2174/1573409915666181219125944] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 11/22/2018] [Accepted: 12/16/2018] [Indexed: 11/22/2022]
Abstract
<P>Background: Acquired immunodeficiency Syndrome (AIDS) is caused by Human immunodeficiency
virus type 1 (HIV-1). Pyrazine and Thiazolidinone pharmacophore has diverse biological
activities including anti HIV activity.
</P><P>
Aims and Objectives: To study binding behavior of Pyrazine- thiazolidinone derivatives on four
different crystal structures of HIV- 1RT.These molecules which were already reported as anti-TB
were investigated for dual activity as Anti-HIV and Anti-TB.
</P><P>
Materials and Methods: In the present study we describe a comparative docking study of twentythree
derivatives of N-(4-oxo-2 substituted thiazolidin-3-yl) pyrazine-2-carbohydrazide. Binding
pattern of these derivatives was gauged by molecular docking studies on four different receptors
bearing PDB code 1ZD1, 1RT2, 1FKP and 1FK9 of HIV–RT enzyme using V. Life MDS software
Genetic algorithm docking method.
</P><P>
Result and Discussion: The studies revealed hydrogen bonds, hydrophobic interaction and pi-pi
interactions playing significant role in binding of the molecules to the enzyme.
Conclusion:
Most of the molecules have shown good dock score and binding energy with anti-HIV
receptors but Molecules 13 and 14 have potential to act as anti-tubercular and Anti HIV and hence
can be further explored for dual activity.</P>
Collapse
Affiliation(s)
- Kalyani Dhirendra Asgaonkar
- Department of Pharmaceutical Chemistry, All India Shri Shivaji Memorial Society’s College of Pharmacy, Kennedy Road, Pune-01, India
| | - Shital Manoj Patil
- Department of Pharmaceutical Chemistry, All India Shri Shivaji Memorial Society’s College of Pharmacy, Kennedy Road, Pune-01, India
| | - Trupti Sameer Chitre
- Department of Pharmaceutical Chemistry, All India Shri Shivaji Memorial Society’s College of Pharmacy, Kennedy Road, Pune-01, India
| | - Vaibhav Nanabhau Ghegade
- Department of Pharmaceutical Chemistry, All India Shri Shivaji Memorial Society’s College of Pharmacy, Kennedy Road, Pune-01, India
| | - Saurabh Radhaji Jadhav
- Department of Pharmaceutical Chemistry, All India Shri Shivaji Memorial Society’s College of Pharmacy, Kennedy Road, Pune-01, India
| | - Sajid Shaukat Sande
- Department of Pharmaceutical Chemistry, All India Shri Shivaji Memorial Society’s College of Pharmacy, Kennedy Road, Pune-01, India
| | - Atharva Sudhakar Kulkarni
- Department of Pharmaceutical Chemistry, All India Shri Shivaji Memorial Society’s College of Pharmacy, Kennedy Road, Pune-01, India
| |
Collapse
|
8
|
Kausar S, Falcao AO. An automated framework for QSAR model building. J Cheminform 2018; 10:1. [PMID: 29340790 PMCID: PMC5770354 DOI: 10.1186/s13321-017-0256-5] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 12/27/2017] [Indexed: 01/13/2023] Open
Abstract
Background In-silico quantitative structure–activity relationship (QSAR) models based tools are widely used to screen huge databases of compounds in order to determine the biological properties of chemical molecules based on their chemical structure. With the passage of time, the exponentially growing amount of synthesized and known chemicals data demands computationally efficient automated QSAR modeling tools, available to researchers that may lack extensive knowledge of machine learning modeling. Thus, a fully automated and advanced modeling platform can be an important addition to the QSAR community. Results In the presented workflow the process from data preparation to model building and validation has been completely automated. The most critical modeling tasks (data curation, data set characteristics evaluation, variable selection and validation) that largely influence the performance of QSAR models were focused. It is also included the ability to quickly evaluate the feasibility of a given data set to be modeled. The developed framework is tested on data sets of thirty different problems. The best-optimized feature selection methodology in the developed workflow is able to remove 62–99% of all redundant data. On average, about 19% of the prediction error was reduced by using feature selection producing an increase of 49% in the percentage of variance explained (PVE) compared to models without feature selection. Selecting only the models with a modelability score above 0.6, average PVE scores were 0.71. A strong correlation was verified between the modelability scores and the PVE of the models produced with variable selection. Conclusions We developed an extendable and highly customizable fully automated QSAR modeling framework. This designed workflow does not require any advanced parameterization nor depends on users decisions or expertise in machine learning/programming. With just a given target or problem, the workflow follows an unbiased standard protocol to develop reliable QSAR models by directly accessing online manually curated databases or by using private data sets. The other distinctive features of the workflow include prior estimation of data modelability to avoid time-consuming modeling trials for non modelable data sets, an efficient variable selection procedure and the facility of output availability at each modeling task for the diverse application and reproduction of historical predictions. The results reached on a selection of thirty QSAR problems suggest that the approach is capable of building reliable models even for challenging problems. Electronic supplementary material The online version of this article (10.1186/s13321-017-0256-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Samina Kausar
- LaSIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016, Lisbon, Portugal.,BioISI: Biosystems and Integrative Sciences Institute, Faculdade de Ciências, Universidade de Lisboa, 1749-016, Lisbon, Portugal
| | - Andre O Falcao
- LaSIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016, Lisbon, Portugal. .,BioISI: Biosystems and Integrative Sciences Institute, Faculdade de Ciências, Universidade de Lisboa, 1749-016, Lisbon, Portugal.
| |
Collapse
|