1
|
Ren YY, Zhou LC, Yang L, Liu PY, Zhao BW, Liu HX. Predicting the aquatic toxicity mode of action using logistic regression and linear discriminant analysis. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2016; 27:721-746. [PMID: 27653817 DOI: 10.1080/1062936x.2016.1229691] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Accepted: 08/22/2016] [Indexed: 06/06/2023]
Abstract
The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.
Collapse
Affiliation(s)
- Y Y Ren
- a School of Environmental and Municipal Engineering, Lanzhou Jiaotong University , Lanzhou , P.R. China
| | - L C Zhou
- b College of Chemistry and Chemical Engineering, Lanzhou University , Lanzhou , P.R. China
| | - L Yang
- a School of Environmental and Municipal Engineering, Lanzhou Jiaotong University , Lanzhou , P.R. China
| | - P Y Liu
- a School of Environmental and Municipal Engineering, Lanzhou Jiaotong University , Lanzhou , P.R. China
| | - B W Zhao
- a School of Environmental and Municipal Engineering, Lanzhou Jiaotong University , Lanzhou , P.R. China
| | - H X Liu
- c School of Pharmacy, Lanzhou University , Lanzhou , P.R. China
| |
Collapse
|
2
|
Martin TM, Young DM, Lilavois CR, Barron MG. Comparison of global and mode of action-based models for aquatic toxicity. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2015; 26:245-62. [PMID: 25783870 DOI: 10.1080/1062936x.2015.1018939] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
The ability to estimate aquatic toxicity is a critical need for ecological risk assessment and chemical regulation. The consensus in the literature is that mode of action (MOA) based toxicity models yield the most toxicologically meaningful and, theoretically, the most accurate results. In this study, a two-step prediction methodology was developed to estimate acute aquatic toxicity from molecular structure. In the first step, one-against-the-rest linear discriminant analysis (LDA) models were used to predict the MOA. The LDA models were able to predict the MOA with 85.8-88.8% accuracy for broad and specific MOAs, respectively. In the second step, a multiple linear regression (MLR) model corresponding to the predicted MOA was used to predict the acute aquatic toxicity value. The MOA-based approach was found to yield similar external prediction accuracy (r(2) = 0.529-0.632) to a single global MLR model (r(2) = 0.551-0.562) fit to the entire training set. Overall, the global hierarchical clustering approach yielded a higher combination of accuracy and prediction coverage (r(2) = 0.572, coverage = 99.3%) than the other approaches. Utilizing multiple two-dimensional chemical descriptors in MLR models yielded comparable results to using only the octanol-water partition coefficient (log K(ow)).
Collapse
Affiliation(s)
- T M Martin
- a National Risk Management Research Laboratory , US Environmental Protection Agency , Cincinnati , OH , USA
| | | | | | | |
Collapse
|
3
|
Cassotti M, Ballabio D, Todeschini R, Consonni V. A similarity-based QSAR model for predicting acute toxicity towards the fathead minnow (Pimephales promelas). SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2015; 26:217-243. [PMID: 25780951 DOI: 10.1080/1062936x.2015.1018938] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
REACH regulation demands information about acute toxicity of chemicals towards fish and supports the use of QSAR models, provided compliance with OECD principles. Existing models present some drawbacks that may limit their regulatory application. In this study, a dataset of 908 chemicals was used to develop a QSAR model to predict the LC50 96 hours for the fathead minnow. Genetic algorithms combined with k nearest neighbour method were applied on the training set (726 chemicals) and resulted in a model based on six molecular descriptors. An automated assessment of the applicability domain (AD) was carried out by comparing the average distance of each molecule from the nearest neighbours with a fixed threshold. The model had good and balanced performance in internal and external validation (182 test molecules), at the expense of a percentage of molecules outside the AD. Principal Component Analysis showed apparent correlations between model descriptors and toxicity.
Collapse
Affiliation(s)
- M Cassotti
- a Department of Earth and Environmental Sciences , University of Milano-Bicocca , Milano , Italy
| | | | | | | |
Collapse
|
4
|
Villain J, Lozano S, Halm-Lemeille MP, Durrieu G, Bureau R. Quantile regression model for a diverse set of chemicals: application to acute toxicity for green algae. J Mol Model 2014; 20:2508. [PMID: 25431186 DOI: 10.1007/s00894-014-2508-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 10/20/2014] [Indexed: 01/18/2023]
Abstract
The potential of quantile regression (QR) and quantile support vector machine regression (QSVMR) was analyzed for the definitions of quantitative structure-activity relationship (QSAR) models associated with a diverse set of chemicals toward a particular endpoint. This study focused on a specific sensitive endpoint (acute toxicity to algae) for which even a narcosis QSAR model is not actually clear. An initial dataset including more than 401 ecotoxicological data for one species of algae (Selenastrum capricornutum) was defined. This set corresponds to a large sample of chemicals ranging from classical organic chemicals to pesticides. From this original data set, the selection of the different subsets was made in terms of the notion of toxic ratio (TR), a parameter based on the ratio between predicted and experimental values. The robustness of QR and QSVMR to outliers was clearly observed, thus demonstrating that this approach represents a major interest for QSAR associated with a diverse set of chemicals. We focused particularly on descriptors related to molecular surface properties.
Collapse
|
5
|
Martin TM, Grulke CM, Young DM, Russom CL, Wang NY, Jackson CR, Barron MG. Prediction of Aquatic Toxicity Mode of Action Using Linear Discriminant and Random Forest Models. J Chem Inf Model 2013; 53:2229-39. [DOI: 10.1021/ci400267h] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Affiliation(s)
- Todd M. Martin
- National Risk Management Research
Laboratory, U.S. Environmental Protection Agency, 26 West Martin Luther King Drive, Cincinnati, Ohio 45268, United
States
| | - Christopher M. Grulke
- National Exposure
Research Laboratory, U.S. Environmental Protection Agency, Research Triangle
Park, North Carolina 27711, United States
| | - Douglas M. Young
- National Risk Management Research
Laboratory, U.S. Environmental Protection Agency, 26 West Martin Luther King Drive, Cincinnati, Ohio 45268, United
States
| | - Christine L. Russom
- National Health and Environmental
Effects Research Laboratory, U.S. Environmental Protection Agency, 6201 Congdon Boulevard, Duluth, Minnesota 55804,
United States
| | - Nina Y. Wang
- National
Center for Environmental
Assessment, U.S. Environmental Protection Agency, 26 West Martin Luther King Drive, Cincinnati, Ohio 45268, United
States
| | - Crystal R. Jackson
- National Health
and Environmental
Effects Research Laboratory, U.S. Environmental Protection Agency, 1 Sabine Island Drive, Gulf Breeze, Florida
32561, United States
| | - Mace G. Barron
- National Health
and Environmental
Effects Research Laboratory, U.S. Environmental Protection Agency, 1 Sabine Island Drive, Gulf Breeze, Florida
32561, United States
| |
Collapse
|
6
|
Casalegno M, Sello G. Determination of toxicant mode of action by augmented top priority fragment class. J Chem Inf Model 2013; 53:1113-26. [PMID: 23621653 DOI: 10.1021/ci400130n] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Theoretical models can be an efficient tool to assess compound toxicity as an alternative to experimental determinations. Their application must follow some requirements that include the possibility of understanding the rationale that supports the prediction; here, the determination of the mode of action (MOA) is important. A combination of similarity and reactivity analysis has been applied to group chemical compounds with the aim at selecting groups that share structure and electronic state. The model is not based on experimental data but only on structural features. The result is a number of groups that contains similar compounds with similar reactivity and, possibly, similar MOA. The comparison of these groups to the experimentally determined MOAs available for the EPAFHAM database permits the discussion of the validity of both the model and the experimental data.
Collapse
Affiliation(s)
- Mosé Casalegno
- Department of Chemistry, Materials, and Chemical Engineering, Giulio Natta, Milano, Italy
| | | |
Collapse
|
7
|
Medina-Franco JL. Scanning structure-activity relationships with structure-activity similarity and related maps: from consensus activity cliffs to selectivity switches. J Chem Inf Model 2012; 52:2485-93. [PMID: 22989212 DOI: 10.1021/ci300362x] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Systematic description of structure-activity relationships (SARs) of data sets and structure-property relationships (SPRs) is of paramount importance in medicinal chemistry and other research fields. To this end, structure-activity similarity (SAS) maps are one of the first tools proposed to describe SARs using the concept of activity landscape modeling. One of the major goals of the SAS maps is to identify activity cliffs defined as chemical compounds with high similar structure but unexpectedly very different biological activity. Since the first publication of the SAS maps more than ten years ago, these tools have evolved and adapted over the years to analyze various types of compound collections, including structural diverse and combinatorial sets with activity for one or multiple biological end points. The development of SAS maps has led to general concepts that are applicable to other activity landscape methods such as "consensus activity cliffs" (activity cliffs common to a series of representations or descriptors) and "selectivity switches" (structural changes that completely invert the selectivity pattern of similar compounds against two biological end points). Herein, we review the development, practical applications, limitations, and perspectives of the SAS and related maps which are intuitive and powerful informatics tools to computationally analyze SPRs.
Collapse
Affiliation(s)
- José L Medina-Franco
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, Florida 34987, USA.
| |
Collapse
|
8
|
Fayet G, Del Rio A, Rotureau P, Joubert L, Adamo C. Predicting the Thermal Stability of Nitroaromatic Compounds Using Chemoinformatic Tools. Mol Inform 2011; 30:623-34. [DOI: 10.1002/minf.201000077] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2010] [Accepted: 04/27/2011] [Indexed: 11/12/2022]
|
9
|
Lodhi H, Muggleton S, Sternberg MJE. Multi-class Mode of Action Classification of Toxic Compounds Using Logic Based Kernel Methods. Mol Inform 2010; 29:655-64. [DOI: 10.1002/minf.201000083] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Accepted: 09/04/2010] [Indexed: 11/08/2022]
|