1
|
Anandhi G, Iyapparaja M. Systematic approaches to machine learning models for predicting pesticide toxicity. Heliyon 2024; 10:e28752. [PMID: 38576573 PMCID: PMC10990867 DOI: 10.1016/j.heliyon.2024.e28752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 03/13/2024] [Accepted: 03/24/2024] [Indexed: 04/06/2024] Open
Abstract
Pesticides play an important role in modern agriculture by protecting crops from pests and diseases. However, the negative consequences of pesticides, such as environmental contamination and adverse effects on human and ecological health, underscore the importance of accurate toxicity predictions. To address this issue, artificial intelligence models have emerged as valuable methods for predicting the toxicity of organic compounds. In this review article, we explore the application of machine learning (ML) for pesticide toxicity prediction. This review provides a detailed summary of recent developments, prediction models, and datasets used for pesticide toxicity prediction. In this analysis, we compared the results of several algorithms that predict the harmfulness of various classes of pesticides. Furthermore, this review article identified emerging trends and areas for future direction, showcasing the transformative potential of machine learning in promoting safer pesticide usage and sustainable agriculture.
Collapse
Affiliation(s)
- Ganesan Anandhi
- Department of Smart Computing, School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - M. Iyapparaja
- Department of Smart Computing, School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| |
Collapse
|
2
|
Boone KS, Di Toro DM, Davis CW, Parkerton TF, Redman A. In Silico Acute Aquatic Hazard Assessment and Prioritization Using a Grouped Target Site Model: A Case Study of Organic Substances Reported in Permian Basin Hydraulic Fracturing Operations. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY 2024. [PMID: 38415890 DOI: 10.1002/etc.5826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/17/2023] [Accepted: 01/15/2024] [Indexed: 02/29/2024]
Abstract
Hydraulic fracturing (HF) is commonly used to enhance onshore recovery of oil and gas during production. This process involves the use of a variety of chemicals to support the physical extraction of oil and gas, maintain appropriate conditions downhole (e.g., redox conditions, pH), and limit microbial growth. The diversity of chemicals used in HF presents a significant challenge for risk assessment. The objective of the present study is to establish a transparent, reproducible procedure for estimating 5th percentile acute aquatic hazard concentrations (e.g., acute hazard concentration 5th percentiles [HC5s]) for these substances and validating against existing toxicity data. A simplified, grouped target site model (gTSM) was developed using a database (n = 1696) of diverse compounds with known mode of action (MoA) information. Statistical significance testing was employed to reduce model complexity by combining 11 discrete MoAs into three general hazard groups. The new model was trained and validated using an 80:20 allocation of the experimental database. The gTSM predicts toxicity using a combination of target site water partition coefficients and hazard group-based critical target site concentrations. Model performance was comparable to the original TSM using 40% fewer parameters. Model predictions were judged to be sufficiently reliable and the gTSM was further used to prioritize a subset of reported Permian Basin HF substances for risk evaluation. The gTSM was applied to predict hazard groups, species acute toxicity, and acute HC5s for 186 organic compounds (neutral and ionic). Toxicity predictions and acute HC5 estimates were validated against measured acute toxicity data compiled for HF substances. This case study supports the gTSM as an efficient, cost-effective computational tool for rapid aquatic hazard assessment of diverse organic chemicals. Environ Toxicol Chem 2024;00:1-12. © 2024 ExxonMobil Petroleum and Chemical BV. Environmental Toxicology and Chemistry published by Wiley Periodicals LLC on behalf of SETAC.
Collapse
Affiliation(s)
- Kathleen S Boone
- Department of Civil and Environmental Engineering, University of Delaware, Newark, Delaware, USA
| | - Dominic M Di Toro
- Department of Civil and Environmental Engineering, University of Delaware, Newark, Delaware, USA
| | - Craig W Davis
- ExxonMobil Biomedical Sciences, Annandale, New Jersey, USA
| | | | - Aaron Redman
- ExxonMobil Biomedical Sciences, Annandale, New Jersey, USA
| |
Collapse
|
3
|
Fuchsman P, Fetters K, O'Connor A. Target Lipid Model and Empirical Organic Carbon Partition Coefficients Predict Sediment Toxicity of Polychlorinated Biphenyls to Benthic Invertebrates. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY 2023; 42:1134-1151. [PMID: 36808761 DOI: 10.1002/etc.5588] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 10/19/2022] [Accepted: 02/13/2023] [Indexed: 06/18/2023]
Abstract
Quantifying causal exposure-response relationships for polychlorinated biphenyl (PCB) toxicity to benthic invertebrates can be an important component of contaminated sediment assessments, informing cleanup decisions and natural resource injury determinations. Building on prior analyses, we demonstrate that the target lipid model accurately predicts aquatic toxicity of PCBs to invertebrates, providing a means to account for effects of PCB mixture composition on the toxicity of bioavailable PCBs. We also incorporate updated data on PCB partitioning between particles and interstitial water in field-collected sediments, to better account for effects of PCB mixture composition on PCB bioavailability. To validate the resulting model, we compare its predictions with sediment toxicity data from spiked sediment toxicity tests and a variety of recent case studies from sites where PCBs are the primary sediment contaminant. The updated model should provide a useful tool for both screening-level and in-depth risk analyses for PCBs in sediment, and it should aid in diagnosing potential contributing factors at sites where sediment toxicity and benthic community impairment are observed. Environ Toxicol Chem 2023;42:1134-1151. © 2023 SETAC.
Collapse
|
4
|
Droge STJ, Hodges G, Bonnell M, Gutsell S, Roberts J, Teixeira A, Barrett EL. Using membrane-water partition coefficients in a critical membrane burden approach to aid the identification of neutral and ionizable chemicals that induce acute toxicity below narcosis levels. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2023; 25:621-647. [PMID: 36779707 DOI: 10.1039/d2em00391k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The risk assessment of thousands of chemicals used in our society benefits from adequate grouping of chemicals based on the mode and mechanism of toxic action (MoA). We measure the phospholipid membrane-water distribution ratio (DMLW) using a chromatographic assay (IAM-HPLC) for 121 neutral and ionized organic chemicals and screen other methods to derive DMLW. We use IAM-HPLC based DMLW as a chemical property to distinguish between baseline narcosis and specific MoA, for reported acute toxicity endpoints on two separate sets of chemicals. The first set comprised 94 chemicals of US EPA's acute fish toxicity database: 47 categorized as narcosis MoA, 27 with specific MoA, and 20 predominantly ionic chemicals with mostly unknown MoA. The narcosis MoA chemicals clustered around the median narcosis critical membrane burden (CMBnarc) of 140 mmol kg-1 lipid, with a lower limit of 14 mmol kg-1 lipid, including all chemicals labelled Narcosis_I and Narcosis_II. This maximum 'toxic ratio' (TR) between CMBnarc and the lower limit narcosis endpoint is thus 10. For 23/28 specific MoA chemicals a TR >10 was derived, indicative of a specific adverse effect pathway related to acute toxicity. For 10/12 cations categorized as "unsure amines", the TR <10 suggests that these affect fish via narcosis MoA. The second set comprised 29 herbicides, including 17 dissociated acids, and evaluated the TR for acute toxic effect concentrations to likely sensitive aquatic plant species (green algae and macrophytes Lemna and Myriophyllum), and non-target animal species (invertebrates and fish). For 21/29 herbicides, a TR >10 indicated a specific toxic mode of action other than narcosis for at least one of these aquatic primary producers. Fish and invertebrate TRs were mostly <10, particularly for neutral herbicides, but for acidic herbicides a TR >10 indicated specific adverse effects in non-target animals. The established critical membrane approach to derive the TR provides for useful contribution to the weight of evidence to bin a chemical as having a narcosis MoA or less likely to have acute toxicity caused by a more specific adverse effect pathway. After proper calibration, the chromatographic assay provides consistent and efficient experimental input for both neutral and ionizable chemicals to this approach.
Collapse
Affiliation(s)
- Steven T J Droge
- Department of Freshwater and Marine Ecology (FAME), Institute for Biodiversity and Ecosystem Dynamics (IBED), Universiteit van Amsterdam (UvA), Science Park 904, 1098XH Amsterdam, The Netherlands.
| | - Geoff Hodges
- Safety and Environmental Assurance Centre, Unilever, Colworth Science Park, Sharnbrook, Bedfordshire, UK
| | - Mark Bonnell
- Environment and Climate Change Canada, Ecological Assessment Division, Science and Risk Assessment Directorate, Gatineau, Quebec, Canada
| | - Steve Gutsell
- Safety and Environmental Assurance Centre, Unilever, Colworth Science Park, Sharnbrook, Bedfordshire, UK
| | - Jayne Roberts
- Safety and Environmental Assurance Centre, Unilever, Colworth Science Park, Sharnbrook, Bedfordshire, UK
| | - Alexandre Teixeira
- Safety and Environmental Assurance Centre, Unilever, Colworth Science Park, Sharnbrook, Bedfordshire, UK
| | - Elin L Barrett
- Safety and Environmental Assurance Centre, Unilever, Colworth Science Park, Sharnbrook, Bedfordshire, UK
| |
Collapse
|
5
|
Recent advances for estimating environmental properties for small molecules from chromatographic measurements and the solvation parameter model. J Chromatogr A 2023; 1687:463682. [PMID: 36502643 DOI: 10.1016/j.chroma.2022.463682] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 11/24/2022] [Accepted: 11/24/2022] [Indexed: 11/30/2022]
Abstract
The transfer of neutral compounds between immiscible phases in chromatographic or environmental systems can be described by six solute properties (solute descriptors) using the solvation parameter model. The solute descriptors are size (McGowan's characteristic volume), V, excess molar refraction, E, dipolarity/polarizability, S, hydrogen-bond acidity and basicity, A and B, and the gas-liquid partition constant on n-hexadecane at 298.15 K, L. V and E for liquids are accessible by calculation but the other descriptors and E for solids are determined experimentally by chromatographic, liquid-liquid partition, and solubility measurements. These solute descriptors are available for several thousand compounds in the Abraham solute descriptor databases and for several hundred compounds in the WSU experimental solute descriptor database. In the first part of this review, we highlight features important in defining each descriptor, their experimental determination, compare descriptor quality for the two organized descriptor databases, and methods for estimating Abraham solute descriptors. In the second part we focus on recent applications of the solvation parameter model to characterize environmental systems and its use for the identification of surrogate chromatographic models for estimating environmental properties.
Collapse
|
6
|
Gao F, Zhang W, Baccarelli AA, Shen Y. Predicting chemical ecotoxicity by learning latent space chemical representations. ENVIRONMENT INTERNATIONAL 2022; 163:107224. [PMID: 35395577 PMCID: PMC9044254 DOI: 10.1016/j.envint.2022.107224] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 03/29/2022] [Accepted: 03/30/2022] [Indexed: 05/31/2023]
Abstract
In silico prediction of chemical ecotoxicity (HC50) represents an important complement to improve in vivo and in vitro toxicological assessment of manufactured chemicals. Recent application of machine learning models to predict chemical HC50 yields variable prediction performance that depends on effectively learning chemical representations from high-dimension data. To improve HC50 prediction performance, we developed an autoencoder model by learning latent space chemical embeddings. This novel approach achieved state-of-the-art prediction performance of HC50 with R2 of 0.668 ± 0.003 and mean absolute error (MAE) of 0.572 ± 0.001, and outperformed other dimension reduction methods including principal component analysis (PCA) (R2 = 0.601 ± 0.031 and MAE = 0.629 ± 0.005), kernel PCA (R2 = 0.631 ± 0.008 and MAE = 0.625 ± 0.006), and uniform manifold approximation and projection dimensionality reduction (R2 = 0.400 ± 0.008 and MAE = 0.801 ± 0.002). A simple linear layer with chemical embeddings learned from the autoencoder model performed better than random forest (R2 = 0.663 ± 0.007 and MAE = 0.591 ± 0.008), fully connected neural network (R2 = 0.614 ± 0.016 and MAE = 0.610 ± 0.008), least absolute shrinkage and selection operator (R2 = 0.617 ± 0.037 and MAE = 0.619 ± 0.007), and ridge regression (R2 = 0.638 ± 0.007 and MAE = 0.613 ± 0.005) using unlearned raw input features. Our results highlighted the usefulness of learning latent chemical representations, and our autoencoder model provides an alternative approach for robust HC50 prediction.
Collapse
Affiliation(s)
- Feng Gao
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY 10032, United States
| | - Wei Zhang
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48823, United States
| | - Andrea A Baccarelli
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY 10032, United States
| | - Yike Shen
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY 10032, United States.
| |
Collapse
|