1
|
Viljanen M, Minnema J, Wassenaar PNH, Rorije E, Peijnenburg W. What is the ecotoxicity of a given chemical for a given aquatic species? Predicting interactions between species and chemicals using recommender system techniques. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2023; 34:765-788. [PMID: 37670728 DOI: 10.1080/1062936x.2023.2254225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 08/27/2023] [Indexed: 09/07/2023]
Abstract
Ecotoxicological safety assessment of chemicals requires toxicity data on multiple species, despite the general desire of minimizing animal testing. Predictive models, specifically machine learning (ML) methods, are one of the tools capable of solving this apparent contradiction as they allow to generalize toxicity patterns across chemicals and species. However, despite the availability of large public toxicity datasets, the data is highly sparse, complicating model development. The aim of this study is to provide insights into how ML can predict toxicity using a large but sparse dataset. We developed models to predict LC50-values, based on experimental LC50-data covering 2431 organic chemicals and 1506 aquatic species from the ECOTOX-database. Several well-known ML techniques were evaluated and a new ML model was developed, inspired by recommender systems. This new model involves a simple linear model that learns low-rank interactions between species and chemicals using factorization machines. We evaluated the predictive performances of the developed models based on two validation settings: 1) predicting unseen chemical-species pairs, and 2) predicting unseen chemicals. The results of this study show that ML models can accurately predict LC50-values in both validation settings. Moreover, we show that the novel factorization machine approach can match well-tuned, complex, ML approaches.
Collapse
Affiliation(s)
- M Viljanen
- Department of Statistics, Data Science and Modelling, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
| | - J Minnema
- Center for Safety of Substances and Products, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
| | - P N H Wassenaar
- Center for Safety of Substances and Products, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
| | - E Rorije
- Center for Safety of Substances and Products, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
| | - W Peijnenburg
- Center for Safety of Substances and Products, National Institute of Public Health and the Environment, Bilthoven, The Netherlands
- Institute of Environmental Sciences (CML), Leiden University, Leiden, The Netherlands
| |
Collapse
|
2
|
Tice RR, Bassan A, Amberg A, Anger LT, Beal MA, Bellion P, Benigni R, Birmingham J, Brigo A, Bringezu F, Ceriani L, Crooks I, Cross K, Elespuru R, Faulkner DM, Fortin MC, Fowler P, Frericks M, Gerets HHJ, Jahnke GD, Jones DR, Kruhlak NL, Lo Piparo E, Lopez-Belmonte J, Luniwal A, Luu A, Madia F, Manganelli S, Manickam B, Mestres J, Mihalchik-Burhans AL, Neilson L, Pandiri A, Pavan M, Rider CV, Rooney JP, Trejo-Martin A, Watanabe-Sailor KH, White AT, Woolley D, Myatt GJ. In Silico Approaches In Carcinogenicity Hazard Assessment: Current Status and Future Needs. COMPUTATIONAL TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2021; 20. [PMID: 35368437 DOI: 10.1016/j.comtox.2021.100191] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Historically, identifying carcinogens has relied primarily on tumor studies in rodents, which require enormous resources in both money and time. In silico models have been developed for predicting rodent carcinogens but have not yet found general regulatory acceptance, in part due to the lack of a generally accepted protocol for performing such an assessment as well as limitations in predictive performance and scope. There remains a need for additional, improved in silico carcinogenicity models, especially ones that are more human-relevant, for use in research and regulatory decision-making. As part of an international effort to develop in silico toxicological protocols, a consortium of toxicologists, computational scientists, and regulatory scientists across several industries and governmental agencies evaluated the extent to which in silico models exist for each of the recently defined 10 key characteristics (KCs) of carcinogens. This position paper summarizes the current status of in silico tools for the assessment of each KC and identifies the data gaps that need to be addressed before a comprehensive in silico carcinogenicity protocol can be developed for regulatory use.
Collapse
Affiliation(s)
- Raymond R Tice
- RTice Consulting, Hillsborough, North Carolina, 27278, USA
| | | | - Alexander Amberg
- Sanofi Preclinical Safety, Industriepark Höchst, 65926 Frankfurt, Germany
| | - Lennart T Anger
- Genentech, Inc., South San Francisco, California, 94080, USA
| | - Marc A Beal
- Healthy Environments and Consumer Safety Branch, Health Canada, Government of Canada, Ottawa, Ontario, Canada K1A 0K9
| | | | | | - Jeffrey Birmingham
- GlaxoSmithKline, David Jack Centre for R&D, Ware, Hertfordshire, SG12 0DP, United Kingdom
| | - Alessandro Brigo
- Roche Pharmaceutical Research & Early Development, Pharmaceutical Sciences, Roche Innovation, Center Basel, F. Hoffmann-La Roche Ltd, CH-4070, Basel, Switzerland
| | | | - Lidia Ceriani
- Humane Society International, 1000 Brussels, Belgium
| | - Ian Crooks
- British American Tobacco (Investments) Ltd, GR&D Centre, Southampton, SO15 8TL, United Kingdom
| | | | - Rosalie Elespuru
- Food and Drug Administration, Center for Devices and Radiological Health, Silver Spring, Maryland, 20993, USA
| | - David M Faulkner
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Marie C Fortin
- Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, New Jersey, 08855, USA
| | - Paul Fowler
- FSTox Consulting (Genetic Toxicology), Northamptonshire, United Kingdom
| | | | | | - Gloria D Jahnke
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, 27709, USA
| | | | - Naomi L Kruhlak
- Food and Drug Administration, Center for Drug Evaluation and Research, Silver Spring, Maryland, 20993, USA
| | - Elena Lo Piparo
- Chemical Food Safety Group, Nestlé Research, CH-1000 Lausanne 26, Switzerland
| | - Juan Lopez-Belmonte
- Cuts Ice Ltd Chemical Food Safety Group, Nestlé Research, CH-1000 Lausanne 26, Switzerland
| | - Amarjit Luniwal
- North American Science Associates (NAMSA) Inc., Minneapolis, Minnesota, 55426, USA
| | - Alice Luu
- Healthy Environments and Consumer Safety Branch, Health Canada, Government of Canada, Ottawa, Ontario, Canada K1A 0K9
| | - Federica Madia
- European Commission, Joint Research Centre (JRC), Ispra, Italy
| | - Serena Manganelli
- Chemical Food Safety Group, Nestlé Research, CH-1000 Lausanne 26, Switzerland
| | | | - Jordi Mestres
- IMIM Institut Hospital Del Mar d'Investigacions Mèdiques and Universitat Pompeu Fabra, Doctor Aiguader 88, Parc de Recerca Biomèdica, 08003 Barcelona, Spain; and Chemotargets SL, Baldiri Reixac 4, Parc Científic de Barcelona, 08028, Barcelona, Spain
| | | | - Louise Neilson
- Broughton Nicotine Services, Oak Tree House, Earby, Lancashire, BB18 6JZ United Kingdom
| | - Arun Pandiri
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, 27709, USA
| | | | - Cynthia V Rider
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, 27709, USA
| | - John P Rooney
- Integrated Laboratory Systems, LLC., Morrisville, North Carolina, 27560, USA
| | | | - Karen H Watanabe-Sailor
- School of Mathematical and Natural Sciences, Arizona State University, West Campus, Glendale, Arizona, 85306, USA
| | - Angela T White
- GlaxoSmithKline, David Jack Centre for R&D, Ware, Hertfordshire, SG12 0DP, United Kingdom
| | | | | |
Collapse
|
3
|
Perez-Castillo Y, Sánchez-Rodríguez A, Tejera E, Cruz-Monteagudo M, Borges F, Cordeiro MNDS, Le-Thi-Thu H, Pham-The H. A desirability-based multi objective approach for the virtual screening discovery of broad-spectrum anti-gastric cancer agents. PLoS One 2018; 13:e0192176. [PMID: 29420638 PMCID: PMC5805264 DOI: 10.1371/journal.pone.0192176] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Accepted: 01/17/2018] [Indexed: 01/09/2023] Open
Abstract
Gastric cancer is the third leading cause of cancer-related mortality worldwide and despite advances in prevention, diagnosis and therapy, it is still regarded as a global health concern. The efficacy of the therapies for gastric cancer is limited by a poor response to currently available therapeutic regimens. One of the reasons that may explain these poor clinical outcomes is the highly heterogeneous nature of this disease. In this sense, it is essential to discover new molecular agents capable of targeting various gastric cancer subtypes simultaneously. Here, we present a multi-objective approach for the ligand-based virtual screening discovery of chemical compounds simultaneously active against the gastric cancer cell lines AGS, NCI-N87 and SNU-1. The proposed approach relays in a novel methodology based on the development of ensemble models for the bioactivity prediction against each individual gastric cancer cell line. The methodology includes the aggregation of one ensemble per cell line using a desirability-based algorithm into virtual screening protocols. Our research leads to the proposal of a multi-targeted virtual screening protocol able to achieve high enrichment of known chemicals with anti-gastric cancer activity. Specifically, our results indicate that, using the proposed protocol, it is possible to retrieve almost 20 more times multi-targeted compounds in the first 1% of the ranked list than what is expected from a uniform distribution of the active ones in the virtual screening database. More importantly, the proposed protocol attains an outstanding initial enrichment of known multi-targeted anti-gastric cancer agents.
Collapse
Affiliation(s)
- Yunierkis Perez-Castillo
- Escuela de Ciencias Físicas y Matemáticas, Universidad de Las Américas, Quito, Ecuador
- * E-mail: (YPC); (HPT)
| | | | - Eduardo Tejera
- Facultad de Ingenieria y Ciencias Agropecuarias, Universidad de Las Américas, Quito, Ecuador
| | - Maykel Cruz-Monteagudo
- CIQUP/Departamento de Química e Bioquímica, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
- REQUIMTE/Departamento de Química e Bioquímica, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
- Department of General Education, West Coast University—Miami Campus, Doral, Florida, United States of America
| | - Fernanda Borges
- CIQUP/Departamento de Química e Bioquímica, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - M. Natália D. S. Cordeiro
- REQUIMTE/Departamento de Química e Bioquímica, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Huong Le-Thi-Thu
- VNU School of Medicine and Pharmacy, Vietnam National University, Hanoi, Vietnam
| | - Hai Pham-The
- Hanoi University of Pharmacy, Hanoi, Vietnam
- * E-mail: (YPC); (HPT)
| |
Collapse
|
4
|
Zhang W, Ji L, Chen Y, Tang K, Wang H, Zhu R, Jia W, Cao Z, Liu Q. When drug discovery meets web search: Learning to Rank for ligand-based virtual screening. J Cheminform 2015; 7:5. [PMID: 25705262 PMCID: PMC4333300 DOI: 10.1186/s13321-015-0052-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 01/07/2015] [Indexed: 11/30/2022] Open
Abstract
Background The rapid increase in the emergence of novel chemical substances presents a substantial demands for more sophisticated computational methodologies for drug discovery. In this study, the idea of Learning to Rank in web search was presented in drug virtual screening, which has the following unique capabilities of 1). Applicable of identifying compounds on novel targets when there is not enough training data available for these targets, and 2). Integration of heterogeneous data when compound affinities are measured in different platforms. Results A standard pipeline was designed to carry out Learning to Rank in virtual screening. Six Learning to Rank algorithms were investigated based on two public datasets collected from Binding Database and the newly-published Community Structure-Activity Resource benchmark dataset. The results have demonstrated that Learning to rank is an efficient computational strategy for drug virtual screening, particularly due to its novel use in cross-target virtual screening and heterogeneous data integration. Conclusions To the best of our knowledge, we have introduced here the first application of Learning to Rank in virtual screening. The experiment workflow and algorithm assessment designed in this study will provide a standard protocol for other similar studies. All the datasets as well as the implementations of Learning to Rank algorithms are available at http://www.tongji.edu.cn/~qiliu/lor_vs.html. The analogy between web search and ligand-based drug discovery ![]()
Collapse
Affiliation(s)
- Wei Zhang
- Department of Central Laboratory, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Lijuan Ji
- Huai'an Second People's Hospital affiliated to Xuzhou Medical College, Huai'an, China
| | - Yanan Chen
- Department of Central Laboratory, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Kailin Tang
- Department of Central Laboratory, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Haiping Wang
- Department of Central Laboratory, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China ; Department of Computer Science, Hefei University of Technology, Hefei, 230009 China
| | - Ruixin Zhu
- Department of Central Laboratory, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Wei Jia
- R & D Information, AstraZeneca, Shanghai, China
| | - Zhiwei Cao
- Department of Central Laboratory, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Qi Liu
- Department of Central Laboratory, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| |
Collapse
|
6
|
Ammad-ud-din M, Georgii E, Gönen M, Laitinen T, Kallioniemi O, Wennerberg K, Poso A, Kaski S. Integrative and Personalized QSAR Analysis in Cancer by Kernelized Bayesian Matrix Factorization. J Chem Inf Model 2014; 54:2347-59. [DOI: 10.1021/ci500152b] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Muhammad Ammad-ud-din
- Helsinki
Institute for Information Technology HIIT, Department of Information
and Computer Science, Aalto University, P.O. Box 15400, Espoo 00076, Finland
| | - Elisabeth Georgii
- Helsinki
Institute for Information Technology HIIT, Department of Information
and Computer Science, Aalto University, P.O. Box 15400, Espoo 00076, Finland
| | - Mehmet Gönen
- Helsinki
Institute for Information Technology HIIT, Department of Information
and Computer Science, Aalto University, P.O. Box 15400, Espoo 00076, Finland
| | - Tuomo Laitinen
- School
of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, P.O.
Box 1627, Kuopio 70211, Finland
| | - Olli Kallioniemi
- Institute
for Molecular Medicine Finland FIMM, University of Helsinki, P.O. Box 20, Helsinki 00014, Finland
| | - Krister Wennerberg
- Institute
for Molecular Medicine Finland FIMM, University of Helsinki, P.O. Box 20, Helsinki 00014, Finland
| | - Antti Poso
- School
of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, P.O.
Box 1627, Kuopio 70211, Finland
- Division
of Molecular Oncology of Solid Tumors, Department of Internal Medicine
1, University Hospital Tuebingen, Otfried Mueller-Strasse 10, 72076 Tuebingen, Germany
| | - Samuel Kaski
- Helsinki
Institute for Information Technology HIIT, Department of Information
and Computer Science, Aalto University, P.O. Box 15400, Espoo 00076, Finland
- Helsinki
Institute for Information Technology HIIT, Department of Computer
Science, University of Helsinki, P.O. Box 68, Helsinki 00014, Finland
| |
Collapse
|