1
|
Mastrolorito F, Togo MV, Gambacorta N, Trisciuzzi D, Giannuzzi V, Bonifazi F, Liantonio A, Imbrici P, De Luca A, Altomare CD, Ciriaco F, Amoroso N, Nicolotti O. TISBE: A Public Web Platform for the Consensus-Based Explainable Prediction of Developmental Toxicity. Chem Res Toxicol 2024; 37:323-339. [PMID: 38200616 DOI: 10.1021/acs.chemrestox.3c00310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]
Abstract
Despite being extremely relevant for the protection of prenatal and neonatal health, the developmental toxicity (Dev Tox) is a highly complex endpoint whose molecular rationale is still largely unknown. The lack of availability of high-quality data as well as robust nontesting methods makes its understanding even more difficult. Thus, the application of new explainable alternative methods is of utmost importance, with Dev Tox being one of the most animal-intensive research themes of regulatory toxicology. Descending from TIRESIA (Toxicology Intelligence and Regulatory Evaluations for Scientific and Industry Applications), the present work describes TISBE (TIRESIA Improved on Structure-Based Explainability), a new public web platform implementing four fundamental advancements for in silico analyses: a three times larger dataset, a transparent XAI (explainable artificial intelligence) framework employing a fragment-based fingerprint coding, a novel consensus classifier based on five independent machine learning models, and a new applicability domain (AD) method based on a double top-down approach for better estimating the prediction reliability. The training set (TS) includes as many as 1008 chemicals annotated with experimental toxicity values. Based on a 5-fold cross-validation, a median value of 0.410 for the Matthews correlation coefficient was calculated; TISBE was very effective, with a median value of sensitivity and specificity equal to 0.984 and 0.274, respectively. TISBE was applied on two external pools made of 1484 bioactive compounds and 85 pediatric drugs taken from ChEMBL (Chemical European Molecular Biology Laboratory) and TEDDY (Task-Force in Europe for Drug Development in the Young) repositories, respectively. Notably, TISBE gives users the option to clearly spot the molecular fragments responsible for the toxicity or the safety of a given chemical query and is available for free at https://prometheus.farmacia.uniba.it/tisbe.
Collapse
Affiliation(s)
- Fabrizio Mastrolorito
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Maria Vittoria Togo
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Nicola Gambacorta
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Viviana Giannuzzi
- Fondazione per la Ricerca Farmacologica Gianni Benzi Onlus, 70010 Valenzano (BA), Italy
| | - Fedele Bonifazi
- Fondazione per la Ricerca Farmacologica Gianni Benzi Onlus, 70010 Valenzano (BA), Italy
| | - Antonella Liantonio
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Paola Imbrici
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Annamaria De Luca
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Fulvio Ciriaco
- Dipartimento di Chimica, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| | - Nicola Amoroso
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Bari, 70125 Bari, Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, 70125 Bari, Italy
| |
Collapse
|
2
|
Lamens A, Bajorath J. Generation of Molecular Counterfactuals for Explainable Machine Learning Based on Core-Substituent Recombination. ChemMedChem 2024; 19:e202300586. [PMID: 37983655 DOI: 10.1002/cmdc.202300586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 11/20/2023] [Accepted: 11/20/2023] [Indexed: 11/22/2023]
Abstract
The use of black box machine learning models whose decisions cannot be understood limits the acceptance of predictions in interdisciplinary research and camouflages artificial learning characteristics leading to predictions for other than anticipated reasons. Consequently, there is increasing interest in explainable artificial intelligence to rationalize predictions and uncover potential pitfalls. Among others, relevant approaches include feature attribution methods to identify molecular structures determining predictions and counterfactuals (CFs) or contrastive explanations. CFs are defined as variants of test instances with minimal modifications leading to opposing predictions. In medicinal chemistry, CFs have thus far only been little investigated although they are particularly intuitive from a chemical perspective. We introduce a new methodology for the systematic generation of CFs that is centered on well-defined structural analogues of test compounds. The approach is transparent, computationally straightforward, and shown to provide a wealth of CFs for test sets. The method is made freely available.
Collapse
Affiliation(s)
- Alec Lamens
- Department of Life Science Informatics and Data Science B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, 53115, Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data Science B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, 53115, Bonn, Germany
- Lamarr Institute for Machine Learning and Artificial Intelligence, Rheinische Friedrich-Wilhelms-Universität Bonn, Friedrich-Hirzebruch-Allee 5/6, 53115, Bonn, Germany
| |
Collapse
|
3
|
Gambacorta N, Ciriaco F, Amoroso N, Altomare CD, Bajorath J, Nicolotti O. CIRCE: Web-Based Platform for the Prediction of Cannabinoid Receptor Ligands Using Explainable Machine Learning. J Chem Inf Model 2023; 63:5916-5926. [PMID: 37675493 DOI: 10.1021/acs.jcim.3c00914] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
The endocannabinoid system, which includes cannabinoid receptor 1 and 2 subtypes (CB1R and CB2R, respectively), is responsible for the onset of various pathologies including neurodegeneration, cancer, neuropathic and inflammatory pain, obesity, and inflammatory bowel disease. Given the high similarity of CB1R and CB2R, generating subtype-selective ligands is still an open challenge. In this work, the Cannabinoid Iterative Revaluation for Classification and Explanation (CIRCE) compound prediction platform has been generated based on explainable machine learning to support the design of selective CB1R and CB2R ligands. Multilayer classifiers were combined with Shapley value analysis to facilitate explainable predictions. In test calculations, CIRCE predictions reached ∼80% accuracy and structural features determining ligand predictions were rationalized. CIRCE was designed as a web-based prediction platform that is made freely available as a part of our study.
Collapse
Affiliation(s)
- Nicola Gambacorta
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| | - Fulvio Ciriaco
- Dipartimento di Chimica, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| | - Nicola Amoroso
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| | - Orazio Nicolotti
- Dipartimento di Farmacia Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona, 4, I-70125 Bari, Italy
| |
Collapse
|
4
|
Muegge I, Hu Y. How do we further enhance 2D fingerprint similarity searching for novel drug discovery? Expert Opin Drug Discov 2022; 17:1173-1176. [PMID: 36150044 DOI: 10.1080/17460441.2022.2128332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Affiliation(s)
| | - Yuan Hu
- Alkermes, Inc, Waltham, Massachusetts, USA
| |
Collapse
|
5
|
Devillers J, Sartor V, Devillers H. Predicting mosquito repellents for clothing application from molecular fingerprint-based artificial neural network SAR models. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2022; 33:729-751. [PMID: 36106833 DOI: 10.1080/1062936x.2022.2124014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 09/06/2022] [Indexed: 06/15/2023]
Abstract
Spraying repellents on clothing limits toxicity and allergy problems that can occur when the repellents are directly applied to skin. This also allows the use of higher doses to ensure longer lasting effects. As the number of repellents available on the market is limited, it is necessary to propose new ones, especially by using in silico methods that reduce costs and time. In this context SAR models were built from a dataset of 2027 chemicals for which repellent activity on clothing was measured against Aedes aegypti. The interest of using either the ECFP or MACCS fingerprints as input neurons of a three-layer perceptron was evaluated. Transformation of MACCS bit strings into disjunctive tables led to interesting results. Models obtained with both types of fingerprints were compared to a model including physicochemical and topological descriptors.
Collapse
Affiliation(s)
| | - V Sartor
- Laboratoire des IMRCP, Université de Toulouse, CNRS UMR 5623, Université Toulouse III - Paul Sabatier, Toulouse, France
| | - H Devillers
- SPO, Univ Montpellier, INRAE, Institut Agro, Montpellier, France
| |
Collapse
|