1
|
Trunzer M, Teigão J, Huth F, Poller B, Desrayaud S, Rodríguez-Pérez R, Faller B. Improving In Vitro-In Vivo Extrapolation of Clearance Using Rat Liver Microsomes for Highly Plasma Protein-Bound Molecules. Drug Metab Dispos 2024; 52:345-354. [PMID: 38360916 DOI: 10.1124/dmd.123.001597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/07/2024] [Accepted: 02/12/2024] [Indexed: 02/17/2024] Open
Abstract
It is common practice in drug discovery and development to predict in vivo hepatic clearance from in vitro incubations with liver microsomes or hepatocytes using the well-stirred model (WSM). When applying the WSM to a set of approximately 3000 Novartis research compounds, 73% of neutral and basic compounds (extended clearance classification system [ECCS] class 2) were well-predicted within 3-fold. In contrast, only 44% (ECCS class 1A) or 34% (ECCS class 1B) of acids were predicted within 3-fold. To explore the hypothesis whether the higher degree of plasma protein binding for acids contributes to the in vitro-in vivo correlation (IVIVC) disconnect, 68 proprietary compounds were incubated with rat liver microsomes in the presence and absence of 5% plasma. A minor impact of plasma on clearance IVIVC was found for moderately bound compounds (fraction unbound in plasma [fup] ≥1%). However, addition of plasma significantly improved the IVIVC for highly bound compounds (fup <1%) as indicated by an increase of the average fold error from 0.10 to 0.36. Correlating fup with the scaled unbound intrinsic clearance ratio in the presence or absence of plasma allowed the establishment of an empirical, nonlinear correction equation that depends on fup Taken together, estimation of the metabolic clearance of highly bound compounds was enhanced by the addition of plasma to microsomal incubations. For standard incubations in buffer only, application of an empirical correction provided improved clearance predictions. SIGNIFICANCE STATEMENT: Application of the well-stirred liver model for clearance in vitro-in vivo extrapolation (IVIVE) in rat generally underpredicts the clearance of acids and the strong protein binding of acids is suspected to be one responsible factor. Unbound intrinsic in vitro clearance (CLint,u) determinations using rat liver microsomes supplemented with 5% plasma resulted in an improved IVIVE. An empirical equation was derived that can be applied to correct CLint,u-values in dependance of fraction unbound in plasma (fup) and measured CLint in buffer.
Collapse
Affiliation(s)
- Markus Trunzer
- Pharmacokinetic Sciences, Novartis Pharma AG, Basel, Switzerland
| | - Joana Teigão
- Pharmacokinetic Sciences, Novartis Pharma AG, Basel, Switzerland
| | - Felix Huth
- Pharmacokinetic Sciences, Novartis Pharma AG, Basel, Switzerland
| | - Birk Poller
- Pharmacokinetic Sciences, Novartis Pharma AG, Basel, Switzerland
| | | | | | - Bernard Faller
- Pharmacokinetic Sciences, Novartis Pharma AG, Basel, Switzerland
| |
Collapse
|
2
|
Fluetsch A, Trunzer M, Gerebtzoff G, Rodríguez-Pérez R. Deep Learning Models Compared to Experimental Variability for the Prediction of CYP3A4 Time-Dependent Inhibition. Chem Res Toxicol 2024; 37:549-560. [PMID: 38501689 DOI: 10.1021/acs.chemrestox.3c00305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Most drugs are mainly metabolized by cytochrome P450 (CYP450), which can lead to drug-drug interactions (DDI). Specifically, time-dependent inhibition (TDI) of CYP3A4 isoenzyme has been associated with clinically relevant DDI. To overcome potential DDI issues, high-throughput in vitro assays were established to assess the TDI of CYP3A4 during the discovery and lead optimization phases. However, in silico machine learning models would enable an earlier and larger-scale assessment of TDI potential liabilities. For CYP inhibition, most modeling efforts have focused on highly imbalanced and small data sets. Moreover, assay variability is rarely considered, which is key to understand the model's quality and suitability for decision-making. In this work, machine learning models were built for the prediction of TDI of CYP3A4, evaluated prospectively, and compared to the variability of the experimental assay. Different modeling strategies were investigated to assess their influence on the model's performance. Through multitask learning, additional data sets were leveraged for model building, coming from public databases, in-house CYP-related assays, or other pharmaceutical companies (federated learning). Apart from the numerical prediction of inactivation rates of CYP3A4 TDI, three-class predictions were carried out, giving a negative (inactivation rate kobs < 0.01 min-1), weak positive (0.01 ≤ kobs ≤ 0.025 min-1), or positive (kobs > 0.025 min-1) output. The final multitask graph neural network model achieved misclassification rates of 8 and 7% for positive and negative TDI, respectively. Importantly, the presented deep learning-based predictions had a similar precision to the reproducibility of in vitro experiments and thus offered great opportunities for drug design, early derisk of DDI potential, and selection of experiments. To facilitate CYP inhibition modeling efforts in the public domain, the developed model was used to annotate ∼16 000 publicly available structures, and a surrogate data set is shared as Supporting Information.
Collapse
Affiliation(s)
- Andrin Fluetsch
- Novartis Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Markus Trunzer
- Novartis Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Grégori Gerebtzoff
- Novartis Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | | |
Collapse
|
3
|
Fluetsch A, Di Lascio E, Gerebtzoff G, Rodríguez-Pérez R. Adapting Deep Learning QSPR Models to Specific Drug Discovery Projects. Mol Pharm 2024; 21:1817-1826. [PMID: 38373038 DOI: 10.1021/acs.molpharmaceut.3c01124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Medicinal chemistry and drug design efforts can be assisted by machine learning (ML) models that relate the molecular structure to compound properties. Such quantitative structure-property relationship models are generally trained on large data sets that include diverse chemical series (global models). In the pharmaceutical industry, these ML global models are available across discovery projects as an "out-of-the-box" solution to assist in drug design, synthesis prioritization, and experiment selection. However, drug discovery projects typically focus on confined parts of the chemical space (e.g., chemical series), where global models might not be applicable. Local ML models are sometimes generated to focus on specific projects or series. Herein, ML-based global models, local models, and hybrid global-local strategies were benchmarked. Analyses were done for more than 300 drug discovery projects at Novartis and ten absorption, distribution, metabolism, and excretion (ADME) assays. In this work, hybrid global-local strategies based on transfer learning approaches were proposed to leverage both historical ADME data (global) and project-specific data (local) to adapt model predictions. Fine-tuning a pretrained global ML model (used for weights' initialization, WI) was the top-performing method. Average improvements of mean absolute errors across all assays were 16% and 27% compared with global and local models, respectively. Interestingly, when the effect of training set size was analyzed, WI fine-tuning was found to be successful even in low-data scenarios (e.g., ∼10 molecules per project). Taken together, this work highlights the potential of domain adaptation in the field of molecular property predictions to refine existing pretrained models on a new compound data distribution.
Collapse
Affiliation(s)
- Andrin Fluetsch
- Novartis Biomedical Research, Novartis Campus, Basel 4002, Switzerland
| | - Elena Di Lascio
- Novartis Biomedical Research, Novartis Campus, Basel 4002, Switzerland
| | | | | |
Collapse
|
4
|
Lluch-Bernal M, Pedrosa M, Domínguez-Ortega J, Colque-Bayona M, Correa-Borit J, Phillips-Anglés E, Gómez-Traseira C, Quirce S, Rodríguez-Pérez R. Sensitization to Quercus ilex pollen is clinically relevant in patients with seasonal pollen allergy. J Investig Allergol Clin Immunol 2024; 34:0. [PMID: 38381081 DOI: 10.18176/jiaci.0998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024] Open
Affiliation(s)
- M Lluch-Bernal
- Allergy Research Group, Hospital La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
- Department of Allergy, La Paz University Hospital, Madrid, Spain
| | - M Pedrosa
- Allergy Research Group, Hospital La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
- Department of Allergy, La Paz University Hospital, Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Raras CIBERER, Madrid, Spain
| | - J Domínguez-Ortega
- Allergy Research Group, Hospital La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
- Department of Allergy, La Paz University Hospital, Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Respiratorias CIBERES, Madrid, Spain
| | - M Colque-Bayona
- Allergy Research Group, Hospital La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
| | - J Correa-Borit
- Allergy Research Group, Hospital La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
| | - E Phillips-Anglés
- Allergy Research Group, Hospital La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
- Department of Allergy, La Paz University Hospital, Madrid, Spain
| | - C Gómez-Traseira
- Allergy Research Group, Hospital La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
- Department of Allergy, La Paz University Hospital, Madrid, Spain
| | - S Quirce
- Allergy Research Group, Hospital La Paz Institute for Health Research (IdiPAZ), Madrid, Spain
- Department of Allergy, La Paz University Hospital, Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Respiratorias CIBERES, Madrid, Spain
| | - R Rodríguez-Pérez
- Department of Allergy, La Paz University Hospital, Madrid, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Respiratorias CIBERES, Madrid, Spain
| |
Collapse
|
5
|
Narváez-Fernández E, Pose K, Caballero ML, Rodríguez-Pérez R, Quirce S. Occupational asthma and food allergy due to soybean in a bakery worker. J Investig Allergol Clin Immunol 2023; 34:0. [PMID: 37905416 DOI: 10.18176/jiaci.0958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023] Open
Affiliation(s)
| | - K Pose
- Department of Allergy, Hospital Universitario La Paz, Madrid, Spain
| | - M L Caballero
- Department of Allergy, Hospital Universitario La Paz, Madrid, Spain
- Hospital La Paz Institute for Health Research, IdiPAZ, Madrid, Spain
- CIBERES, CIBER of Respiratory Diseases, Madrid, Spain
| | - R Rodríguez-Pérez
- Department of Allergy, Hospital Universitario La Paz, Madrid, Spain
- Hospital La Paz Institute for Health Research, IdiPAZ, Madrid, Spain
- CIBERES, CIBER of Respiratory Diseases, Madrid, Spain
| | - S Quirce
- Department of Allergy, Hospital Universitario La Paz, Madrid, Spain
- Hospital La Paz Institute for Health Research, IdiPAZ, Madrid, Spain
- CIBERES, CIBER of Respiratory Diseases, Madrid, Spain
| |
Collapse
|
6
|
Rodríguez-Pérez R, Del Pozuelo S, Pulido E, Brigido C, Carretero P, Caballero ML. Chemiluminescence-based IgE dot-blot assay to diagnose a case of anaphylaxis caused by Prontosan. J Investig Allergol Clin Immunol 2023; 34:0. [PMID: 37669080 DOI: 10.18176/jiaci.0933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2023] Open
Affiliation(s)
- R Rodríguez-Pérez
- Allergy Research Group, La Paz Hospital Institute for Health Research (IdiPAZ), Madrid, Spain
| | - S Del Pozuelo
- Department of Allergy, Hospital Universitario, Burgos, Spain
| | - E Pulido
- Allergy Research Group, La Paz Hospital Institute for Health Research (IdiPAZ), Madrid, Spain
| | - C Brigido
- Department of Allergy, Hospital Universitario, Burgos, Spain
| | - P Carretero
- Department of Allergy, Hospital Universitario, Burgos, Spain
| | - M L Caballero
- Allergy Research Group, La Paz Hospital Institute for Health Research (IdiPAZ), Madrid, Spain
- Department of Allergy, La Paz University Hospital, Madrid, Spain
| |
Collapse
|
7
|
Amara K, Rodríguez-Pérez R, Jiménez-Luna J. Explaining compound activity predictions with a substructure-aware loss for graph neural networks. J Cheminform 2023; 15:67. [PMID: 37491407 PMCID: PMC10369817 DOI: 10.1186/s13321-023-00733-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 07/08/2023] [Indexed: 07/27/2023] Open
Abstract
Explainable machine learning is increasingly used in drug discovery to help rationalize compound property predictions. Feature attribution techniques are popular choices to identify which molecular substructures are responsible for a predicted property change. However, established molecular feature attribution methods have so far displayed low performance for popular deep learning algorithms such as graph neural networks (GNNs), especially when compared with simpler modeling alternatives such as random forests coupled with atom masking. To mitigate this problem, a modification of the regression objective for GNNs is proposed to specifically account for common core structures between pairs of molecules. The presented approach shows higher accuracy on a recently-proposed explainability benchmark. This methodology has the potential to assist with model explainability in drug discovery pipelines, particularly in lead optimization efforts where specific chemical series are investigated.
Collapse
Affiliation(s)
- Kenza Amara
- Microsoft Research AI4Science, 21 Station Rd., Cambridge, CB1 2FB UK
- Department of Computer Science, ETH Zurich, Andreasstrasse 5, 8050 Zurich, Switzerland
| | | | - José Jiménez-Luna
- Microsoft Research AI4Science, 21 Station Rd., Cambridge, CB1 2FB UK
| |
Collapse
|
8
|
Di Lascio E, Gerebtzoff G, Rodríguez-Pérez R. Systematic Evaluation of Local and Global Machine Learning Models for the Prediction of ADME Properties. Mol Pharm 2023; 20:1758-1767. [PMID: 36745394 DOI: 10.1021/acs.molpharmaceut.2c00962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Machine learning (ML) has become an indispensable tool to predict absorption, distribution, metabolism, and excretion (ADME) properties in pharmaceutical research. ML algorithms are trained on molecular structures and corresponding ADME assay data to develop quantitative structure-property relationship (QSPR) models. Traditional QSPR models were trained on compound sets of limited size. With the advent of more complex ML algorithms and data availability, training sets have become larger and more diverse. Most common training approaches consist in either training a model with a small set of similar compounds, namely, compounds designed for the same drug discovery project or chemical series (local model approach) or with a larger set of diverse compounds (global model approach). Global models are built with all experimental data available for an assay, combining compound data from different projects and disease areas. Despite the ML progress made so far, the choice of the appropriate data composition for building ML models is still unclear. Herein, a systematic evaluation of local and global ML models was performed for 10 different experimental assays and 112 drug discovery projects. Results show a consistent superior performance of global models for ADME property predictions. Diagnostic analyses were also carried out to investigate the influence of training set size, structural diversity, and data shift in the relative performance of local and global ML models. Training set and structural diversity did not have an impact in the relative performance on the methods. Instead, data shift helped to identify the projects with larger performance differences between local and global models. Results presented in this work can be leveraged to improve ML-based ADME properties predictions and thus decision-making in drug discovery projects.
Collapse
Affiliation(s)
- Elena Di Lascio
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Grégori Gerebtzoff
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | | |
Collapse
|
9
|
Rodríguez-Pérez R, Trunzer M, Schneider N, Faller B, Gerebtzoff G. Multispecies Machine Learning Predictions of In Vitro Intrinsic Clearance with Uncertainty Quantification Analyses. Mol Pharm 2023; 20:383-394. [PMID: 36437712 DOI: 10.1021/acs.molpharmaceut.2c00680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In pharmaceutical research, compounds are optimized for metabolic stability to avoid a too fast elimination of the drug. Intrinsic clearance (CLint) measured in liver microsomes or hepatocytes is an important parameter during lead optimization. In this work, machine learning models were developed to relate the compound structure to microsomal metabolic stability and predict CLint for new compounds. A multitask (MT) learning architecture was introduced to model the CLint of six species simultaneously, giving as a result a multispecies machine learning model. MT graph neural network (MT-GNN) regression was identified as the top-performing method, and an ensemble of 10 MT-GNN models was evaluated prospectively. Geometric mean fold errors were consistently smaller than 2-fold. Moreover, high precision values were obtained in the prediction of "high" (>300 μL/min/mg) and "low" (<100 μL/min/mg) CLint compounds. Precision values ranged from 80 to 94% for low CLint predictions and from 75 to 97% for high CLint predictions, depending on the species. Uncertainty on experimental values and model predictions was systematically quantified. Experimental variability (aleatoric uncertainty) of all historical Novartis in vitro clearance experiments was analyzed. Interestingly, MT-GNN models' performance approached assays' experimental variability. Moreover, uncertainty estimation in predictions (epistemic uncertainty) enabled identifying predictions associated with lower and higher error. Taken together, our manuscript combines a multispecies deep learning model and large-scale uncertainty analyses to improve CLint predictions and facilitate early informed decisions for compound prioritization.
Collapse
Affiliation(s)
| | - Markus Trunzer
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Nadine Schneider
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Bernard Faller
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Grégori Gerebtzoff
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| |
Collapse
|
10
|
Bajorath J, Chávez-Hernández AL, Duran-Frigola M, Fernández-de Gortari E, Gasteiger J, López-López E, Maggiora GM, Medina-Franco JL, Méndez-Lucio O, Mestres J, Miranda-Quintana RA, Oprea TI, Plisson F, Prieto-Martínez FD, Rodríguez-Pérez R, Rondón-Villarreal P, Saldívar-Gonzalez FI, Sánchez-Cruz N, Valli M. Chemoinformatics and artificial intelligence colloquium: progress and challenges in developing bioactive compounds. J Cheminform 2022; 14:82. [PMID: 36461094 PMCID: PMC9716667 DOI: 10.1186/s13321-022-00661-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 11/25/2022] [Indexed: 12/03/2022] Open
Abstract
We report the main conclusions of the first Chemoinformatics and Artificial Intelligence Colloquium, Mexico City, June 15-17, 2022. Fifteen lectures were presented during a virtual public event with speakers from industry, academia, and non-for-profit organizations. Twelve hundred and ninety students and academics from more than 60 countries. During the meeting, applications, challenges, and opportunities in drug discovery, de novo drug design, ADME-Tox (absorption, distribution, metabolism, excretion and toxicity) property predictions, organic chemistry, peptides, and antibiotic resistance were discussed. The program along with the recordings of all sessions are freely available at https://www.difacquim.com/english/events/2022-colloquium/ .
Collapse
Affiliation(s)
- Jürgen Bajorath
- grid.10388.320000 0001 2240 3300Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, 53113 Bonn, Germany
| | - Ana L. Chávez-Hernández
- grid.9486.30000 0001 2159 0001DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, 04510 Mexico City, Mexico
| | - Miquel Duran-Frigola
- Ersilia Open Source Initiative, Cambridge, UK ,grid.7722.00000 0001 1811 6966Joint IRB-BSC-CRG Programme in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia Spain
| | - Eli Fernández-de Gortari
- grid.420330.60000 0004 0521 6935Nanosafety Laboratory, International Iberian Nanotechnology Laboratory, 4715-330 Braga, Portugal
| | - Johann Gasteiger
- grid.5330.50000 0001 2107 3311Computer-Chemie-Centrum, University of Erlangen-Nuremberg, Erlangen, Germany
| | - Edgar López-López
- grid.9486.30000 0001 2159 0001DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, 04510 Mexico City, Mexico ,grid.512574.0Department of Pharmacology, Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV), 07360 Mexico City, Mexico
| | - Gerald M. Maggiora
- grid.134563.60000 0001 2168 186XBIO5 Institute, University of Arizona, Tucson, AZ 85721 USA
| | - José L. Medina-Franco
- grid.9486.30000 0001 2159 0001DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, 04510 Mexico City, Mexico
| | | | - Jordi Mestres
- grid.5841.80000 0004 1937 0247Chemotargets SL, Baldiri Reixac 4, Parc Cientific de Barcelona (PCB), 08028 Barcelona, Catalonia Spain ,grid.20522.370000 0004 1767 9005Research Group on Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute and University Pompeu Fabra, Parc de Recerca Biomedica (PRBB), 08003 Barcelona, Catalonia Spain
| | | | - Tudor I. Oprea
- grid.266832.b0000 0001 2188 8502Department of Internal Medicine, University of New Mexico School of Medicine, Albuquerque, NM 87131 USA ,grid.8761.80000 0000 9919 9582Department of Rheumatology and Inflammation Research, Institute of Medicine, Sahlgrenska Academy at Gothenburg University, 40530 Gothenburg, Sweden ,grid.5254.60000 0001 0674 042XNovo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark ,Present Address: Roivant Discovery Sciences, Inc., 451 D Street, Boston, MA 02210 USA
| | - Fabien Plisson
- grid.512574.0Department of Biotechnology and Biochemistry, Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV-IPN), Irapuato Unit, 36824 Irapuato, Gto Mexico
| | - Fernando D. Prieto-Martínez
- grid.9486.30000 0001 2159 0001Chemistry Institute, National Autonomous University of Mexico, 04510 Mexico City, Mexico
| | - Raquel Rodríguez-Pérez
- grid.419481.10000 0001 1515 9979Novartis Institutes for Biomedical Research, 4002 Basel, Switzerland
| | - Paola Rondón-Villarreal
- grid.442204.40000 0004 0486 1035Universidad de Santander, Facultad de Ciencias Médicas y de la Salud, Instituto de Investigación Masira, Calle 70 No. 55-210, 680003 Santander, Bucaramanga Colombia
| | - Fernanda I. Saldívar-Gonzalez
- grid.9486.30000 0001 2159 0001DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, 04510 Mexico City, Mexico
| | - Norberto Sánchez-Cruz
- grid.5841.80000 0004 1937 0247Chemotargets SL, Baldiri Reixac 4, Parc Cientific de Barcelona (PCB), 08028 Barcelona, Catalonia Spain ,grid.9486.30000 0001 2159 0001Instituto de Química, Unidad Mérida, Universidad Nacional Autónoma de México, Carretera Mérida-Tetiz Km. 4.5, Yucatán, 97357 Ucú, Mexico
| | - Marilia Valli
- grid.410543.70000 0001 2188 478XNuclei of Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE), Department of Organic Chemistry, Institute of Chemistry, São Paulo State University-UNESP, Araraquara, Brazil
| |
Collapse
|
11
|
Mastropietro A, Pasculli G, Feldmann C, Rodríguez-Pérez R, Bajorath J. EdgeSHAPer: Bond-Centric Shapley Value-Based Explanation Method for Graph Neural Networks. iScience 2022; 25:105043. [PMID: 36134335 PMCID: PMC9483788 DOI: 10.1016/j.isci.2022.105043] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 08/17/2022] [Accepted: 08/25/2022] [Indexed: 11/29/2022] Open
Abstract
Graph neural networks (GNNs) recursively propagate signals along the edges of an input graph, integrate node feature information with graph structure, and learn object representations. Like other deep neural network models, GNNs have notorious black box character. For GNNs, only few approaches are available to rationalize model decisions. We introduce EdgeSHAPer, a generally applicable method for explaining GNN-based models. The approach is devised to assess edge importance for predictions. Therefore, EdgeSHAPer makes use of the Shapley value concept from game theory. For proof-of-concept, EdgeSHAPer is applied to compound activity prediction, a central task in drug discovery. EdgeSHAPer’s edge centricity is relevant for molecular graphs where edges represent chemical bonds. Combined with feature mapping, EdgeSHAPer produces intuitive explanations for compound activity predictions. Compared to a popular node-centric and another edge-centric GNN explanation method, EdgeSHAPer reveals higher resolution in differentiating features determining predictions and identifies minimal pertinent positive feature sets. EdgeSHAPer is new methodology for explaining graph neural network models Edge centricity represents a characteristic feature of the approach EdgeSHAPer is generally applicable including molecular predictions EdgeSHAPer produces explanations of compound predictions at a high resolution
Collapse
|
12
|
Hamzic S, Lewis R, Desrayaud S, Soylu C, Fortunato M, Gerebtzoff G, Rodríguez-Pérez R. Predicting In Vivo Compound Brain Penetration Using Multi-task Graph Neural Networks. J Chem Inf Model 2022; 62:3180-3190. [PMID: 35738004 DOI: 10.1021/acs.jcim.2c00412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Assessing whether compounds penetrate the brain can become critical in drug discovery, either to prevent adverse events or to reach the biological target. Generally, pre-clinical in vivo studies measuring the ratio of brain and blood concentrations (Kp) are required to estimate the brain penetration potential of a new drug entity. In this work, we developed machine learning models to predict in vivo compound brain penetration (as LogKp) from chemical structure. Our results show the benefit of including in vitro experimental data as auxiliary tasks in multi-task graph neural network (MT-GNN) models. MT-GNNs outperformed single-task (ST) models solely trained on in vivo brain penetration data. The best-performing MT-GNN regression model achieved a coefficient of determination of 0.42 and a mean absolute error of 0.39 (2.5-fold) on a prospective validation set and outperformed all tested ST models. To facilitate decision-making, compounds were classified into brain-penetrant or non-penetrant, achieving a Matthew's correlation coefficient of 0.66. Taken together, our findings indicate that the inclusion of in vitro assay data as MT-GNN auxiliary tasks improves in vivo brain penetration predictions and prospective compound prioritization.
Collapse
Affiliation(s)
- Seid Hamzic
- Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Richard Lewis
- Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Sandrine Desrayaud
- Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Cihan Soylu
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Mike Fortunato
- Novartis Institutes for BioMedical Research Inc., 181 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Grégori Gerebtzoff
- Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Raquel Rodríguez-Pérez
- Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| |
Collapse
|
13
|
Abstract
In chemoinformatics and medicinal chemistry, machine learning has evolved into an important approach. In recent years, increasing computational resources and new deep learning algorithms have put machine learning onto a new level, addressing previously unmet challenges in pharmaceutical research. In silico approaches for compound activity predictions, de novo design, and reaction modeling have been further advanced by new algorithmic developments and the emergence of big data in the field. Herein, novel applications of machine learning and deep learning in chemoinformatics and medicinal chemistry are reviewed. Opportunities and challenges for new methods and applications are discussed, placing emphasis on proper baseline comparisons, robust validation methodologies, and new applicability domains. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany; .,Current affiliation: Novartis Institutes for Biomedical Research, Novartis Campus, Basel, Switzerland
| | - Filip Miljković
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany; .,Current affiliation: Data Science and AI, Imaging and Data Analytics, Clinical Pharmacology and Safety Sciences, R&D AstraZeneca, Gothenburg, Sweden
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT (Bonn-Aachen International Center for Information Technology), Chemical Biology and Medicinal Chemistry Program Unit, LIMES (Life and Medical Sciences Institute), Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany;
| |
Collapse
|
14
|
Rodríguez-Pérez R, Bajorath J. Evolution of Support Vector Machine and Regression Modeling in Chemoinformatics and Drug Discovery. J Comput Aided Mol Des 2022; 36:355-362. [PMID: 35304657 PMCID: PMC9325859 DOI: 10.1007/s10822-022-00442-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 02/15/2022] [Indexed: 11/05/2022]
Abstract
The support vector machine (SVM) algorithm is one of the most widely used machine learning (ML) methods for predicting active compounds and molecular properties. In chemoinformatics and drug discovery, SVM has been a state-of-the-art ML approach for more than a decade. A unique attribute of SVM is that it operates in feature spaces of increasing dimensionality. Hence, SVM conceptually departs from the paradigm of low dimensionality that applies to many other methods for chemical space navigation. The SVM approach is applicable to compound classification, and ranking, multi-class predictions, and –in algorithmically modified form– regression modeling. In the emerging era of deep learning (DL), SVM retains its relevance as one of the premier ML methods in chemoinformatics, for reasons discussed herein. We describe the SVM methodology including strengths and weaknesses and discuss selected applications that have contributed to the evolution of SVM as a premier approach for compound classification, property predictions, and virtual compound screening.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115, Bonn, Germany.,Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002, Basel, Switzerland
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115, Bonn, Germany. .,Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002, Basel, Switzerland.
| |
Collapse
|
15
|
Rodríguez-Pérez R, Carretero P, Brigido C, Nin-Valencia A, Carpio-Hernández D, Tomás M, Quirce S, Caballero ML. The new Api m 11.0301 isoallergen from Apis mellifera is a food allergen from honey. J Investig Allergol Clin Immunol 2022; 32:492-493. [PMID: 35234637 DOI: 10.18176/jiaci.0799] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
- R Rodríguez-Pérez
- Allergy Research Group, La Paz Hospital Institute for Health Research (IdiPAZ), Madrid, Spain
| | - P Carretero
- Department of Allergy, Hospital Universitario, Burgos, Spain
| | - C Brigido
- Department of Allergy, Hospital Universitario, Burgos, Spain
| | - A Nin-Valencia
- Department of Allergy, La Paz University Hospital, Madrid, Spain
| | | | - M Tomás
- Allergy Research Group, La Paz Hospital Institute for Health Research (IdiPAZ), Madrid, Spain.,Department of Allergy, La Paz University Hospital, Madrid, Spain
| | - S Quirce
- Allergy Research Group, La Paz Hospital Institute for Health Research (IdiPAZ), Madrid, Spain.,Department of Allergy, La Paz University Hospital, Madrid, Spain
| | - M L Caballero
- Allergy Research Group, La Paz Hospital Institute for Health Research (IdiPAZ), Madrid, Spain.,Department of Allergy, La Paz University Hospital, Madrid, Spain
| |
Collapse
|
16
|
Abstract
The prediction of compound properties from chemical structure is a main task for machine learning (ML) in medicinal chemistry. ML is often applied to large data sets in applications such as compound screening, virtual library enumeration, or generative chemistry. Albeit desirable, a detailed understanding of ML model decisions is typically not required in these cases. By contrast, compound optimization efforts rely on small data sets to identify structural modifications leading to desired property profiles. In this situation, if ML is applied, one usually is reluctant to make decisions based on predictions that cannot be rationalized. Only few ML methods are interpretable. However, to yield insights into complex ML model decisions, explanatory approaches can be applied. Herein, methodologies for better understanding of ML models or explaining individual predictions are reviewed and current challenges in integrating ML into medicinal chemistry programs as well as future opportunities are discussed.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115 Bonn, Germany.,Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002 Basel, Switzerland
| | - Jürgen Bajorath
- Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115 Bonn, Germany
| |
Collapse
|
17
|
Miljković F, Rodríguez-Pérez R, Bajorath J. Impact of Artificial Intelligence on Compound Discovery, Design, and Synthesis. ACS Omega 2021; 6:33293-33299. [PMID: 34926881 PMCID: PMC8674916 DOI: 10.1021/acsomega.1c05512] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 11/18/2021] [Indexed: 05/17/2023]
Abstract
As in other areas, artificial intelligence (AI) is heavily promoted in different scientific fields, including chemistry. Although chemistry traditionally tends to be a conservative field and slower than others to adapt new concepts, AI is increasingly being investigated across chemical disciplines. In medicinal chemistry, supported by computer-aided drug design and cheminformatics, computational methods have long been employed to aid in the search for and optimization of active compounds. We are currently witnessing a multitude of AI-related publications in the medicinal-chemistry-relevant literature and anticipate that the numbers will further increase. Often, advances through AI promoted in such reports are difficult to reconcile or remain questionable, which hampers the acceptance of computational work in interdisciplinary environments. Herein we attempt to highlight selected investigations in which AI has shown promise to impact medicinal chemistry in areas such as compound design and synthesis.
Collapse
Affiliation(s)
- Filip Miljković
- Department
of Life Science Informatics and Data Science, B-IT, LIMES Program
Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115 Bonn, Germany
- Data
Science and AI, Imaging and Data Analytics, Clinical Pharmacology
& Safety Sciences, R&D, AstraZeneca, SE-431 83 Gothenburg, Sweden
| | - Raquel Rodríguez-Pérez
- Department
of Life Science Informatics and Data Science, B-IT, LIMES Program
Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115 Bonn, Germany
- Novartis
Institutes for Biomedical Research, Novartis
Campus, CH-4002 Basel, Switzerland
| | - Jürgen Bajorath
- Department
of Life Science Informatics and Data Science, B-IT, LIMES Program
Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115 Bonn, Germany
- Phone: 49-228-7369-100.
| |
Collapse
|
18
|
Rodríguez-Pérez R, Bajorath J. Feature importance correlation from machine learning indicates functional relationships between proteins and similar compound binding characteristics. Sci Rep 2021; 11:14245. [PMID: 34244588 PMCID: PMC8270985 DOI: 10.1038/s41598-021-93771-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 06/30/2021] [Indexed: 11/08/2022] Open
Abstract
Machine learning is widely applied in drug discovery research to predict molecular properties and aid in the identification of active compounds. Herein, we introduce a new approach that uses model-internal information from compound activity predictions to uncover relationships between target proteins. On the basis of a large-scale analysis generating and comparing machine learning models for more than 200 proteins, feature importance correlation analysis is shown to detect similar compound binding characteristics. Furthermore, rather unexpectedly, the analysis also reveals functional relationships between proteins that are independent of active compounds and binding characteristics. Feature importance correlation analysis does not depend on specific representations, algorithms, or metrics and is generally applicable as long as predictive models can be derived. Moreover, the approach does not require or involve explainable or interpretable machine learning, but only access to feature weights or importance values. On the basis of our findings, the approach represents a new facet of machine learning in drug discovery with potential for practical applications.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, 53115, Bonn, Germany
- Novartis Institutes for Biomedical Research, Novartis Campus, 4002, Basel, Switzerland
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, 53115, Bonn, Germany.
| |
Collapse
|
19
|
Galati S, Yonchev D, Rodríguez-Pérez R, Vogt M, Tuccinardi T, Bajorath J. Predicting Isoform-Selective Carbonic Anhydrase Inhibitors via Machine Learning and Rationalizing Structural Features Important for Selectivity. ACS Omega 2021; 6:4080-4089. [PMID: 33585783 PMCID: PMC7876851 DOI: 10.1021/acsomega.0c06153] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 01/14/2021] [Indexed: 05/03/2023]
Abstract
Carbonic anhydrases (CAs) catalyze the physiological hydration of carbon dioxide and are among the most intensely studied pharmaceutical target enzymes. A hallmark of CA inhibition is the complexation of the catalytic zinc cation in the active site. Human (h) CA isoforms belonging to different families are implicated in a wide range of diseases and of very high interest for therapeutic intervention. Given the conserved catalytic mechanisms and high similarity of many hCA isoforms, a major challenge for CA-based therapy is achieving inhibitor selectivity for hCA isoforms that are associated with specific pathologies over other widely distributed isoforms such as hCA I or hCA II that are of critical relevance for the integrity of many physiological processes. To address this challenge, we have attempted to predict compounds that are selective for isoform hCA IX, which is a tumor-associated protein and implicated in metastasis, over hCA II on the basis of a carefully curated data set of selective and nonselective inhibitors. Machine learning achieved surprisingly high accuracy in predicting hCA IX-selective inhibitors. The results were further investigated, and compound features determining successful predictions were identified. These features were then studied on the basis of X-ray structures of hCA isoform-inhibitor complexes and found to include substructures that explain compound selectivity. Our findings lend credence to selectivity predictions and indicate that the machine learning models derived herein have considerable potential to aid in the identification of new hCA IX-selective compounds.
Collapse
Affiliation(s)
- Salvatore Galati
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115 Bonn, Germany
- Department
of Pharmacy, University of Pisa, 56126 Pisa, Italy
| | - Dimitar Yonchev
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115 Bonn, Germany
| | - Raquel Rodríguez-Pérez
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115 Bonn, Germany
| | - Martin Vogt
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115 Bonn, Germany
| | - Tiziano Tuccinardi
- Department
of Pharmacy, University of Pisa, 56126 Pisa, Italy
- . Phone: 39-050-2219595
| | - Jürgen Bajorath
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115 Bonn, Germany
- . Phone: 49-228-7369-100
| |
Collapse
|
20
|
Rodríguez-Pérez R, Miljković F, Bajorath J. Assessing the information content of structural and protein-ligand interaction representations for the classification of kinase inhibitor binding modes via machine learning and active learning. J Cheminform 2020; 12:36. [PMID: 33431025 PMCID: PMC7245824 DOI: 10.1186/s13321-020-00434-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2019] [Accepted: 04/27/2020] [Indexed: 12/27/2022] Open
Abstract
For kinase inhibitors, X-ray crystallography has revealed different types of binding modes. Currently, more than 2000 kinase inhibitors with known binding modes are available, which makes it possible to derive and test machine learning models for the prediction of inhibitors with different binding modes. We have addressed this prediction task to evaluate and compare the information content of distinct molecular representations including protein–ligand interaction fingerprints (IFPs) and compound structure-based structural fingerprints (i.e., atom environment/fragment fingerprints). IFPs were designed to capture binding mode-specific interaction patterns at different resolution levels. Accurate predictions of kinase inhibitor binding modes were achieved with random forests using both representations. The performance of IFPs was consistently superior to atom environment fingerprints, albeit only by less than 10%. An active learning strategy applying information entropy-based selection of training instances was applied as a diagnostic approach to assess the relative information content of distinct representations. IFPs were found to capture more binding mode-relevant information than atom environment fingerprints, leading to highly predictive models even when training instances were randomly selected. By contrast, for atom environment fingerprints, the derivation of accurate models via active learning depended on entropy-based selection of informative training compounds. Notably, higher information content of IFPs confirmed by active learning only resulted in small improvements in global prediction accuracy compared to models derived using atom environment fingerprints. For practical applications, prediction of binding modes of new kinase inhibitors on the basis of chemical structure is highly attractive.![]()
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, 53115, Bonn, Germany
| | - Filip Miljković
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, 53115, Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, 53115, Bonn, Germany.
| |
Collapse
|
21
|
Rodríguez-Pérez R, Bajorath J. Interpretation of Compound Activity Predictions from Complex Machine Learning Models Using Local Approximations and Shapley Values. J Med Chem 2019; 63:8761-8777. [PMID: 31512867 DOI: 10.1021/acs.jmedchem.9b01101] [Citation(s) in RCA: 128] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
In qualitative or quantitative studies of structure-activity relationships (SARs), machine learning (ML) models are trained to recognize structural patterns that differentiate between active and inactive compounds. Understanding model decisions is challenging but of critical importance to guide compound design. Moreover, the interpretation of ML results provides an additional level of model validation based on expert knowledge. A number of complex ML approaches, especially deep learning (DL) architectures, have distinctive black-box character. Herein, a locally interpretable explanatory method termed Shapley additive explanations (SHAP) is introduced for rationalizing activity predictions of any ML algorithm, regardless of its complexity. Models resulting from random forest (RF), nonlinear support vector machine (SVM), and deep neural network (DNN) learning are interpreted, and structural patterns determining the predicted probability of activity are identified and mapped onto test compounds. The results indicate that SHAP has high potential for rationalizing predictions of complex ML models.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany.,Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Straße 65, 88397 Biberach an der Riß, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
22
|
Miljković F, Rodríguez-Pérez R, Bajorath J. Machine Learning Models for Accurate Prediction of Kinase Inhibitors with Different Binding Modes. J Med Chem 2019; 63:8738-8748. [PMID: 31469557 DOI: 10.1021/acs.jmedchem.9b00867] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Noncovalent inhibitors of protein kinases have different modes of action. They bind to the active or inactive form of kinases, compete with ATP, stabilize inactive kinase conformations, or act through allosteric sites. Accordingly, kinase inhibitors have been classified on the basis of different binding modes. For medicinal chemistry, it would be very useful to derive mechanistic hypotheses for newly discovered inhibitors. Therefore, we have applied different machine learning approaches to generate models for predicting different classes of kinase inhibitors including types I, I1/2, and II as well as allosteric inhibitors. These models were built on the basis of compounds with binding modes confirmed by X-ray crystallography and yielded unexpectedly accurate and stable predictions without the need for deep learning. The results indicate that the new machine learning models have considerable potential for practical applications. Therefore, our data sets and models are made freely available.
Collapse
Affiliation(s)
- Filip Miljković
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| | - Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany.,Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Strasse 65, 88397 Biberach/Riß, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
23
|
Rodríguez-Pérez R, Bajorath J. Prediction of Compound Profiling Matrices, Part II: Relative Performance of Multitask Deep Learning and Random Forest Classification on the Basis of Varying Amounts of Training Data. ACS Omega 2018; 3:12033-12040. [PMID: 30320286 PMCID: PMC6175492 DOI: 10.1021/acsomega.8b01682] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 09/12/2018] [Indexed: 05/28/2023]
Abstract
Currently, there is a high level of interest in deep learning and multitask learning in many scientific fields including the life sciences and chemistry. Herein, we investigate the performance of multitask deep neural networks (MT-DNNs) compared to random forest (RF) classification, a standard method in machine learning, in predicting compound profiling experiments. Predictions were carried out on a large profiling matrix extracted from biological screening data. For model building, submatrices with varying data density of 5-100% were generated to investigate the influence of data sparseness on prediction performance. MT-DNN models were directly compared to RF models, and control calculations were also carried out using single-task DNNs (ST-DNNs). On the basis of compound recall, the performance of ST-DNN was consistently lower than that of the other methods. Compared to RF, MT-DNN models only yielded better prediction performance for individual assays in the profiling matrix when training data were very sparse. However, when the matrix density increased to at least 25-45%, per-assay RF models met or partly exceeded the prediction performance of MT-DNN models. When the average performances of RF and MT-DNN over the grid of all targets were compared, MT-DNN was slightly superior to RF, which was a likely consequence of multitask learning. Overall, there was no consistent advantage of MT-DNN over standard RF classification in predicting the results of compound profiling assays under varying conditions. In the presence of very sparse training data, prediction performance was limited. Under these challenging conditions, MT-DNN was the preferred approach. When more training data became available and prediction performance increased, RF performance was not inferior to MT-DNN.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
- Department
of Medicinal Chemistry, Boehringer Ingelheim
Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397 Biberach/Riß, Germany
| | - Jürgen Bajorath
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
24
|
Rodríguez-Pérez R, Fernández L, Marco S. Overoptimism in cross-validation when using partial least squares-discriminant analysis for omics data: a systematic study. Anal Bioanal Chem 2018; 410:5981-5992. [PMID: 29959482 DOI: 10.1007/s00216-018-1217-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Revised: 06/13/2018] [Accepted: 06/21/2018] [Indexed: 01/29/2023]
Abstract
Advances in analytical instrumentation have provided the possibility of examining thousands of genes, peptides, or metabolites in parallel. However, the cost and time-consuming data acquisition process causes a generalized lack of samples. From a data analysis perspective, omics data are characterized by high dimensionality and small sample counts. In many scenarios, the analytical aim is to differentiate between two different conditions or classes combining an analytical method plus a tailored qualitative predictive model using available examples collected in a dataset. For this purpose, partial least squares-discriminant analysis (PLS-DA) is frequently employed in omics research. Recently, there has been growing concern about the uncritical use of this method, since it is prone to overfitting and may aggravate problems of false discoveries. In many applications involving a small number of subjects or samples, predictive model performance estimation is only based on cross-validation (CV) results with a strong preference for reporting results using leave one out (LOO). The combination of PLS-DA for high dimensionality data and small sample conditions, together with a weak validation methodology is a recipe for unreliable estimations of model performance. In this work, we present a systematic study about the impact of the dataset size, the dimensionality, and the CV technique used on PLS-DA overoptimism when performance estimation is done in cross-validation. Firstly, by using synthetic data generated from a same probability distribution and with assigned random binary labels, we have obtained a dataset where the true classification rate (CR) is 50%. As expected, our results confirm that internal validation provides overoptimistic estimations of the classification accuracy (i.e., overfitting). We have characterized the CR estimator in terms of bias and variance depending on the internal CV technique used and sample to dimensionality ratio. In small sample conditions, due to the large bias and variance of the estimator, the occurrence of extremely good CRs is common. We have found that overfitting peaks when the sample size in the training subset approaches the feature vector dimensionality minus one. In these conditions, the models are neither under- or overdetermined with a unique solution. This effect is particularly intense for LOO and peaks higher in small sample conditions. Overoptimism is decreased beyond this point where the abundance of noisy produces a regularization effect leading to less complex models. In terms of overfitting, our study ranks CV methods as follows: Bootstrap produces the most accurate estimator of the CR, followed by bootstrapped Latin partitions, random subsampling, K-Fold, and finally, the very popular LOO provides the worst results. Simulation results are further confirmed in real datasets from mass spectrometry and microarrays.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Signal and Information Processing for Sensing Systems, Institute for Bioengineering of Catalonia, The Barcelona Institute for Science and Technology, Baldiri Reixac 4-8, 08028, Barcelona, Spain
| | - Luis Fernández
- Signal and Information Processing for Sensing Systems, Institute for Bioengineering of Catalonia, The Barcelona Institute for Science and Technology, Baldiri Reixac 4-8, 08028, Barcelona, Spain.,Department of Electronics and Biomedical Engineering, University of Barcelona, Martí i Franqués 1, 08028, Barcelona, Spain
| | - Santiago Marco
- Signal and Information Processing for Sensing Systems, Institute for Bioengineering of Catalonia, The Barcelona Institute for Science and Technology, Baldiri Reixac 4-8, 08028, Barcelona, Spain. .,Department of Electronics and Biomedical Engineering, University of Barcelona, Martí i Franqués 1, 08028, Barcelona, Spain.
| |
Collapse
|
25
|
Rodríguez-Pérez R, Miyao T, Jasial S, Vogt M, Bajorath J. Prediction of Compound Profiling Matrices Using Machine Learning. ACS Omega 2018; 3:4713-4723. [PMID: 30023899 PMCID: PMC6045364 DOI: 10.1021/acsomega.8b00462] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 04/20/2018] [Indexed: 05/25/2023]
Abstract
Screening of compound libraries against panels of targets yields profiling matrices. Such matrices typically contain structurally diverse screening compounds, large numbers of inactives, and small numbers of hits per assay. As such, they represent interesting and challenging test cases for computational screening and activity predictions. In this work, modeling of large compound profiling matrices was attempted that were extracted from publicly available screening data. Different machine learning methods including deep learning were compared and different prediction strategies explored. Prediction accuracy varied for assays with different numbers of active compounds, and alternative machine learning approaches often produced comparable results. Deep learning did not further increase the prediction accuracy of standard methods such as random forests or support vector machines. Target-based random forest models were prioritized and yielded successful predictions of active compounds for many assays.
Collapse
|
26
|
Rodríguez-Pérez R, Cortés R, Guamán A, Pardo A, Torralba Y, Gómez F, Roca J, Barberà JA, Cascante M, Marco S. Instrumental drift removal in GC-MS data for breath analysis: the short-term and long-term temporal validation of putative biomarkers for COPD. J Breath Res 2018; 12:036007. [PMID: 29292699 DOI: 10.1088/1752-7163/aaa492] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Breath analysis holds the promise of a non-invasive technique for the diagnosis of diverse respiratory conditions including chronic obstructive pulmonary disease (COPD) and lung cancer. Breath contains small metabolites that may be putative biomarkers of these conditions. However, the discovery of reliable biomarkers is a considerable challenge in the presence of both clinical and instrumental confounding factors. Among the latter, instrumental time drifts are highly relevant, as since question the short and long-term validity of predictive models. In this work we present a methodology to counter instrumental drifts using information from interleaved blanks for a case study of GC-MS data from breath samples. The proposed method includes feature filtering, and additive, multiplicative and multivariate drift corrections, the latter being based on component correction. Biomarker discovery was based on genetic algorithms in a filter configuration using Fisher's ratio computed in the partial least squares-discriminant analysis subspace as a figure of merit. Using our protocol, we have been able to find nine peaks that provide a statistically significant area under the ROC curve of 0.75 for COPD discrimination. The method developed has been successfully validated using blind samples in short-term temporal validation. However, the attempt to use this model for patient screening six months later was not successful. This negative result highlights the importance of increasing validation rigor when reporting biomarker discovery results.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Signal and Information Processing for Sensing Systems, Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Rodríguez-Pérez R, Vogt M, Bajorath J. Support Vector Machine Classification and Regression Prioritize Different Structural Features for Binary Compound Activity and Potency Value Prediction. ACS Omega 2017; 2:6371-6379. [PMID: 30023518 PMCID: PMC6045367 DOI: 10.1021/acsomega.7b01079] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Accepted: 09/22/2017] [Indexed: 05/15/2023]
Abstract
In computational chemistry and chemoinformatics, the support vector machine (SVM) algorithm is among the most widely used machine learning methods for the identification of new active compounds. In addition, support vector regression (SVR) has become a preferred approach for modeling nonlinear structure-activity relationships and predicting compound potency values. For the closely related SVM and SVR methods, fingerprints (i.e., bit string or feature set representations of chemical structure and properties) are generally preferred descriptors. Herein, we have compared SVM and SVR calculations for the same compound data sets to evaluate which features are responsible for predictions. On the basis of systematic feature weight analysis, rather surprising results were obtained. Fingerprint features were frequently identified that contributed differently to the corresponding SVM and SVR models. The overlap between feature sets determining the predictive performance of SVM and SVR was only very small. Furthermore, features were identified that had opposite effects on SVM and SVR predictions. Feature weight analysis in combination with feature mapping made it also possible to interpret individual predictions, thus balancing the black box character of SVM/SVR modeling.
Collapse
|
28
|
Rodríguez-Pérez R, Vogt M, Bajorath J. Influence of Varying Training Set Composition and Size on Support Vector Machine-Based Prediction of Active Compounds. J Chem Inf Model 2017; 57:710-716. [PMID: 28376613 PMCID: PMC5417594 DOI: 10.1021/acs.jcim.7b00088] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Support
vector machine (SVM) modeling is one of the most popular
machine learning approaches in chemoinformatics and drug design. The
influence of training set composition and size on predictions currently
is an underinvestigated issue in SVM modeling. In this study, we have
derived SVM classification and ranking models for a variety of compound
activity classes under systematic variation of the number of positive
and negative training examples. With increasing numbers of negative
training compounds, SVM classification calculations became increasingly
accurate and stable. However, this was only the case if a required
threshold of positive training examples was also reached. In addition,
consideration of class weights and optimization of cost factors substantially
aided in balancing the calculations for increasing numbers of negative
training examples. Taken together, the results of our analysis have
practical implications for SVM learning and the prediction of active
compounds. For all compound classes under study, top recall performance
and independence of compound recall of training set composition was
achieved when 250–500 active and 500–1000 randomly selected
inactive training instances were used. However, as long as ∼50
known active compounds were available for training, increasing numbers of 500–1000
randomly selected negative training examples significantly improved
model performance and gave very similar results for different training
sets.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität , Dahlmannstrasse 2, D-53113 Bonn, Germany
| | - Martin Vogt
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität , Dahlmannstrasse 2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität , Dahlmannstrasse 2, D-53113 Bonn, Germany
| |
Collapse
|
29
|
García-Urquijo A, Rodríguez-Rodríguez J, Rodríguez-Pérez R, Lorenzo-Manzanas, Hernández-González G. Staphylococcus aureus en quemaduras: estudio de incidencia, tendencia y pronóstico. Cir plást iberolatinoam 2015. [DOI: 10.4321/s0376-78922015000200002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
30
|
Garcia Alonso M, Caballero ML, Umpierrez A, Lluch-Bernal M, Knaute T, Rodríguez-Pérez R. Relationships between T cell and IgE/IgG4 epitopes of the Anisakis simplex major allergen Ani s 1. Clin Exp Allergy 2015; 45:994-1005. [DOI: 10.1111/cea.12474] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Revised: 11/05/2014] [Accepted: 12/07/2014] [Indexed: 02/06/2023]
Affiliation(s)
- M. Garcia Alonso
- Hospital La Paz Institute for Health Research; IdiPaz; Madrid Spain
| | | | - A. Umpierrez
- Allergy Department; Hospital La Paz; IdiPaz; Madrid Spain
| | | | - T. Knaute
- JPT Peptide Technologies; Berlin Germany
| | | |
Collapse
|
31
|
Mauriz E, Laliena A, Vallejo D, Tuñón MJ, Rodríguez-López JM, Rodríguez-Pérez R, García-Fernández MC. Effects of a low-fat diet with antioxidant supplementation on biochemical markers of multiple sclerosis long-term care residents. NUTR HOSP 2013; 28:2229-35. [PMID: 24506405 DOI: 10.3305/nutr hosp.v28in06.6983] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
INTRODUCTION Multiple sclerosis (MS) treatment options are primarily limited to immunomodulatory therapies in MS non-progressive forms. Nutrition intervention studies suggest that diet may be considered as a complementary treatment to control disease progression. Therefore, dietary intervention may help to improve wellness and ameliorate symptoms of MS patients. OBJECTIVES To assess the effect of a low-fat diet with antioxidant supplementation on biochemical markers of institutionalized patients with progressive forms of multiple sclerosis. METHODS A randomized prospective placebo-controlled study involving 9 participants, 5 of them assigned to the intervention group (low-fat diet and antioxidant supplementation) and the other 4 to the placebo group (low-fat diet). The effect of the dietary intervention, involving diet modification and antioxidant supplementation, was examined for 42 days by measuring anthropometric, biochemical parameters and oxidative stress markers in blood at baseline (day 0), intermediate (day 15) and end (day 42) stages of the treatment. RESULTS The intervention group obtained C reactive protein levels significantly lower than those observed in the corresponding placebo group at the end of the study. Oxidative stress and inflammatory markers isoprostane 8-iso-PGF2α and interleukine IL-6 values also diminished after dietary intervention in the intervention group. Catalase activity increased significantly in the intervention group prior antioxidant supplementation. No significant differences were observed in other oxidative stress markers. CONCLUSIONS The results suggest that diet and dietary supplements are involved in cell metabolism modulation and MS-related inflammatory processes. Consequently, low fat diets and antioxidant supplements may be used as complementary therapies for treatment of multiple sclerosis.
Collapse
Affiliation(s)
- Elba Mauriz
- Institute of Food Science and Technology (ICTAL). University of León. Spain. State Reference Centre (CRE) of Disability and Dependency. San Andrés del Rabanedo. León. Spain..
| | - A Laliena
- Institute of Biomedicine (IBIOMED), University of León. Spain
| | - D Vallejo
- Institute of Biomedicine (IBIOMED), University of León. Spain
| | - M J Tuñón
- Institute of Biomedicine (IBIOMED), University of León. Spain
| | | | | | | |
Collapse
|
32
|
Iparraguirre A, Rodríguez-Pérez R, Juste S, Ledesma A, Moneo I, Caballero ML. Selective allergy to lobster in a case of primary sensitization to house dust mites. J Investig Allergol Clin Immunol 2009; 19:409-413. [PMID: 19862942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023] Open
Abstract
Allergy to only 1 kind of seafood is uncommon. We report a case of selective allergy to lobster. We studied a 30-year-old man who suffered generalized urticaria, facial erythema, and pharyngeal pruritus after eating lobster. He had a more than 10-year history of mild persistent asthma and sensitization to house dust mites. The study was performed by skin prick test, and prick-prick test, oral food challenge, specific immunoglobulin (Ig) E determinations by CAP (Phadia, Uppsala, Sweden) and ADVIA-Centaur (ALK-Abelló, Madrid, Spain), and IgE-immunoblotting. The patient's serum recognized 2 allergens of around 198 kDa and 2 allergens of around 65 kDa from the lobster extract, allergens of around 15, 90, and 120 kDa from Dermatophagoides pteronyssinus extract, and allergens of around 15 and 65 kDa from Dermatophagoides farinae extract. Serum did not recognize purified shrimp tropomyosin. Immunoblot-inhibition assay results indicated cross-reactivity between lobster and mite allergens. This is the first report of selective allergy to lobster.
Collapse
Affiliation(s)
- A Iparraguirre
- Department of Allergology, Hospital General Yagüe, Burgos, Spain
| | | | | | | | | | | |
Collapse
|
33
|
Bascones O, Rodríguez-Pérez R, Juste S, Moneo I, Caballero ML. Lettuce-induced anaphylaxis. Identification of the allergen involved. J Investig Allergol Clin Immunol 2009; 19:154-7. [PMID: 19476020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023] Open
Abstract
BACKGROUND Only 2 allergenic proteins have been described in lettuce allergy: a 16-kDa protein (putative profilin) and a lipid transfer protein (LTP) named Lac s 1. OBJECTIVE Our aim was to identify the allergens involved in the anaphylactic reactions of 2 patients who had eaten lettuce. METHODS The study was performed by Ig (immunoglobulin)-E immunodetection and immunodetection-inhibition assays. RESULTS Both patients' sera showed specific IgE binding to a single protein from the crude lettuce extract (apparent molecular weight of 14 kDa). To characterize the allergen detected, the lettuce extract underwent proteolytic digestion and heat treatment and was highly resistant to both. The patients' sera also recognized the major peach allergen Pru p 3 by immunodetection. When the lettuce allergen was incubated with both Pru p 3 from peach peel and recombinant Pru p 3, the immunodetection-inhibition assay indicated that patients were sensitized to the lettuce LTP Lac s 1. CONCLUSIONS The allergen involved in the lettuce-induced anaphylaxis of our patients was the LTP Lac s 1.
Collapse
Affiliation(s)
- O Bascones
- Department of Allergology, Hospital General Yagüe, Burgos, Spain
| | | | | | | | | |
Collapse
|
34
|
Vicente-Serrano J, Caballero ML, Rodríguez-Pérez R, Carretero P, Pérez R, Blanco JG, Juste S, Moneo I. Sensitization to serum albumins in children allergic to cow's milk and epithelia. Pediatr Allergy Immunol 2007; 18:503-7. [PMID: 17680908 DOI: 10.1111/j.1399-3038.2007.00548.x] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Patients with persistent milk allergy and specific immunoglobulin E (IgE) to bovine serum albumin (BSA) have a greater risk of rhinoconjunctivitis and asthma because of animal dander. To prove the cross-reactivity between serum albumin (SA) of different mammals in milk, meat, and epithelia and determine if heat treatment of meats decrease the allergenicity of albumins. The study was performed using SDS-PAGE and IgE-immunoblotting using sera from eight patients sensitized to milk, BSA, and animal danders. Sera from non-allergic and only animal dander allergic subjects served as a control. With one exception, all patients' sera recognized SA in different meats (beef, lamb, deer, and pork), epithelia (dog, cat, and cow), and cow's milk. Some patients even were only sensitized to SA in meat and epithelia. Danders' allergic only recognized other proteins in epithelia but not SA. No patients reacted to SA from heated meat extracts. Serum albumin is an important allergen involved in milk, meat, and epithelia allergy. The first contact with SA was through cow's milk and patients developed sensitization to epithelia SA even without direct contact with animals. Patients with both BSA and cow's milk allergy must avoid raw meats and furry pets.
Collapse
|
35
|
Zurera-Cosano G, García-Gimeno R, Rodríguez-Pérez R, Hervás-Martínez C. Performance of response surface model for prediction of Leuconostoc mesenteroides growth parameters under different experimental conditions. Food Control 2006. [DOI: 10.1016/j.foodcont.2005.02.003] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
36
|
García-Gimeno RM, Hervás-Martínez C, Rodríguez-Pérez R, Zurera-Cosano G. Modelling the growth of Leuconostoc mesenteroides by Artificial Neural Networks. Int J Food Microbiol 2005; 105:317-32. [PMID: 16054719 DOI: 10.1016/j.ijfoodmicro.2005.04.013] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2004] [Accepted: 04/18/2005] [Indexed: 11/30/2022]
Abstract
The combined effect of temperature (10.5 to 24.5 degrees C), pH level (5.5 to 7.5), sodium chloride level (0.25% to 6.25%) and sodium nitrite level (0 to 200 ppm) on the predicted specific growth rate (Gr), lag-time (Lag) and maximum population density (yEnd) of Leuconostoc mesenteroides under aerobic and anaerobic conditions, was studied using an Artificial Neural Network-based model (ANN) in comparison with Response Surface Methodology (RS). For both aerobic and anaerobic conditions, two types of ANN model were elaborated, unidimensional for each of the growth parameters, and multidimensional in which the three parameters Gr, Lag, and yEnd are combined. Although in general no significant statistical differences were observed between both types of model, we opted for the unidimensional model, because it obtained the lowest mean value for the standard error of prediction for generalisation. The ANN models developed provided reliable estimates for the three kinetic parameters studied; the SEP values in aerobic conditions ranged from between 2.82% for Gr, 6.05% for Lag and 10% for yEnd, a higher degree accuracy than those of the RS model (Gr: 9.54%; Lag: 8.89%; yEnd: 10.27%). Similar results were observed for anaerobic conditions. During external validation, a higher degree of accuracy (Af) and bias (Bf) were observed for the ANN model compared with the RS model. ANN predictive growth models are a valuable tool, enabling swift determination of L. mesenteroides growth parameters.
Collapse
Affiliation(s)
- R M García-Gimeno
- Department of Food Science and Technology, University of Córdoba, Campus Rabanales, Edif. Darwin, 14014 Córdoba, Spain.
| | | | | | | |
Collapse
|
37
|
Pérez-Guillé G, Camacho-Vieyra A, Toledo-López A, Guillé-Pérez A, Flores-Pérez J, Rodríguez-Pérez R, Juárez-Olguín H, Lares-Asseff I. Patterns of drug consumption in relation with the pathologies of elderly Mexican subjects resident in nursing homes. J Pharm Pharm Sci 2001; 4:159-66. [PMID: 11466173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/20/2023]
Abstract
PURPOSE To describe the patterns of drugs consumed by the male and female elderly living in Mexican private and public nursing homes. METHODS Three hundred and fifty elderly participants from four nursing homes (2 private and 2 public) were selected for the six month study: 108 subjects were excluded; the remaining 242 were between 65 and 100 years old; 123 were females and 119 males. A complete clinical history was taken and clinical files were reviewed. RESULTS Of the 242 elderly studied, 193 took diverse medications and 28.5% were at risk of some type of drug interaction. The groups of drugs more frequently consumed were vitamins and anti-anemic medications, followed by cardiovascular drugs. Females consumed greater number of drugs. They also consumed more drugs simultaneously. CONCLUSIONS There is a need to monitor the elderly for their drugs pattern use.
Collapse
Affiliation(s)
- G Pérez-Guillé
- Department of Pharmacology and Toxicology, "Dr. Joaquín Cravioto" Research Tower - SSA, Mexico City
| | | | | | | | | | | | | | | |
Collapse
|