1
|
Jmel H, Sarno S, Giuliani C, Boukhalfa W, Abdelhak S, Luiselli D, Kefi R. Genetic diversity of variants involved in drug response among Tunisian and Italian populations toward personalized medicine. Sci Rep 2024; 14:5842. [PMID: 38462643 PMCID: PMC10925599 DOI: 10.1038/s41598-024-55239-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 02/21/2024] [Indexed: 03/12/2024] Open
Abstract
Adverse drug reactions (ADR) represent a significant contributor to morbidity and mortality, imposing a substantial financial burden. Genetic ancestry plays a crucial role in drug response. The aim of this study is to characterize the genetic variability of selected pharmacogenes involved with ADR in Tunisians and Italians, with a comparative analysis against global populations. A cohort of 135 healthy Tunisians and 737 Italians were genotyped using a SNP array. Variants located in 25 Very Important Pharmacogenes implicated in ADR were extracted from the genotyping data. Distribution analysis of common variants in Tunisian and Italian populations in comparison to 24 publicly available worldwide populations was performed using PLINK and R software. Results from Principle Component and ADMIXTURE analyses showed a high genetic similarity among Mediterranean populations, distinguishing them from Sub-Saharan African and Asian populations. The Fst comparative analysis identified 27 variants exhibiting significant differentiation between the studied populations. Among these variants, four SNPs rs622342, rs3846662, rs7294, rs5215 located in SLC22A1, HMGCR, VKORC1 and KCNJ11 genes respectively, are reported to be associated with ethnic variability in drug responses. In conclusion, correlating the frequencies of genotype risk variants with their associated ADRs would enhance drug outcomes and the implementation of personalized medicine in the studied populations.
Collapse
Affiliation(s)
- Haifa Jmel
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, Tunis, Tunisia
- University of Tunis El Manar, Tunis, Tunisia
- Genetic Typing DNA Service Pasteur Institute, Institut Pasteur de Tunis, Tunis, Tunisia
| | - Stefania Sarno
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences (BiGeA), University of Bologna, Bologna, Italy
| | - Cristina Giuliani
- Laboratory of Molecular Anthropology & Centre for Genome Biology, Department of Biological, Geological and Environmental Sciences (BiGeA), University of Bologna, Bologna, Italy
| | - Wided Boukhalfa
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, Tunis, Tunisia
- University of Tunis El Manar, Tunis, Tunisia
| | - Sonia Abdelhak
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, Tunis, Tunisia
- University of Tunis El Manar, Tunis, Tunisia
| | - Donata Luiselli
- Laboratory of Ancient DNA (aDNALab), Department of Cultural Heritage (DBC), University of Bologna, Ravenna, Italy
| | - Rym Kefi
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, Tunis, Tunisia.
- University of Tunis El Manar, Tunis, Tunisia.
- Genetic Typing DNA Service Pasteur Institute, Institut Pasteur de Tunis, Tunis, Tunisia.
| |
Collapse
|
2
|
Bresso E, Monnin P, Bousquet C, Calvier FE, Ndiaye NC, Petitpain N, Smaïl-Tabbone M, Coulet A. Investigating ADR mechanisms with Explainable AI: a feasibility study with knowledge graph mining. BMC Med Inform Decis Mak 2021; 21:171. [PMID: 34039343 PMCID: PMC8157660 DOI: 10.1186/s12911-021-01518-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 05/05/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Adverse drug reactions (ADRs) are statistically characterized within randomized clinical trials and postmarketing pharmacovigilance, but their molecular mechanism remains unknown in most cases. This is true even for hepatic or skin toxicities, which are classically monitored during drug design. Aside from clinical trials, many elements of knowledge about drug ingredients are available in open-access knowledge graphs, such as their properties, interactions, or involvements in pathways. In addition, drug classifications that label drugs as either causative or not for several ADRs, have been established. METHODS We propose in this paper to mine knowledge graphs for identifying biomolecular features that may enable automatically reproducing expert classifications that distinguish drugs causative or not for a given type of ADR. In an Explainable AI perspective, we explore simple classification techniques such as Decision Trees and Classification Rules because they provide human-readable models, which explain the classification itself, but may also provide elements of explanation for molecular mechanisms behind ADRs. In summary, (1) we mine a knowledge graph for features; (2) we train classifiers at distinguishing, on the basis of extracted features, drugs associated or not with two commonly monitored ADRs: drug-induced liver injuries (DILI) and severe cutaneous adverse reactions (SCAR); (3) we isolate features that are both efficient in reproducing expert classifications and interpretable by experts (i.e., Gene Ontology terms, drug targets, or pathway names); and (4) we manually evaluate in a mini-study how they may be explanatory. RESULTS Extracted features reproduce with a good fidelity classifications of drugs causative or not for DILI and SCAR (Accuracy = 0.74 and 0.81, respectively). Experts fully agreed that 73% and 38% of the most discriminative features are possibly explanatory for DILI and SCAR, respectively; and partially agreed (2/3) for 90% and 77% of them. CONCLUSION Knowledge graphs provide sufficiently diverse features to enable simple and explainable models to distinguish between drugs that are causative or not for ADRs. In addition to explaining classifications, most discriminative features appear to be good candidates for investigating ADR mechanisms further.
Collapse
Affiliation(s)
- Emmanuel Bresso
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, France
- Centre d’Investigations Cliniques Plurithématique 1433, Inserm 1116, CHRU de Nancy, Université de Lorraine, Nancy, France
| | - Pierre Monnin
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, France
- Orange, Belfort, France
| | - Cédric Bousquet
- Service de santé publique et information médicale, CHU de Saint Etienne, Saint Etienne, France
- Sorbonne Université, Inserm, Université Paris 13, LIMICS, Paris, France
| | - François-Elie Calvier
- Service de santé publique et information médicale, CHU de Saint Etienne, Saint Etienne, France
| | | | - Nadine Petitpain
- Centre Régional de Pharmacovigilance, CHRU of Nancy, Nancy, France
| | | | - Adrien Coulet
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, France
- Inria Paris, Paris, France
- Centre de Recherche des Cordeliers, INSERM, Sorbonne Université, Université de Paris, Paris, France
| |
Collapse
|
3
|
Kang H, Li J, Wu M, Shen L, Hou L. Building a Pharmacogenomics Knowledge Model Toward Precision Medicine: Case Study in Melanoma. JMIR Med Inform 2020; 8:e20291. [PMID: 33084582 PMCID: PMC7641779 DOI: 10.2196/20291] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 08/11/2020] [Accepted: 09/13/2020] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Many drugs do not work the same way for everyone owing to distinctions in their genes. Pharmacogenomics (PGx) aims to understand how genetic variants influence drug efficacy and toxicity. It is often considered one of the most actionable areas of the personalized medicine paradigm. However, little prior work has included in-depth explorations and descriptions of drug usage, dosage adjustment, and so on. OBJECTIVE We present a pharmacogenomics knowledge model to discover the hidden relationships between PGx entities such as drugs, genes, and diseases, especially details in precise medication. METHODS PGx open data such as DrugBank and RxNorm were integrated in this study, as well as drug labels published by the US Food and Drug Administration. We annotated 190 drug labels manually for entities and relationships. Based on the annotation results, we trained 3 different natural language processing models to complete entity recognition. Finally, the pharmacogenomics knowledge model was described in detail. RESULTS In entity recognition tasks, the Bidirectional Encoder Representations from Transformers-conditional random field model achieved better performance with micro-F1 score of 85.12%. The pharmacogenomics knowledge model in our study included 5 semantic types: drug, gene, disease, precise medication (population, daily dose, dose form, frequency, etc), and adverse reaction. Meanwhile, 26 semantic relationships were defined in detail. Taking melanoma caused by a BRAF gene mutation into consideration, the pharmacogenomics knowledge model covered 7 related drugs and 4846 triples were established in this case. All the corpora, relationship definitions, and triples were made publically available. CONCLUSIONS We highlighted the pharmacogenomics knowledge model as a scalable framework for clinicians and clinical pharmacists to adjust drug dosage according to patient-specific genetic variation, and for pharmaceutical researchers to develop new drugs. In the future, a series of other antitumor drugs and automatic relation extractions will be taken into consideration to further enhance our framework with more PGx linked data.
Collapse
Affiliation(s)
- Hongyu Kang
- Institute of Medical Information &Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China
- Department of Biomedical Engineering, School of Life Science, Beijing Institute of Technology, Beijing, China
| | - Jiao Li
- Institute of Medical Information &Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China
| | - Meng Wu
- Institute of Medical Information &Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China
| | - Liu Shen
- Institute of Medical Information &Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China
| | - Li Hou
- Institute of Medical Information &Library, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China
| |
Collapse
|
4
|
Besso MJ, Montivero L, Lacunza E, Argibay MC, Abba M, Furlong LI, Colas E, Gil-Moreno A, Reventos J, Bello R, Vazquez-Levin MH. Identification of early stage recurrence endometrial cancer biomarkers using bioinformatics tools. Oncol Rep 2020; 44:873-886. [PMID: 32705231 PMCID: PMC7388212 DOI: 10.3892/or.2020.7648] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 04/22/2020] [Indexed: 01/08/2023] Open
Abstract
Endometrial cancer (EC) is the sixth most common cancer in women worldwide. Early diagnosis is critical in recurrent EC management. The present study aimed to identify biomarkers of EC early recurrence using a workflow that combined text and data mining databases (DisGeNET, Gene Expression Omnibus), a prioritization algorithm to select a set of putative candidates (ToppGene), protein-protein interaction network analyses (Search Tool for the Retrieval of Interacting Genes, cytoHubba), association analysis of selected genes with clinicopathological parameters, and survival analysis (Kaplan-Meier and Cox proportional hazard ratio analyses) using a The Cancer Genome Atlas cohort. A total of 10 genes were identified, among which the targeting protein for Xklp2 (TPX2) was the most promising independent prognostic biomarker in stage I EC. TPX2 expression (mRNA and protein) was higher (P<0.0001 and P<0.001, respectively) in ETS variant transcription factor 5-overexpressing Hec1a and Ishikawa cells, a previously reported cell model of aggressive stage I EC. In EC biopsies, TPX2 mRNA expression levels were higher (P<0.05) in high grade tumors (grade 3) compared with grade 1–2 tumors (P<0.05), in tumors with deep myometrial invasion (>50% compared with <50%; P<0.01), and in intermediate-high recurrence risk tumors compared with low-risk tumors (P<0.05). Further validation studies in larger and independent EC cohorts will contribute to confirm the prognostic value of TPX2.
Collapse
Affiliation(s)
- María José Besso
- Laboratorio de Estudios de Interacción Celular en Reproducción y Cáncer, Instituto de Biología y Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas de Argentina (CONICET)‑Fundación IBYME (FIBYME), Ciudad Autónoma de Buenos Aires 1428ADN, Argentina
| | - Luciana Montivero
- Laboratorio de Estudios de Interacción Celular en Reproducción y Cáncer, Instituto de Biología y Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas de Argentina (CONICET)‑Fundación IBYME (FIBYME), Ciudad Autónoma de Buenos Aires 1428ADN, Argentina
| | - Ezequiel Lacunza
- Centro de Investigaciones Inmunológicas, Básicas y Aplicadas, Facultad de Ciencias Médicas, Universidad Nacional de La Plata, La Plata, Buenos Aires 1900, Argentina
| | - María Cecilia Argibay
- Laboratorio de Estudios de Interacción Celular en Reproducción y Cáncer, Instituto de Biología y Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas de Argentina (CONICET)‑Fundación IBYME (FIBYME), Ciudad Autónoma de Buenos Aires 1428ADN, Argentina
| | - Martín Abba
- Centro de Investigaciones Inmunológicas, Básicas y Aplicadas, Facultad de Ciencias Médicas, Universidad Nacional de La Plata, La Plata, Buenos Aires 1900, Argentina
| | - Laura Inés Furlong
- Integrative Biomedical Informatics Group, Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, 08002 Barcelona, Spain
| | - Eva Colas
- Biomedical Research Group in Gynecology, Vall d´Hebron Research Institute (VHIR), Universitat Autónoma de Barcelona, CIBERONC, 08035 Barcelona, Spain
| | - Antonio Gil-Moreno
- Biomedical Research Group in Gynecology, Vall d´Hebron Research Institute (VHIR), Universitat Autónoma de Barcelona, CIBERONC, 08035 Barcelona, Spain
| | - Jaume Reventos
- Biomedical Research Group in Gynecology, Vall d´Hebron Research Institute (VHIR), Universitat Autónoma de Barcelona, CIBERONC, 08035 Barcelona, Spain
| | - Ricardo Bello
- Departamento de Metodología, Estadística y Matemática, Universidad de Tres de Febrero, Sáenz Peña, Buenos Aires B1674AHF, Argentina
| | - Mónica Hebe Vazquez-Levin
- Laboratorio de Estudios de Interacción Celular en Reproducción y Cáncer, Instituto de Biología y Medicina Experimental (IBYME), Consejo Nacional de Investigaciones Científicas y Técnicas de Argentina (CONICET)‑Fundación IBYME (FIBYME), Ciudad Autónoma de Buenos Aires 1428ADN, Argentina
| |
Collapse
|
5
|
Monnin P, Legrand J, Husson G, Ringot P, Tchechmedjiev A, Jonquet C, Napoli A, Coulet A. PGxO and PGxLOD: a reconciliation of pharmacogenomic knowledge of various provenances, enabling further comparison. BMC Bioinformatics 2019; 20:139. [PMID: 30999867 PMCID: PMC6471679 DOI: 10.1186/s12859-019-2693-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
Background Pharmacogenomics (PGx) studies how genomic variations impact variations in drug response phenotypes. Knowledge in pharmacogenomics is typically composed of units that have the form of ternary relationships gene variant – drug – adverse event. Such a relationship states that an adverse event may occur for patients having the specified gene variant and being exposed to the specified drug. State-of-the-art knowledge in PGx is mainly available in reference databases such as PharmGKB and reported in scientific biomedical literature. But, PGx knowledge can also be discovered from clinical data, such as Electronic Health Records (EHRs), and in this case, may either correspond to new knowledge or confirm state-of-the-art knowledge that lacks “clinical counterpart” or validation. For this reason, there is a need for automatic comparison of knowledge units from distinct sources. Results In this article, we propose an approach, based on Semantic Web technologies, to represent and compare PGx knowledge units. To this end, we developed PGxO, a simple ontology that represents PGx knowledge units and their components. Combined with PROV-O, an ontology developed by the W3C to represent provenance information, PGxO enables encoding and associating provenance information to PGx relationships. Additionally, we introduce a set of rules to reconcile PGx knowledge, i.e. to identify when two relationships, potentially expressed using different vocabularies and levels of granularity, refer to the same, or to different knowledge units. We evaluated our ontology and rules by populating PGxO with knowledge units extracted from PharmGKB (2701), the literature (65,720) and from discoveries reported in EHR analysis studies (only 10, manually extracted); and by testing their similarity. We called PGxLOD (PGx Linked Open Data) the resulting knowledge base that represents and reconciles knowledge units of those various origins. Conclusions The proposed ontology and reconciliation rules constitute a first step toward a more complete framework for knowledge comparison in PGx. In this direction, the experimental instantiation of PGxO, named PGxLOD, illustrates the ability and difficulties of reconciling various existing knowledge sources. Electronic supplementary material The online version of this article (10.1186/s12859-019-2693-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Pierre Monnin
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France.
| | - Joël Legrand
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | - Graziella Husson
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | - Patrice Ringot
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | | | - Clément Jonquet
- LIRMM, Université de Montpellier, CNRS, Montpellier, 34095, France.,Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, 94305, California, USA
| | - Amedeo Napoli
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | - Adrien Coulet
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France.,Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, 94305, California, USA
| |
Collapse
|
6
|
Abstract
Recent advances in technology have led to the exponential growth of scientific literature in biomedical sciences. This rapid increase in information has surpassed the threshold for manual curation efforts, necessitating the use of text mining approaches in the field of life sciences. One such application of text mining is in fostering in silico drug discovery such as drug target screening, pharmacogenomics, adverse drug event detection, etc. This chapter serves as an introduction to the applications of various text mining approaches in drug discovery. It is divided into two parts with the first half as an overview of text mining in the biosciences. The second half of the chapter reviews strategies and methods for four unique applications of text mining in drug discovery.
Collapse
Affiliation(s)
- Si Zheng
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Shazia Dharssi
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, USA
| | - Meng Wu
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Jiao Li
- Institute of Medical Information and Library, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, USA.
| |
Collapse
|
7
|
Vlietstra WJ, Vos R, Sijbers AM, van Mulligen EM, Kors JA. Using predicate and provenance information from a knowledge graph for drug efficacy screening. J Biomed Semantics 2018; 9:23. [PMID: 30189889 PMCID: PMC6127943 DOI: 10.1186/s13326-018-0189-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 08/01/2018] [Indexed: 12/11/2022] Open
Abstract
Background Biomedical knowledge graphs have become important tools to computationally analyse the comprehensive body of biomedical knowledge. They represent knowledge as subject-predicate-object triples, in which the predicate indicates the relationship between subject and object. A triple can also contain provenance information, which consists of references to the sources of the triple (e.g. scientific publications or database entries). Knowledge graphs have been used to classify drug-disease pairs for drug efficacy screening, but existing computational methods have often ignored predicate and provenance information. Using this information, we aimed to develop a supervised machine learning classifier and determine the added value of predicate and provenance information for drug efficacy screening. To ensure the biological plausibility of our method we performed our research on the protein level, where drugs are represented by their drug target proteins, and diseases by their disease proteins. Results Using random forests with repeated 10-fold cross-validation, our method achieved an area under the ROC curve (AUC) of 78.1% and 74.3% for two reference sets. We benchmarked against a state-of-the-art knowledge-graph technique that does not use predicate and provenance information, obtaining AUCs of 65.6% and 64.6%, respectively. Classifiers that only used predicate information performed superior to classifiers that only used provenance information, but using both performed best. Conclusion We conclude that both predicate and provenance information provide added value for drug efficacy screening. Electronic supplementary material The online version of this article (10.1186/s13326-018-0189-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wytze J Vlietstra
- Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, 3015, GE, the Netherlands.
| | - Rein Vos
- Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, 3015, GE, the Netherlands.,Department of Methodology and Statistics, Maastricht University, Maastricht, 6200, MD, the Netherlands
| | - Anneke M Sijbers
- Centre for Molecular and Biomolecular Informatics, Radboudumc, Nijmegen, 6525, GA, the Netherlands
| | - Erik M van Mulligen
- Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, 3015, GE, the Netherlands
| | - Jan A Kors
- Department of Medical Informatics, Erasmus University Medical Centre, Rotterdam, 3015, GE, the Netherlands
| |
Collapse
|