1
|
Wang Y, Zuo J, Duan C, Peng H, Huang J, Zhao L, Zhang L, Dong Z. Large language models assisted multi-effect variants mining on cerebral cavernous malformation familial whole genome sequencing. Comput Struct Biotechnol J 2024; 23:843-858. [PMID: 38352937 PMCID: PMC10861960 DOI: 10.1016/j.csbj.2024.01.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 01/04/2024] [Accepted: 01/19/2024] [Indexed: 02/16/2024] Open
Abstract
Cerebral cavernous malformation (CCM) is a polygenic disease with intricate genetic interactions contributing to quantitative pathogenesis across multiple factors. The principal pathogenic genes of CCM, specifically KRIT1, CCM2, and PDCD10, have been reported, accompanied by a growing wealth of genetic data related to mutations. Furthermore, numerous other molecules associated with CCM have been unearthed. However, tackling such massive volumes of unstructured data remains challenging until the advent of advanced large language models. In this study, we developed an automated analytical pipeline specialized in single nucleotide variants (SNVs) related biomedical text analysis called BRLM. To facilitate this, BioBERT was employed to vectorize the rich information of SNVs, while a deep residue network was used to discriminate the classes of the SNVs. BRLM was initially constructed on mutations from 12 different types of TCGA cancers, achieving an accuracy exceeding 99%. It was further examined for CCM mutations in familial sequencing data analysis, highlighting an upstream master regulator gene fibroblast growth factor 1 (FGF1). With multi-omics characterization and validation in biological function, FGF1 demonstrated to play a significant role in the development of CCMs, which proved the effectiveness of our model. The BRLM web server is available at http://1.117.230.196.
Collapse
Affiliation(s)
- Yiqi Wang
- College of Biomedicine and Health, College of Life Science and Technology, Huazhong Agricultural University, No.1, Shizishan Street, Wuhan 430070, Hubei, China
- Center for Neurological Disease Research, Taihe Hospital, Hubei University of Medicine, No.32, Renmin South Road, Shiyan 442000, Hubei, China
- Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, No. 32, Renmin South Road, Shiyan 442000, Hubei, China
| | - Jinmei Zuo
- Physical Examination Center, Taihe Hospital, Hubei University of Medicine, No. 32, Renmin South Road, Shiyan 442000, Hubei, China
| | - Chao Duan
- College of Biomedicine and Health, College of Life Science and Technology, Huazhong Agricultural University, No.1, Shizishan Street, Wuhan 430070, Hubei, China
- Center for Neurological Disease Research, Taihe Hospital, Hubei University of Medicine, No.32, Renmin South Road, Shiyan 442000, Hubei, China
| | - Hao Peng
- Center for Neurological Disease Research, Taihe Hospital, Hubei University of Medicine, No.32, Renmin South Road, Shiyan 442000, Hubei, China
- Department of Neurosurgery, Taihe Hospital, Hubei University of Medicine, No.32, Renmin South Road, Shiyan 442000, Hubei, China
| | - Jia Huang
- The Second Clinical Medical College, Lanzhou University, No. 222, South Tianshui Road, Lanzhou 730030, Gansu, China
| | - Liang Zhao
- Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, No. 32, Renmin South Road, Shiyan 442000, Hubei, China
| | - Li Zhang
- Center for Neurological Disease Research, Taihe Hospital, Hubei University of Medicine, No.32, Renmin South Road, Shiyan 442000, Hubei, China
- Department of Neurosurgery, Taihe Hospital, Hubei University of Medicine, No.32, Renmin South Road, Shiyan 442000, Hubei, China
| | - Zhiqiang Dong
- College of Biomedicine and Health, College of Life Science and Technology, Huazhong Agricultural University, No.1, Shizishan Street, Wuhan 430070, Hubei, China
- Center for Neurological Disease Research, Taihe Hospital, Hubei University of Medicine, No.32, Renmin South Road, Shiyan 442000, Hubei, China
| |
Collapse
|
2
|
Gonzalez Hernandez F, Nguyen Q, Smith VC, Cordero JA, Ballester MR, Duran M, Solé A, Chotsiri P, Wattanakul T, Mundin G, Lilaonitkul W, Standing JF, Kloprogge F. Named entity recognition of pharmacokinetic parameters in the scientific literature. Sci Rep 2024; 14:23485. [PMID: 39379460 PMCID: PMC11461509 DOI: 10.1038/s41598-024-73338-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Accepted: 09/16/2024] [Indexed: 10/10/2024] Open
Abstract
The development of accurate predictions for a new drug's absorption, distribution, metabolism, and excretion profiles in the early stages of drug development is crucial due to high candidate failure rates. The absence of comprehensive, standardised, and updated pharmacokinetic (PK) repositories limits pre-clinical predictions and often requires searching through the scientific literature for PK parameter estimates from similar compounds. While text mining offers promising advancements in automatic PK parameter extraction, accurate Named Entity Recognition (NER) of PK terms remains a bottleneck due to limited resources. This work addresses this gap by introducing novel corpora and language models specifically designed for effective NER of PK parameters. Leveraging active learning approaches, we developed an annotated corpus containing over 4000 entity mentions found across the PK literature on PubMed. To identify the most effective model for PK NER, we fine-tuned and evaluated different NER architectures on our corpus. Fine-tuning BioBERT exhibited the best results, achieving a strict F 1 score of 90.37% in recognising PK parameter mentions, significantly outperforming heuristic approaches and models trained on existing corpora. To accelerate the development of end-to-end PK information extraction pipelines and improve pre-clinical PK predictions, the PK NER models and the labelled corpus were released open source at https://github.com/PKPDAI/PKNER .
Collapse
Affiliation(s)
| | - Quang Nguyen
- Institute of Health Informatics, University College London, London, UK
| | - Victoria C Smith
- Institute of Health Informatics, University College London, London, UK
| | | | - Maria Rosa Ballester
- Blanquerna School of Health Sciences, Ramon Llull University, Barcelona, Spain
- Institut de Recerca Sant Pau Barcelona, Barcelona, Spain
| | - Màrius Duran
- Blanquerna School of Health Sciences, Ramon Llull University, Barcelona, Spain
| | - Albert Solé
- Blanquerna School of Health Sciences, Ramon Llull University, Barcelona, Spain
| | - Palang Chotsiri
- Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
| | - Thanaporn Wattanakul
- Mahidol Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
| | - Gill Mundin
- Department of Computer Science, University College London, London, UK
| | | | - Joseph F Standing
- Great Ormond Street Institute for Child Health, University College London, London, UK
- Department of Pharmacy, Great Ormond Street Hospital for Children, London, UK
| | - Frank Kloprogge
- Institute for Global Health, University College London, London, UK.
| |
Collapse
|
3
|
Geci R, Gadaleta D, de Lomana MG, Ortega-Vallbona R, Colombo E, Serrano-Candelas E, Paini A, Kuepfer L, Schaller S. Systematic evaluation of high-throughput PBK modelling strategies for the prediction of intravenous and oral pharmacokinetics in humans. Arch Toxicol 2024; 98:2659-2676. [PMID: 38722347 PMCID: PMC11272695 DOI: 10.1007/s00204-024-03764-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 04/23/2024] [Indexed: 07/26/2024]
Abstract
Physiologically based kinetic (PBK) modelling offers a mechanistic basis for predicting the pharmaco-/toxicokinetics of compounds and thereby provides critical information for integrating toxicity and exposure data to replace animal testing with in vitro or in silico methods. However, traditional PBK modelling depends on animal and human data, which limits its usefulness for non-animal methods. To address this limitation, high-throughput PBK modelling aims to rely exclusively on in vitro and in silico data for model generation. Here, we evaluate a variety of in silico tools and different strategies to parameterise PBK models with input values from various sources in a high-throughput manner. We gather 2000 + publicly available human in vivo concentration-time profiles of 200 + compounds (IV and oral administration), as well as in silico, in vitro and in vivo determined compound-specific parameters required for the PBK modelling of these compounds. Then, we systematically evaluate all possible PBK model parametrisation strategies in PK-Sim and quantify their prediction accuracy against the collected in vivo concentration-time profiles. Our results show that even simple, generic high-throughput PBK modelling can provide accurate predictions of the pharmacokinetics of most compounds (87% of Cmax and 84% of AUC within tenfold). Nevertheless, we also observe major differences in prediction accuracies between the different parameterisation strategies, as well as between different compounds. Finally, we outline a strategy for high-throughput PBK modelling that relies exclusively on freely available tools. Our findings contribute to a more robust understanding of the reliability of high-throughput PBK modelling, which is essential to establish the confidence necessary for its utilisation in Next-Generation Risk Assessment.
Collapse
Affiliation(s)
- René Geci
- esqLABS GmbH, Saterland, Germany.
- Institute for Systems Medicine with Focus on Organ Interaction, University Hospital RWTH Aachen, Aachen, Germany.
| | | | - Marina García de Lomana
- Machine Learning Research, Research and Development, Pharmaceuticals, Bayer AG, Berlin, Germany
| | | | - Erika Colombo
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | | | | | - Lars Kuepfer
- Institute for Systems Medicine with Focus on Organ Interaction, University Hospital RWTH Aachen, Aachen, Germany
| | | |
Collapse
|
4
|
Terranova N, Renard D, Shahin MH, Menon S, Cao Y, Hop CECA, Hayes S, Madrasi K, Stodtmann S, Tensfeldt T, Vaddady P, Ellinwood N, Lu J. Artificial Intelligence for Quantitative Modeling in Drug Discovery and Development: An Innovation and Quality Consortium Perspective on Use Cases and Best Practices. Clin Pharmacol Ther 2024; 115:658-672. [PMID: 37716910 DOI: 10.1002/cpt.3053] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 09/11/2023] [Indexed: 09/18/2023]
Abstract
Recent breakthroughs in artificial intelligence (AI) and machine learning (ML) have ushered in a new era of possibilities across various scientific domains. One area where these advancements hold significant promise is model-informed drug discovery and development (MID3). To foster a wider adoption and acceptance of these advanced algorithms, the Innovation and Quality (IQ) Consortium initiated the AI/ML working group in 2021 with the aim of promoting their acceptance among the broader scientific community as well as by regulatory agencies. By drawing insights from workshops organized by the working group and attended by key stakeholders across the biopharma industry, academia, and regulatory agencies, this white paper provides a perspective from the IQ Consortium. The range of applications covered in this white paper encompass the following thematic topics: (i) AI/ML-enabled Analytics for Pharmacometrics and Quantitative Systems Pharmacology (QSP) Workflows; (ii) Explainable Artificial Intelligence and its Applications in Disease Progression Modeling; (iii) Natural Language Processing (NLP) in Quantitative Pharmacology Modeling; and (iv) AI/ML Utilization in Drug Discovery. Additionally, the paper offers a set of best practices to ensure an effective and responsible use of AI, including considering the context of use, explainability and generalizability of models, and having human-in-the-loop. We believe that embracing the transformative power of AI in quantitative modeling while adopting a set of good practices can unlock new opportunities for innovation, increase efficiency, and ultimately bring benefits to patients.
Collapse
Affiliation(s)
- Nadia Terranova
- Quantitative Pharmacology, Merck KGaA, Lausanne, Switzerland
| | - Didier Renard
- Full Development Pharmacometrics, Novartis Pharma AG, Basel, Switzerland
| | | | - Sujatha Menon
- Clinical Pharmacology, Pfizer Inc., Groton, Connecticut, USA
| | - Youfang Cao
- Clinical Pharmacology and Translational Medicine, Eisai Inc., Nutley, New Jersey, USA
| | | | - Sean Hayes
- Quantitative Pharmacology & Pharmacometrics, Merck & Co. Inc., Rahway, New Jersey, USA
| | - Kumpal Madrasi
- Modeling & Simulation, Sanofi, Bridgewater, New Jersey, USA
| | - Sven Stodtmann
- Pharmacometrics, AbbVie Deutschland GmbH & Co. KG, Ludwigshafen, Germany
| | | | - Pavan Vaddady
- Quantitative Clinical Pharmacology, Daiichi Sankyo, Inc., Basking Ridge, New Jersey, USA
| | | | - James Lu
- Clinical Pharmacology, Genentech Inc., South San Francisco, California, USA
| |
Collapse
|
5
|
Yang W, Mak W, Gwee A, Gu M, Wu Y, Shi Y, He Q, Xiang X, Han B, Zhu X. Establishment and Evaluation of a Parametric Population Pharmacokinetic Model Repository for Ganciclovir and Valganciclovir. Pharmaceutics 2023; 15:1801. [PMID: 37513988 PMCID: PMC10386724 DOI: 10.3390/pharmaceutics15071801] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 06/09/2023] [Accepted: 06/21/2023] [Indexed: 07/30/2023] Open
Abstract
BACKGROUND Ganciclovir and valganciclovir are used for prophylaxis and treatment of cytomegalovirus infection. However, there is great interindividual variability in ganciclovir's pharmacokinetics (PK), highlighting the importance of individualized dosing. To facilitate model-informed precision dosing (MIPD), this study aimed to establish a parametric model repository of ganciclovir and valganciclovir by summarizing existing population pharmacokinetic information and analyzing the sources of variability. (2) Methods: A total of four databases were searched for published population PK models. We replicated these models, evaluated the impact of covariates on clearance, calculated the probability of target attainment for each model based on a predetermined dosing regimen, and developed an area under the concentration-time curve (AUC) calculator using maximum a posteriori Bayesian estimation. (3) Results: A total of 16 models, one- or two-compartment models, were included. The most significant covariates were body size (weight and body surface area) and renal function. The results show that 5 mg/kg/12 h of ganciclovir could make the AUC0-24h within 40-80 mg·h/L for 50.03% pediatrics but cause AUC0-24h exceeding the exposure thresholds for toxicity (120 mg·h/L) in 51.24% adults. (4) Conclusions: Dosing regimens of ganciclovir and valganciclovir should be adjusted according to body size and renal function. This model repository has a broad range of potential applications in MIPD.
Collapse
Affiliation(s)
- Wenyu Yang
- Department of Clinical Pharmacy and Pharmacy Administration, School of Pharmacy, Fudan University, Shanghai 201203, China
- Department of Pharmacy, Minhang Hospital, Fudan University, Shanghai 201199, China
| | - Wenyao Mak
- Department of Clinical Pharmacy and Pharmacy Administration, School of Pharmacy, Fudan University, Shanghai 201203, China
| | - Amanda Gwee
- Department of General Medicine, Royal Children's Hospital, Parkville, VIC 3052, Australia
- Infectious Diseases Group, Murdoch Children's Research Institute, Parkville, VIC 3052, Australia
- Department of Paediatrics, The University of Melbourne, Parkville, VIC 3010, Australia
| | - Meng Gu
- Department of Clinical Pharmacy and Pharmacy Administration, School of Pharmacy, Fudan University, Shanghai 201203, China
- Department of Pharmacy, Minhang Hospital, Fudan University, Shanghai 201199, China
| | - Yue Wu
- Department of Clinical Pharmacy, Shenzhen Children's Hospital Affiliated to Shantou University Medical College, Shenzhen 518038, China
| | - Yufei Shi
- Department of Clinical Pharmacy and Pharmacy Administration, School of Pharmacy, Fudan University, Shanghai 201203, China
| | - Qingfeng He
- Department of Clinical Pharmacy and Pharmacy Administration, School of Pharmacy, Fudan University, Shanghai 201203, China
| | - Xiaoqiang Xiang
- Department of Clinical Pharmacy and Pharmacy Administration, School of Pharmacy, Fudan University, Shanghai 201203, China
| | - Bing Han
- Department of Pharmacy, Minhang Hospital, Fudan University, Shanghai 201199, China
| | - Xiao Zhu
- Department of Clinical Pharmacy and Pharmacy Administration, School of Pharmacy, Fudan University, Shanghai 201203, China
| |
Collapse
|
6
|
Grzegorzewski J, Brandhorst J, König M. Physiologically based pharmacokinetic (PBPK) modeling of the role of CYP2D6 polymorphism for metabolic phenotyping with dextromethorphan. Front Pharmacol 2022; 13:1029073. [PMID: 36353484 PMCID: PMC9637881 DOI: 10.3389/fphar.2022.1029073] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 09/23/2022] [Indexed: 11/24/2022] Open
Abstract
The cytochrome P450 2D6 (CYP2D6) is a key xenobiotic-metabolizing enzyme involved in the clearance of many drugs. Genetic polymorphisms in CYP2D6 contribute to the large inter-individual variability in drug metabolism and could affect metabolic phenotyping of CYP2D6 probe substances such as dextromethorphan (DXM). To study this question, we (i) established an extensive pharmacokinetics dataset for DXM; and (ii) developed and validated a physiologically based pharmacokinetic (PBPK) model of DXM and its metabolites dextrorphan (DXO) and dextrorphan O-glucuronide (DXO-Glu) based on the data. Drug-gene interactions (DGI) were introduced by accounting for changes in CYP2D6 enzyme kinetics depending on activity score (AS), which in combination with AS for individual polymorphisms allowed us to model CYP2D6 gene variants. Variability in CYP3A4 and CYP2D6 activity was modeled based on in vitro data from human liver microsomes. Model predictions are in very good agreement with pharmacokinetics data for CYP2D6 polymorphisms, CYP2D6 activity as described by the AS system, and CYP2D6 metabolic phenotypes (UM, EM, IM, PM). The model was applied to investigate the genotype-phenotype association and the role of CYP2D6 polymorphisms for metabolic phenotyping using the urinary cumulative metabolic ratio (UCMR), DXM/(DXO + DXO-Glu). The effect of parameters on UCMR was studied via sensitivity analysis. Model predictions indicate very good robustness against the intervention protocol (i.e. application form, dosing amount, dissolution rate, and sampling time) and good robustness against physiological variation. The model is capable of estimating the UCMR dispersion within and across populations depending on activity scores. Moreover, the distribution of UCMR and the risk of genotype-phenotype mismatch could be estimated for populations with known CYP2D6 genotype frequencies. The model can be applied for individual prediction of UCMR and metabolic phenotype based on CYP2D6 genotype. Both, model and database are freely available for reuse.
Collapse
Affiliation(s)
- Jan Grzegorzewski
- Institute for Theoretical Biology, Institute of Biology, Humboldt University, Berlin, Germany
| | | | | |
Collapse
|
7
|
Grzegorzewski J, Bartsch F, Köller A, König M. Pharmacokinetics of Caffeine: A Systematic Analysis of Reported Data for Application in Metabolic Phenotyping and Liver Function Testing. Front Pharmacol 2022; 12:752826. [PMID: 35280254 PMCID: PMC8914174 DOI: 10.3389/fphar.2021.752826] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 12/03/2021] [Indexed: 01/13/2023] Open
Abstract
Caffeine is by far the most ubiquitous psychostimulant worldwide found in tea, coffee, cocoa, energy drinks, and many other beverages and food. Caffeine is almost exclusively metabolized in the liver by the cytochrome P-450 enzyme system to the main product paraxanthine and the additional products theobromine and theophylline. Besides its stimulating properties, two important applications of caffeine are metabolic phenotyping of cytochrome P450 1A2 (CYP1A2) and liver function testing. An open challenge in this context is to identify underlying causes of the large inter-individual variability in caffeine pharmacokinetics. Data is urgently needed to understand and quantify confounding factors such as lifestyle (e.g., smoking), the effects of drug-caffeine interactions (e.g., medication metabolized via CYP1A2), and the effect of disease. Here we report the first integrative and systematic analysis of data on caffeine pharmacokinetics from 141 publications and provide a comprehensive high-quality data set on the pharmacokinetics of caffeine, caffeine metabolites, and their metabolic ratios in human adults. The data set is enriched by meta-data on the characteristics of studied patient cohorts and subjects (e.g., age, body weight, smoking status, health status), the applied interventions (e.g., dosing, substance, route of application), measured pharmacokinetic time-courses, and pharmacokinetic parameters (e.g., clearance, half-life, area under the curve). We demonstrate via multiple applications how the data set can be used to solidify existing knowledge and gain new insights relevant for metabolic phenotyping and liver function testing based on caffeine. Specifically, we analyzed 1) the alteration of caffeine pharmacokinetics with smoking and use of oral contraceptives; 2) drug-drug interactions with caffeine as possible confounding factors of caffeine pharmacokinetics or source of adverse effects; 3) alteration of caffeine pharmacokinetics in disease; and 4) the applicability of caffeine as a salivary test substance by comparison of plasma and saliva data. In conclusion, our data set and analyses provide important resources which could enable more accurate caffeine-based metabolic phenotyping and liver function testing.
Collapse
|