1
|
Xue L, He S, Singla RK, Qin Q, Ding Y, Liu L, Ding X, Bediaga-Bañeres H, Arrasate S, Durado-Sanchez A, Zhang Y, Shen Z, Shen B, Miao L, González-Díaz H. Machine learning guided prediction of warfarin blood levels for personalized medicine based on clinical longitudinal data from cardiac surgery patients: a prospective observational study. Int J Surg 2024; 110:6528-6540. [PMID: 38833337 PMCID: PMC11487003 DOI: 10.1097/js9.0000000000001734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 05/19/2024] [Indexed: 06/06/2024]
Abstract
BACKGROUND Warfarin is a common oral anticoagulant, and its effects vary widely among individuals. Numerous dose-prediction algorithms have been reported based on cross-sectional data generated via multiple linear regression or machine learning. This study aimed to construct an information fusion perturbation theory and machine-learning prediction model of warfarin blood levels based on clinical longitudinal data from cardiac surgery patients. METHODS AND MATERIAL The data of 246 patients were obtained from electronic medical records. Continuous variables were processed by calculating the distance of the raw data with the moving average (MA ∆v ki ( sj )), and categorical variables in different attribute groups were processed using Euclidean distance (ED ǁ∆v k ( sj )ǁ). Regression and classification analyses were performed on the raw data, MA ∆v ki ( sj ), and ED ǁ∆v k ( sj )ǁ. Different machine-learning algorithms were chosen for the STATISTICA and WEKA software. RESULTS The random forest (RF) algorithm was the best for predicting continuous outputs using the raw data. The correlation coefficients of the RF algorithm were 0.978 and 0.595 for the training and validation sets, respectively, and the mean absolute errors were 0.135 and 0.362 for the training and validation sets, respectively. The proportion of ideal predictions of the RF algorithm was 59.0%. General discriminant analysis (GDA) was the best algorithm for predicting the categorical outputs using the MA ∆v ki ( sj ) data. The GDA algorithm's total true positive rate (TPR) was 95.4% and 95.6% for the training and validation sets, respectively, with MA ∆v ki ( sj ) data. CONCLUSIONS An information fusion perturbation theory and machine-learning model for predicting warfarin blood levels was established. A model based on the RF algorithm could be used to predict the target international normalized ratio (INR), and a model based on the GDA algorithm could be used to predict the probability of being within the target INR range under different clinical scenarios.
Collapse
Affiliation(s)
- Ling Xue
- Department of Pharmacy, the First Affiliated Hospital of Soochow University
- Department of Pharmacology, Faculty of Medicine, University of The Basque Country (UPV/EHU), Bilbao, Basque Country
| | - Shan He
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), Bilbao, Basque Country, Spain
- IKERDATA S.L., ZITEK, University of The Basque Country (UPV/EHU), Bilbao, Basque Country
| | - Rajeev K. Singla
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
- School of Pharmaceutical Sciences, Lovely Professional University, Phagwara, Punjab, India
| | - Qiong Qin
- Department of Pharmacy, the First Affiliated Hospital of Soochow University
| | - Yinglong Ding
- Department of Cardiovascular Surgery, the First Affiliated Hospital of Soochow University
- Institute for Cardiovascular Science, Soochow University
| | - Linsheng Liu
- Department of Pharmacy, the First Affiliated Hospital of Soochow University
| | - Xiaoliang Ding
- Department of Pharmacy, the First Affiliated Hospital of Soochow University
| | - Harbil Bediaga-Bañeres
- IKERDATA S.L., ZITEK, University of The Basque Country (UPV/EHU), Bilbao, Basque Country
- Department of Painting, Faculty of Fine Arts, University of the Basque Country UPV/EHU, 48940, Leioa, Biscay
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), Bilbao, Basque Country, Spain
| | - Aliuska Durado-Sanchez
- IKERDATA S.L., ZITEK, University of The Basque Country (UPV/EHU), Bilbao, Basque Country
- Department of Public Law, Faculty of Law, University of The Basque Country (UPV/EHU), Leioa, Biscay, Basque, Country
| | - Yuzhen Zhang
- Department of Cardiology, the First Affiliated Hospital of Soochow University
| | - Zhenya Shen
- Department of Cardiovascular Surgery, the First Affiliated Hospital of Soochow University
- Institute for Cardiovascular Science, Soochow University
| | - Bairong Shen
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Liyan Miao
- Department of Pharmacy, the First Affiliated Hospital of Soochow University
- Institute for Interdisciplinary Drug Research and Translational Sciences, Soochow University
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), Bilbao, Basque Country, Spain
- BIOFISIKA: Basque Center for Biophysics CSIC, University of The Basque Country (UPV/EHU), Bilbao, Basque Country
- IKERBASQUE, Basque Foundation for Science, Bilbao, Basque Country, Spain
| |
Collapse
|
2
|
Xue L, Singla RK, He S, Arrasate S, González-Díaz H, Miao L, Shen B. Warfarin-A natural anticoagulant: A review of research trends for precision medication. PHYTOMEDICINE : INTERNATIONAL JOURNAL OF PHYTOTHERAPY AND PHYTOPHARMACOLOGY 2024; 128:155479. [PMID: 38493714 DOI: 10.1016/j.phymed.2024.155479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 01/29/2024] [Accepted: 02/22/2024] [Indexed: 03/19/2024]
Abstract
BACKGROUND Warfarin is a widely prescribed anticoagulant in the clinic. It has a more considerable individual variability, and many factors affect its variability. Mathematical models can quantify the quantitative impact of these factors on individual variability. PURPOSE The aim is to comprehensively analyze the advanced warfarin dosing algorithm based on pharmacometrics and machine learning models of personalized warfarin dosage. METHODS A bibliometric analysis of the literature retrieved from PubMed and Scopus was performed using VOSviewer. The relevant literature that reported the precise dosage of warfarin calculation was retrieved from the database. The multiple linear regression (MLR) algorithm was excluded because a recent systematic review that mainly reviewed this algorithm has been reported. The following terms of quantitative systems pharmacology, mechanistic model, physiologically based pharmacokinetic model, artificial intelligence, machine learning, pharmacokinetic, pharmacodynamic, pharmacokinetics, pharmacodynamics, and warfarin were added as MeSH Terms or appearing in Title/Abstract into query box of PubMed, then humans and English as filter were added to retrieve the literature. RESULTS Bibliometric analysis revealed important co-occuring MeShH and index keywords. Further, the United States, China, and the United Kingdom were among the top countries contributing in this domain. Some studies have established personalized warfarin dosage models using pharmacometrics and machine learning-based algorithms. There were 54 related studies, including 14 pharmacometric models, 31 artificial intelligence models, and 9 model evaluations. Each model has its advantages and disadvantages. The pharmacometric model contains biological or pharmacological mechanisms in structure. The process of pharmacometric model development is very time- and labor-intensive. Machine learning is a purely data-driven approach; its parameters are more mathematical and have less biological interpretation. However, it is faster, more efficient, and less time-consuming. Most published models of machine learning algorithms were established based on cross-sectional data sourced from the database. CONCLUSION Future research on personalized warfarin medication should focus on combining the advantages of machine learning and pharmacometrics algorithms to establish a more robust warfarin dosage algorithm. Randomized controlled trials should be performed to evaluate the established algorithm of warfarin dosage. Moreover, a more user-friendly and accessible warfarin precision medicine platform should be developed.
Collapse
Affiliation(s)
- Ling Xue
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China; Department of Pharmacy, The First Affiliated Hospital of Soochow University, Suzhou, China; Department of Pharmacology, Faculty of Medicine, University of The Basque Country (UPV/EHU), Bilbao, Basque Country, Spain
| | - Rajeev K Singla
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China; School of Pharmaceutical Sciences, Lovely Professional University, Phagwara, Punjab-144411, India
| | - Shan He
- IKERDATA S.l., ZITEK, University of The Basque Country (UPVEHU), Rectorate Building, 48940, Bilbao, Basque Country, Spain; Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Basque Country, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Basque Country, Spain
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Basque Country, Spain; BIOFISIKA: Basque Center for Biophysics CSIC, University of The Basque Country (UPV/EHU), Barrio Sarriena s/n, Leioa, Bizkaia 48940, Basque Country, Spain; IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Basque Country, Spain
| | - Liyan Miao
- Department of Pharmacy, The First Affiliated Hospital of Soochow University, Suzhou, China; Institute for Interdisciplinary Drug Research and Translational Sciences, Soochow University, Suzhou, China; College of Pharmaceutical Sciences, Soochow University, Suzhou, China.
| | - Bairong Shen
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China.
| |
Collapse
|
3
|
Xu L, Guo C, Liu M. A weighted distance-based dynamic ensemble regression framework for gastric cancer survival time prediction. Artif Intell Med 2024; 147:102740. [PMID: 38184344 DOI: 10.1016/j.artmed.2023.102740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 10/28/2023] [Accepted: 11/28/2023] [Indexed: 01/08/2024]
Abstract
Accurate prediction of gastric cancer patient survival time is essential for clinical decision-making. However, unified static models lack specificity and flexibility in predictions owing to the varying survival outcomes among gastric cancer patients. We address these problems by using an ensemble learning approach and adaptively assigning greater weights to similar patients to make more targeted predictions when predicting an individual's survival time. We treat these problems as regression problems and introduce a weighted dynamic ensemble regression framework. To better identify similar patients, we devise a method to measure patient similarity, considering the diverse impacts of features. Subsequently, we use this measure to design both a weighted K-means clustering method and a fuzzy K-means sampling technique to group patients and train corresponding base regressors. To achieve more targeted predictions, we calculate the weight of each base regressor based on the similarity between the patient to be predicted and the patient clusters, culminating in the integration of the results. The model is validated on a dataset of 7791 patients, outperforming other models in terms of three evaluation metrics, namely, the root mean square error, mean absolute error, and the coefficient of determination. The weighted dynamic ensemble regression strategy can improve the baseline model by 1.75%, 2.12%, and 13.45% in terms of the three respective metrics while also mitigating the imbalanced survival time distribution issue. This enhanced performance has been statistically validated, even when tested on six public datasets with different sizes. By considering feature variations, patients with distinct survival profiles can be effectively differentiated, and the model predictive performance can be enhanced. The results generated by our proposed model can be invaluable in guiding decisions related to treatment plans and resource allocation. Furthermore, the model has the potential for broader applications in prognosis for other types of cancers or similar regression problems in various domains.
Collapse
Affiliation(s)
- Liangchen Xu
- Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China.
| | - Chonghui Guo
- Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China.
| | - Mucan Liu
- Institute of Systems Engineering, Dalian University of Technology, Dalian 116024, China.
| |
Collapse
|
4
|
Iancu A, Leb I, Prokosch HU, Rödle W. Machine learning in medication prescription: A systematic review. Int J Med Inform 2023; 180:105241. [PMID: 37939541 DOI: 10.1016/j.ijmedinf.2023.105241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 09/17/2023] [Accepted: 09/27/2023] [Indexed: 11/10/2023]
Abstract
BACKGROUND Medication prescription is a complex process that could benefit from current research and development in machine learning through decision support systems. Particularly pediatricians are forced to prescribe medications "off-label" as children are still underrepresented in clinical studies, which leads to a high risk of an incorrect dose and adverse drug effects. METHODS PubMed, IEEE Xplore and PROSPERO were searched for relevant studies that developed and evaluated well-performing machine learning algorithms following the PRISMA statement. Quality assessment was conducted in accordance with the IJMEDI checklist. Identified studies were reviewed in detail, including the required variables for predicting the correct dose, especially of pediatric medication prescription. RESULTS The search identified 656 studies, of which 64 were reviewed in detail and 36 met the inclusion criteria. According to the IJMEDI checklist, five studies were considered to be of high quality. 19 of the 36 studies dealt with the active substance warfarin. Overall, machine learning algorithms based on decision trees or regression methods performed superior regarding their predictive power than algorithms based on neural networks, support vector machines or other methods. The use of ensemble methods like bagging or boosting generally enhanced the accuracy of the dose predictions. The required input and output variables of the algorithms were considerably heterogeneous and differ strongly among the respective substance. CONCLUSIONS By using machine learning algorithms, the prescription process could be simplified and dosing correctness could be enhanced. Despite the heterogenous results among the different substances and cases and the lack of pediatric use cases, the identified approaches and required variables can serve as an excellent starting point for further development of algorithms predicting drug doses, particularly for children. Especially the combination of physiologically-based pharmacokinetic models with machine learning algorithms represents a great opportunity to enhance the predictive power and accuracy of the developed algorithms.
Collapse
Affiliation(s)
- Alexa Iancu
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Wetterkreuz 15, 91058 Erlangen, Germany
| | - Ines Leb
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Wetterkreuz 15, 91058 Erlangen, Germany
| | - Hans-Ulrich Prokosch
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Wetterkreuz 15, 91058 Erlangen, Germany
| | - Wolfgang Rödle
- Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Wetterkreuz 15, 91058 Erlangen, Germany.
| |
Collapse
|
5
|
Sonawane R, Patil H. A design and implementation of heart disease prediction model using data and ECG signal through hybrid clustering. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING: IMAGING & VISUALIZATION 2022. [DOI: 10.1080/21681163.2022.2156927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Ritesh Sonawane
- Computer Engineering, S.S.V.P.S B.S.Deore College of Engineering, Dhule, Maharashtra, India
| | - Hitendra Patil
- Computer Engineering, S.S.V.P.S B.S.Deore College of Engineering, Dhule, Maharashtra, India
| |
Collapse
|
6
|
Janssen A, Bennis FC, Mathôt RAA. Adoption of Machine Learning in Pharmacometrics: An Overview of Recent Implementations and Their Considerations. Pharmaceutics 2022; 14:1814. [PMID: 36145562 PMCID: PMC9502080 DOI: 10.3390/pharmaceutics14091814] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 08/17/2022] [Accepted: 08/22/2022] [Indexed: 11/23/2022] Open
Abstract
Pharmacometrics is a multidisciplinary field utilizing mathematical models of physiology, pharmacology, and disease to describe and quantify the interactions between medication and patient. As these models become more and more advanced, the need for advanced data analysis tools grows. Recently, there has been much interest in the adoption of machine learning (ML) algorithms. These algorithms offer strong function approximation capabilities and might reduce the time spent on model development. However, ML tools are not yet an integral part of the pharmacometrics workflow. The goal of this work is to discuss how ML algorithms have been applied in four stages of the pharmacometrics pipeline: data preparation, hypothesis generation, predictive modelling, and model validation. We will also discuss considerations before the use of ML algorithms with respect to each topic. We conclude by summarizing applications that hold potential for adoption by pharmacometricians.
Collapse
Affiliation(s)
- Alexander Janssen
- Department of Clinical Pharmacology, Hospital Pharmacy, Amsterdam University Medical Center, 1105 Amsterdam, The Netherlands
| | - Frank C. Bennis
- Quantitative Data Analytics Group, Department of Computer Science, Vrije Universiteit Amsterdam, 1081 Amsterdam, The Netherlands
| | - Ron A. A. Mathôt
- Department of Clinical Pharmacology, Hospital Pharmacy, Amsterdam University Medical Center, 1105 Amsterdam, The Netherlands
| |
Collapse
|
7
|
Zhang F, Liu Y, Ma W, Zhao S, Chen J, Gu Z. Nonlinear Machine Learning in Warfarin Dose Prediction: Insights from Contemporary Modelling Studies. J Pers Med 2022; 12:jpm12050717. [PMID: 35629140 PMCID: PMC9147332 DOI: 10.3390/jpm12050717] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/26/2022] [Accepted: 04/28/2022] [Indexed: 02/01/2023] Open
Abstract
Objective: This study aimed to systematically assess the characteristics and risk of bias of previous studies that have investigated nonlinear machine learning algorithms for warfarin dose prediction. Methods: We systematically searched PubMed, Embase, Cochrane Library, Chinese National Knowledge Infrastructure (CNKI), China Biology Medicine (CBM), China Science and Technology Journal Database (VIP), and Wanfang Database up to March 2022. We assessed the general characteristics of the included studies with respect to the participants, predictors, model development, and model evaluation. The methodological quality of the studies was determined, and the risk of bias was evaluated using the Prediction model Risk of Bias Assessment Tool (PROBAST). Results: From a total of 8996 studies, 23 were assessed in this study, of which 23 (100%) were retrospective, and 11 studies focused on the Asian population. The most common demographic and clinical predictors were age (21/23, 91%), weight (17/23, 74%), height (12/23, 52%), and amiodarone combination (11/23, 48%), while CYP2C9 (14/23, 61%), VKORC1 (14/23, 61%), and CYP4F2 (5/23, 22%) were the most common genetic predictors. Of the included studies, the MAE ranged from 1.47 to 10.86 mg/week in model development studies, from 2.42 to 5.18 mg/week in model development with external validation (same data) studies, from 12.07 to 17.59 mg/week in model development with external validation (another data) studies, and from 4.40 to 4.84 mg/week in model external validation studies. All studies were evaluated as having a high risk of bias. Factors contributing to the risk of bias include inappropriate exclusion of participants (10/23, 43%), small sample size (15/23, 65%), poor handling of missing data (20/23, 87%), and incorrect method of selecting predictors (8/23, 35%). Conclusions: Most studies on nonlinear-machine-learning-based warfarin prediction models show poor methodological quality and have a high risk of bias. The analysis domain is the major contributor to the overall high risk of bias. External validity and model reproducibility are lacking in most studies. Future studies should focus on external validity, diminish risk of bias, and enhance real-world clinical relevance.
Collapse
Affiliation(s)
- Fengying Zhang
- Department of Evidence-Based Medicine and Clinical Epidemiology, West China Hospital, Sichuan University, Chengdu 610041, China; (F.Z.); (W.M.); (S.Z.)
| | - Yan Liu
- Department of Clinical Pharmacy, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200092, China;
| | - Weijie Ma
- Department of Evidence-Based Medicine and Clinical Epidemiology, West China Hospital, Sichuan University, Chengdu 610041, China; (F.Z.); (W.M.); (S.Z.)
| | - Shengming Zhao
- Department of Evidence-Based Medicine and Clinical Epidemiology, West China Hospital, Sichuan University, Chengdu 610041, China; (F.Z.); (W.M.); (S.Z.)
| | - Jin Chen
- Department of Evidence-Based Medicine and Clinical Epidemiology, West China Hospital, Sichuan University, Chengdu 610041, China; (F.Z.); (W.M.); (S.Z.)
- Correspondence: (J.C.); (Z.G.)
| | - Zhichun Gu
- Department of Pharmacy, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai 200127, China
- Shanghai Anticoagulation Pharmacist Alliance, Shanghai Pharmaceutical Association, Shanghai 200040, China
- Correspondence: (J.C.); (Z.G.)
| |
Collapse
|
8
|
Ma W, Li H, Dong L, Zhou Q, Fu B, Hou JL, Wang J, Qin W, Chen J. Warfarin maintenance dose prediction for Chinese after heart valve replacement by a feedforward neural network with equal stratified sampling. Sci Rep 2021; 11:13778. [PMID: 34215839 PMCID: PMC8253817 DOI: 10.1038/s41598-021-93317-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 06/23/2021] [Indexed: 02/05/2023] Open
Abstract
Patients requiring low-dose warfarin are more likely to suffer bleeding due to overdose. The goal of this work is to improve the feedforward neural network model's precision in predicting the low maintenance dose for Chinese in the aspect of training data construction. We built the model from a resampled dataset created by equal stratified sampling (maintaining the same sample number in three dose-groups with a total of 3639) and performed internal and external validations. Comparing to the model trained from the raw dataset of 19,060 eligible cases, we improved the low-dose group's ideal prediction percentage from 0.7 to 9.6% and maintained the overall performance (76.4% vs. 75.6%) in external validation. We further built neural network models on single-dose subsets to invest whether the subsets samples were sufficient and whether the selected factors were appropriate. The training set sizes were 1340 and 1478 for the low and high dose subsets; the corresponding ideal prediction percentages were 70.2% and 75.1%. The training set size for the intermediate dose varied and was 1553, 6214, and 12,429; the corresponding ideal prediction percentages were 95.6, 95.1%, and 95.3%. Our conclusion is that equal stratified sampling can be a considerable alternative approach in training data construction to build drug dosing models in the clinic.
Collapse
Affiliation(s)
- Weijie Ma
- Department of Evidence-Based Medicine and Clinical Epidemiology, School of Medicine/West China Hospital, Sichuan University, No. 17, Section 3, Renmin South Road, Chengdu, 610041, Sichuan, China
| | - Hongying Li
- College of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Li Dong
- Department of Cardiovascular Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Qin Zhou
- Department of Nutrition, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Bo Fu
- Department of Cardiovascular Surgery, Tianjin Central Hospital, Tianjin, China
| | - Jiang-Long Hou
- Department of Cardiovascular Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Jing Wang
- Department of Career Development Division, The Fourth Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China
| | - Wenzhe Qin
- Department of Social Medicine and Health Management, Shandong University, Jinan, Shandong, China
| | - Jin Chen
- Department of Evidence-Based Medicine and Clinical Epidemiology, School of Medicine/West China Hospital, Sichuan University, No. 17, Section 3, Renmin South Road, Chengdu, 610041, Sichuan, China.
| |
Collapse
|
9
|
Zhang Y, Xie C, Xue L, Tao Y, Yue G, Jiang B. A post-hoc interpretable ensemble model to feature effect analysis in warfarin dose prediction for Chinese patients. IEEE J Biomed Health Inform 2021; 26:840-851. [PMID: 34166206 DOI: 10.1109/jbhi.2021.3092170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
To interprete the importance of clinical features and genotypes for warfarin daily dose prediction, we developed a post-hoc interpretable framework based on an ensemble predictive model. This framework includes permutation importance for global interpretation and local interpretable model-agnostic explanation (LIME) and shapley additive explanations (SHAP) for local explanation. The permutation importance globally ranks the importance of features on the whole data set. This can guide us to build a predictive model with less variables and the complexity of final predictive model can be reduced. LIME and SHAP together explain how the predictive model give the predicted dosage for specific samples. This help clinicians prescribe accurate doses to patients using more effective clinical variables. Results showed that both the permutation importance and SHAP demonstrated that VKORC1, age, serum creatinine (SCr), left atrium (LA) size, CYP2C9 and weight were the most important features on the whole data set. In specific samples, both SHAP and LIME discovered that in Chinese patients, wild-type VKORC1-AA, mutant-type CYP2C9*3, age over 60, abnormal LA size, SCr within the normal range, and using amiodarone definitely required dosage reduction, whereas mutant-type VKORC1-AG/GG, small age, SCr out of normal range, normal LA size, diabetes and heavy weight required dosage enhancement.
Collapse
|
10
|
Gu ZC, Huang SR, Dong L, Zhou Q, Wang J, Fu B, Chen J. An Adapted Neural-Fuzzy Inference System Model Using Preprocessed Balance Data to Improve the Predictive Accuracy of Warfarin Maintenance Dosing in Patients After Heart Valve Replacement. Cardiovasc Drugs Ther 2021; 36:879-889. [PMID: 33877502 DOI: 10.1007/s10557-021-07191-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/14/2021] [Indexed: 02/05/2023]
Abstract
BACKGROUND Tailoring warfarin use poses a challenge for physicians and pharmacists due to its narrow therapeutic window and substantial inter-individual variability. This study aimed to create an adapted neural-fuzzy inference system (ANFIS) model using preprocessed balance data to improve the predictive accuracy of warfarin maintenance dosing in Chinese patients undergoing heart valve replacement (HVR). METHODS This retrospective study enrolled patients who underwent HVR between June 1, 2012, and June 1, 2016, from 35 centers in China. The primary outcomes were the mean difference between predicted warfarin dose by ANFIS models and actual dose and the models' predictive accuracy, including the ideal predicted percentage, the mean absolute error (MAE), and the mean squared error (MSE). The eligible cases were divided into training, internal validation, and external validation groups. We explored input variables by univariate analysis of a general linear model and created two ANFIS models using imbalanced and balanced training sets. We finally compared the primary outcomes between the imbalanced and balanced ANFIS models in both internal and external validation sets. Stratified analyses were conducted across warfarin doses (low, medium, and high doses). RESULTS A total of 15,108 patients were included and grouped as follows: 12,086 in the imbalanced training set; 2820 in the balanced training set; 1511 in the internal validation set; and 1511 in the external validation set. Eight variables were explored as predictors related to warfarin maintenance doses, and imbalanced and balanced ANFIS models with multi-fuzzy rules were developed. The results showed a low mean difference between predicted and actual doses (< 0.3 mg/d for each model) and an accurate prediction property in both the imbalanced model (ideal prediction percentage, 74.39-78.16%; MAE, 0.37 mg/daily; MSE, 0.39 mg/daily) and the balanced model (ideal prediction percentage, 73.46-75.31%; MAE, 0.42 mg/daily; MSE, 0.43 mg/daily). Compared to the imbalanced model, the balanced model had a significantly higher prediction accuracy in the low-dose (14.46% vs. 3.01%; P < 0.001) and the high-dose warfarin groups (34.71% vs. 23.14%; P = 0.047). The results from the external validation cohort confirmed this finding. CONCLUSIONS The ANFIS model can accurately predict the warfarin maintenance dose in patients after HVR. Through data preprocessing, the balanced model contributed to improved prediction ability in the low- and high-dose warfarin groups.
Collapse
Affiliation(s)
- Zhi-Chun Gu
- Department of Pharmacy, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Shou-Rui Huang
- Department of Evidence-Based Medicine and Clinical epidemiology, West China Hospital, Sichuan University, Chengdu, China
| | - Li Dong
- Department of Cardiovascular Surgery, West China Hospital, Sichuan University, Chengdu, China
| | - Qin Zhou
- Department of Nutrition, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Jing Wang
- Department of Career Development Division, The Fourth Affiliated Hospital of Anhui Medical University, Hefei, China
| | - Bo Fu
- Department of Cardiovascular Surgery, Tianjin Central Hospital, Tianjin, China
| | - Jin Chen
- Department of Evidence-Based Medicine and Clinical epidemiology, West China Hospital, Sichuan University, Chengdu, China.
| |
Collapse
|
11
|
Feng Y, Wang X, Zhang J. A heterogeneous ensemble learning method for neuroblastoma survival prediction. IEEE J Biomed Health Inform 2021; 26:1472-1483. [PMID: 33848254 DOI: 10.1109/jbhi.2021.3073056] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Neuroblastoma is a pediatric cancer with high morbidity and mortality. Accurate survival prediction of patients with neuroblastoma plays an important role in the formulation of treatment plans. In this study, we proposed a heterogeneous ensemble learning method to predict the survival of neuroblastoma patients and extract decision rules from the proposed method to assist doctors in making decisions. After data preprocessing, five heterogeneous base learners were developed, which consisted of decision tree, random forest, support vector machine based on genetic algorithm, extreme gradient boosting and light gradient boosting machine. Subsequently, a heterogeneous feature selection method was devised to obtain the optimal feature subset of each base learner, and the optimal feature subset of each base learner guided the construction of the base learners as a priori knowledge. Furthermore, an area under curve-based ensemble mechanism was proposed to integrate the five heterogeneous base learners. Finally, the proposed method was compared with mainstream machine learning methods from different indicators, and valuable information was extracted by using the partial dependency plot analysis method and rule-extracted method from the proposed method. Experimental results show that the proposed method achieves an accuracy of 91.64%, recall of 91.14%, and AUC of 91.35% and is significantly better than the mainstream machine learning methods. In addition, interpretable rules with accuracy higher than 0.900 and predicted responses are extracted from the proposed method. Our study can effectively improve the performance of the clinical decision support system to improve the survival of neuroblastoma patients.
Collapse
|
12
|
Tao Y, Jiang B, Xue L, Xie C, Zhang Y. Evolutionary synthetic oversampling technique and cocktail ensemble model for warfarin dose prediction with imbalanced data. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05568-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
13
|
Tao Y, Zhang Y, Jiang B. DBCSMOTE: a clustering-based oversampling technique for data-imbalanced warfarin dose prediction. BMC Med Genomics 2020; 13:152. [PMID: 33087117 PMCID: PMC7579987 DOI: 10.1186/s12920-020-00781-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Vitamin K antagonist (warfarin) is the most classical and widely used oral anticoagulant with assuring anticoagulant effect, wide clinical indications and low price. Warfarin dosage requirements of different patients vary largely. For warfarin daily dosage prediction, the data imbalance in dataset leads to inaccurate prediction on the patients of rare genotype, who usually have large stable dosage requirement. To balance the dataset of patients treated with warfarin and improve the predictive accuracy, an appropriate partition of majority and minority groups, together with an oversampling method, is required. METHOD To solve the data-imbalance problem mentioned above, we developed a clustering-based oversampling technique denoted as DBCSMOTE, which combines density-based spatial clustering of application with noise (DBCSCAN) and synthetic minority oversampling technique (SMOTE). DBCSMOTE automatically finds the minority groups by acquiring the association between samples in terms of the clinical features/genotypes and the warfarin dosage, and creates an extended dataset by adding the new synthetic samples of majority and minority groups. Meanwhile, two ensemble models, boosted regression tree (BRT) and random forest (RF), which are built on the extended dataset generateed by DBCSMOTE, accomplish the task of warfarin daily dosage prediction. RESULTS DBCSMOTE and the comparison methods were tested on the datasets derived from our Hospital and International Warfarin Pharmacogenetics Consortium (IWPC). As the results, DBCSMOTE-BRT obtained the highest R-squared (R2) of 0.424 and the smallest mean squared error (mse) of 1.08. In terms of the percentage of patients whose predicted dose of warfarin is within 20% of the actual stable therapeutic dose (20%-p), DBCSMOTE-BRT can achieve the largest value of 47.8% among predictive models. The more important thing is that DBCSMOTE saved about 68% computational time to achieve the same or better performance than the Evolutionary SMOTE, which was the best oversampling method in warfarin dose prediction by far. Meanwhile, in warfarin dose prediction, it is discovered that DBCSMOTE is more effective in integrating BRT than RF for warfarin dose prediction. CONCLUSION Our finding is that the genotypes, CYP2C9 and VKORC1, no doubt contribute to the predictive accuracy. It was also discovered left atrium diameter, glutamic pyruvic transaminase and serum creatinine included in the model actually improved the predictive accuracy; When congestive heart failure, diabetes mellitus and valve replacement were absent in DBCSMOTE-BRT/RF, the predictive accuracy of DBCSMOTE-BRT/RF decreased. The oversampling ratio and number of minority clusters have a large impact on the effect of oversampling. According to our test, the predictive accuracy was high when the number of minority clusters was 6 ~ 8. The oversampling ratio for small minority clusters should be large (> 1.2) and for large minority clusters should be small (< 0.2). If the dataset becomes larger, the DBCSMOTE would be re-optimized and its BRT/RF model should be re-trained. DBCSMOTE-BRT/RF outperformed the current commonly-used tool called Warfarindosing. As compared to Evolutionary SMOTE-BRT and RF models, DBCSMOTE-BRT and RF models take only a small computational time to achieve the same or higher performance in many cases. In terms of predictive accuracy, RF is not as good as BRT. However, RF still has a powerful ability in generating a highly accurate model as the dataset increases; the software "WarfarinSeer v2.0" is a test version, which packed DBCSMOTE-BRT/RF. It could be a convenient tool for clinical application in warfarin treatment.
Collapse
Affiliation(s)
- Yanyun Tao
- Intelligent transportation and cognitive computing laboratory, Soochow university, Shizi Street 1, Suzhou, 215005, China
| | - Yuzhen Zhang
- the Cardiovascular Department, the First Affiliated Hospital of Soochow University, Shizi Street 100, Suzhou, 215005, China.
| | - Bin Jiang
- the Cardiovascular Department, the First Affiliated Hospital of Soochow University, Shizi Street 100, Suzhou, 215005, China
| |
Collapse
|
14
|
The Prediction Model of Warfarin Individual Maintenance Dose for Patients Undergoing Heart Valve Replacement, Based on the Back Propagation Neural Network. Clin Drug Investig 2019; 40:41-53. [DOI: 10.1007/s40261-019-00850-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
|