1
|
Krix S, DeLong LN, Madan S, Domingo-Fernández D, Ahmad A, Gul S, Zaliani A, Fröhlich H. MultiGML: Multimodal graph machine learning for prediction of adverse drug events. Heliyon 2023; 9:e19441. [PMID: 37681175 PMCID: PMC10481305 DOI: 10.1016/j.heliyon.2023.e19441] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 08/22/2023] [Accepted: 08/23/2023] [Indexed: 09/09/2023] Open
Abstract
Adverse drug events constitute a major challenge for the success of clinical trials. Several computational strategies have been suggested to estimate the risk of adverse drug events in preclinical drug development. While these approaches have demonstrated high utility in practice, they are at the same time limited to specific information sources. Thus, many current computational approaches neglect a wealth of information which results from the integration of different data sources, such as biological protein function, gene expression, chemical compound structure, cell-based imaging and others. In this work we propose an integrative and explainable multi-modal Graph Machine Learning approach (MultiGML), which fuses knowledge graphs with multiple further data modalities to predict drug related adverse events and general drug target-phenotype associations. MultiGML demonstrates excellent prediction performance compared to alternative algorithms, including various traditional knowledge graph embedding techniques. MultiGML distinguishes itself from alternative techniques by providing in-depth explanations of model predictions, which point towards biological mechanisms associated with predictions of an adverse drug event. Hence, MultiGML could be a versatile tool to support decision making in preclinical drug development.
Collapse
Affiliation(s)
- Sophia Krix
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
- Fraunhofer Center for Machine Learning, Germany
| | - Lauren Nicole DeLong
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Artificial Intelligence and its Applications Institute, School of Informatics, University of Edinburgh, 10 Crichton Street, EH8 9AB, UK
| | - Sumit Madan
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Department of Computer Science, University of Bonn, 53115, Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Fraunhofer Center for Machine Learning, Germany
- Enveda Biosciences, Boulder, CO, 80301, USA
| | - Ashar Ahmad
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
- Grunenthal GmbH, 52099, Aachen, Germany
| | - Sheraz Gul
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases CIMD, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Andrea Zaliani
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Schnackenburgallee 114, 22525, Hamburg, Germany
- Fraunhofer Cluster of Excellence for Immune-Mediated Diseases CIMD, Schnackenburgallee 114, 22525, Hamburg, Germany
| | - Holger Fröhlich
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757, Sankt Augustin, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115, Bonn, Germany
| |
Collapse
|
2
|
Mazuz E, Shtar G, Kutsky N, Rokach L, Shapira B. Pretrained transformer models for predicting the withdrawal of drugs from the market. Bioinformatics 2023; 39:btad519. [PMID: 37610328 PMCID: PMC10469107 DOI: 10.1093/bioinformatics/btad519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 07/24/2023] [Accepted: 08/22/2023] [Indexed: 08/24/2023] Open
Abstract
MOTIVATION The process of drug discovery is notoriously complex, costing an average of 2.6 billion dollars and taking ∼13 years to bring a new drug to the market. The success rate for new drugs is alarmingly low (around 0.0001%), and severe adverse drug reactions (ADRs) frequently occur, some of which may even result in death. Early identification of potential ADRs is critical to improve the efficiency and safety of the drug development process. RESULTS In this study, we employed pretrained large language models (LLMs) to predict the likelihood of a drug being withdrawn from the market due to safety concerns. Our method achieved an area under the curve (AUC) of over 0.75 through cross-database validation, outperforming classical machine learning models and graph-based models. Notably, our pretrained LLMs successfully identified over 50% drugs that were subsequently withdrawn, when predictions were made on a subset of drugs with inconsistent labeling between the training and test sets. AVAILABILITY AND IMPLEMENTATION The code and datasets are available at https://github.com/eyalmazuz/DrugWithdrawn.
Collapse
Affiliation(s)
- Eyal Mazuz
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel
| | - Guy Shtar
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel
| | - Nir Kutsky
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel
| | - Lior Rokach
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel
| | - Bracha Shapira
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel
| |
Collapse
|
3
|
Zhao Y, Yu Y, Wang H, Li Y, Deng Y, Jiang G, Luo Y. Machine Learning in Causal Inference: Application in Pharmacovigilance. Drug Saf 2022; 45:459-476. [PMID: 35579811 PMCID: PMC9114053 DOI: 10.1007/s40264-022-01155-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/09/2022] [Indexed: 01/28/2023]
Abstract
Monitoring adverse drug events or pharmacovigilance has been promoted by the World Health Organization to assure the safety of medicines through a timely and reliable information exchange regarding drug safety issues. We aim to discuss the application of machine learning methods as well as causal inference paradigms in pharmacovigilance. We first reviewed data sources for pharmacovigilance. Then, we examined traditional causal inference paradigms, their applications in pharmacovigilance, and how machine learning methods and causal inference paradigms were integrated to enhance the performance of traditional causal inference paradigms. Finally, we summarized issues with currently mainstream correlation-based machine learning models and how the machine learning community has tried to address these issues by incorporating causal inference paradigms. Our literature search revealed that most existing data sources and tasks for pharmacovigilance were not designed for causal inference. Additionally, pharmacovigilance was lagging in adopting machine learning-causal inference integrated models. We highlight several currently trending directions or gaps to integrate causal inference with machine learning in pharmacovigilance research. Finally, our literature search revealed that the adoption of causal paradigms can mitigate known issues with machine learning models. We foresee that the pharmacovigilance domain can benefit from the progress in the machine learning field.
Collapse
Affiliation(s)
- Yiqing Zhao
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive, Room 11-189, Chicago, IL, 60611, USA
| | - Yue Yu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, 55902, USA
| | - Hanyin Wang
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive, Room 11-189, Chicago, IL, 60611, USA
| | - Yikuan Li
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive, Room 11-189, Chicago, IL, 60611, USA
| | - Yu Deng
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive, Room 11-189, Chicago, IL, 60611, USA
| | - Guoqian Jiang
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN, 55902, USA
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Drive, Room 11-189, Chicago, IL, 60611, USA.
| |
Collapse
|
4
|
Zhou L, Tang Y, Yan G. A New Estimation Method for the Biological Interaction Predicting Problems. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1415-1423. [PMID: 33406043 DOI: 10.1109/tcbb.2021.3049642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
For the past decades, computational methods have been developed to predict various interactions in biological problems. Usually these methods treated the predicting problems as semi-supervised problem or positive-unlabeled(PU) learning problem. Researchers focused on the prediction of unlabeled samples and hoped to find novel interactions in the datasets they collected. However, most of the computational methods could only predict a small proportion of undiscovered interactions and the total number was unknown. In this paper, we developed an estimation method with deep learning to calculate the number of undiscovered interactions in the unlabeled samples, derived its asymptotic interval estimation, and applied it to the compound synergism dataset, drug-target interaction(DTI) dataset and MicroRNA-disease interaction dataset successfully. Moreover, this method could reveal which dataset contained more undiscovered interactions and would be a guidance for the experimental validation. Furthermore, we compared our method with some mixture proportion estimators and demonstarted the efficacy of our method. Finally, we proved that AUC and AUPR were related with the number of undiscovered interactions, which was regarded as another evaluation indicator for the computational methods.
Collapse
|
5
|
Deng S, Sun Y, Zhao T, Hu Y, Zang T. A Review of Drug Side Effect Identification Methods. Curr Pharm Des 2020; 26:3096-3104. [PMID: 32532187 DOI: 10.2174/1381612826666200612163819] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Accepted: 05/18/2020] [Indexed: 11/22/2022]
Abstract
Drug side effects have become an important indicator for evaluating the safety of drugs. There are two main factors in the frequent occurrence of drug safety problems; on the one hand, the clinical understanding of drug side effects is insufficient, leading to frequent adverse drug reactions, while on the other hand, due to the long-term period and complexity of clinical trials, side effects of approved drugs on the market cannot be reported in a timely manner. Therefore, many researchers have focused on developing methods to identify drug side effects. In this review, we summarize the methods of identifying drug side effects and common databases in this field. We classified methods of identifying side effects into four categories: biological experimental, machine learning, text mining and network methods. We point out the key points of each kind of method. In addition, we also explain the advantages and disadvantages of each method. Finally, we propose future research directions.
Collapse
Affiliation(s)
- Shuai Deng
- College of Science, Beijing Forestry University, Beijing, China
| | - Yige Sun
- Microbiology Department, Harbin Medical University, Harbin, 150081, China
| | - Tianyi Zhao
- School of Life Science and Technology, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Yang Hu
- School of Life Science and Technology, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Tianyi Zang
- School of Life Science and Technology, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| |
Collapse
|
6
|
Li F, Yu H. An investigation of single-domain and multidomain medication and adverse drug event relation extraction from electronic health record notes using advanced deep learning models. J Am Med Inform Assoc 2019; 26:646-654. [PMID: 30938761 PMCID: PMC6562161 DOI: 10.1093/jamia/ocz018] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Revised: 01/15/2019] [Accepted: 01/31/2019] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE We aim to evaluate the effectiveness of advanced deep learning models (eg, capsule network [CapNet], adversarial training [ADV]) for single-domain and multidomain relation extraction from electronic health record (EHR) notes. MATERIALS AND METHODS We built multiple deep learning models with increased complexity, namely a multilayer perceptron (MLP) model and a CapNet model for single-domain relation extraction and fully shared (FS), shared-private (SP), and adversarial training (ADV) modes for multidomain relation extraction. Our models were evaluated in 2 ways: first, we compared our models using our expert-annotated cancer (the MADE1.0 corpus) and cardio corpora; second, we compared our models with the systems in the MADE1.0 and i2b2 challenges. RESULTS Multidomain models outperform single-domain models by 0.7%-1.4% in F1 (t test P < .05), but the results of FS, SP, and ADV modes are mixed. Our results show that the MLP model generally outperforms the CapNet model by 0.1%-1.0% in F1. In the comparisons with other systems, the CapNet model achieves the state-of-the-art result (87.2% in F1) in the cancer corpus and the MLP model generally outperforms MedEx in the cancer, cardiovascular diseases, and i2b2 corpora. CONCLUSIONS Our MLP or CapNet model generally outperforms other state-of-the-art systems in medication and adverse drug event relation extraction. Multidomain models perform better than single-domain models. However, neither the SP nor the ADV mode can always outperform the FS mode significantly. Moreover, the CapNet model is not superior to the MLP model for our corpora.
Collapse
Affiliation(s)
- Fei Li
- Department of Computer Science, University of Massachusetts Lowell, Lowell, Massachusetts, USA
- Center for Healthcare Organization and Implementation Research, Bedford Veterans Affairs Medical Center, Bedford, Massachusetts, USA
| | - Hong Yu
- Department of Computer Science, University of Massachusetts Lowell, Lowell, Massachusetts, USA
- Center for Healthcare Organization and Implementation Research, Bedford Veterans Affairs Medical Center, Bedford, Massachusetts, USA
| |
Collapse
|
7
|
Dey S, Luo H, Fokoue A, Hu J, Zhang P. Predicting adverse drug reactions through interpretable deep learning framework. BMC Bioinformatics 2018; 19:476. [PMID: 30591036 PMCID: PMC6300887 DOI: 10.1186/s12859-018-2544-0] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Adverse drug reactions (ADRs) are unintended and harmful reactions caused by normal uses of drugs. Predicting and preventing ADRs in the early stage of the drug development pipeline can help to enhance drug safety and reduce financial costs. METHODS In this paper, we developed machine learning models including a deep learning framework which can simultaneously predict ADRs and identify the molecular substructures associated with those ADRs without defining the substructures a-priori. RESULTS We evaluated the performance of our model with ten different state-of-the-art fingerprint models and found that neural fingerprints from the deep learning model outperformed all other methods in predicting ADRs. Via feature analysis on drug structures, we identified important molecular substructures that are associated with specific ADRs and assessed their associations via statistical analysis. CONCLUSIONS The deep learning model with feature analysis, substructure identification, and statistical assessment provides a promising solution for identifying risky components within molecular structures and can potentially help to improve drug safety evaluation.
Collapse
Affiliation(s)
- Sanjoy Dey
- Center for Computational Health, IBM T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY USA
| | - Heng Luo
- Center for Computational Health, IBM T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY USA
| | - Achille Fokoue
- Cognitive Computing, IBM T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY USA
| | - Jianying Hu
- Center for Computational Health, IBM T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY USA
| | - Ping Zhang
- Center for Computational Health, IBM T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY USA
| |
Collapse
|
8
|
Li F, Liu W, Yu H. Extraction of Information Related to Adverse Drug Events from Electronic Health Record Notes: Design of an End-to-End Model Based on Deep Learning. JMIR Med Inform 2018; 6:e12159. [PMID: 30478023 PMCID: PMC6288593 DOI: 10.2196/12159] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 10/31/2018] [Accepted: 11/09/2018] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Pharmacovigilance and drug-safety surveillance are crucial for monitoring adverse drug events (ADEs), but the main ADE-reporting systems such as Food and Drug Administration Adverse Event Reporting System face challenges such as underreporting. Therefore, as complementary surveillance, data on ADEs are extracted from electronic health record (EHR) notes via natural language processing (NLP). As NLP develops, many up-to-date machine-learning techniques are introduced in this field, such as deep learning and multi-task learning (MTL). However, only a few studies have focused on employing such techniques to extract ADEs. OBJECTIVE We aimed to design a deep learning model for extracting ADEs and related information such as medications and indications. Since extraction of ADE-related information includes two steps-named entity recognition and relation extraction-our second objective was to improve the deep learning model using multi-task learning between the two steps. METHODS We employed the dataset from the Medication, Indication and Adverse Drug Events (MADE) 1.0 challenge to train and test our models. This dataset consists of 1089 EHR notes of cancer patients and includes 9 entity types such as Medication, Indication, and ADE and 7 types of relations between these entities. To extract information from the dataset, we proposed a deep-learning model that uses a bidirectional long short-term memory (BiLSTM) conditional random field network to recognize entities and a BiLSTM-Attention network to extract relations. To further improve the deep-learning model, we employed three typical MTL methods, namely, hard parameter sharing, parameter regularization, and task relation learning, to build three MTL models, called HardMTL, RegMTL, and LearnMTL, respectively. RESULTS Since extraction of ADE-related information is a two-step task, the result of the second step (ie, relation extraction) was used to compare all models. We used microaveraged precision, recall, and F1 as evaluation metrics. Our deep learning model achieved state-of-the-art results (F1=65.9%), which is significantly higher than that (F1=61.7%) of the best system in the MADE1.0 challenge. HardMTL further improved the F1 by 0.8%, boosting the F1 to 66.7%, whereas RegMTL and LearnMTL failed to boost the performance. CONCLUSIONS Deep learning models can significantly improve the performance of ADE-related information extraction. MTL may be effective for named entity recognition and relation extraction, but it depends on the methods, data, and other factors. Our results can facilitate research on ADE detection, NLP, and machine learning.
Collapse
Affiliation(s)
- Fei Li
- Department of Computer Science, University of Massachusetts Lowell, Lowell, MA, United States
- Center for Healthcare Organization and Implementation Research, Bedford Veterans Affairs Medical Center, Bedford, MA, United States
- Department of Medicine, University of Massachusetts Medical School, Worcester, MA, United States
| | - Weisong Liu
- Department of Computer Science, University of Massachusetts Lowell, Lowell, MA, United States
- Center for Healthcare Organization and Implementation Research, Bedford Veterans Affairs Medical Center, Bedford, MA, United States
- Department of Medicine, University of Massachusetts Medical School, Worcester, MA, United States
| | - Hong Yu
- Department of Computer Science, University of Massachusetts Lowell, Lowell, MA, United States
- Center for Healthcare Organization and Implementation Research, Bedford Veterans Affairs Medical Center, Bedford, MA, United States
- Department of Medicine, University of Massachusetts Medical School, Worcester, MA, United States
- School of Computer Science, University of Massachusetts, Amherst, MA, United States
| |
Collapse
|
9
|
Zhang W, Liu X, Chen Y, Wu W, Wang W, Li X. Feature-derived graph regularized matrix factorization for predicting drug side effects. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.01.085] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
10
|
Meng HY, Luo ZH, Hu B, Jin WL, Yan CK, Li ZB, Xue YY, Liu Y, Luo YE, Xu LQ, Yang H. SNPs affecting the clinical outcomes of regularly used immunosuppressants. Pharmacogenomics 2018. [PMID: 29517418 DOI: 10.2217/pgs-2017-0182] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Recent studies have suggested that genomic diversity may play a key role in different clinical outcomes, and the importance of SNPs is becoming increasingly clear. In this article, we summarize the bioactivity of SNPs that may affect the sensitivity to or possibility of drug reactions that occur among the signaling pathways of regularly used immunosuppressants, such as glucocorticoids, azathioprine, tacrolimus, mycophenolate mofetil, cyclophosphamide and methotrexate. The development of bioinformatics, including machine learning models, has enabled prediction of the proper immunosuppressant dosage with minimal adverse drug reactions for patients after organ transplantation or for those with autoimmune diseases. This article provides a theoretical basis for the personalized use of immunosuppressants in the future.
Collapse
Affiliation(s)
- Huan-Yu Meng
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| | - Zhao-Hui Luo
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| | - Bo Hu
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| | - Wan-Lin Jin
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| | - Cheng-Kai Yan
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| | - Zhi-Bin Li
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| | - Yuan-Yuan Xue
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| | - Yu Liu
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| | - Yi-En Luo
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| | - Li-Qun Xu
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| | - Huan Yang
- Department of Neurology, Xiangya Hospital of Central South University, Changsha, PR China
| |
Collapse
|
11
|
Abstract
BACKGROUND Acute kidney injury (AKI), characterized by abrupt deterioration of renal function, is a common clinical event among hospitalized patients and it is associated with high morbidity and mortality. AKI is defined in three stages with stage-3 being the most severe phase which is irreversible. It is important to effectively discover the true risk factors in order to identify high-risk AKI patients and allow better targeting of tailored interventions. However, Stage-3 AKI patients are very rare (only 0.2% of AKI patients) with a large scale of features available in EHR (1917 potential risk features), yielding a scenario unfeasible for any correlation-based feature selection or modeling method. This study aims to discover the key factors and improve the detection of Stage-3 AKI. METHODS A causal discovery method (McDSL) is adopted for causal discovery to infer true causal relationship between information buried in EHR (such as medication, diagnosis, laboratory tests, comorbidities and etc.) and Stage-3 AKI risk. The research approach comprised two major phases: data collection, and causal discovery. The first phase is propose to collect the data from HER (includes 358 encounters and 891 risk factors). Finally, McDSL is employed to discover the causal risk factors of Stage-3 AKI, and five well-known machine learning models are built for predicting Stage-3 AKI with 10-fold cross-validation (predictive accuracy were measured by AUC, precision, recall and F-score). RESULTS McDSL is useful for further research of EHR. It is able to discover four causal features, all selected features are medications that are modifiable. The latest research of machine learning is employed to compare the performance of prediction, and the experimental result has verified the selected features are pivotal. CONCLUSIONS The features selected by McDSL, which enable us to achieve significant dimension reduction without sacrificing prediction accuracy, suggesting potential clinical use such as helping physicians develop better prevention and treatment strategies.
Collapse
|
12
|
Hodos RA, Kidd BA, Khader S, Readhead BP, Dudley JT. In silico methods for drug repurposing and pharmacology. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2016; 8:186-210. [PMID: 27080087 PMCID: PMC4845762 DOI: 10.1002/wsbm.1337] [Citation(s) in RCA: 181] [Impact Index Per Article: 22.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Revised: 02/08/2016] [Accepted: 02/11/2016] [Indexed: 12/18/2022]
Abstract
Data in the biological, chemical, and clinical domains are accumulating at ever-increasing rates and have the potential to accelerate and inform drug development in new ways. Challenges and opportunities now lie in developing analytic tools to transform these often complex and heterogeneous data into testable hypotheses and actionable insights. This is the aim of computational pharmacology, which uses in silico techniques to better understand and predict how drugs affect biological systems, which can in turn improve clinical use, avoid unwanted side effects, and guide selection and development of better treatments. One exciting application of computational pharmacology is drug repurposing-finding new uses for existing drugs. Already yielding many promising candidates, this strategy has the potential to improve the efficiency of the drug development process and reach patient populations with previously unmet needs such as those with rare diseases. While current techniques in computational pharmacology and drug repurposing often focus on just a single data modality such as gene expression or drug-target interactions, we argue that methods such as matrix factorization that can integrate data within and across diverse data types have the potential to improve predictive performance and provide a fuller picture of a drug's pharmacological action. WIREs Syst Biol Med 2016, 8:186-210. doi: 10.1002/wsbm.1337 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Rachel A Hodos
- New York University and Icahn School of Medicine at Mt. Sinai, New York, NY
| | - Brian A Kidd
- Icahn School of Medicine at Mt. Sinai, New York, NY
| | | | | | | |
Collapse
|
13
|
|
14
|
Zhang W, Liu F, Luo L, Zhang J. Predicting drug side effects by multi-label learning and ensemble learning. BMC Bioinformatics 2015; 16:365. [PMID: 26537615 PMCID: PMC4634905 DOI: 10.1186/s12859-015-0774-y] [Citation(s) in RCA: 100] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2015] [Accepted: 10/14/2015] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Predicting drug side effects is an important topic in the drug discovery. Although several machine learning methods have been proposed to predict side effects, there is still space for improvements. Firstly, the side effect prediction is a multi-label learning task, and we can adopt the multi-label learning techniques for it. Secondly, drug-related features are associated with side effects, and feature dimensions have specific biological meanings. Recognizing critical dimensions and reducing irrelevant dimensions may help to reveal the causes of side effects. METHODS In this paper, we propose a novel method 'feature selection-based multi-label k-nearest neighbor method' (FS-MLKNN), which can simultaneously determine critical feature dimensions and construct high-accuracy multi-label prediction models. RESULTS Computational experiments demonstrate that FS-MLKNN leads to good performances as well as explainable results. To achieve better performances, we further develop the ensemble learning model by integrating individual feature-based FS-MLKNN models. When compared with other state-of-the-art methods, the ensemble method produces better performances on benchmark datasets. CONCLUSIONS In conclusion, FS-MLKNN and the ensemble method are promising tools for the side effect prediction. The source code and datasets are available in the Additional file 1.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer, Wuhan University, Wuhan, 430072, China. .,Research Institute of Shenzhen, Wuhan University, Shenzhen, 518057, China.
| | - Feng Liu
- International School of software, Wuhan University, Wuhan, 430072, China.
| | - Longqiang Luo
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072, China.
| | - Jingxia Zhang
- School of Mathematics and Statistics, Wuhan University, Wuhan, 430072, China.
| |
Collapse
|
15
|
In silico assessment of adverse drug reactions and associated mechanisms. Drug Discov Today 2015; 21:58-71. [PMID: 26272036 DOI: 10.1016/j.drudis.2015.07.018] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2015] [Revised: 07/15/2015] [Accepted: 07/31/2015] [Indexed: 12/31/2022]
Abstract
During recent years, various in silico approaches have been developed to estimate chemical and biological drug features, for example chemical fragments, protein targets, pathways, among others, that correlate with adverse drug reactions (ADRs) and explain the associated mechanisms. These features have also been used for the creation of predictive models that enable estimation of ADRs during the early stages of drug development. In this review, we discuss various in silico approaches to predict these features for a certain drug, estimate correlations with ADRs, establish causal relationships between selected features and ADR mechanisms and create corresponding predictive models.
Collapse
|
16
|
Huiru Zheng, Haiying Wang, Hua Xu, Yonghui Wu, Zhongming Zhao, Azuaje F. Linking Biochemical Pathways and Networks to Adverse Drug Reactions. IEEE Trans Nanobioscience 2014; 13:131-7. [DOI: 10.1109/tnb.2014.2319158] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|