1
|
McCorkle ML, Kisor DF, Freiermuth CE, Sprague JE. Systematic review of Pharmacogenomics Knowledgebase evidence for pharmacogenomic links to the dopamine reward pathway for heroin dependence. Pharmacogenomics 2021; 22:849-857. [PMID: 34424051 DOI: 10.2217/pgs-2021-0023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Genetics play an important role in opioid use disorder (OUD); however, few specific gene variants have been identified. Therefore, there is a need to further understand the pharmacogenomics influences on the pharmacodynamics of opioids. The Pharmacogenomics Knowledgebase (PharmGKB), a database that links genetic variation and drug interaction in the body, was queried to identify polymorphisms associated with heroin dependence in the context of opioid related disorders/OUD. Eight genes with 22 variants were identified as linked to increased risk of heroin dependence, with three genes and variants linked to decreased risk, although the level of evidence was moderate to low. Therefore, continued exploration of biomarker influences on OUD, reward pathways and other contributing circuitries is necessary to understand the true impact of genetics on OUD before integration into clinical guidelines.
Collapse
Affiliation(s)
| | - David F Kisor
- Department of Pharmaceutical Sciences & Pharmacogenomics, College of Pharmacy, Natural & Health Sciences, Manchester University, Fort Wayne, IN 46845, USA
| | - Caroline E Freiermuth
- Department of Emergency Medicine, University of Cincinnati, Cincinnati, OH 45267, USA.,Center for Addiction Research, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Jon E Sprague
- The Ohio Attorney General's Office, Columbus, OH 43215, USA.,The Ohio Attorney General's Center for the Future of Forensic Science, Bowling Green State University, Bowling Green, OH 43403, USA
| |
Collapse
|
2
|
Miftahutdinov Z, Kadurin A, Kudrin R, Tutubalina E. Medical Concept Normalization in Clinical Trials with Drug and Disease Representation Learning. Bioinformatics 2021; 37:3856-3864. [PMID: 34213526 PMCID: PMC8570806 DOI: 10.1093/bioinformatics/btab474] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 06/02/2021] [Accepted: 07/01/2021] [Indexed: 11/18/2022] Open
Abstract
Motivation Clinical trials are the essential stage of every drug development program for the treatment to become available to patients. Despite the importance of well-structured clinical trial databases and their tremendous value for drug discovery and development such instances are very rare. Presently large-scale information on clinical trials is stored in clinical trial registers which are relatively structured, but the mappings to external databases of drugs and diseases are increasingly lacking. The precise production of such links would enable us to interrogate richer harmonized datasets for invaluable insights. Results We present a neural approach for medical concept normalization of diseases and drugs. Our two-stage approach is based on Bidirectional Encoder Representations from Transformers (BERT). In the training stage, we optimize the relative similarity of mentions and concept names from a terminology via triplet loss. In the inference stage, we obtain the closest concept name representation in a common embedding space to a given mention representation. We performed a set of experiments on a dataset of abstracts and a real-world dataset of trial records with interventions and conditions mapped to drug and disease terminologies. The latter includes mentions associated with one or more concepts (in-KB) or zero (out-of-KB, nil prediction). Experiments show that our approach significantly outperforms baseline and state-of-the-art architectures. Moreover, we demonstrate that our approach is effective in knowledge transfer from the scientific literature to clinical trial data. Availability and implementation We make code and data freely available at https://github.com/insilicomedicine/DILBERT.
Collapse
Affiliation(s)
| | - Artur Kadurin
- Insilico Medicine Hong Kong, Pak Shek Kok, Hong Kong
| | - Roman Kudrin
- Insilico Medicine Hong Kong, Pak Shek Kok, Hong Kong
| | | |
Collapse
|
3
|
Sadeghi SS, Keyvanpour MR. An Analytical Review of Computational Drug Repurposing. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:472-488. [PMID: 31403439 DOI: 10.1109/tcbb.2019.2933825] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Drug repurposing is a vital function in pharmaceutical fields and has gained popularity in recent years in both the pharmaceutical industry and research community. It refers to the process of discovering new uses and indications for existing or failed drugs. It is cost-effective and reliable in contrast to experimental drug discovery, which is a costly, time-consuming, and risky process and limited to a relatively small number of targets. Accordingly, a plethora of computational methodologies have been propounded to repurpose drugs on a large scale by utilizing available high throughput data. The available literature, however, lacks a contemporary and comprehensive analysis of the current computational drug repurposing methodologies. In this paper, we presented a systematic analysis of computational drug repurposing which consists of three main sections: Initially, we categorize the computational drug repurposing methods based on their technical approach and artificial intelligence perspective and discuss the strengths and weaknesses of various methods. Secondly, some general criteria are recommended to analyze our proposed categorization. In the third and final section, a qualitative comparison is made between each approach which is a guide to understanding their preference to one another. Further, this systematic analysis can help in the efficient selection and improvement of drug repurposing techniques based on the nature of computational methods implemented on biological resources.
Collapse
|
4
|
Shi W, Chen X, Deng L. A Review of Recent Developments and Progress in Computational Drug Repositioning. Curr Pharm Des 2021; 26:3059-3068. [PMID: 31951162 DOI: 10.2174/1381612826666200116145559] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 01/09/2020] [Indexed: 12/27/2022]
Abstract
Computational drug repositioning is an efficient approach towards discovering new indications for existing drugs. In recent years, with the accumulation of online health-related information and the extensive use of biomedical databases, computational drug repositioning approaches have achieved significant progress in drug discovery. In this review, we summarize recent advancements in drug repositioning. Firstly, we explicitly demonstrated the available data source information which is conducive to identifying novel indications. Furthermore, we provide a summary of the commonly used computing approaches. For each method, we briefly described techniques, case studies, and evaluation criteria. Finally, we discuss the limitations of the existing computing approaches.
Collapse
Affiliation(s)
- Wanwan Shi
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Xuegong Chen
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
5
|
Jarada TN, Rokne JG, Alhajj R. A review of computational drug repositioning: strategies, approaches, opportunities, challenges, and directions. J Cheminform 2020; 12:46. [PMID: 33431024 PMCID: PMC7374666 DOI: 10.1186/s13321-020-00450-7] [Citation(s) in RCA: 139] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 07/13/2020] [Indexed: 01/13/2023] Open
Abstract
Drug repositioning is the process of identifying novel therapeutic potentials for existing drugs and discovering therapies for untreated diseases. Drug repositioning, therefore, plays an important role in optimizing the pre-clinical process of developing novel drugs by saving time and cost compared to the traditional de novo drug discovery processes. Since drug repositioning relies on data for existing drugs and diseases the enormous growth of publicly available large-scale biological, biomedical, and electronic health-related data along with the high-performance computing capabilities have accelerated the development of computational drug repositioning approaches. Multidisciplinary researchers and scientists have carried out numerous attempts, with different degrees of efficiency and success, to computationally study the potential of repositioning drugs to identify alternative drug indications. This study reviews recent advancements in the field of computational drug repositioning. First, we highlight different drug repositioning strategies and provide an overview of frequently used resources. Second, we summarize computational approaches that are extensively used in drug repositioning studies. Third, we present different computing and experimental models to validate computational methods. Fourth, we address prospective opportunities, including a few target areas. Finally, we discuss challenges and limitations encountered in computational drug repositioning and conclude with an outline of further research directions.
Collapse
Affiliation(s)
- Tamer N Jarada
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
| | - Jon G Rokne
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
| | - Reda Alhajj
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada.
- Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey.
| |
Collapse
|
6
|
Liu H, Zhang W, Song Y, Deng L, Zhou S. HNet-DNN: Inferring New Drug-Disease Associations with Deep Neural Network Based on Heterogeneous Network Features. J Chem Inf Model 2020; 60:2367-2376. [PMID: 32118415 DOI: 10.1021/acs.jcim.9b01008] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Drug research and development is a time-consuming and high-cost task, pressing an urgent demand to identify novel indications of approved drugs, referred to as drug repositioning, which provides an economical and efficient way for drug discovery. With increasing volumes of large-scale chemical, genomic, and pharmacological data sets generated by the high-throughput technique, it is crucial to develop systematic and rational computational approaches to identify new indications of approved drugs. In this paper, we introduce HNet-DNN, which utilizes a deep neural network (DNN), to predict new drug-disease associations based on the features extracted from the drug-disease heterogeneous network. Instead of the straightforward concatenation of chemical and phenotypic features as the input of DNN, we used these raw features of drugs and diseases to construct a drug-drug similarity network and a disease-disease similarity network, and then built a drug-disease heterogeneous network by integrating known drug-disease associations. Subsequently, we extracted topological features for drug-disease associations from the heterogeneous network and used them to train a DNN model. Our intensive performance evaluations demonstrated that HNet-DNN effectively exploits the features of the heterogeneous network to boost the predictive performance of drug-disease associations. Compared with a couple of typical classifiers and competitive approaches, our method not only achieved state-of-the-art performance but also effectively alleviated the overfitting problem. Moreover, we ran HNet-DNN to predict new drug-disease associations and carried out case studies to verify the effectiveness of our method.
Collapse
Affiliation(s)
- Hui Liu
- Aliyun School of Big Data, Changzhou University, 213164 Changzhou, China
| | - Wenhao Zhang
- Aliyun School of Big Data, Changzhou University, 213164 Changzhou, China
| | - Yinglong Song
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, 200433 Shanghai, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, 410075 Changsha, China
| | - Shuigeng Zhou
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, 200433 Shanghai, China
| |
Collapse
|
7
|
Yan CK, Wang WX, Zhang G, Wang JL, Patel A. BiRWDDA: A Novel Drug Repositioning Method Based on Multisimilarity Fusion. J Comput Biol 2019; 26:1230-1242. [DOI: 10.1089/cmb.2019.0063] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Affiliation(s)
- Chao-Kun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Wen-Xiu Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Jian-Lin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | | |
Collapse
|
8
|
Mining heterogeneous network for drug repositioning using phenotypic information extracted from social media and pharmaceutical databases. Artif Intell Med 2019; 96:80-92. [DOI: 10.1016/j.artmed.2019.03.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 02/24/2019] [Accepted: 03/05/2019] [Indexed: 01/09/2023]
|
9
|
Yorifuji K, Uemura Y, Horibata S, Tsuji G, Suzuki Y, Miyagawa K, Nakayama K, Hirata KI, Kumagai S, Emoto N. CHST3 and CHST13 polymorphisms as predictors of bosentan-induced liver toxicity in Japanese patients with pulmonary arterial hypertension. Pharmacol Res 2018; 135:259-264. [PMID: 30118797 DOI: 10.1016/j.phrs.2018.08.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 08/09/2018] [Accepted: 08/13/2018] [Indexed: 01/24/2023]
Abstract
Bosentan, an endothelin receptor antagonist, has been widely used as a first-line drug for the treatment of pulmonary arterial hypertension (PAH). In addition, bosentan is approved for patients with digital ulcers related to systemic sclerosis. Liver dysfunction is a major adverse effect of bosentan and may lead to discontinuation of therapy. The purpose of this study was to identify genomic biomarkers to predict bosentan-induced liver injury. A total of 69 PAH patients were recruited into the study. An exploratory analysis of 1936 single-nucleotide polymorphisms (SNPs) in 231 genes involved in absorption, distribution, metabolism, and elimination of multiple medications using Affimetrix DMET™ (Drug Metabolism Enzymes and Transporters) chips was performed. We extracted 16 SNPs (P < 0.05) using the Jonckheere-Terpstra trend test and multiplex logistic analysis; we identified two SNPs in two genes, CHST3 and CHST13, which are responsible for proteoglycan sulfation and were significantly associated with bosentan-induced liver injury. We constructed a predictive model for bosentan-induced liver injury (area under the curve [AUC]: 0.89, sensitivity: 82.61%, specificity: 86.05%) via receiver operating curve (ROC) analysis using 2 SNPs and 2 non-genetic factors. Two SNPs were identified as potential predictive markers for bosentan-induced liver injury in Japanese patients with pulmonary arterial hypertension. This is the first pharmacogenomics study linking proteoglycan sulfating genes to drug-induced liver dysfunction, a frequently observed clinical adverse effect of bosentan therapy. These results may provide a way to personalize PAH medicine as well as provide novel mechanistic insights to drug-induced liver dysfunction.
Collapse
Affiliation(s)
- Kennosuke Yorifuji
- Laboratory of Clinical Pharmaceutical Science, Kobe Pharmaceutical University, 4-19-1 Motoyama-kitamachi, Higashinada, Kobe 658-8558, Japan; The Shinko Institute for Medical Research, Shinko Hospital, 1-4-47, Wakinohama, Chuo, Kobe 651-0072, Japan; Department of Pharmacy, Shinko Hospital, 1-4-47, Wakinohama, Chuo, Kobe 651-0072, Japan
| | - Yuko Uemura
- The Shinko Institute for Medical Research, Shinko Hospital, 1-4-47, Wakinohama, Chuo, Kobe 651-0072, Japan
| | - Shinji Horibata
- The Shinko Institute for Medical Research, Shinko Hospital, 1-4-47, Wakinohama, Chuo, Kobe 651-0072, Japan; Department of Pharmacy, Shinko Hospital, 1-4-47, Wakinohama, Chuo, Kobe 651-0072, Japan
| | - Goh Tsuji
- The Shinko Institute for Medical Research, Shinko Hospital, 1-4-47, Wakinohama, Chuo, Kobe 651-0072, Japan; Center for Rheumatic Diseases, Shinko Hospital, 1-4-47, Wakinohama, Chuo, Kobe 651-0072, Japan
| | - Yoko Suzuki
- Laboratory of Clinical Pharmaceutical Science, Kobe Pharmaceutical University, 4-19-1 Motoyama-kitamachi, Higashinada, Kobe 658-8558, Japan; Division of Cardiovascular Medicine, Department of Internal Medicine, Kobe Graduate School of Medicine, 7-5-1 Kusunoki, Chuo, Kobe 650-0017, Japan
| | - Kazuya Miyagawa
- Laboratory of Clinical Pharmaceutical Science, Kobe Pharmaceutical University, 4-19-1 Motoyama-kitamachi, Higashinada, Kobe 658-8558, Japan; Division of Cardiovascular Medicine, Department of Internal Medicine, Kobe Graduate School of Medicine, 7-5-1 Kusunoki, Chuo, Kobe 650-0017, Japan
| | - Kazuhiko Nakayama
- Division of Cardiovascular Medicine, Department of Internal Medicine, Kobe Graduate School of Medicine, 7-5-1 Kusunoki, Chuo, Kobe 650-0017, Japan
| | - Ken-Ichi Hirata
- Division of Cardiovascular Medicine, Department of Internal Medicine, Kobe Graduate School of Medicine, 7-5-1 Kusunoki, Chuo, Kobe 650-0017, Japan
| | - Shunichi Kumagai
- The Shinko Institute for Medical Research, Shinko Hospital, 1-4-47, Wakinohama, Chuo, Kobe 651-0072, Japan; Center for Rheumatic Diseases, Shinko Hospital, 1-4-47, Wakinohama, Chuo, Kobe 651-0072, Japan
| | - Noriaki Emoto
- Laboratory of Clinical Pharmaceutical Science, Kobe Pharmaceutical University, 4-19-1 Motoyama-kitamachi, Higashinada, Kobe 658-8558, Japan; Division of Cardiovascular Medicine, Department of Internal Medicine, Kobe Graduate School of Medicine, 7-5-1 Kusunoki, Chuo, Kobe 650-0017, Japan.
| |
Collapse
|
10
|
Krallinger M, Rabal O, Lourenço A, Oyarzabal J, Valencia A. Information Retrieval and Text Mining Technologies for Chemistry. Chem Rev 2017; 117:7673-7761. [PMID: 28475312 DOI: 10.1021/acs.chemrev.6b00851] [Citation(s) in RCA: 111] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.
Collapse
Affiliation(s)
- Martin Krallinger
- Structural Computational Biology Group, Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre , C/Melchor Fernández Almagro 3, Madrid E-28029, Spain
| | - Obdulia Rabal
- Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research (CIMA), University of Navarra , Avenida Pio XII 55, Pamplona E-31008, Spain
| | - Anália Lourenço
- ESEI - Department of Computer Science, University of Vigo , Edificio Politécnico, Campus Universitario As Lagoas s/n, Ourense E-32004, Spain.,Centro de Investigaciones Biomédicas (Centro Singular de Investigación de Galicia) , Campus Universitario Lagoas-Marcosende, Vigo E-36310, Spain.,CEB-Centre of Biological Engineering, University of Minho , Campus de Gualtar, Braga 4710-057, Portugal
| | - Julen Oyarzabal
- Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research (CIMA), University of Navarra , Avenida Pio XII 55, Pamplona E-31008, Spain
| | - Alfonso Valencia
- Life Science Department, Barcelona Supercomputing Centre (BSC-CNS) , C/Jordi Girona, 29-31, Barcelona E-08034, Spain.,Joint BSC-IRB-CRG Program in Computational Biology, Parc Científic de Barcelona , C/ Baldiri Reixac 10, Barcelona E-08028, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA) , Passeig de Lluís Companys 23, Barcelona E-08010, Spain
| |
Collapse
|
11
|
Su EW, Sanger TM. Systematic drug repositioning through mining adverse event data in ClinicalTrials.gov. PeerJ 2017; 5:e3154. [PMID: 28348935 PMCID: PMC5366063 DOI: 10.7717/peerj.3154] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Accepted: 03/07/2017] [Indexed: 01/05/2023] Open
Abstract
Drug repositioning (i.e., drug repurposing) is the process of discovering new uses for marketed drugs. Historically, such discoveries were serendipitous. However, the rapid growth in electronic clinical data and text mining tools makes it feasible to systematically identify drugs with the potential to be repurposed. Described here is a novel method of drug repositioning by mining ClinicalTrials.gov. The text mining tools I2E (Linguamatics) and PolyAnalyst (Megaputer) were utilized. An I2E query extracts “Serious Adverse Events” (SAE) data from randomized trials in ClinicalTrials.gov. Through a statistical algorithm, a PolyAnalyst workflow ranks the drugs where the treatment arm has fewer predefined SAEs than the control arm, indicating that potentially the drug is reducing the level of SAE. Hypotheses could then be generated for the new use of these drugs based on the predefined SAE that is indicative of disease (for example, cancer).
Collapse
Affiliation(s)
- Eric Wen Su
- Advanced Analytics Hub, Eli Lilly and Company , Indianapolis , IN , United States of America
| | - Todd M Sanger
- Advanced Analytics Hub, Eli Lilly and Company , Indianapolis , IN , United States of America
| |
Collapse
|
12
|
Zarin DA, Tse T, Williams RJ, Rajakannan T. Update on Trial Registration 11 Years after the ICMJE Policy Was Established. N Engl J Med 2017; 376:383-391. [PMID: 28121511 PMCID: PMC5813248 DOI: 10.1056/nejmsr1601330] [Citation(s) in RCA: 146] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
In the decade following the journal editors’ trial registration policy, a global trial reporting system (TRS) has arisen to supplement journal publication by increasing the transparency and accountability of the clinical research enterprise (CRE), which ultimately advances evidence-based medicine. Trial registration a foundation component of the TRS. In this article, we assess impact of the trial registration on the CRE with respect to two key goals: (1) establishing a publicly accessible and structured public record of all trials and (2) ensuring access to date-stamped protocol details that change during a study. After characterizing international trial registry landscape, we summarize the published evidence of the impact of the registration laws and policies on the CRE to date. We present three analyses using ClinicalTrials.gov registration data to illustrate approaches for assessing and monitoring the TRS: (1) timing of registration (i.e., prior to trial initiation [prospective] or after trial initiation [retrospective or “late”]; (2) degree of specificity and consistency of registered primary outcome measures compared to descriptions in study protocols and published articles; and (3) a survey of the published literature to characterize how ClinicalTrials.gov data has been used in research on the CRE. These findings suggest that, while the TRS is largely moving towards goals, key stakeholders need to do more in the next decade.
Collapse
Affiliation(s)
- Deborah A Zarin
- From the National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD
| | - Tony Tse
- From the National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD
| | - Rebecca J Williams
- From the National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD
| | - Thiyagu Rajakannan
- From the National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD
| |
Collapse
|
13
|
Liu H, Song Y, Guan J, Luo L, Zhuang Z. Inferring new indications for approved drugs via random walk on drug-disease heterogenous networks. BMC Bioinformatics 2016; 17:539. [PMID: 28155639 PMCID: PMC5259862 DOI: 10.1186/s12859-016-1336-7] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Background Since traditional drug research and development is often time-consuming and high-risk, there is an increasing interest in establishing new medical indications for approved drugs, referred to as drug repositioning, which provides a relatively low-cost and high-efficiency approach for drug discovery. With the explosive growth of large-scale biochemical and phenotypic data, drug repositioning holds great potential for precision medicine in the post-genomic era. It is urgent to develop rational and systematic approaches to predict new indications for approved drugs on a large scale. Results In this paper, we propose the two-pass random walks with restart on a heterogenous network, TP-NRWRH for short, to predict new indications for approved drugs. Rather than random walk on bipartite network, we integrated the drug-drug similarity network, disease-disease similarity network and known drug-disease association network into one heterogenous network, on which the two-pass random walks with restart is implemented. We have conducted performance evaluation on two datasets of drug-disease associations, and the results show that our method has higher performance than six existing methods. A case study on the Alzheimer’s disease showed that nine of top 10 predicted drugs have been approved or investigational for neurodegenerative diseases. The experimental results show that our method achieves state-of-the-art performance in predicting new indications for approved drugs. Conclusions We proposed a two-pass random walk with restart on the drug-disease heterogeneous network, referred to as TP-NRWRH, to predict new indications for approved drugs. Performance evaluation on two independent datasets showed that TP-NRWRH achieved higher performance than six existing methods on 10-fold cross validations. The case study on the Alzheimer’s disease showed that nine of top 10 predicted drugs have been approved or are investigational for neurodegenerative diseases. The results show that our method achieves state-of-the-art performance in predicting new indications for approved drugs.
Collapse
Affiliation(s)
- Hui Liu
- Changzhou NO. 7 People's Hospital, Changzhou, Jiangsu, 213011, China.,Changzhou University, Jiangsu, 213164, China
| | - Yinglong Song
- Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, 200433, China
| | - Jihong Guan
- Department of Computer Science and Technology, Tongji University, Shanghai, 201804, China
| | - Libo Luo
- Changzhou NO. 7 People's Hospital, Changzhou, Jiangsu, 213011, China.
| | - Ziheng Zhuang
- Changzhou NO. 7 People's Hospital, Changzhou, Jiangsu, 213011, China. .,Changzhou University, Jiangsu, 213164, China.
| |
Collapse
|
14
|
Xu J, Lee HJ, Zeng J, Wu Y, Zhang Y, Huang LC, Johnson A, Holla V, Bailey AM, Cohen T, Meric-Bernstam F, Bernstam EV, Xu H. Extracting genetic alteration information for personalized cancer therapy from ClinicalTrials.gov. J Am Med Inform Assoc 2016; 23:750-7. [PMID: 27013523 DOI: 10.1093/jamia/ocw009] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 01/13/2016] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov. METHODS We developed a semi-automatic framework that combines advanced text-processing techniques with manual review to curate genetic alteration information in cancer trials. The framework consists of a document classification system to identify cancer treatment trials from ClinicalTrials.gov and an information extraction system to extract gene and alteration pairs from the Title and Eligibility Criteria sections of clinical trials. By applying the framework to trials at ClinicalTrials.gov, we created a knowledge base of cancer treatment trials with genetic alteration annotations. We then evaluated each component of the framework against manually reviewed sets of clinical trials and generated descriptive statistics of the knowledge base. RESULTS AND DISCUSSION The automated cancer treatment trial identification system achieved a high precision of 0.9944. Together with the manual review process, it identified 20 193 cancer treatment trials from ClinicalTrials.gov. The automated gene-alteration extraction system achieved a precision of 0.8300 and a recall of 0.6803. After validation by manual review, we generated a knowledge base of 2024 cancer trials that are labeled with specific genetic alteration information. Analysis of the knowledge base revealed the trend of increased use of targeted therapy for cancer, as well as top frequent gene-alteration pairs of interest. We expect this knowledge base to be a valuable resource for physicians and patients who are seeking information about personalized cancer therapy.
Collapse
Affiliation(s)
- Jun Xu
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Hee-Jin Lee
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Jia Zeng
- Institute for Personalized Cancer Therapy, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Yonghui Wu
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yaoyun Zhang
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Liang-Chin Huang
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Amber Johnson
- Institute for Personalized Cancer Therapy, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Vijaykumar Holla
- Institute for Personalized Cancer Therapy, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ann M Bailey
- Institute for Personalized Cancer Therapy, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Trevor Cohen
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Funda Meric-Bernstam
- Institute for Personalized Cancer Therapy, University of Texas MD Anderson Cancer Center, Houston, TX, USA Department of Investigational Cancer Therapeutics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Elmer V Bernstam
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA Division of General Internal Medicine, Department of Internal Medicine, Medical School, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Hua Xu
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| |
Collapse
|
15
|
Wei CH, Leaman R, Lu Z. SimConcept: a hybrid approach for simplifying composite named entities in biomedical text. IEEE J Biomed Health Inform 2015; 19:1385-91. [PMID: 25879978 PMCID: PMC4543296 DOI: 10.1109/jbhi.2015.2422651] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
One particular challenge in biomedical named entity recognition (NER) and normalization is the identification and resolution of composite named entities, where a single span refers to more than one concept (e.g., BRCA1/2). Previous NER and normalization studies have either ignored composite mentions, used simple ad hoc rules, or only handled coordination ellipsis, making a robust approach for handling multitype composite mentions greatly needed. To this end, we propose a hybrid method integrating a machine-learning model with a pattern identification strategy to identify the individual components of each composite mention. Our method, which we have named SimConcept, is the first to systematically handle many types of composite mentions. The technique achieves high performance in identifying and resolving composite mentions for three key biological entities: genes (90.42% in F-measure), diseases (86.47% in F-measure), and chemicals (86.05% in F-measure). Furthermore, our results show that using our SimConcept method can subsequently improve the performance of gene and disease concept recognition and normalization. SimConcept is available for download at: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/SimConcept/.
Collapse
|
16
|
Li J, Zheng S, Chen B, Butte AJ, Swamidass SJ, Lu Z. A survey of current trends in computational drug repositioning. Brief Bioinform 2015; 17:2-12. [PMID: 25832646 DOI: 10.1093/bib/bbv020] [Citation(s) in RCA: 338] [Impact Index Per Article: 37.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2014] [Indexed: 12/26/2022] Open
Abstract
Computational drug repositioning or repurposing is a promising and efficient tool for discovering new uses from existing drugs and holds the great potential for precision medicine in the age of big data. The explosive growth of large-scale genomic and phenotypic data, as well as data of small molecular compounds with granted regulatory approval, is enabling new developments for computational repositioning. To achieve the shortest path toward new drug indications, advanced data processing and analysis strategies are critical for making sense of these heterogeneous molecular measurements. In this review, we show recent advancements in the critical areas of computational drug repositioning from multiple aspects. First, we summarize available data sources and the corresponding computational repositioning strategies. Second, we characterize the commonly used computational techniques. Third, we discuss validation strategies for repositioning studies, including both computational and experimental methods. Finally, we highlight potential opportunities and use-cases, including a few target areas such as cancers. We conclude with a brief discussion of the remaining challenges in computational drug repositioning.
Collapse
|
17
|
A Semantic Web-based System for Mining Genetic Mutations in Cancer Clinical Trials. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2015; 2015:142-6. [PMID: 26306257 PMCID: PMC4525254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Textual eligibility criteria in clinical trial protocols contain important information about potential clinically relevant pharmacogenomic events. Manual curation for harvesting this evidence is intractable as it is error prone and time consuming. In this paper, we develop and evaluate a Semantic Web-based system that captures and manages mutation evidences and related contextual information from cancer clinical trials. The system has 2 main components: an NLP-based annotator and a Semantic Web ontology-based annotation manager. We evaluated the performance of the annotator in terms of precision and recall. We demonstrated the usefulness of the system by conducting case studies in retrieving relevant clinical trials using a collection of mutations identified from TCGA Leukemia patients and Atlas of Genetics and Cytogenetics in Oncology and Haematology. In conclusion, our system using Semantic Web technologies provides an effective framework for extraction, annotation, standardization and management of genetic mutations in cancer clinical trials.
Collapse
|
18
|
Khare R, Wei CH, Mao Y, Leaman R, Lu Z. tmBioC: improving interoperability of text-mining tools with BioC. Database (Oxford) 2014; 2014:bau073. [PMID: 25062914 PMCID: PMC4110697 DOI: 10.1093/database/bau073] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Revised: 06/30/2014] [Accepted: 07/01/2014] [Indexed: 02/05/2023]
Abstract
The lack of interoperability among biomedical text-mining tools is a major bottleneck in creating more complex applications. Despite the availability of numerous methods and techniques for various text-mining tasks, combining different tools requires substantial efforts and time owing to heterogeneity and variety in data formats. In response, BioC is a recent proposal that offers a minimalistic approach to tool interoperability by stipulating minimal changes to existing tools and applications. BioC is a family of XML formats that define how to present text documents and annotations, and also provides easy-to-use functions to read/write documents in the BioC format. In this study, we introduce our text-mining toolkit, which is designed to perform several challenging and significant tasks in the biomedical domain, and repackage the toolkit into BioC to enhance its interoperability. Our toolkit consists of six state-of-the-art tools for named-entity recognition, normalization and annotation (PubTator) of genes (GenNorm), diseases (DNorm), mutations (tmVar), species (SR4GN) and chemicals (tmChem). Although developed within the same group, each tool is designed to process input articles and output annotations in a different format. We modify these tools and enable them to read/write data in the proposed BioC format. We find that, using the BioC family of formats and functions, only minimal changes were required to build the newer versions of the tools. The resulting BioC wrapped toolkit, which we have named tmBioC, consists of our tools in BioC, an annotated full-text corpus in BioC, and a format detection and conversion tool. Furthermore, through participation in the 2013 BioCreative IV Interoperability Track, we empirically demonstrate that the tools in tmBioC can be more efficiently integrated with each other as well as with external tools: Our experimental results show that using BioC reduces >60% in lines of code for text-mining tool integration. The tmBioC toolkit is publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/. Database URL: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/.
Collapse
Affiliation(s)
- Ritu Khare
- National Center for Biotechnology Information, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, USA
| | - Chih-Hsuan Wei
- National Center for Biotechnology Information, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, USA
| | - Yuqing Mao
- National Center for Biotechnology Information, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, USA
| | - Robert Leaman
- National Center for Biotechnology Information, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, USA
| |
Collapse
|
19
|
Wei CH, Leaman R, Lu Z. SimConcept: A Hybrid Approach for Simplifying Composite Named Entities in Biomedicine. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2014; 2014:138-146. [PMID: 25844401 PMCID: PMC4384177 DOI: 10.1145/2649387.2649420] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Many text-mining studies have focused on the issue of named entity recognition and normalization, especially in the field of biomedical natural language processing. However, entity recognition is a complicated and difficult task in biomedical text. One particular challenge is to identify and resolve composite named entities, where a single span refers to more than one concept(e.g., BRCA1/2). Most bioconcept recognition and normalization studies have either ignored this issue, used simple ad-hoc rules, or only handled coordination ellipsis, which is only one of the many types of composite mentions studied in this work. No systematic methods for simplifying composite mentions have been previously reported, making a robust approach greatly needed. To this end, we propose a hybrid approach by integrating a machine learning model with a pattern identification strategy to identify the antecedent and conjuncts regions of a concept mention, and then reassemble the composite mention using those identified regions. Our method, which we have named SimConcept, is the first method to systematically handle most types of composite mentions. Our method achieves high performance in identifying and resolving composite mentions for three fundamental biological entities: genes (89.29% in F-measure), diseases (85.52% in F-measure) and chemicals (84.04% in F-measure). Furthermore, our results show that, using our SimConcept method can subsequently help improve the performance of gene and disease concept recognition and normalization.
Collapse
Affiliation(s)
- Chih-Hsuan Wei
- 8600 Rockville Pike, National Center for Biotechnology Information (NCBI), Bethesda, Maryland, USA, 20894
| | - Robert Leaman
- 8600 Rockville Pike, National Center for Biotechnology Information (NCBI), Bethesda, Maryland, USA, 20894
| | - Zhiyong Lu
- 8600 Rockville Pike, National Center for Biotechnology Information (NCBI), Bethesda, Maryland, USA, 20894
| |
Collapse
|
20
|
Application of genomics, proteomics and metabolomics in drug discovery, development and clinic. Ther Deliv 2013; 4:395-413. [PMID: 23442083 DOI: 10.4155/tde.13.4] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Genomics, proteomics and metabolomics are three areas that are routinely applied throughout the drug-development process as well as after a product enters the market. This review discusses all three 'omics, reporting on the key applications, techniques, recent advances and expectations of each. Genomics, mainly through the use of novel and next-generation sequencing techniques, has advanced areas of drug discovery and development through the comparative assessment of normal and diseased-state tissues, transcription and/or expression profiling, side-effect profiling, pharmacogenomics and the identification of biomarkers. Proteomics, through techniques including isotope coded affinity tags, stable isotopic labeling by amino acids in cell culture, isobaric tags for relative and absolute quantification, multidirectional protein identification technology, activity-based probes, protein/peptide arrays, phage displays and two-hybrid systems is utilized in multiple areas through the drug development pipeline including target and lead identification, compound optimization, throughout the clinical trials process and after market analysis. Metabolomics, although the most recent and least developed of the three 'omics considered in this review, provides a significant contribution to drug development through systems biology approaches. Already implemented to some degree in the drug-discovery industry and used in applications spanning target identification through to toxicological analysis, metabolic network understanding is essential in generating future discoveries.
Collapse
|
21
|
Trugenberger CA, Wälti C, Peregrim D, Sharp ME, Bureeva S. Discovery of novel biomarkers and phenotypes by semantic technologies. BMC Bioinformatics 2013; 14:51. [PMID: 23402646 PMCID: PMC3605201 DOI: 10.1186/1471-2105-14-51] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2012] [Accepted: 02/01/2013] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Biomarkers and target-specific phenotypes are important to targeted drug design and individualized medicine, thus constituting an important aspect of modern pharmaceutical research and development. More and more, the discovery of relevant biomarkers is aided by in silico techniques based on applying data mining and computational chemistry on large molecular databases. However, there is an even larger source of valuable information available that can potentially be tapped for such discoveries: repositories constituted by research documents. RESULTS This paper reports on a pilot experiment to discover potential novel biomarkers and phenotypes for diabetes and obesity by self-organized text mining of about 120,000 PubMed abstracts, public clinical trial summaries, and internal Merck research documents. These documents were directly analyzed by the InfoCodex semantic engine, without prior human manipulations such as parsing. Recall and precision against established, but different benchmarks lie in ranges up to 30% and 50% respectively. Retrieval of known entities missed by other traditional approaches could be demonstrated. Finally, the InfoCodex semantic engine was shown to discover new diabetes and obesity biomarkers and phenotypes. Amongst these were many interesting candidates with a high potential, although noticeable noise (uninteresting or obvious terms) was generated. CONCLUSIONS The reported approach of employing autonomous self-organising semantic engines to aid biomarker discovery, supplemented by appropriate manual curation processes, shows promise and has potential to impact, conservatively, a faster alternative to vocabulary processes dependent on humans having to read and analyze all the texts. More optimistically, it could impact pharmaceutical research, for example to shorten time-to-market of novel drugs, or speed up early recognition of dead ends and adverse reactions.
Collapse
Affiliation(s)
- Carlo A Trugenberger
- InfoCodex AG, Semantic Technologies, Bahnhofstrasse 50, Buchs (SG), CH-9470, Switzerland
| | - Christoph Wälti
- InfoCodex AG, Semantic Technologies, Bahnhofstrasse 50, Buchs (SG), CH-9470, Switzerland
| | - David Peregrim
- Merck Research Laboratories, 126 East Lincoln Avenue, Rahway, NJ 07065, USA
| | - Mark E Sharp
- Merck Research Laboratories, 126 East Lincoln Avenue, Rahway, NJ 07065, USA
| | - Svetlana Bureeva
- Thomson Reuters, 5901 Priestly Drive, STE 200, Carlsbad, CA, 92008, USA
| |
Collapse
|
22
|
The state of the art in text mining and natural language processing for pharmacogenomics. J Biomed Inform 2012. [DOI: 10.1016/j.jbi.2012.08.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|