1
|
Brnabic AJM, Curtis SE, Johnston JA, Lo A, Zagar AJ, Lipkovich I, Kadziola Z, Murray MH, Ryan T. Incidence of type 2 diabetes, cardiovascular disease and chronic kidney disease in patients with multiple sclerosis initiating disease-modifying therapies: Retrospective cohort study using a frequentist model averaging statistical framework. PLoS One 2024; 19:e0300708. [PMID: 38517926 PMCID: PMC10959335 DOI: 10.1371/journal.pone.0300708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 03/04/2024] [Indexed: 03/24/2024] Open
Abstract
Researchers are increasingly using insights derived from large-scale, electronic healthcare data to inform drug development and provide human validation of novel treatment pathways and aid in drug repurposing/repositioning. The objective of this study was to determine whether treatment of patients with multiple sclerosis with dimethyl fumarate, an activator of the nuclear factor erythroid 2-related factor 2 (Nrf2) pathway, results in a change in incidence of type 2 diabetes and its complications. This retrospective cohort study used administrative claims data to derive four cohorts of adults with multiple sclerosis initiating dimethyl fumarate, teriflunomide, glatiramer acetate or fingolimod between January 2013 and December 2018. A causal inference frequentist model averaging framework based on machine learning was used to compare the time to first occurrence of a composite endpoint of type 2 diabetes, cardiovascular disease or chronic kidney disease, as well as each individual outcome, across the four treatment cohorts. There was a statistically significantly lower risk of incidence for dimethyl fumarate versus teriflunomide for the composite endpoint (restricted hazard ratio [95% confidence interval] 0.70 [0.55, 0.90]) and type 2 diabetes (0.65 [0.49, 0.98]), myocardial infarction (0.59 [0.35, 0.97]) and chronic kidney disease (0.52 [0.28, 0.86]). No differences for other individual outcomes or for dimethyl fumarate versus the other two cohorts were observed. This study effectively demonstrated the use of an innovative statistical methodology to test a clinical hypothesis using real-world data to perform early target validation for drug discovery. Although there was a trend among patients treated with dimethyl fumarate towards a decreased incidence of type 2 diabetes, cardiovascular disease and chronic kidney disease relative to other disease-modifying therapies-which was statistically significant for the comparison with teriflunomide-this study did not definitively support the hypothesis that Nrf2 activation provided additional metabolic disease benefit in patients with multiple sclerosis.
Collapse
Affiliation(s)
- Alan J M Brnabic
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Sarah E Curtis
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Joseph A Johnston
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Albert Lo
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Anthony J Zagar
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Ilya Lipkovich
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Zbigniew Kadziola
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Megan H Murray
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Timothy Ryan
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| |
Collapse
|
2
|
Zhou JX, Torres VE. Drug repurposing in autosomal dominant polycystic kidney disease. Kidney Int 2023; 103:859-871. [PMID: 36870435 DOI: 10.1016/j.kint.2023.02.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 01/23/2023] [Accepted: 02/07/2023] [Indexed: 03/06/2023]
Abstract
Autosomal dominant polycystic kidney disease is characterized by progressive kidney cyst formation that leads to kidney failure. Tolvaptan, a vasopressin 2 receptor antagonist, is the only drug approved to treat patients with autosomal dominant polycystic kidney disease who have rapid disease progression. The use of tolvaptan is limited by reduced tolerability from aquaretic effects and potential hepatotoxicity. Thus, the search for more effective drugs to slow down the progression of autosomal dominant polycystic kidney disease is urgent and challenging. Drug repurposing is a strategy for identifying new clinical indications for approved or investigational medications. Drug repurposing is increasingly becoming an attractive proposition because of its cost-efficiency and time-efficiency and known pharmacokinetic and safety profiles. In this review, we focus on the repurposing approaches to identify suitable drug candidates to treat autosomal dominant polycystic kidney disease and prioritization and implementation of candidates with high probability of success. Identification of drug candidates through understanding of disease pathogenesis and signaling pathways is highlighted.
Collapse
Affiliation(s)
- Julie Xia Zhou
- Division of Nephrology and Hypertension, Mayo Clinic, Rochester, Minnesota, USA; Mayo Clinic Robert M. and Billie Kelley Pirnie Translational Polycystic Kidney Disease Center, Rochester, Minnesota, USA.
| | - Vicente E Torres
- Division of Nephrology and Hypertension, Mayo Clinic, Rochester, Minnesota, USA; Mayo Clinic Robert M. and Billie Kelley Pirnie Translational Polycystic Kidney Disease Center, Rochester, Minnesota, USA.
| |
Collapse
|
3
|
Ye Q, Guo NL. Inferencing Bulk Tumor and Single-Cell Multi-Omics Regulatory Networks for Discovery of Biomarkers and Therapeutic Targets. Cells 2022; 12:cells12010101. [PMID: 36611894 PMCID: PMC9818242 DOI: 10.3390/cells12010101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 12/22/2022] [Accepted: 12/24/2022] [Indexed: 12/28/2022] Open
Abstract
There are insufficient accurate biomarkers and effective therapeutic targets in current cancer treatment. Multi-omics regulatory networks in patient bulk tumors and single cells can shed light on molecular disease mechanisms. Integration of multi-omics data with large-scale patient electronic medical records (EMRs) can lead to the discovery of biomarkers and therapeutic targets. In this review, multi-omics data harmonization methods were introduced, and common approaches to molecular network inference were summarized. Our Prediction Logic Boolean Implication Networks (PLBINs) have advantages over other methods in constructing genome-scale multi-omics networks in bulk tumors and single cells in terms of computational efficiency, scalability, and accuracy. Based on the constructed multi-modal regulatory networks, graph theory network centrality metrics can be used in the prioritization of candidates for discovering biomarkers and therapeutic targets. Our approach to integrating multi-omics profiles in a patient cohort with large-scale patient EMRs such as the SEER-Medicare cancer registry combined with extensive external validation can identify potential biomarkers applicable in large patient populations. These methodologies form a conceptually innovative framework to analyze various available information from research laboratories and healthcare systems, accelerating the discovery of biomarkers and therapeutic targets to ultimately improve cancer patient survival outcomes.
Collapse
Affiliation(s)
- Qing Ye
- West Virginia University Cancer Institute, Morgantown, WV 26506, USA
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
| | - Nancy Lan Guo
- West Virginia University Cancer Institute, Morgantown, WV 26506, USA
- Department of Occupational and Environmental Health Sciences, School of Public Health, West Virginia University, Morgantown, WV 26506, USA
- Correspondence: ; Tel.: +1-304-293-6455
| |
Collapse
|
4
|
Computational drug repurposing based on electronic health records: a scoping review. NPJ Digit Med 2022; 5:77. [PMID: 35701544 PMCID: PMC9198008 DOI: 10.1038/s41746-022-00617-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 05/19/2022] [Indexed: 11/30/2022] Open
Abstract
Computational drug repurposing methods adapt Artificial intelligence (AI) algorithms for the discovery of new applications of approved or investigational drugs. Among the heterogeneous datasets, electronic health records (EHRs) datasets provide rich longitudinal and pathophysiological data that facilitate the generation and validation of drug repurposing. Here, we present an appraisal of recently published research on computational drug repurposing utilizing the EHR. Thirty-three research articles, retrieved from Embase, Medline, Scopus, and Web of Science between January 2000 and January 2022, were included in the final review. Four themes, (1) publication venue, (2) data types and sources, (3) method for data processing and prediction, and (4) targeted disease, validation, and released tools were presented. The review summarized the contribution of EHR used in drug repurposing as well as revealed that the utilization is hindered by the validation, accessibility, and understanding of EHRs. These findings can support researchers in the utilization of medical data resources and the development of computational methods for drug repurposing.
Collapse
|
5
|
Lee S, Jeon S, Kim HS. A Study on Methodologies of Drug Repositioning Using Biomedical Big Data: A Focus on Diabetes Mellitus. Endocrinol Metab (Seoul) 2022; 37:195-207. [PMID: 35413782 PMCID: PMC9081315 DOI: 10.3803/enm.2022.1404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 03/21/2022] [Indexed: 11/11/2022] Open
Abstract
Drug repositioning is a strategy for identifying new applications of an existing drug that has been previously proven to be safe. Based on several examples of drug repositioning, we aimed to determine the methodologies and relevant steps associated with drug repositioning that should be pursued in the future. Reports on drug repositioning, retrieved from PubMed from January 2011 to December 2020, were classified based on an analysis of the methodology and reviewed by experts. Among various drug repositioning methods, the network-based approach was the most common (38.0%, 186/490 cases), followed by machine learning/deep learningbased (34.3%, 168/490 cases), text mining-based (7.1%, 35/490 cases), semantic-based (5.3%, 26/490 cases), and others (15.3%, 75/490 cases). Although drug repositioning offers several advantages, its implementation is curtailed by the need for prior, conclusive clinical proof. This approach requires the construction of various databases, and a deep understanding of the process underlying repositioning is quintessential. An in-depth understanding of drug repositioning could reduce the time, cost, and risks inherent to early drug development, providing reliable scientific evidence. Furthermore, regarding patient safety, drug repurposing might allow the discovery of new relationships between drugs and diseases.
Collapse
Affiliation(s)
- Suehyun Lee
- Department of Biomedical Informatics, Konyang University College of Medicine, Daejeon, Korea
- Health Care Data Science Center, Konyang University Hospital, Daejeon, Korea
| | - Seongwoo Jeon
- Health Care Data Science Center, Konyang University Hospital, Daejeon, Korea
| | - Hun-Sung Kim
- Department of Medical Informatics, College of Medicine, The Catholic University of Korea, Seoul, Korea
- Division of Endocrinology and Metabolism, Department of Internal Medicine, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
- Corresponding author: Hun-Sung Kim Department of Medical Informatics, College of Medicine, The Catholic University of Korea, 222 Banpo-daero, Seocho-gu, Seoul 06591, Korea Tel: +82-2-2258-8262, Fax: +82-2-2258-8297, E-mail:
| |
Collapse
|
6
|
Ouchi D, Giner-Soriano M, Gómez-Lumbreras A, Vedia Urgell C, Torres F, Morros R. SMOOTH algorithm: An automatic method to estimate the most likely drug combination in electronic health records. Development and validation study. (Preprint). JMIR Med Inform 2022; 10:e37976. [DOI: 10.2196/37976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 09/19/2022] [Accepted: 10/13/2022] [Indexed: 11/07/2022] Open
|
7
|
Teneralli RE, Kern DM, Cepeda MS, Gilbert JP, Drevets WC. Exploring real-world evidence to uncover unknown drug benefits and support the discovery of new treatment targets for depressive and bipolar disorders. J Affect Disord 2021; 290:324-333. [PMID: 34020207 DOI: 10.1016/j.jad.2021.04.096] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 02/19/2021] [Accepted: 04/25/2021] [Indexed: 12/28/2022]
Abstract
BACKGROUND Major depressive and bipolar disorders are associated with impaired quality of life and high economic burden. Although progress has been made in our understanding of the underlying pathophysiology and the development of novel pharmacological treatments, a large unmet need remains for finding effective treatment options. The purpose of this study was to identify potential new mechanisms of actions or treatment targets that could inform future research and development opportunities for major depressive and bipolar disorders. METHODS A self-controlled cohort study was conducted to examine associations between 1933 medications and incidence of major depressive and bipolar disorders across four US insurance claims databases. Presence of incident depressive or bipolar disorders were captured for each patient prior to or after drug exposure and incident rate ratios were calculated. Medications that demonstrated ≥50% reduction in risk for both depressive and bipolar disorders within two or more databases were evaluated as potential treatment targets. RESULTS Eight medications met our inclusion criteria, which fell into three treatment groups: drugs used in substance use disorders; drugs that affect the cholinergic system; and drugs used for the management of cardiovascular-related conditions. LIMITATIONS This study was not designed to confirm a causal association nor inform current clinical practice. Instead, this research and the methods employed intended to be hypothesis generating and help uncover potential treatment pathways that could warrant further investigation. CONCLUSIONS Several potential drug targets that could aid further research and discovery into novel treatments for depressive and bipolar disorders were identified.
Collapse
Affiliation(s)
- Rachel E Teneralli
- Janssen Research & Development, LLC., Epidemiology, Titusville, NJ, USA.
| | - David M Kern
- Janssen Research & Development, LLC., Epidemiology, Titusville, NJ, USA
| | - M Soledad Cepeda
- Janssen Research & Development, LLC., Epidemiology, Titusville, NJ, USA
| | - James P Gilbert
- Janssen Research & Development, LLC., Observational Health and Data Analytics, Raritan, NJ, USA
| | - Wayne C Drevets
- Janssen Research & Development, LLC., Neuroscience, San Diego, CA, USA
| |
Collapse
|
8
|
Kern DM, Cepeda MS, Flores CM, Wittenberg GM. Application of Real-World Data and the REWARD Framework to Detect Unknown Benefits of Memantine and Identify Potential Disease Targets for New NMDA Receptor Antagonists. CNS Drugs 2021; 35:243-251. [PMID: 33537916 PMCID: PMC7907035 DOI: 10.1007/s40263-020-00789-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/27/2020] [Indexed: 11/30/2022]
Abstract
BACKGROUND Observational data may inform novel drug development programs by identifying previously unappreciated, clinical benefits of existing drugs. Several preclinical and clinical studies have suggested emergent therapeutic utility of drugs acting on the N-methyl-D-aspartate (NMDA) receptor, a subtype of glutamate receptors, including the antidementia drug memantine. METHODS Using a self-controlled cohort study design, the association of exposure to the NMDA receptor antagonist memantine with the incidence of all observed disease outcomes in four US administrative claims databases, spanning from January 2000 through January 2019, was assessed. The databases used in this study were the IBM MarketScan® Commercial Database (CCAE), the IBM MarketScan® Multi-State Medicaid Database (MDCD), the IBM MarketScan® Medicare Supplemental Database (MDCR), and the Optum© De-Identified Clinformatics® Data Mart Database. Outcomes were defined according to the unique Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) classification system codes and required a diagnosis on two or more distinct dates. Of 20,953 outcomes assessed, only those for which memantine was associated with a ≥ 50% reduction in risk in two or more databases were included. A meta-analysis with random effects was used to pool data across the databases. RESULTS Overall, 312,336 patients were exposed to memantine during the study. After removing conditions related to dementia and memory loss, 60 outcomes met the threshold criteria. Results fell into five disease categories: mental disorders, substance use disorders, pain, gastrointestinal and colon disorders, and demyelinating disease. The bulk of findings fell into the first two groups, with 28 outcomes related to mental disorders and 24 related to substance use disorders. CONCLUSION The present results confirm that NMDA receptor antagonism may have broader therapeutic utility than previously recognized. Further observational and clinical research may be warranted to explore the therapeutic benefit of NMDA antagonists for the outcomes found in this study.
Collapse
Affiliation(s)
- David M. Kern
- Janssen Research and Development, 1125 Trenton Harbourton Rd, Titusville, NJ 08560 USA
| | - M. Soledad Cepeda
- Janssen Research and Development, 1125 Trenton Harbourton Rd, Titusville, NJ 08560 USA
| | - Christopher M. Flores
- Janssen Research and Development, 1125 Trenton Harbourton Rd, Titusville, NJ 08560 USA
| | - Gayle M. Wittenberg
- Janssen Research and Development, 1125 Trenton Harbourton Rd, Titusville, NJ 08560 USA
| |
Collapse
|
9
|
Zhang Y, Tayarani M, Al’Aref SJ, Beecy AN, Liu Y, Sholle E, RoyChoudhury A, Axsom KM, Gao HO, Pathak J, Ancker JS. Using electronic health records for population health sciences: a case study to evaluate the associations between changes in left ventricular ejection fraction and the built environment. JAMIA Open 2020; 3:386-394. [PMID: 33215073 PMCID: PMC7660965 DOI: 10.1093/jamiaopen/ooaa038] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 07/16/2020] [Accepted: 08/20/2020] [Indexed: 11/18/2022] Open
Abstract
OBJECTIVE Electronic health record (EHR) data linked with address-based metrics using geographic information systems (GIS) are emerging data sources in population health studies. This study examined this approach through a case study on the associations between changes in ejection fraction (EF) and the built environment among heart failure (HF) patients. MATERIALS AND METHODS We identified 1287 HF patients with at least 2 left ventricular EF measurements that are minimally 1 year apart. EHR data were obtained at an academic medical center in New York for patients who visited between 2012 and 2017. Longitudinal clinical information was linked with address-based built environment metrics related to transportation, air quality, land use, and accessibility by GIS. The primary outcome is the increase in the severity of EF categories. Statistical analyses were performed using mixed-effects models, including a subgroup analysis of patients who initially had normal EF measurements. RESULTS Previously reported effects from the built environment among HF patients were identified. Increased daily nitrogen dioxide concentration was associated with the outcome while controlling for known HF risk factors including sex, comorbidities, and medication usage. In the subgroup analysis, the outcome was significantly associated with decreased distance to subway stops and increased distance to parks. CONCLUSIONS Population health studies using EHR data may drive efficient hypothesis generation and enable novel information technology-based interventions. The availability of more precise outcome measurements and home locations, and frequent collection of individual-level social determinants of health may further drive the use of EHR data in population health studies.
Collapse
Affiliation(s)
- Yiye Zhang
- Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York City, New York, USA
- Department of Emergency Medicine, Weill Cornell Medicine, Cornell University, New York City, New York, USA
| | - Mohammad Tayarani
- School of Civil and Environmental Engineering, Cornell University, Ithaca, New York, USA
| | - Subhi J Al’Aref
- Division of Cardiology, Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - Ashley N Beecy
- Division of Cardiology, Department of Medicine, Weill Cornell Medicine, New York, New York, USA
| | - Yifan Liu
- Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York City, New York, USA
| | - Evan Sholle
- Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York City, New York, USA
| | - Arindam RoyChoudhury
- Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York City, New York, USA
| | - Kelly M Axsom
- Columbia University Irving Medical Center, New York, New York, USA
| | - Huaizhu Oliver Gao
- School of Civil and Environmental Engineering, Cornell University, Ithaca, New York, USA
| | - Jyotishman Pathak
- Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York City, New York, USA
| | - Jessica S Ancker
- Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York City, New York, USA
| |
Collapse
|
10
|
Kim DH, Lee JE, Kim YG, Lee Y, Seo DW, Lee KH, Lee JH, Kim WS, Kim YH, Oh JS. High-Throughput Algorithm for Discovering New Drug Indications by Utilizing Large-Scale Electronic Medical Record Data. Clin Pharmacol Ther 2020; 108:1299-1307. [PMID: 32621536 DOI: 10.1002/cpt.1980] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 06/24/2020] [Indexed: 12/28/2022]
Abstract
Drug repositioning is an effective way to mitigate the production problem in the pharmaceutical industry. Electronic medical record (EMR) databases harbor a large amount of data on drug prescriptions and laboratory test results and may thus be useful for finding new indications for existing drugs. Here, we present a novel high-throughput data-driven algorithm that identifies and prioritizes drug candidates that show significant effects on specific clinical indicators by utilizing large-scale EMR data. We chose four laboratory tests as clinical indicators: hemoglobin A1c (HbA1c), low-density lipoprotein (LDL) cholesterol, triglycerides (TGs), and high-density lipoprotein (HDL) cholesterol. From a 5-year EMR database, we generated datasets consisting of paired data with averaged measurement values during on and off each drug in each patient, adjusted for co-administered drug effects at each timepoint, and applied one sample t-test with the Bonferroni correction for statistical analysis. Among 1,774 drugs, 45 were associated with increases in HDL cholesterol, and 41, 146, and 65 were associated with reductions in HbA1c, LDL cholesterol, and TGs, respectively. We compared the list of candidate drugs with that of drugs indicated for relevant clinical conditions and found that the algorithm had high values for both sensitivity (range 0.95-1.00) and negative predictive value (range 0.95-1.00). Our algorithm was able to rediscover well-known drugs that are used for diabetes and dyslipidemia while revealing potential candidates without current indications but have shown promising results in the literature. Our algorithm may facilitate the repositioning of drugs with proven safety profiles.
Collapse
Affiliation(s)
- Do-Hoon Kim
- Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Jung-Eun Lee
- Asan Institute for Life Sciences, Asan Medical Center, Seoul, Republic of Korea
| | - Yong-Gil Kim
- Division of Rheumatology, Department of Internal Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Yura Lee
- Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Dong-Woo Seo
- Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea.,Department of Emergency Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Kye Hwa Lee
- Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Jae-Ho Lee
- Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea.,Department of Emergency Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Woo Sung Kim
- Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea.,Department of Pulmonary and Critical Care Medicine, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Young-Hak Kim
- Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea.,Health Innovation Big Data Center, Asan Institute for Life Science, Asan Medical Center, Seoul, Republic of Korea.,Department of Cardiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Ji Seon Oh
- Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea.,Health Innovation Big Data Center, Asan Institute for Life Science, Asan Medical Center, Seoul, Republic of Korea
| |
Collapse
|
11
|
Banerjee I, Sofela M, Yang J, Chen JH, Shah NH, Ball R, Mushlin AI, Desai M, Bledsoe J, Amrhein T, Rubin DL, Zamanian R, Lungren MP. Development and Performance of the Pulmonary Embolism Result Forecast Model (PERFORM) for Computed Tomography Clinical Decision Support. JAMA Netw Open 2019; 2:e198719. [PMID: 31390040 PMCID: PMC6686780 DOI: 10.1001/jamanetworkopen.2019.8719] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
IMPORTANCE Pulmonary embolism (PE) is a life-threatening clinical problem, and computed tomographic imaging is the standard for diagnosis. Clinical decision support rules based on PE risk-scoring models have been developed to compute pretest probability but are underused and tend to underperform in practice, leading to persistent overuse of CT imaging for PE. OBJECTIVE To develop a machine learning model to generate a patient-specific risk score for PE by analyzing longitudinal clinical data as clinical decision support for patients referred for CT imaging for PE. DESIGN, SETTING, AND PARTICIPANTS In this diagnostic study, the proposed workflow for the machine learning model, the Pulmonary Embolism Result Forecast Model (PERFORM), transforms raw electronic medical record (EMR) data into temporal feature vectors and develops a decision analytical model targeted toward adult patients referred for CT imaging for PE. The model was tested on holdout patient EMR data from 2 large, academic medical practices. A total of 3397 annotated CT imaging examinations for PE from 3214 unique patients seen at Stanford University hospitals and clinics were used for training and validation. The models were externally validated on 240 unique patients seen at Duke University Medical Center. The comparison with clinical scoring systems was done on randomly selected 100 outpatient samples from Stanford University hospitals and clinics and 101 outpatient samples from Duke University Medical Center. MAIN OUTCOMES AND MEASURES Prediction performance of diagnosing acute PE was evaluated using ElasticNet, artificial neural networks, and other machine learning approaches on holdout data sets from both institutions, and performance of models was measured by area under the receiver operating characteristic curve (AUROC). RESULTS Of the 3214 patients included in the study, 1704 (53.0%) were women from Stanford University hospitals and clinics; mean (SD) age was 60.53 (19.43) years. The 240 patients from Duke University Medical Center used for validation included 132 women (55.0%); mean (SD) age was 70.2 (14.2) years. In the samples for clinical scoring system comparisons, the 100 outpatients from Stanford University hospitals and clinics included 67 women (67.0%); mean (SD) age was 57.74 (19.87) years, and the 101 patients from Duke University Medical Center included 59 women (58.4%); mean (SD) age was 73.06 (15.3) years. The best-performing model achieved an AUROC performance of predicting a positive PE study of 0.90 (95% CI, 0.87-0.91) on intrainstitutional holdout data with an AUROC of 0.71 (95% CI, 0.69-0.72) on an external data set from Duke University Medical Center; superior AUROC performance and cross-institutional generalization of the model of 0.81 (95% CI, 0.77-0.87) and 0.81 (95% CI, 0.73-0.82), respectively, were noted on holdout outpatient populations from both intrainstitutional and extrainstitutional data. CONCLUSIONS AND RELEVANCE The machine learning model, PERFORM, may consider multitudes of applicable patient-specific risk factors and dependencies to arrive at a PE risk prediction that generalizes to new population distributions. This approach might be used as an automated clinical decision-support tool for patients referred for CT PE imaging to improve CT use.
Collapse
Affiliation(s)
- Imon Banerjee
- Department of Biomedical Data Science, Stanford University, Stanford, California
- Department of Radiology, Stanford University, Stanford, California
| | - Miji Sofela
- Duke University Health System, Duke University School of Medicine, Durham, North Carolina
| | - Jaden Yang
- Quantitative Science Unit, Stanford University, Stanford, California
| | - Jonathan H. Chen
- Department of Medicine (Biomedical Informatics), Stanford University, Stanford, California
| | - Nigam H. Shah
- Department of Medicine (Biomedical Informatics), Stanford University, Stanford, California
| | - Robyn Ball
- Quantitative Science Unit, Stanford University, Stanford, California
| | - Alvin I. Mushlin
- Department of Medicine, Weill Cornell Medical College, Cornell University, Ithaca, New York
| | - Manisha Desai
- Quantitative Science Unit, Stanford University, Stanford, California
| | - Joseph Bledsoe
- Department of Emergency Medicine, Intermountain Medical Center, Salt Lake City, Utah
| | - Timothy Amrhein
- Department of Radiology, Duke University School of Medicine, Durham, North Carolina
| | - Daniel L. Rubin
- Department of Biomedical Data Science, Stanford University, Stanford, California
- Department of Radiology, Stanford University, Stanford, California
| | - Roham Zamanian
- Department of Medicine, Med/Pulmonary, and Critical Care Medicine, Stanford University, Stanford, California
| | | |
Collapse
|
12
|
Ru B, Li D, Hu Y, Yao L. Serendipity-A Machine-Learning Application for Mining Serendipitous Drug Usage From Social Media. IEEE Trans Nanobioscience 2019; 18:324-334. [PMID: 30951476 PMCID: PMC6650153 DOI: 10.1109/tnb.2019.2909094] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Serendipitous drug usage refers to the unexpected relief of comorbid diseases or symptoms when taking medication for a different known indication. Historically, serendipity has contributed significantly to identifying many new drug indications. If patient-reported serendipitous drug usage in social media could be computationally identified, it could help generate and validate drug-repositioning hypotheses. We investigated deep neural network models for mining serendipitous drug usage from social media. We used the word2vec algorithm to construct word-embedding features from drug reviews posted in a WebMD patient forum. We adapted and redesigned the convolutional neural network, long short-term memory network, and convolutional long short-term memory network by adding contextual information extracted from drug-review posts, information-filtering tools, medical ontology, and medical knowledge. We trained, tuned, and evaluated our models with a gold-standard dataset of 15714 sentences (447 [2.8%] describing serendipitous drug usage). Additionally, we compared our deep neural networks to support vector machine, random forest, and AdaBoost.M1 algorithms. Context information helped to reduce the false-positive rate of deep neural network models. If we used an extremely imbalanced dataset with limited instances of serendipitous drug usage, deep neural network models did not outperform other machine-learning models with n-gram and context features. However, deep neural network models could more effectively use word embedding in feature construction, an advantage that makes them worthy of further investigation. Finally, we implemented natural-language processing and machine-learning methods in a web-based application to help scientists and software developers mine social media for serendipitous drug usage.
Collapse
|
13
|
Glicksberg BS, Johnson KW, Dudley JT. The next generation of precision medicine: observational studies, electronic health records, biobanks and continuous monitoring. Hum Mol Genet 2019; 27:R56-R62. [PMID: 29659828 DOI: 10.1093/hmg/ddy114] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Accepted: 03/27/2018] [Indexed: 02/06/2023] Open
Abstract
Precision medicine can utilize new techniques in order to more effectively translate research findings into clinical practice. In this article, we first explore the limitations of traditional study designs, which stem from (to name a few): massive cost for the assembly of large patient cohorts; non-representative patient data; and the astounding complexity of human biology. Second, we propose that harnessing electronic health records and mobile device biometrics coupled to longitudinal data may prove to be a solution to many of these problems by capturing a 'real world' phenotype. We envision that future biomedical research utilizing more precise approaches to patient care will utilize continuous and longitudinal data sources.
Collapse
Affiliation(s)
- Benjamin S Glicksberg
- Institute for Next Generation Healthcare Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, NY 10029, USA.,Institute for Computational Health Sciences, University of California San Francisco, San Francisco, CA 94158, USA
| | - Kipp W Johnson
- Institute for Next Generation Healthcare Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, NY 10029, USA
| | - Joel T Dudley
- Institute for Next Generation Healthcare Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York City, NY 10029, USA
| |
Collapse
|
14
|
Romano JD, Tatonetti NP. Informatics and Computational Methods in Natural Product Drug Discovery: A Review and Perspectives. Front Genet 2019; 10:368. [PMID: 31114606 PMCID: PMC6503039 DOI: 10.3389/fgene.2019.00368] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Accepted: 04/05/2019] [Indexed: 12/17/2022] Open
Abstract
The discovery of new pharmaceutical drugs is one of the preeminent tasks-scientifically, economically, and socially-in biomedical research. Advances in informatics and computational biology have increased productivity at many stages of the drug discovery pipeline. Nevertheless, drug discovery has slowed, largely due to the reliance on small molecules as the primary source of novel hypotheses. Natural products (such as plant metabolites, animal toxins, and immunological components) comprise a vast and diverse source of bioactive compounds, some of which are supported by thousands of years of traditional medicine, and are largely disjoint from the set of small molecules used commonly for discovery. However, natural products possess unique characteristics that distinguish them from traditional small molecule drug candidates, requiring new methods and approaches for assessing their therapeutic potential. In this review, we investigate a number of state-of-the-art techniques in bioinformatics, cheminformatics, and knowledge engineering for data-driven drug discovery from natural products. We focus on methods that aim to bridge the gap between traditional small-molecule drug candidates and different classes of natural products. We also explore the current informatics knowledge gaps and other barriers that need to be overcome to fully leverage these compounds for drug discovery. Finally, we conclude with a "road map" of research priorities that seeks to realize this goal.
Collapse
Affiliation(s)
- Joseph D. Romano
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
- Department of Systems Biology, Columbia University, New York, NY, United States
- Department of Medicine, Columbia University, New York, NY, United States
- Data Science Institute, Columbia University, New York, NY, United States
| | - Nicholas P. Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
- Department of Systems Biology, Columbia University, New York, NY, United States
- Department of Medicine, Columbia University, New York, NY, United States
- Data Science Institute, Columbia University, New York, NY, United States
| |
Collapse
|
15
|
Abstract
The surge of public disease and drug-related data availability has facilitated the application of computational methodologies to transform drug discovery. In the current chapter, we outline and detail the various resources and tools one can leverage in order to perform such analyses. We further describe in depth the in silico workflows of two recent studies that have identified possible novel indications of existing drugs. Lastly, we delve into the caveats and considerations of this process to enable other researchers to perform rigorous computational drug discovery experiments of their own.
Collapse
|
16
|
Brown N, Cambruzzi J, Cox PJ, Davies M, Dunbar J, Plumbley D, Sellwood MA, Sim A, Williams-Jones BI, Zwierzyna M, Sheppard DW. Big Data in Drug Discovery. PROGRESS IN MEDICINAL CHEMISTRY 2018; 57:277-356. [PMID: 29680150 DOI: 10.1016/bs.pmch.2017.12.003] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Interpretation of Big Data in the drug discovery community should enhance project timelines and reduce clinical attrition through improved early decision making. The issues we encounter start with the sheer volume of data and how we first ingest it before building an infrastructure to house it to make use of the data in an efficient and productive way. There are many problems associated with the data itself including general reproducibility, but often, it is the context surrounding an experiment that is critical to success. Help, in the form of artificial intelligence (AI), is required to understand and translate the context. On the back of natural language processing pipelines, AI is also used to prospectively generate new hypotheses by linking data together. We explain Big Data from the context of biology, chemistry and clinical trials, showcasing some of the impressive public domain sources and initiatives now available for interrogation.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Aaron Sim
- BenevolentAI, London, United Kingdom
| | | | - Magdalena Zwierzyna
- BenevolentAI, London, United Kingdom; Institute of Cardiovascular Science, University College London, London, United Kingdom
| | | |
Collapse
|
17
|
Low YS, Daugherty AC, Schroeder EA, Chen W, Seto T, Weber S, Lim M, Hastie T, Mathur M, Desai M, Farrington C, Radin AA, Sirota M, Kenkare P, Thompson CA, Yu PP, Gomez SL, Sledge GW, Kurian AW, Shah NH. Synergistic drug combinations from electronic health records and gene expression. J Am Med Inform Assoc 2017; 24:565-576. [PMID: 27940607 PMCID: PMC6080645 DOI: 10.1093/jamia/ocw161] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Objective Using electronic health records (EHRs) and biomolecular data, we sought to discover drug pairs with synergistic repurposing potential. EHRs provide real-world treatment and outcome patterns, while complementary biomolecular data, including disease-specific gene expression and drug-protein interactions, provide mechanistic understanding. Method We applied Group Lasso INTERaction NETwork (glinternet), an overlap group lasso penalty on a logistic regression model, with pairwise interactions to identify variables and interacting drug pairs associated with reduced 5-year mortality using EHRs of 9945 breast cancer patients. We identified differentially expressed genes from 14 case-control human breast cancer gene expression datasets and integrated them with drug-protein networks. Drugs in the network were scored according to their association with breast cancer individually or in pairs. Lastly, we determined whether synergistic drug pairs found in the EHRs were enriched among synergistic drug pairs from gene-expression data using a method similar to gene set enrichment analysis. Results From EHRs, we discovered 3 drug-class pairs associated with lower mortality: anti-inflammatories and hormone antagonists, anti-inflammatories and lipid modifiers, and lipid modifiers and obstructive airway drugs. The first 2 pairs were also enriched among pairs discovered using gene expression data and are supported by molecular interactions in drug-protein networks and preclinical and epidemiologic evidence. Conclusions This is a proof-of-concept study demonstrating that a combination of complementary data sources, such as EHRs and gene expression, can corroborate discoveries and provide mechanistic insight into drug synergism for repurposing.
Collapse
Affiliation(s)
- Yen S Low
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | | | | | - William Chen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Tina Seto
- Clinical Informatics, Stanford University
| | | | - Michael Lim
- Department of Statistics, Stanford University
| | - Trevor Hastie
- Department of Statistics, Stanford University.,Department of Health Research and Policy, Stanford University
| | - Maya Mathur
- Quantitative Sciences Unit, Stanford University
| | | | | | | | | | - Pragati Kenkare
- Palo Alto Medical Foundation Research Institute, Palo Alto, CA, USA
| | | | - Peter P Yu
- Palo Alto Medical Foundation Research Institute, Palo Alto, CA, USA
| | - Scarlett L Gomez
- Department of Health Research and Policy, Stanford University.,Cancer Prevention Institute of California, Fremont, CA, USA
| | - George W Sledge
- Division of Oncology, Department of Medicine, Stanford University
| | - Allison W Kurian
- Department of Health Research and Policy, Stanford University.,Division of Oncology, Department of Medicine, Stanford University
| | - Nigam H Shah
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| |
Collapse
|
18
|
Montvida O, Arandjelović O, Reiner E, Paul SK. Data Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical Records. ACTA ACUST UNITED AC 2017. [DOI: 10.2174/1875036201709010001] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Background:
Electronic Medical Records (EMRs) from primary/ ambulatory care systems present a new and promising source of information for conducting clinical and translational research.
Objectives:
To address the methodological and computational challenges in order to extract reliable medication information from raw data which is often complex, incomplete and erroneous. To assess whether the use of specific chaining fields of medication information may additionally improve the data quality.
Methods:
Guided by a range of challenges associated with missing and internally inconsistent data, we introduce two methods for the robust extraction of patient-level medication data. First method relies on chaining fields to estimate duration of treatment (“chaining”), while second disregards chaining fields and relies on the chronology of records (“continuous”). Centricity EMR database was used to estimate treatment duration with both methods for two widely prescribed drugs among type 2 diabetes patients: insulin and glucagon-like peptide-1 receptor agonists.
Results:
At individual patient level the “chaining” approach could identify the treatment alterations longitudinally and produced more robust estimates of treatment duration for individual drugs, while the “continuous” method was unable to capture that dynamics. At population level, both methods produced similar estimates of average treatment duration, however, notable differences were observed at individual-patient level.
Conclusion:
The proposed algorithms explicitly identify and handle longitudinal erroneous or missing entries and estimate treatment duration with specific drug(s) of interest, which makes them a valuable tool for future EMR based clinical and pharmaco-epidemiological studies. To improve accuracy of real-world based studies, implementing chaining fields of medication information is recommended.
Collapse
|
19
|
A flexible data-driven comorbidity feature extraction framework. Comput Biol Med 2016; 73:165-72. [DOI: 10.1016/j.compbiomed.2016.04.014] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Revised: 03/26/2016] [Accepted: 04/19/2016] [Indexed: 12/11/2022]
|
20
|
Miotto R, Li L, Kidd BA, Dudley JT. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records. Sci Rep 2016; 6:26094. [PMID: 27185194 PMCID: PMC4869115 DOI: 10.1038/srep26094] [Citation(s) in RCA: 607] [Impact Index Per Article: 75.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 04/27/2016] [Indexed: 12/30/2022] Open
Abstract
Secondary use of electronic health records (EHRs) promises to advance clinical research and better inform clinical decision making. Challenges in summarizing and representing patient data prevent widespread practice of predictive modeling using EHRs. Here we present a novel unsupervised deep feature learning method to derive a general-purpose patient representation from EHR data that facilitates clinical predictive modeling. In particular, a three-layer stack of denoising autoencoders was used to capture hierarchical regularities and dependencies in the aggregated EHRs of about 700,000 patients from the Mount Sinai data warehouse. The result is a representation we name “deep patient”. We evaluated this representation as broadly predictive of health states by assessing the probability of patients to develop various diseases. We performed evaluation using 76,214 test patients comprising 78 diseases from diverse clinical domains and temporal windows. Our results significantly outperformed those achieved using representations based on raw EHR data and alternative feature learning strategies. Prediction performance for severe diabetes, schizophrenia, and various cancers were among the top performing. These findings indicate that deep learning applied to EHRs can derive patient representations that offer improved clinical predictions, and could provide a machine learning framework for augmenting clinical decision systems.
Collapse
Affiliation(s)
- Riccardo Miotto
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Li Li
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Brian A Kidd
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Joel T Dudley
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
21
|
Hodos RA, Kidd BA, Khader S, Readhead BP, Dudley JT. In silico methods for drug repurposing and pharmacology. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2016; 8:186-210. [PMID: 27080087 PMCID: PMC4845762 DOI: 10.1002/wsbm.1337] [Citation(s) in RCA: 168] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Revised: 02/08/2016] [Accepted: 02/11/2016] [Indexed: 12/18/2022]
Abstract
Data in the biological, chemical, and clinical domains are accumulating at ever-increasing rates and have the potential to accelerate and inform drug development in new ways. Challenges and opportunities now lie in developing analytic tools to transform these often complex and heterogeneous data into testable hypotheses and actionable insights. This is the aim of computational pharmacology, which uses in silico techniques to better understand and predict how drugs affect biological systems, which can in turn improve clinical use, avoid unwanted side effects, and guide selection and development of better treatments. One exciting application of computational pharmacology is drug repurposing-finding new uses for existing drugs. Already yielding many promising candidates, this strategy has the potential to improve the efficiency of the drug development process and reach patient populations with previously unmet needs such as those with rare diseases. While current techniques in computational pharmacology and drug repurposing often focus on just a single data modality such as gene expression or drug-target interactions, we argue that methods such as matrix factorization that can integrate data within and across diverse data types have the potential to improve predictive performance and provide a fuller picture of a drug's pharmacological action. WIREs Syst Biol Med 2016, 8:186-210. doi: 10.1002/wsbm.1337 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Rachel A Hodos
- New York University and Icahn School of Medicine at Mt. Sinai, New York, NY
| | - Brian A Kidd
- Icahn School of Medicine at Mt. Sinai, New York, NY
| | | | | | | |
Collapse
|
22
|
Miotto R, Weng C. Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials. J Am Med Inform Assoc 2015; 22:e141-50. [PMID: 25769682 PMCID: PMC4428438 DOI: 10.1093/jamia/ocu050] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 12/16/2014] [Indexed: 11/12/2022] Open
Abstract
Objective To develop a cost-effective, case-based reasoning framework for clinical research eligibility screening by only reusing the electronic health records (EHRs) of minimal enrolled participants to represent the target patient for each trial under consideration. Materials and Methods The EHR data—specifically diagnosis, medications, laboratory results, and clinical notes—of known clinical trial participants were aggregated to profile the “target patient” for a trial, which was used to discover new eligible patients for that trial. The EHR data of unseen patients were matched to this “target patient” to determine their relevance to the trial; the higher the relevance, the more likely the patient was eligible. Relevance scores were a weighted linear combination of cosine similarities computed over individual EHR data types. For evaluation, we identified 262 participants of 13 diversified clinical trials conducted at Columbia University as our gold standard. We ran a 2-fold cross validation with half of the participants used for training and the other half used for testing along with other 30 000 patients selected at random from our clinical database. We performed binary classification and ranking experiments. Results The overall area under the ROC curve for classification was 0.95, enabling the highlight of eligible patients with good precision. Ranking showed satisfactory results especially at the top of the recommended list, with each trial having at least one eligible patient in the top five positions. Conclusions This relevance-based method can potentially be used to identify eligible patients for clinical trials by processing patient EHR data alone without parsing free-text eligibility criteria, and shows promise of efficient “case-based reasoning” modeled only on minimal trial participants.
Collapse
Affiliation(s)
| | - Chunhua Weng
- Department of Biomedical Informatics The Irving Institute for Clinical and Translational Research, Columbia University, New York, NY 10032, USA
| |
Collapse
|
23
|
Ben-Zion R, Pliskin N, Fink L. Critical Success Factors for Adoption of Electronic Health Record Systems: Literature Review and Prescriptive Analysis. INFORMATION SYSTEMS MANAGEMENT 2014. [DOI: 10.1080/10580530.2014.958024] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
24
|
Boland MR, Hripcsak G, Shen Y, Chung WK, Weng C. Defining a comprehensive verotype using electronic health records for personalized medicine. J Am Med Inform Assoc 2013; 20:e232-8. [PMID: 24001516 PMCID: PMC3861934 DOI: 10.1136/amiajnl-2013-001932] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Accepted: 08/12/2013] [Indexed: 11/04/2022] Open
Abstract
The burgeoning adoption of electronic health records (EHR) introduces a golden opportunity for studying individual manifestations of myriad diseases, which is called 'EHR phenotyping'. In this paper, we break down this concept by: relating it to phenotype definitions from Johannsen; comparing it to cohort identification and disease subtyping; introducing a new concept called 'verotype' (Latin: vere = true, actually) to represent the 'true' population of similar patients for treatment purposes through the integration of genotype, phenotype, and disease subtype (eg, specific glucose value pattern in patients with diabetes) information; analyzing the value of the 'verotype' concept for personalized medicine; and outlining the potential for using network-based approaches to reverse engineer clinical disease subtypes.
Collapse
Affiliation(s)
- Mary Regina Boland
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Yufeng Shen
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
- Department of Systems Biology, Columbia University, New York, New York, USA
| | - Wendy K Chung
- Department of Pediatrics, Columbia University, New York, New York, USA
- Department of Medicine, Columbia University, New York, New York, USA
- The Irving Institute for Clinical and Translational Research, Columbia University, New York, New York, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
- The Irving Institute for Clinical and Translational Research, Columbia University, New York, New York, USA
| |
Collapse
|
25
|
Hajiheydari N, Khakbaz SB, Farhadi H. Proposing a Business Model in Healthcare Industry. INTERNATIONAL JOURNAL OF HEALTHCARE INFORMATION SYSTEMS AND INFORMATICS 2013. [DOI: 10.4018/jhisi.2013040104] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In modern-day developing countries, there are certain key problems in the healthcare system that adds to a patient’s confusion. An example of these difficulties relates to choosing an appropriate medical specialty and among specialists. Owing to the lack of structural healthcare services, there is the need for guidance in selecting the most appropriate diagnosis and medicine for patients with various symptoms or physical disabilities, the need to educate patients on self-treatment procedures, the need to reduce the high cost of treatment and diagnosis, the need to address boring procedures of diagnosis and treatment, the lack of adequate strategic planning due to the absence of valuable information about patients, the problems connected with unnecessary traffic congestion, and many more. Together, these problems create a great opportunity for the expert analysts to ameliorate the healthcare system in these countries by applying new methods, such as using web-based programs and data mining (DM). This article focuses on the use of software, healthcare data warehouse and the application of DM to generate models for solving the aforementioned problems.
Collapse
|
26
|
Wang L, Ma C, Wipf P, Liu H, Su W, Xie XQ. TargetHunter: an in silico target identification tool for predicting therapeutic potential of small organic molecules based on chemogenomic database. AAPS JOURNAL 2013; 15:395-406. [PMID: 23292636 DOI: 10.1208/s12248-012-9449-z] [Citation(s) in RCA: 131] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2012] [Accepted: 12/10/2012] [Indexed: 02/08/2023]
Abstract
Target identification of the known bioactive compounds and novel synthetic analogs is a very important research field in medicinal chemistry, biochemistry, and pharmacology. It is also a challenging and costly step towards chemical biology and phenotypic screening. In silico identification of potential biological targets for chemical compounds offers an alternative avenue for the exploration of ligand-target interactions and biochemical mechanisms, as well as for investigation of drug repurposing. Computational target fishing mines biologically annotated chemical databases and then maps compound structures into chemogenomical space in order to predict the biological targets. We summarize the recent advances and applications in computational target fishing, such as chemical similarity searching, data mining/machine learning, panel docking, and the bioactivity spectral analysis for target identification. We then described in detail a new web-based target prediction tool, TargetHunter (http://www.cbligand.org/TargetHunter). This web portal implements a novel in silico target prediction algorithm, the Targets Associated with its MOst SImilar Counterparts, by exploring the largest chemogenomical databases, ChEMBL. Prediction accuracy reached 91.1% from the top 3 guesses on a subset of high-potency compounds from the ChEMBL database, which outperformed a published algorithm, multiple-category models. TargetHunter also features an embedded geography tool, BioassayGeoMap, developed to allow the user easily to search for potential collaborators that can experimentally validate the predicted biological target(s) or off target(s). TargetHunter therefore provides a promising alternative to bridge the knowledge gap between biology and chemistry, and significantly boost the productivity of chemogenomics researchers for in silico drug design and discovery.
Collapse
Affiliation(s)
- Lirong Wang
- Department of Pharmaceutical Sciences, School of Pharmacy, Computational Chemical Genomics Screening Center, Pittsburgh, PA, USA
| | | | | | | | | | | |
Collapse
|
27
|
Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 2012; 13:395-405. [PMID: 22549152 DOI: 10.1038/nrg3208] [Citation(s) in RCA: 706] [Impact Index Per Article: 58.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Clinical data describing the phenotypes and treatment of patients represents an underused data source that has much greater research potential than is currently realized. Mining of electronic health records (EHRs) has the potential for establishing new patient-stratification principles and for revealing unknown disease correlations. Integrating EHR data with genetic data will also give a finer understanding of genotype-phenotype relationships. However, a broad range of ethical, legal and technical reasons currently hinder the systematic deposition of these data in EHRs and their mining. Here, we consider the potential for furthering medical research and clinical care using EHR data and the challenges that must be overcome before this is a reality.
Collapse
|
28
|
Cavalla D, Singal C. Retrospective clinical analysis for drug rescue: for new indications or stratified patient groups. Drug Discov Today 2011; 17:104-9. [PMID: 22001144 DOI: 10.1016/j.drudis.2011.09.019] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2011] [Revised: 09/20/2011] [Accepted: 09/30/2011] [Indexed: 12/19/2022]
Abstract
The increasing realization that many existing drugs do indeed provide opportunities for additional therapeutic indications suggests we should not only be alert for this potential among marketed drugs but also within the pool of developmental drugs, of which (owing to attrition) there are many more examples in existence. We present examples of drug repurposing by retrospective clinical trial analysis and suggest that this strategy presents a promising way of rescuing failed developmental candidates. We contend that the commercial barriers to successful drug rescue are less problematic than for drug repurposing. We indicate practical means for mining data from past clinical trials, either for new indications or for specific patient groups.
Collapse
|