51
|
Johnson KW, De Freitas JK, Glicksberg BS, Bobe JR, Dudley JT. Evaluation of patient re-identification using laboratory test orders and mitigation via latent space variables. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2019; 24:415-426. [PMID: 30864342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Anonymized electronic health records (EHR) are often used for biomedical research. One persistent concern with this type of research is the risk for re-identification of patients from their purportedly anonymized data. Here, we use the EHR of 731,850 de-identified patients to demonstrate that the average patient is unique from all others 98.4% of the time simply by examining what laboratory tests have been ordered for them. By the time a patient has visited the hospital on two separate days, they are unique in 72.3% of cases. We further present a computational study to identify how accurately the records from a single day of care can be used to re-identify patients from a set of 99 other patients. We show that, given a single visit's laboratory orders (even without result values) for a patient, we can re-identify the patient at least 25% of the time. Furthermore, we can place this patient among the top 10 most similar patients 47% of the time. Finally, we present a proof-of-concept technique using a variational autoencoder to encode laboratory results into a lower-dimensional latent space. We demonstrate that releasing latentspace encoded laboratory orders significantly improves privacy compared to releasing raw laboratory orders (<5% re-identification), while preserving information contained within the laboratory orders (AUC of >0.9 for recreating encoded values). Our findings have potential consequences for the public release of anonymized laboratory tests to the biomedical research community. We note that our findings do not imply that laboratory tests alone are personally identifiable. In the attack scenario presented here, reidentification would require a threat actor to possess an external source of laboratory values which are linked to personal identifiers at the start.
Collapse
|
52
|
Haure-Mirande JV, Wang M, Audrain M, Fanutza T, Kim SH, Heja S, Readhead B, Dudley JT, Blitzer RD, Schadt EE, Zhang B, Gandy S, Ehrlich ME. Correction: Integrative approach to sporadic Alzheimer's disease: deficiency of TYROBP in cerebral Aβ amyloidosis mouse normalizes clinical phenotype and complement subnetwork molecular pathology without reducing Aβ burden. Mol Psychiatry 2019; 24:472. [PMID: 30464330 PMCID: PMC7608234 DOI: 10.1038/s41380-018-0301-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
This article was originally published under standard licence, but has now been made available under a CC BY 4.0 license. The PDF and HTML versions of the paper have been modified accordingly.
Collapse
|
53
|
Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform 2018; 19:1236-1246. [PMID: 28481991 PMCID: PMC6455466 DOI: 10.1093/bib/bbx044] [Citation(s) in RCA: 746] [Impact Index Per Article: 124.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Revised: 02/19/2017] [Indexed: 02/07/2023] Open
Abstract
Gaining knowledge and actionable insights from complex, high-dimensional and heterogeneous biomedical data remains a key challenge in transforming health care. Various types of data have been emerging in modern biomedical research, including electronic health records, imaging, -omics, sensor data and text, which are complex, heterogeneous, poorly annotated and generally unstructured. Traditional data mining and statistical learning approaches typically need to first perform feature engineering to obtain effective and more robust features from those data, and then build prediction or clustering models on top of them. There are lots of challenges on both steps in a scenario of complicated data and lacking of sufficient domain knowledge. The latest advances in deep learning technologies provide new effective paradigms to obtain end-to-end learning models from complex data. In this article, we review the recent literature on applying deep learning technologies to advance the health care domain. Based on the analyzed work, we suggest that deep learning approaches could be the vehicle for translating big biomedical data into improved human health. However, we also note limitations and needs for improved methods development and applications, especially in terms of ease-of-understanding for domain experts and citizen scientists. We discuss such challenges and suggest developing holistic and meaningful interpretable architectures to bridge deep learning models and human interpretability.
Collapse
|
54
|
Readhead B, Hartley BJ, Eastwood BJ, Collier DA, Evans D, Farias R, He C, Hoffman G, Sklar P, Dudley JT, Schadt EE, Savić R, Brennand KJ. Author Correction: Expression-based drug screening of neural progenitor cells from individuals with schizophrenia. Nat Commun 2018; 9:4926. [PMID: 30451900 PMCID: PMC6242834 DOI: 10.1038/s41467-018-07326-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
55
|
Smith MR, Yevoo P, Sadahiro M, Austin C, Amarasiriwardena C, Awawda M, Arora M, Dudley JT, Morishita H. Integrative bioinformatics identifies postnatal lead (Pb) exposure disrupts developmental cortical plasticity. Sci Rep 2018; 8:16388. [PMID: 30401819 PMCID: PMC6219596 DOI: 10.1038/s41598-018-34592-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Accepted: 10/22/2018] [Indexed: 11/23/2022] Open
Abstract
Given that thousands of chemicals released into the environment have the potential capacity to harm neurodevelopment, there is an urgent need to systematically evaluate their toxicity. Neurodevelopment is marked by critical periods of plasticity wherein neural circuits are refined by the environment to optimize behavior and function. If chemicals perturb these critical periods, neurodevelopment can be permanently altered. Focusing on 214 human neurotoxicants, we applied an integrative bioinformatics approach using publically available data to identify dozens of neurotoxicant signatures that disrupt a transcriptional signature of a critical period for brain plasticity. This identified lead (Pb) as a critical period neurotoxicant and we confirmed in vivo that Pb partially suppresses critical period plasticity at a time point analogous to exposure associated with autism. This work demonstrates the utility of a novel informatics approach to systematically identify neurotoxicants that disrupt childhood neurodevelopment and can be extended to assess other environmental chemicals.
Collapse
|
56
|
Readhead B, Hartley BJ, Eastwood BJ, Collier DA, Evans D, Farias R, He C, Hoffman G, Sklar P, Dudley JT, Schadt EE, Savić R, Brennand KJ. Expression-based drug screening of neural progenitor cells from individuals with schizophrenia. Nat Commun 2018; 9:4412. [PMID: 30356048 PMCID: PMC6200740 DOI: 10.1038/s41467-018-06515-4] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 09/07/2018] [Indexed: 12/11/2022] Open
Abstract
A lack of biologically relevant screening models hinders the discovery of better treatments for schizophrenia (SZ) and other neuropsychiatric disorders. Here we compare the transcriptional responses of 8 commonly used cancer cell lines (CCLs) directly with that of human induced pluripotent stem cell (hiPSC)-derived neural progenitor cells (NPCs) from 12 individuals with SZ and 12 controls across 135 drugs, generating 4320 unique drug-response transcriptional signatures. We identify those drugs that reverse post-mortem SZ-associated transcriptomic signatures, several of which also differentially regulate neuropsychiatric disease-associated genes in a cell type (hiPSC NPC vs. CCL) and/or a diagnosis (SZ vs. control)-dependent manner. Overall, we describe a proof-of-concept application of transcriptomic drug screening to hiPSC-based models, demonstrating that the drug-induced gene expression differences observed with patient-derived hiPSC NPCs are enriched for SZ biology, thereby revealing a major advantage of incorporating cell type and patient-specific platforms in drug discovery. Unbiased large scale screening of small molecules for drug discovery in psychiatric disease is technically challenging and financially costly. Here, Readhead and colleagues integrate in silico and in vitro approaches to design and conduct transcriptomic drug screening in schizophrenia patient-derived neural cells, in order to survey novel pathologies and points of intervention.
Collapse
|
57
|
Baida G, Bhalla P, Yemelyanov A, Stechschulte LA, Shou W, Readhead B, Dudley JT, Sánchez ER, Budunova I. Deletion of the glucocorticoid receptor chaperone FKBP51 prevents glucocorticoid-induced skin atrophy. Oncotarget 2018; 9:34772-34783. [PMID: 30410676 PMCID: PMC6205168 DOI: 10.18632/oncotarget.26194] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 09/15/2018] [Indexed: 01/20/2023] Open
Abstract
FKBP51 (FK506-binding protein 51) is a known co-chaperone and regulator of the glucocorticoid receptor (GR), which usually attenuates its activity. FKBP51 is one of the major GR target genes in skin, but its role in clinical effects of glucocorticoids is not known. Here, we used FKBP51 knockout (KO) mice to determine FKBP51's role in the major adverse effect of topical glucocorticoids, skin atrophy. Unexpectedly, we found that all skin compartments (epidermis, dermis, dermal adipose and CD34+ stem cells) in FKBP51 KO animals were much more resistant to glucocorticoid-induced hypoplasia. Furthermore, despite the absence of inhibitory FKBP51, the basal level of expression and glucocorticoid activation of GR target genes were not increased in FKBP51 KO skin or CRISPR/Cas9-edited FKBP51 KO HaCaT human keratinocytes. FKBP51 is known to negatively regulate Akt and mTOR. We found a significant increase in AktSer473 and mTORSer2448 phosphorylation and downstream pro-growth signaling in FKBP51-deficient keratinocytes in vivo and in vitro. As Akt/mTOR-GR crosstalk is usually negative in skin, our results suggest that Akt/mTOR activation could be responsible for the lack of increased GR function and resistance of FKBP51 KO mice to the steroid-induced skin atrophy.
Collapse
|
58
|
Shameer K, Perez-Rodriguez MM, Bachar R, Li L, Johnson A, Johnson KW, Glicksberg BS, Smith MR, Readhead B, Scarpa J, Jebakaran J, Kovatch P, Lim S, Goodman W, Reich DL, Kasarskis A, Tatonetti NP, Dudley JT. Pharmacological risk factors associated with hospital readmission rates in a psychiatric cohort identified using prescriptome data mining. BMC Med Inform Decis Mak 2018; 18:79. [PMID: 30255805 PMCID: PMC6156906 DOI: 10.1186/s12911-018-0653-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Worldwide, over 14% of individuals hospitalized for psychiatric reasons have readmissions to hospitals within 30 days after discharge. Predicting patients at risk and leveraging accelerated interventions can reduce the rates of early readmission, a negative clinical outcome (i.e., a treatment failure) that affects the quality of life of patient. To implement individualized interventions, it is necessary to predict those individuals at highest risk for 30-day readmission. In this study, our aim was to conduct a data-driven investigation to find the pharmacological factors influencing 30-day all-cause, intra- and interdepartmental readmissions after an index psychiatric admission, using the compendium of prescription data (prescriptome) from electronic medical records (EMR). METHODS The data scientists in the project received a deidentified database from the Mount Sinai Data Warehouse, which was used to perform all analyses. Data was stored in a secured MySQL database, normalized and indexed using a unique hexadecimal identifier associated with the data for psychiatric illness visits. We used Bayesian logistic regression models to evaluate the association of prescription data with 30-day readmission risk. We constructed individual models and compiled results after adjusting for covariates, including drug exposure, age, and gender. We also performed digital comorbidity survey using EMR data combined with the estimation of shared genetic architecture using genomic annotations to disease phenotypes. RESULTS Using an automated, data-driven approach, we identified prescription medications, side effects (primary side effects), and drug-drug interaction-induced side effects (secondary side effects) associated with readmission risk in a cohort of 1275 patients using prescriptome analytics. In our study, we identified 28 drugs associated with risk for readmission among psychiatric patients. Based on prescription data, Pravastatin had the highest risk of readmission (OR = 13.10; 95% CI (2.82, 60.8)). We also identified enrichment of primary side effects (n = 4006) and secondary side effects (n = 36) induced by prescription drugs in the subset of readmitted patients (n = 89) compared to the non-readmitted subgroup (n = 1186). Digital comorbidity analyses and shared genetic analyses further reveals that cardiovascular disease and psychiatric conditions are comorbid and share functional gene modules (cardiomyopathy and anxiety disorder: shared genes (n = 37; P = 1.06815E-06)). CONCLUSIONS Large scale prescriptome data is now available from EMRs and accessible for analytics that could improve healthcare outcomes. Such analyses could also drive hypothesis and data-driven research. In this study, we explored the utility of prescriptome data to identify factors driving readmission in a psychiatric cohort. Converging digital health data from EMRs and systems biology investigations reveal a subset of patient populations that have significant comorbidities with cardiovascular diseases are more likely to be readmitted. Further, the genetic architecture of psychiatric illness also suggests overlap with cardiovascular diseases. In summary, assessment of medications, side effects, and drug-drug interactions in a clinical setting as well as genomic information using a data mining approach could help to find factors that could help to lower readmission rates in patients with mental illness.
Collapse
|
59
|
Laganà A, Beno I, Melnekoff D, Leshchenko V, Madduri D, Ramdas D, Sanchez L, Niglio S, Perumal D, Kidd BA, Miotto R, Shaknovich R, Chari A, Cho HJ, Barlogie B, Jagannath S, Dudley JT, Parekh S. Precision Medicine for Relapsed Multiple Myeloma on the Basis of an Integrative Multiomics Approach. JCO Precis Oncol 2018; 2018. [PMID: 30706044 PMCID: PMC6350920 DOI: 10.1200/po.18.00019] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Purpose Multiple myeloma (MM) is a malignancy of plasma cells, with a median survival of 6 years. Despite recent therapeutic advancements, relapse remains mostly inevitable, and the disease is fatal in the majority of patients. A major challenge in the treatment of patients with relapsed MM is the timely identification of treatment options in a personalized manner. Current approaches in precision oncology aim at matching specific DNA mutations to drugs, but incorporation of genome-wide RNA profiles has not yet been clinically assessed. Methods We have developed a novel computational platform for precision medicine of relapsed and/or refractory MM on the basis of DNA and RNA sequencing. Our approach expands on the traditional DNA-based approaches by integrating somatic mutations and copy number alterations with RNA-based drug repurposing and pathway analysis. We tested our approach in a pilot precision medicine clinical trial with 64 patients with relapsed and/or refractory MM. Results We generated treatment recommendations in 63 of 64 patients. Twenty-six patients had treatment implemented, and 21 were assessable. Of these, 11 received a drug that was based on RNA findings, eight received a drug that was based on DNA, and two received a drug that was based on both RNA and DNA. Sixteen of the 21 evaluable patients had a clinical response (ie, reduction of disease marker ≥ 25%), giving a clinical benefit rate of 76% and an overall response rate of 66%, with five patients having ongoing responses at the end of the trial. The median duration of response was 131 days. Conclusion Our results show that a comprehensive sequencing approach can identify viable options in patients with relapsed and/or refractory myeloma, and they represent proof of principle of how RNA sequencing can contribute beyond DNA mutation analysis to the development of a reliable drug recommendation tool.
Collapse
|
60
|
Vashisht R, Jung K, Schuler A, Banda JM, Park RW, Jin S, Li L, Dudley JT, Johnson KW, Shervey MM, Xu H, Wu Y, Natrajan K, Hripcsak G, Jin P, Van Zandt M, Reckard A, Reich CG, Weaver J, Schuemie MJ, Ryan PB, Callahan A, Shah NH. Association of Hemoglobin A1c Levels With Use of Sulfonylureas, Dipeptidyl Peptidase 4 Inhibitors, and Thiazolidinediones in Patients With Type 2 Diabetes Treated With Metformin: Analysis From the Observational Health Data Sciences and Informatics Initiative. JAMA Netw Open 2018; 1:e181755. [PMID: 30646124 PMCID: PMC6324274 DOI: 10.1001/jamanetworkopen.2018.1755] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
IMPORTANCE Consensus around an efficient second-line treatment option for type 2 diabetes (T2D) remains ambiguous. The availability of electronic medical records and insurance claims data, which capture routine medical practice, accessed via the Observational Health Data Sciences and Informatics network presents an opportunity to generate evidence for the effectiveness of second-line treatments. OBJECTIVE To identify which drug classes among sulfonylureas, dipeptidyl peptidase 4 (DPP-4) inhibitors, and thiazolidinediones are associated with reduced hemoglobin A1c (HbA1c) levels and lower risk of myocardial infarction, kidney disorders, and eye disorders in patients with T2D treated with metformin as a first-line therapy. DESIGN, SETTING, AND PARTICIPANTS Three retrospective, propensity-matched, new-user cohort studies with replication across 8 sites were performed from 1975 to 2017. Medical data of 246 558 805 patients from multiple countries from the Observational Health Data Sciences and Informatics (OHDSI) initiative were included and medical data sets were transformed into a unified common data model, with analysis done using open-source analytical tools. Participants included patients with T2D receiving metformin with at least 1 prior HbA1c laboratory test who were then prescribed either sulfonylureas, DPP-4 inhibitors, or thiazolidinediones. Data analysis was conducted from 2015 to 2018. EXPOSURES Treatment with sulfonylureas, DPP-4 inhibitors, or thiazolidinediones starting at least 90 days after the initial prescription of metformin. MAIN OUTCOMES AND MEASURES The primary outcome is the first observation of the reduction of HbA1c level to 7% of total hemoglobin or less after prescription of a second-line drug. Secondary outcomes are myocardial infarction, kidney disorder, and eye disorder after prescription of a second-line drug. RESULTS A total of 246 558 805 patients (126 977 785 women [51.5%]) were analyzed. Effectiveness of sulfonylureas, DPP-4 inhibitors, and thiazolidinediones prescribed after metformin to lower HbA1c level to 7% or less of total hemoglobin remained indistinguishable in patients with T2D. Patients treated with sulfonylureas compared with DPP-4 inhibitors had a small increased consensus hazard ratio of myocardial infarction (1.12; 95% CI, 1.02-1.24) and eye disorders (1.15; 95% CI, 1.11-1.19) in the meta-analysis. Hazard of observing kidney disorders after treatment with sulfonylureas, DPP-4 inhibitors, or thiazolidinediones was equally likely. CONCLUSIONS AND RELEVANCE The examined drug classes did not differ in lowering HbA1c and in hazards of kidney disorders in patients with T2D treated with metformin as a first-line therapy. Sulfonylureas had a small, higher observed hazard of myocardial infarction and eye disorders compared with DPP-4 inhibitors in the meta-analysis. The OHDSI collaborative network can be used to conduct a large international study examining the effectiveness of second-line treatment choices made in clinical management of T2D.
Collapse
|
61
|
Patel R, Scheinfeldt LB, Sanderford MD, Lanham TR, Tamura K, Platt A, Glicksberg BS, Xu K, Dudley JT, Kumar S. Adaptive Landscape of Protein Variation in Human Exomes. Mol Biol Evol 2018; 35:2015-2025. [PMID: 29846678 PMCID: PMC6063297 DOI: 10.1093/molbev/msy107] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The human genome contains hundreds of thousands of missense mutations. However, only a handful of these variants are known to be adaptive, which implies that adaptation through protein sequence change is an extremely rare phenomenon in human evolution. Alternatively, existing methods may lack the power to pinpoint adaptive variation. We have developed and applied an Evolutionary Probability Approach (EPA) to discover candidate adaptive polymorphisms (CAPs) through the discordance between allelic evolutionary probabilities and their observed frequencies in human populations. EPA reveals thousands of missense CAPs, which suggest that a large number of previously optimal alleles experienced a reversal of fortune in the human lineage. We explored nonadaptive mechanisms to explain CAPs, including the effects of demography, mutation rate variability, and negative and positive selective pressures in modern humans. Many nonadaptive hypotheses were tested, but failed to explain the data, which suggests that a large proportion of CAP alleles have increased in frequency due to beneficial selection. This suggestion is supported by the fact that a vast majority of adaptive missense variants discovered previously in humans are CAPs, and hundreds of CAP alleles are protective in genotype-phenotype association data. Our integrated phylogenomic and population genetic EPA approach predicts the existence of thousands of nonneutral candidate variants in the human proteome. We expect this collection to be enriched in beneficial variation. The EPA approach can be applied to discover candidate adaptive variation in any protein, population, or species for which allele frequency data and reliable multispecies alignments are available.
Collapse
|
62
|
Shameer K, Glicksberg BS, Hodos R, Johnson KW, Badgeley MA, Readhead B, Tomlinson MS, O’Connor T, Miotto R, Kidd BA, Chen R, Ma’ayan A, Dudley JT. Systematic analyses of drugs and disease indications in RepurposeDB reveal pharmacological, biological and epidemiological factors influencing drug repositioning. Brief Bioinform 2018; 19:656-678. [PMID: 28200013 PMCID: PMC6192146 DOI: 10.1093/bib/bbw136] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Revised: 11/29/2016] [Indexed: 12/22/2022] Open
Abstract
Increase in global population and growing disease burden due to the emergence of infectious diseases (Zika virus), multidrug-resistant pathogens, drug-resistant cancers (cisplatin-resistant ovarian cancer) and chronic diseases (arterial hypertension) necessitate effective therapies to improve health outcomes. However, the rapid increase in drug development cost demands innovative and sustainable drug discovery approaches. Drug repositioning, the discovery of new or improved therapies by reevaluation of approved or investigational compounds, solves a significant gap in the public health setting and improves the productivity of drug development. As the number of drug repurposing investigations increases, a new opportunity has emerged to understand factors driving drug repositioning through systematic analyses of drugs, drug targets and associated disease indications. However, such analyses have so far been hampered by the lack of a centralized knowledgebase, benchmarking data sets and reporting standards. To address these knowledge and clinical needs, here, we present RepurposeDB, a collection of repurposed drugs, drug targets and diseases, which was assembled, indexed and annotated from public data. RepurposeDB combines information on 253 drugs [small molecules (74.30%) and protein drugs (25.29%)] and 1125 diseases. Using RepurposeDB data, we identified pharmacological (chemical descriptors, physicochemical features and absorption, distribution, metabolism, excretion and toxicity properties), biological (protein domains, functional process, molecular mechanisms and pathway cross talks) and epidemiological (shared genetic architectures, disease comorbidities and clinical phenotype similarities) factors mediating drug repositioning. Collectively, RepurposeDB is developed as the reference database for drug repositioning investigations. The pharmacological, biological and epidemiological principles of drug repositioning identified from the meta-analyses could augment therapeutic development.
Collapse
|
63
|
Readhead B, Haure-Mirande JV, Funk CC, Richards MA, Shannon P, Haroutunian V, Sano M, Liang WS, Beckmann ND, Price ND, Reiman EM, Schadt EE, Ehrlich ME, Gandy S, Dudley JT. Multiscale Analysis of Independent Alzheimer's Cohorts Finds Disruption of Molecular, Genetic, and Clinical Networks by Human Herpesvirus. Neuron 2018; 99:64-82.e7. [PMID: 29937276 PMCID: PMC6551233 DOI: 10.1016/j.neuron.2018.05.023] [Citation(s) in RCA: 420] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Revised: 03/05/2018] [Accepted: 05/15/2018] [Indexed: 12/13/2022]
Abstract
Investigators have long suspected that pathogenic microbes might contribute to the onset and progression of Alzheimer's disease (AD) although definitive evidence has not been presented. Whether such findings represent a causal contribution, or reflect opportunistic passengers of neurodegeneration, is also difficult to resolve. We constructed multiscale networks of the late-onset AD-associated virome, integrating genomic, transcriptomic, proteomic, and histopathological data across four brain regions from human post-mortem tissue. We observed increased human herpesvirus 6A (HHV-6A) and human herpesvirus 7 (HHV-7) from subjects with AD compared with controls. These results were replicated in two additional, independent and geographically dispersed cohorts. We observed regulatory relationships linking viral abundance and modulators of APP metabolism, including induction of APBB2, APPBP2, BIN1, BACE1, CLU, PICALM, and PSEN1 by HHV-6A. This study elucidates networks linking molecular, clinical, and neuropathological features with viral activity and is consistent with viral activity constituting a general feature of AD.
Collapse
|
64
|
Brouwer J, Cheng WY, Bauer-Mehren A, Maisel D, Lechner K, Andersson E, Dudley JT, Milletti F. Abstract 1027: Regulatory T-cell genes drive altered immune microenvironment in adult solid cancers and allow for immune contextual patient subtyping. Cancer Res 2018. [DOI: 10.1158/1538-7445.am2018-1027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The tumor microenvironment is an important factor in cancer immunotherapy response. To further understand how a tumor affects the local immune system, we analyzed immune gene expression differences between matching normal and tumor tissues. We analyzed previously published and new gene expression data from solid cancers and isolated immune cell populations. We also determined the correlation between CD8, FoxP3 immunohistochemistry (IHC) and immune-related genes. Across solid TCGA cancers, we observed that regulatory T-cells (Tregs) were one of the main drivers of immune gene expression differences between normal and tumor tissues. A tumor-specific CD8 signature had slightly lower scores in tumor tissues compared to normal of most (12 of 16) cancers, while a Treg signature score was higher in tumor tissues of all cancers except liver. We clustered TCGA colorectal samples (626 patients) and a new separate testing data set (60 patients) into two groups according to Treg gene signature expression. The High Treg cluster had more colorectal tumors that were Consensus Molecular Subtype 1/4, right-sided and microsatellite-instable, compared to the Low Treg cluster. Finally, we determined the correlation between CD8, FoxP3 immunohistochemistry (IHC) and our gene signatures and found that in this small data set correlation between signature and IHC overall was low, but samples in the High Treg cluster had significantly more CD8+ and FoxP3+ cells compared to the Low Treg cluster. We conclude that high Treg signature expression scores correlate with high overall immune gene expression. Using this novel way of classifying patients, more colorectal tumors with high immune activation were identified compared to other colorectal subtyping methods. Further research will reveal if this Treg-based subtyping improves the identification of patients that may benefit from cancer immunotherapy.
Citation Format: Jurriaan Brouwer, Wei-Yi Cheng, Anna Bauer-Mehren, Daniela Maisel, Katharina Lechner, Emilia Andersson, Joel T. Dudley, Francesca Milletti. Regulatory T-cell genes drive altered immune microenvironment in adult solid cancers and allow for immune contextual patient subtyping [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 1027.
Collapse
|
65
|
Fullard JF, Giambartolomei C, Hauberg ME, Xu K, Voloudakis G, Shao Z, Bare C, Dudley JT, Mattheisen M, Robakis NK, Haroutunian V, Roussos P. Open chromatin profiling of human postmortem brain infers functional roles for non-coding schizophrenia loci. Hum Mol Genet 2018; 29:5047114. [PMID: 29982455 PMCID: PMC7530524 DOI: 10.1093/hmg/ddy229] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
66
|
Johnson KW, Dudley JT, Bobe JR. A 72-Year-Old Patient with Longstanding, Untreated Familial Hypercholesterolemia but no Coronary Artery Calcification: A Case Report. Cureus 2018; 10:e2452. [PMID: 29888156 PMCID: PMC5991918 DOI: 10.7759/cureus.2452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Familial hypercholesterolemia (FH) is a genetic disease associated with persistently elevated levels of low-density lipoprotein cholesterol (LDL-C), which ultimately leads to greatly increased rates of atherosclerosis and cardiovascular disease. Atherosclerosis progression can be clinically approximated through measurement of coronary artery calcification (CAC). CAC can be measured via electron beam computed tomography (EBCT), multi-slice computed tomography (MSCT), or contrast-enhanced CT coronary angiography (CTCA). Here, we present the case of a 72-year-old man with known FH and established hypercholesterolemia who has consistently tested negative for any significant CAC.
Collapse
|
67
|
Lesovaya E, Agarwal S, Readhead B, Vinokour E, Baida G, Bhalla P, Kirsanov K, Yakubovskaya M, Platanias LC, Dudley JT, Budunova I. Rapamycin Modulates Glucocorticoid Receptor Function, Blocks Atrophogene REDD1, and Protects Skin from Steroid Atrophy. J Invest Dermatol 2018; 138:1935-1944. [PMID: 29596905 DOI: 10.1016/j.jid.2018.02.045] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Revised: 02/16/2018] [Accepted: 02/21/2018] [Indexed: 12/20/2022]
Abstract
Glucocorticoids have excellent therapeutic properties; however, they cause significant adverse atrophogenic effects. The mTORC1 inhibitor REDD1 has been recently identified as a key mediator of glucocorticoid-induced atrophy. We performed computational screening of a connectivity map database to identify putative REDD1 inhibitors. The top selected candidates included rapamycin, which was unexpected because it inhibits pro-proliferative mTOR signaling. Indeed, rapamycin inhibited REDD1 induction by glucocorticoids dexamethasone, clobetasol propionate, and fluocinolone acetonide in keratinocytes, lymphoid cells, and mouse skin. We also showed blunting of glucocorticoid-induced REDD1 induction by either catalytic inhibitor of mTORC1/2 (OSI-027) or genetic inhibition of mTORC1, highlighting role of mTOR in glucocorticoid receptor signaling. Moreover, rapamycin inhibited glucocorticoid receptor phosphorylation, nuclear translocation, and loading on glucocorticoid-responsive elements in REDD1 promoter. Using microarrays, we quantified a global effect of rapamycin on gene expression regulation by fluocinolone acetonide in human keratinocytes. Rapamycin inhibited activation of glucocorticoid receptor target genes yet enhanced the repression of pro-proliferative and proinflammatory genes. Remarkably, rapamycin protected skin against glucocorticoid-induced atrophy but had no effect on the glucocorticoid anti-inflammatory activity in different in vivo models, suggesting the clinical potential of combining rapamycin with glucocorticoids for the treatment of inflammatory diseases.
Collapse
|
68
|
Lee HC, Kosoy R, Becker CE, Dudley JT, Kidd BA. Automated cell type discovery and classification through knowledge transfer. Bioinformatics 2018; 33:1689-1695. [PMID: 28158442 PMCID: PMC5447237 DOI: 10.1093/bioinformatics/btx054] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 01/24/2017] [Indexed: 01/30/2023] Open
Abstract
Motivation Recent advances in mass cytometry allow simultaneous measurements of up to 50 markers at single-cell resolution. However, the high dimensionality of mass cytometry data introduces computational challenges for automated data analysis and hinders translation of new biological understanding into clinical applications. Previous studies have applied machine learning to facilitate processing of mass cytometry data. However, manual inspection is still inevitable and becoming the barrier to reliable large-scale analysis. Results We present a new algorithm called Automated Cell-type Discovery and Classification (ACDC) that fully automates the classification of canonical cell populations and highlights novel cell types in mass cytometry data. Evaluations on real-world data show ACDC provides accurate and reliable estimations compared to manual gating results. Additionally, ACDC automatically classifies previously ambiguous cell types to facilitate discovery. Our findings suggest that ACDC substantially improves both reliability and interpretability of results obtained from high-dimensional mass cytometry profiling data. Availability and Implementation A Python package (Python 3) and analysis scripts for reproducing the results are availability on https://bitbucket.org/dudleylab/acdc. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
69
|
Hall JB, Cong Z, Imamura-Kawasawa Y, Kidd BA, Dudley JT, Thiboutot DM, Nelson AM. Isolation and Identification of the Follicular Microbiome: Implications for Acne Research. J Invest Dermatol 2018; 138:2033-2040. [PMID: 29548797 DOI: 10.1016/j.jid.2018.02.038] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Revised: 02/12/2018] [Accepted: 02/12/2018] [Indexed: 12/12/2022]
Abstract
Our understanding of the microbiome and the role of Propionibacterium acnes in skin homeostasis and acne pathogenesis is evolving. Multiple methods for sampling and identifying the skin's microbiome exist, and understanding the differences between the abilities of various methods to characterize the microbial landscape is warranted. This study compared the microbial diversity of samples obtained from the cheeks of 20 volunteers, collected by surface swab, pore strips, and cyanoacrylate glue follicular biopsy, all sequenced with 16S rRNA sequencing (V1-V3) and whole-genome metagenomic sequencing. The sequencing method of choice influenced the detection of microbial profiles as whole-genome sequencing captured more species diversity, including viruses, compared with 16S sequencing. The relative abundance of bacterial or fungal species and overall diversity did not differ between sampling methods. However, the viral composition of the skin's surface is unique compared with the follicle, suggesting distinct viral niches within the skin. P. acnes bacteria, ribotypes, and bacteriophages were identified equally by all sampling methods indicating that the sampling method, whether for the skin's surface or follicle, does not impact P. acnes-related characterization and that all may be equally useful for acne-related research studies.
Collapse
|
70
|
Salem Omar AM, Shameer K, Narula S, Abdel Rahman MA, Rifaie O, Narula J, Dudley JT, Sengupta PP. Artificial Intelligence-Based Assessment of Left Ventricular Filling Pressures From 2-Dimensional Cardiac Ultrasound Images. JACC Cardiovasc Imaging 2018; 11:509-510. [DOI: 10.1016/j.jcmg.2017.05.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/26/2016] [Revised: 05/01/2017] [Accepted: 05/01/2017] [Indexed: 11/17/2022]
|
71
|
Kishibe M, Baida G, Bhalla P, Lavker RM, Schlosser B, Iinuma S, Yoshida S, Dudley JT, Budunova I. Important role of kallikrein 6 for the development of keratinocyte proliferative resistance to topical glucocorticoids. Oncotarget 2018; 7:69479-69488. [PMID: 27283773 PMCID: PMC5342492 DOI: 10.18632/oncotarget.9926] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2016] [Accepted: 05/13/2016] [Indexed: 11/25/2022] Open
Abstract
One of the major adverse effects of topical glucocorticoids is cutaneous atrophy often followed by development of resistance to steroids (tachyphylaxis). Previously we showed that after two weeks, interfollicular mouse keratinocytes acquired resistance to anti-proliferative effects of glucocorticoid fluocinolone acetonide (FA). One of the top genes activated by FA during tachyphylaxis was Klk6 encoding kallikrein-related peptidase 6, known to enhance keratinocyte proliferation. KLK6 was also strongly induced by chronic glucocorticoids in human skin. Double immunostaining showed that KLK6+ keratinocytes, localized in suprabasal layer of mouse skin, were frequently adjacent to proliferating 5-bromo-2'-deoxyuridine-positive basal keratinocytes. We used KLK6 knockout (KO) mice to evaluate KLK6 role in skin regeneration after steroid-induced atrophy. KLK6 KOs had thinner epidermis and decreased keratinocyte proliferation. The keratinocytes in wild type and KLK6 KO epidermis were equally sensitive to acute anti-proliferative effect of FA. However, the development of proliferative resistance during chronic treatment was reduced in KO epidermis. This was not due to the changes in glucocorticoid receptor (GR) expression or function as GR protein level and induction of GR-target genes were similar in wild type and KLK6 KO skin. Overall, these results suggest a novel mechanism of epidermal regeneration after glucocorticoid-induced atrophy via KLK6 activation.
Collapse
|
72
|
Shameer K, Johnson KW, Glicksberg BS, Dudley JT, Sengupta PP. Machine learning in cardiovascular medicine: are we there yet? Heart 2018; 104:1156-1164. [PMID: 29352006 DOI: 10.1136/heartjnl-2017-311198] [Citation(s) in RCA: 229] [Impact Index Per Article: 38.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 12/19/2017] [Accepted: 12/21/2017] [Indexed: 12/11/2022] Open
Abstract
Artificial intelligence (AI) broadly refers to analytical algorithms that iteratively learn from data, allowing computers to find hidden insights without being explicitly programmed where to look. These include a family of operations encompassing several terms like machine learning, cognitive learning, deep learning and reinforcement learning-based methods that can be used to integrate and interpret complex biomedical and healthcare data in scenarios where traditional statistical methods may not be able to perform. In this review article, we discuss the basics of machine learning algorithms and what potential data sources exist; evaluate the need for machine learning; and examine the potential limitations and challenges of implementing machine in the context of cardiovascular medicine. The most promising avenues for AI in medicine are the development of automated risk prediction algorithms which can be used to guide clinical care; use of unsupervised learning techniques to more precisely phenotype complex disease; and the implementation of reinforcement learning algorithms to intelligently augment healthcare providers. The utility of a machine learning-based predictive model will depend on factors including data heterogeneity, data depth, data breadth, nature of modelling task, choice of machine learning and feature selection algorithms, and orthogonal evidence. A critical understanding of the strength and limitations of various methods and tasks amenable to machine learning is vital. By leveraging the growing corpus of big data in medicine, we detail pathways by which machine learning may facilitate optimal development of patient-specific models for improving diagnoses, intervention and outcome in cardiovascular medicine.
Collapse
|
73
|
Smith MR, Glicksberg BS, Li L, Chen R, Morishita H, Dudley JT. Loss-of-function of neuroplasticity-related genes confers risk for human neurodevelopmental disorders. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2018; 23:68-79. [PMID: 29218870 PMCID: PMC5728668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
High and increasing prevalence of neurodevelopmental disorders place enormous personal and economic burdens on society. Given the growing realization that the roots of neurodevelopmental disorders often lie in early childhood, there is an urgent need to identify childhood risk factors. Neurodevelopment is marked by periods of heightened experience-dependent neuroplasticity wherein neural circuitry is optimized by the environment. If these critical periods are disrupted, development of normal brain function can be permanently altered, leading to neurodevelopmental disorders. Here, we aim to systematically identify human variants in neuroplasticity-related genes that confer risk for neurodevelopmental disorders. Historically, this knowledge has been limited by a lack of techniques to identify genes related to neurodevelopmental plasticity in a high-throughput manner and a lack of methods to systematically identify mutations in these genes that confer risk for neurodevelopmental disorders. Using an integrative genomics approach, we determined loss-of-function (LOF) variants in putative plasticity genes, identified from transcriptional profiles of brain from mice with elevated plasticity, that were associated with neurodevelopmental disorders. From five shared differentially expressed genes found in two mouse models of juvenile-like elevated plasticity (juvenile wild-type or adult Lynx1-/- relative to adult wild-type) that were also genotyped in the Mount Sinai BioMe Biobank we identified multiple associations between LOF genes and increased risk for neurodevelopmental disorders across 10,510 patients linked to the Mount Sinai Electronic Medical Records (EMR), including epilepsy and schizophrenia. This work demonstrates a novel approach to identify neurodevelopmental risk genes and points toward a promising avenue to discover new drug targets to address the unmet therapeutic needs of neurodevelopmental disease.
Collapse
|
74
|
Glicksberg BS, Miotto R, Johnson KW, Shameer K, Li L, Chen R, Dudley JT. Automated disease cohort selection using word embeddings from Electronic Health Records. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2018; 23:145-156. [PMID: 29218877 PMCID: PMC5788312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Accurate and robust cohort definition is critical to biomedical discovery using Electronic Health Records (EHR). Similar to prospective study designs, high quality EHR-based research requires rigorous selection criteria to designate case/control status particular to each disease. Electronic phenotyping algorithms, which are manually built and validated per disease, have been successful in filling this need. However, these approaches are time-consuming, leading to only a relatively small amount of algorithms for diseases developed. Methodologies that automatically learn features from EHRs have been used for cohort selection as well. To date, however, there has been no systematic analysis of how these methods perform against current gold standards. Accordingly, this paper compares the performance of a state-of-the-art automated feature learning method to extracting research-grade cohorts for five diseases against their established electronic phenotyping algorithms. In particular, we use word2vec to create unsupervised embeddings of the phenotype space within an EHR system. Using medical concepts as a query, we then rank patients by their proximity in the embedding space and automatically extract putative disease cohorts via a distance threshold. Experimental evaluation shows promising results with average F-score of 0.57 and AUC-ROC of 0.98. However, we noticed that results varied considerably between diseases, thus necessitating further investigation and/or phenotype-specific refinement of the approach before being readily deployed across all diseases.
Collapse
|
75
|
Johnson KW, Glicksberg BS, Hodos RA, Shameer K, Dudley JT. Causal inference on electronic health records to assess blood pressure treatment targets: an application of the parametric g formula. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2018; 23:180-191. [PMID: 29218880 PMCID: PMC5728675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Hypertension is a major risk factor for ischemic cardiovascular disease and cerebrovascular disease, which are respectively the primary and secondary most common causes of morbidity and mortality across the globe. To alleviate the risks of hypertension, there are a number of effective antihypertensive drugs available. However, the optimal treatment blood pressure goal for antihypertensive therapy remains an area of controversy. The results of the recent Systolic Blood Pressure Intervention Trial (SPRINT) trial, which found benefits for intensive lowering of systolic blood pressure, have been debated for several reasons. We aimed to assess the benefits of treating to four different blood pressure targets and to compare our results to those of SPRINT using a method for causal inference called the parametric g formula. We applied this method to blood pressure measurements obtained from the electronic health records of approximately 200,000 patients who visited the Mount Sinai Hospital in New York, NY. We simulated the effect of four clinically relevant dynamic treatment regimes, assessing the effectiveness of treating to four different blood pressure targets: 150 mmHg, 140 mmHg, 130 mmHg, and 120 mmHg. In contrast to current American Heart Association guidelines and in concordance with SPRINT, we find that targeting 120 mmHg systolic blood pressure is significantly associated with decreased incidence of major adverse cardiovascular events. Causal inference methods applied to electronic methods are a powerful and flexible technique and medicine may benefit from their increased usage.
Collapse
|