1
|
Medvedeva IV, Stokes ME, Eisinger D, LaBrie ST, Ai J, Trotter MWB, Schafer P, Yang R. Large-scale Analyses of Disease Biomarkers and Apremilast Pharmacodynamic Effects. Sci Rep 2020; 10:605. [PMID: 31953524 PMCID: PMC6969165 DOI: 10.1038/s41598-020-57542-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 12/12/2019] [Indexed: 01/06/2023] Open
Abstract
Finding biomarkers that provide shared link between disease severity, drug-induced pharmacodynamic effects and response status in human trials can provide number of values for patient benefits: elucidating current therapeutic mechanism-of-action, and, back-translating to fast-track development of next-generation therapeutics. Both opportunities are predicated on proactive generation of human molecular profiles that capture longitudinal trajectories before and after pharmacological intervention. Here, we present the largest plasma proteomic biomarker dataset available to-date and the corresponding analyses from placebo-controlled Phase III clinical trials of the phosphodiesterase type 4 inhibitor apremilast in psoriasis (PSOR), psoriatic arthritis (PsA), and ankylosing spondylitis (AS) from 526 subjects overall. Using approximately 150 plasma analytes tracked across three time points, we identified IL-17A and KLK-7 as biomarkers for disease severity and apremilast pharmacodynamic effect in psoriasis patients. Combined decline rate of KLK-7, PEDF, MDC and ANGPTL4 by Week 16 represented biomarkers for the responder subgroup, shedding insights into therapeutic mechanisms. In ankylosing spondylitis patients, IL-6 and LRG-1 were identified as biomarkers with concordance to disease severity. Apremilast-induced LRG-1 increase was consistent with the overall lack of efficacy in ankylosing spondylitis. Taken together, these findings expanded the mechanistic knowledge base of apremilast and provided translational foundations to accelerate future efforts including compound differentiation, combination, and repurposing.
Collapse
Affiliation(s)
- Irina V Medvedeva
- Celgene Corporation, Informatics&Predictive Sciences, Cambridge, 02140, USA.
| | - Matthew E Stokes
- Celgene Corporation, Informatics&Predictive Sciences, Cambridge, 02140, USA
| | | | | | - Jing Ai
- Celgene Corporation, Informatics&Predictive Sciences, Cambridge, 02140, USA
| | - Matthew W B Trotter
- Celgene Corporation, Celgene Institute for Translational Research Europe (CITRE), Sevilla, 41092, Spain
| | - Peter Schafer
- Celgene Corporation, Translational Development, Summit, 07901, USA
| | - Robert Yang
- Celgene Corporation, Informatics&Predictive Sciences, Cambridge, 02140, USA
| |
Collapse
|
2
|
Baker N, Boobis A, Burgoon L, Carney E, Currie R, Fritsche E, Knudsen T, Laffont M, Piersma AH, Poole A, Schneider S, Daston G. Building a developmental toxicity ontology. Birth Defects Res 2018; 110:502-518. [DOI: 10.1002/bdr2.1189] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Nancy Baker
- Lockheed Martin, Research Triangle Park; Piedmont North Carolina
| | - Alan Boobis
- Department of Medicine; Imperial College London; London United Kingdom
| | - Lyle Burgoon
- U.S. Army Engineer Research and Development Center; Raleigh-Durham North Carolina
| | | | | | | | - Thomas Knudsen
- U.S. Environmental Protection Agency; Research Triangle Park; Piedmont North Carolina
| | - Madeleine Laffont
- European Centre for Ecotoxicology and Toxicology of Chemicals (ECETOC); Brussels Belgium
| | - Aldert H. Piersma
- Center for Health Protection; National Institute for Public Health and the Environment (RIVM), Bilthoven, and Institute for Risk Assessment Sciences (IRAS), Utrecht University; Utrecht The Netherlands
| | - Alan Poole
- European Centre for Ecotoxicology and Toxicology of Chemicals (ECETOC); Brussels Belgium
| | | | - George Daston
- Central Product Safety Department; The Procter & Gamble Company; Mason Ohio
| |
Collapse
|
3
|
Application of text mining in the biomedical domain. Methods 2015; 74:97-106. [PMID: 25641519 DOI: 10.1016/j.ymeth.2015.01.015] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2014] [Revised: 01/21/2015] [Accepted: 01/23/2015] [Indexed: 12/12/2022] Open
Abstract
In recent years the amount of experimental data that is produced in biomedical research and the number of papers that are being published in this field have grown rapidly. In order to keep up to date with developments in their field of interest and to interpret the outcome of experiments in light of all available literature, researchers turn more and more to the use of automated literature mining. As a consequence, text mining tools have evolved considerably in number and quality and nowadays can be used to address a variety of research questions ranging from de novo drug target discovery to enhanced biological interpretation of the results from high throughput experiments. In this paper we introduce the most important techniques that are used for a text mining and give an overview of the text mining tools that are currently being used and the type of problems they are typically applied for.
Collapse
|
4
|
Fleuren WWM, Toonen EJM, Verhoeven S, Frijters R, Hulsen T, Rullmann T, van Schaik R, de Vlieg J, Alkema W. Identification of new biomarker candidates for glucocorticoid induced insulin resistance using literature mining. BioData Min 2013; 6:2. [PMID: 23379763 PMCID: PMC3577498 DOI: 10.1186/1756-0381-6-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2012] [Accepted: 01/02/2013] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Glucocorticoids are potent anti-inflammatory agents used for the treatment of diseases such as rheumatoid arthritis, asthma, inflammatory bowel disease and psoriasis. Unfortunately, usage is limited because of metabolic side-effects, e.g. insulin resistance, glucose intolerance and diabetes. To gain more insight into the mechanisms behind glucocorticoid induced insulin resistance, it is important to understand which genes play a role in the development of insulin resistance and which genes are affected by glucocorticoids.Medline abstracts contain many studies about insulin resistance and the molecular effects of glucocorticoids and thus are a good resource to study these effects. RESULTS We developed CoPubGene a method to automatically identify gene-disease associations in Medline abstracts. We used this method to create a literature network of genes related to insulin resistance and to evaluate the importance of the genes in this network for glucocorticoid induced metabolic side effects and anti-inflammatory processes.With this approach we found several genes that already are considered markers of GC induced IR, such as phosphoenolpyruvate carboxykinase (PCK) and glucose-6-phosphatase, catalytic subunit (G6PC). In addition, we found genes involved in steroid synthesis that have not yet been recognized as mediators of GC induced IR. CONCLUSIONS With this approach we are able to construct a robust informative literature network of insulin resistance related genes that gave new insights to better understand the mechanisms behind GC induced IR. The method has been set up in a generic way so it can be applied to a wide variety of disease networks.
Collapse
Affiliation(s)
- Wilco WM Fleuren
- Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre, P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands
- Netherlands Bioinformatics Centre (NBIC), P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands
| | - Erik JM Toonen
- Department of Medicine, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | | | - Raoul Frijters
- Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre, P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands
- Present address: Rijk Zwaan Nederland BV, Fijnaart, The Netherlands
| | - Tim Hulsen
- Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre, P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands
- Present address: Philips Research Europe, Eindhoven, The Netherlands
| | | | | | - Jacob de Vlieg
- Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre, P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands
- Netherlands eScience Center, Amsterdam, The Netherlands
| | - Wynand Alkema
- Computational Drug Discovery (CDD), CMBI, NCMLS, Radboud University Nijmegen Medical Centre, P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands
- Present address: NIZO Food Research BV, Ede, The Netherlands
| |
Collapse
|
5
|
Spjuth O, Eklund M, Ahlberg Helgee E, Boyer S, Carlsson L. Integrated Decision Support for Assessing Chemical Liabilities. J Chem Inf Model 2011; 51:1840-7. [DOI: 10.1021/ci200242c] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
| | - Martin Eklund
- Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
| | - Ernst Ahlberg Helgee
- Computational Toxicology, Global Safety Assessment, AstraZeneca R&D, Mölndal, Sweden
| | - Scott Boyer
- Computational Toxicology, Global Safety Assessment, AstraZeneca R&D, Mölndal, Sweden
| | - Lars Carlsson
- Computational Toxicology, Global Safety Assessment, AstraZeneca R&D, Mölndal, Sweden
| |
Collapse
|
6
|
Boutros PC, Moffat ID, Okey AB, Pohjanvirta R. mRNA levels in control rat liver display strain-specific, hereditary, and AHR-dependent components. PLoS One 2011; 6:e18337. [PMID: 21760882 PMCID: PMC3132743 DOI: 10.1371/journal.pone.0018337] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2010] [Accepted: 03/04/2011] [Indexed: 02/05/2023] Open
Abstract
Rat is a major model organism in toxicogenomics and pharmacogenomics. Hepatic mRNA profiles after treatment with xenobiotic chemicals are used to predict and understand drug toxicity and mechanisms. Surprisingly, neither inter- and intra-strain variability of mRNA abundances in control rats nor the heritability of rat mRNA abundances yet been established. We address these issues by studying five populations: the popular Sprague-Dawley strain, sub-strains of Long-Evans and Wistar rats, and two lines derived from crosses between the Long-Evans and Wistar sub-strains. Using three independent techniques--variance analysis, linear modelling, and unsupervised pattern recognition--we characterize extensive intra- and inter-strain variability in mRNA levels. We find that both sources of variability are non-random and are enriched for specific functional groups. Specific transcription-factor binding-sites are enriched in their promoter regions and these genes occur in "islands" scattered throughout the rat genome. Using the two lines generated by crossbreeding we tested heritability of hepatic mRNA levels: the majority of rat genes appear to exhibit directional genetics, with only a few interacting loci. Finally, a comparison of inter-strain heterogeneity between mouse and rat orthologs shows more heterogeneity in rats than mice; thus rat and mouse heterogeneity are uncorrelated. Our results establish that control hepatic mRNA levels are relatively homogeneous within rat strains but highly variable between strains. This variability may be related to increased activity of specific transcription-factors and has clear functional consequences. Future studies may take advantage of this phenomenon by surveying panels of rat strains.
Collapse
Affiliation(s)
- Paul C Boutros
- Department of Pharmacology and Toxicology, University of Toronto, Toronto, Canada.
| | | | | | | |
Collapse
|
7
|
Fleuren WWM, Verhoeven S, Frijters R, Heupers B, Polman J, van Schaik R, de Vlieg J, Alkema W. CoPub update: CoPub 5.0 a text mining system to answer biological questions. Nucleic Acids Res 2011; 39:W450-4. [PMID: 21622961 PMCID: PMC3125746 DOI: 10.1093/nar/gkr310] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
In this article, we present CoPub 5.0, a publicly available text mining system, which uses Medline abstracts to calculate robust statistics for keyword co-occurrences. CoPub was initially developed for the analysis of microarray data, but we broadened the scope by implementing new technology and new thesauri. In CoPub 5.0, we integrated existing CoPub technology with new features, and provided a new advanced interface, which can be used to answer a variety of biological questions. CoPub 5.0 allows searching for keywords of interest and its relations to curated thesauri and provides highlighting and sorting mechanisms, using its statistics, to retrieve the most important abstracts in which the terms co-occur. It also provides a way to search for indirect relations between genes, drugs, pathways and diseases, following an ABC principle, in which A and C have no direct connection but are connected via shared B intermediates. With CoPub 5.0, it is possible to create, annotate and analyze networks using the layout and highlight options of Cytoscape web, allowing for literature based systems biology. Finally, operations of the CoPub 5.0 Web service enable to implement the CoPub technology in bioinformatics workflows. CoPub 5.0 can be accessed through the CoPub portal http://www.copub.org.
Collapse
Affiliation(s)
- Wilco W M Fleuren
- Computational Drug Discovery, CMBI, NCMLS, Radboud University Nijmegen Medical Centre, 6500 HB Nijmegen, The Netherlands.
| | | | | | | | | | | | | | | |
Collapse
|
8
|
Hakvoort TBM, Moerland PD, Frijters R, Sokolović A, Labruyère WT, Vermeulen JLM, Ver Loren van Themaat E, Breit TM, Wittink FRA, van Kampen AHC, Verhoeven AJ, Lamers WH, Sokolović M. Interorgan coordination of the murine adaptive response to fasting. J Biol Chem 2011; 286:16332-43. [PMID: 21393243 DOI: 10.1074/jbc.m110.216986] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Starvation elicits a complex adaptive response in an organism. No information on transcriptional regulation of metabolic adaptations is available. We, therefore, studied the gene expression profiles of brain, small intestine, kidney, liver, and skeletal muscle in mice that were subjected to 0-72 h of fasting. Functional-category enrichment, text mining, and network analyses were employed to scrutinize the overall adaptation, aiming to identify responsive pathways, processes, and networks, and their regulation. The observed transcriptomics response did not follow the accepted "carbohydrate-lipid-protein" succession of expenditure of energy substrates. Instead, these processes were activated simultaneously in different organs during the entire period. The most prominent changes occurred in lipid and steroid metabolism, especially in the liver and kidney. They were accompanied by suppression of the immune response and cell turnover, particularly in the small intestine, and by increased proteolysis in the muscle. The brain was extremely well protected from the sequels of starvation. 60% of the identified overconnected transcription factors were organ-specific, 6% were common for 4 organs, with nuclear receptors as protagonists, accounting for almost 40% of all transcriptional regulators during fasting. The common transcription factors were PPARα, HNF4α, GCRα, AR (androgen receptor), SREBP1 and -2, FOXOs, EGR1, c-JUN, c-MYC, SP1, YY1, and ETS1. Our data strongly suggest that the control of metabolism in four metabolically active organs is exerted by transcription factors that are activated by nutrient signals and serves, at least partly, to prevent irreversible brain damage.
Collapse
Affiliation(s)
- Theodorus B M Hakvoort
- Tytgat Institute for Liver and Intestinal Research (formerly AMC Liver Center), Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Deftereos SN, Andronis C, Friedla EJ, Persidis A, Persidis A. Drug repurposing and adverse event prediction using high-throughput literature analysis. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2011; 3:323-34. [PMID: 21416632 DOI: 10.1002/wsbm.147] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Drug repurposing is the process of using existing drugs in indications other than the ones they were originally designed for. It is an area of significant recent activity due to the mounting costs of traditional drug development and scarcity of new chemical entities brought to the market by bio-pharmaceutical companies. By selecting drugs that already satisfy basic toxicity, ADME and related criteria, drug repurposing promises to deliver significant value at reduced cost and in dramatically shorter time frames than is normally the case for the drug development process. The same process that results in drug repurposing can also be used for the prediction of adverse events of known or novel drugs. The analytics method is based on the description of the mechanism of action of a drug, which is then compared to the molecular mechanisms underlying all known adverse events. This review will focus on those approaches to drug repurposing and adverse event prediction that are based on the biomedical literature. Such approaches typically begin with an analysis of the literature and aim to reveal indirect relationships among seemingly unconnected biomedical entities such as genes, signaling pathways, physiological processes, and diseases. Networks of associations of these entities allow the uncovering of the molecular mechanisms underlying a disease, better understanding of the biological effects of a drug and the evaluation of its benefit/risk profile. In silico results can be tested in relevant cellular and animal models and, eventually, in clinical trials.
Collapse
|
10
|
Frijters R, van Vugt M, Smeets R, van Schaik R, de Vlieg J, Alkema W. Literature mining for the discovery of hidden connections between drugs, genes and diseases. PLoS Comput Biol 2010; 6. [PMID: 20885778 PMCID: PMC2944780 DOI: 10.1371/journal.pcbi.1000943] [Citation(s) in RCA: 120] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2010] [Accepted: 08/26/2010] [Indexed: 01/19/2023] Open
Abstract
The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs. The biomedical literature is an important source of knowledge on the function of genes and on the mechanisms by which these genes regulate cellular processes. Several text mining approaches have been developed to leverage this rich source of information by automatically extracting associations between concepts such as genes, diseases and drugs from a large body of text. Here, we describe a new method that extracts novel, not yet recognized associations between genes, diseases, drugs and cellular processes from the biomedical literature. Our method is built on the assumption that even if two concepts do not have a direct connection in literature, they may be functionally related if they are both connected to an overlapping set of concepts. Using this approach we predicted several novel connections between genes, diseases, drugs and pathways. Our results imply that our method is able to predict novel relationships from literature and, most importantly, that these newly identified relationships are biologically relevant. Our method can aid the drug discovery process where it can be used to find novel drug targets, increase insight in mode of action of a drug or find novel applications for known drugs.
Collapse
Affiliation(s)
- Raoul Frijters
- Computational Drug Discovery (CDD), Nijmegen Centre for Molecular Life Sciences (NCMLS), Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | - Marianne van Vugt
- Department of Immune Therapeutics, Schering-Plough, Oss, The Netherlands
| | - Ruben Smeets
- Department of Immune Therapeutics, Schering-Plough, Oss, The Netherlands
| | - René van Schaik
- Department of Molecular Design & Informatics, Schering-Plough, Oss, The Netherlands
| | - Jacob de Vlieg
- Computational Drug Discovery (CDD), Nijmegen Centre for Molecular Life Sciences (NCMLS), Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
- Department of Molecular Design & Informatics, Schering-Plough, Oss, The Netherlands
| | - Wynand Alkema
- Department of Molecular Design & Informatics, Schering-Plough, Oss, The Netherlands
- * E-mail:
| |
Collapse
|
11
|
Feingold BJ, Vegosen L, Davis M, Leibler J, Peterson A, Silbergeld EK. A niche for infectious disease in environmental health: rethinking the toxicological paradigm. ENVIRONMENTAL HEALTH PERSPECTIVES 2010; 118:1165-72. [PMID: 20385515 PMCID: PMC2920090 DOI: 10.1289/ehp.0901866] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2009] [Accepted: 04/12/2010] [Indexed: 05/21/2023]
Abstract
OBJECTIVE In this review we highlight the need to expand the scope of environmental health research, which now focuses largely on the study of toxicants, to incorporate infectious agents. We provide evidence that environmental health research would be strengthened through finding common ground with the tools and approaches of infectious disease research. DATA SOURCES AND EXTRACTION We conducted a literature review for examples of interactions between toxic agents and infectious diseases, as well as the role of these interactions as risk factors in classic "environmental" diseases. We investigated existing funding sources and research mandates in the United States from the National Science Foundation and the National Institutes of Health, particularly the National Institute of Environmental Health Sciences. DATA SYNTHESIS We adapted the toxicological paradigm to guide reintegration of infectious disease into environmental health research and to identify common ground between these two fields as well as opportunities for improving public health through interdisciplinary research. CONCLUSIONS Environmental health encompasses complex disease processes, many of which involve interactions among multiple risk factors, including toxicant exposures, pathogens, and susceptibility. Funding and program mandates for environmental health studies should be expanded to include pathogens in order to capture the true scope of these overlapping risks, thus creating more effective research investments with greater relevance to the complexity of real-world exposures and multifactorial health outcomes. We propose a new model that integrates the toxicology and infectious disease paradigms to facilitate improved collaboration and communication by providing a framework for interdisciplinary research. Pathogens should be part of environmental health research planning and funding allocation, as well as applications such as surveillance and policy development.
Collapse
Affiliation(s)
- Beth J Feingold
- Department of Environmental Health Sciences, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA.
| | | | | | | | | | | |
Collapse
|
12
|
Hettne KM, van Mulligen EM, Schuemie MJ, Schijvenaars BJ, Kors JA. Rewriting and suppressing UMLS terms for improved biomedical term identification. J Biomed Semantics 2010; 1:5. [PMID: 20618981 PMCID: PMC2895736 DOI: 10.1186/2041-1480-1-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2009] [Accepted: 03/31/2010] [Indexed: 11/17/2022] Open
Abstract
Background Identification of terms is essential for biomedical text mining.. We concentrate here on the use of vocabularies for term identification, specifically the Unified Medical Language System (UMLS). To make the UMLS more suitable for biomedical text mining we implemented and evaluated nine term rewrite and eight term suppression rules. The rules rely on UMLS properties that have been identified in previous work by others, together with an additional set of new properties discovered by our group during our work with the UMLS. Our work complements the earlier work in that we measure the impact on the number of terms identified by the different rules on a MEDLINE corpus. The number of uniquely identified terms and their frequency in MEDLINE were computed before and after applying the rules. The 50 most frequently found terms together with a sample of 100 randomly selected terms were evaluated for every rule. Results Five of the nine rewrite rules were found to generate additional synonyms and spelling variants that correctly corresponded to the meaning of the original terms and seven out of the eight suppression rules were found to suppress only undesired terms. Using the five rewrite rules that passed our evaluation, we were able to identify 1,117,772 new occurrences of 14,784 rewritten terms in MEDLINE. Without the rewriting, we recognized 651,268 terms belonging to 397,414 concepts; with rewriting, we recognized 666,053 terms belonging to 410,823 concepts, which is an increase of 2.8% in the number of terms and an increase of 3.4% in the number of concepts recognized. Using the seven suppression rules, a total of 257,118 undesired terms were suppressed in the UMLS, notably decreasing its size. 7,397 terms were suppressed in the corpus. Conclusions We recommend applying the five rewrite rules and seven suppression rules that passed our evaluation when the UMLS is to be used for biomedical term identification in MEDLINE. A software tool to apply these rules to the UMLS is freely available at http://biosemantics.org/casper.
Collapse
Affiliation(s)
- Kristina M Hettne
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, the Netherlands.
| | | | | | | | | |
Collapse
|
13
|
Boobis AR, Cohen SM, Doerrer NG, Galloway SM, Haley PJ, Hard GC, Hess FG, Macdonald JS, Thibault S, Wolf DC, Wright J. A Data-Based Assessment of Alternative Strategies for Identification of Potential Human Cancer Hazards. Toxicol Pathol 2009; 37:714-32. [DOI: 10.1177/0192623309343779] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
The two-year cancer bioassay in rodents remains the primary testing strategy for in-life screening of compounds that might pose a potential cancer hazard. Yet experimental evidence shows that cancer is often secondary to a biological precursor effect, the mode of action is sometimes not relevant to humans, and key events leading to cancer in rodents from nongenotoxic agents usually occur well before tumorigenesis and at the same or lower doses than those producing tumors. The International Life Sciences Institute (ILSI) Health and Environmental Sciences Institute (HESI) hypothesized that the signals of importance for human cancer hazard identification can be detected in shorter-term studies. Using the National Toxicology Program (NTP) database, a retrospective analysis was conducted on sixteen chemicals with liver, lung, or kidney tumors in two-year rodent cancer bioassays, and for which short-term data were also available. For nongenotoxic compounds, results showed that cellular changes indicative of a tumorigenic endpoint can be identified for many, but not all, of the chemicals producing tumors in two-year studies after thirteen weeks utilizing conventional endpoints. Additional endpoints are needed to identify some signals not detected with routine evaluation. This effort defined critical questions that should be explored to improve the predictivity of human carcinogenic risk.
Collapse
Affiliation(s)
| | | | - Nancy G. Doerrer
- ILSI Health and Environmental Sciences Institute, Washington, D.C., 20005 USA
| | | | | | | | | | | | | | - Douglas C. Wolf
- U.S. Environmental Protection Agency, Research Triangle Park, NC, 27713 USA
| | | |
Collapse
|
14
|
Janga SC, Tzakos A. Structure and organization of drug-target networks: insights from genomic approaches for drug discovery. MOLECULAR BIOSYSTEMS 2009; 5:1536-48. [DOI: 10.1039/b908147j] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
15
|
Plant N. Can systems toxicology identify common biomarkers of non-genotoxic carcinogenesis? Toxicology 2008; 254:164-9. [PMID: 18674585 DOI: 10.1016/j.tox.2008.07.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2008] [Revised: 06/30/2008] [Accepted: 07/01/2008] [Indexed: 10/25/2022]
Abstract
For the rapid development of safe, efficacious chemicals it is important that any potential liabilities are identified as early as possible in the discovery/development pipeline. Once identified it is then possible to make rational decisions on whether to progress a chemical and/or series further; one such liability is chemical carcinogenesis, a highly undesirable characteristic in a novel chemical entity. Chemical carcinogens may be roughly divided into two classes, those that elicit their actions through direct damage to DNA (genotoxic carcinogens) and those that cause carcinogenesis through mechanisms that involve direct damage of the DNA by the agent (non-genotoxic carcinogens). Whereas the former group can be identified by in vitro screens to a good degree of accuracy, the latter group are far more problematic due to their diverse modes of action. This review will focus on the latter class of chemical carcinogens, examining how modern '-omic' technologies have begun to identify signatures that may represent sensitive, early markers for these processes. In addition to their use in signature generation the role of -omic level approaches to delineating molecular mechanisms of action will also be discussed.
Collapse
Affiliation(s)
- Nick Plant
- Centre for Toxicology, Faculty of Health and Medical Sciences, University of Surrey, Guildford, Surrey GU2 7XH, UK.
| |
Collapse
|
16
|
Frijters R, Heupers B, van Beek P, Bouwhuis M, van Schaik R, de Vlieg J, Polman J, Alkema W. CoPub: a literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Res 2008; 36:W406-10. [PMID: 18442992 PMCID: PMC2447728 DOI: 10.1093/nar/gkn215] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Medline is a rich information source, from which links between genes and keywords describing biological processes, pathways, drugs, pathologies and diseases can be extracted. We developed a publicly available tool called CoPub that uses the information in the Medline database for the biological interpretation of microarray data. CoPub allows batch input of multiple human, mouse or rat genes and produces lists of keywords from several biomedical thesauri that are significantly correlated with the set of input genes. These lists link to Medline abstracts in which the co-occurring input genes and correlated keywords are highlighted. Furthermore, CoPub can graphically visualize differentially expressed genes and over-represented keywords in a network, providing detailed insight in the relationships between genes and keywords, and revealing the most influential genes as highly connected hubs. CoPub is freely accessible at http://services.nbic.nl/cgi-bin/copub/CoPub.pl.
Collapse
Affiliation(s)
- Raoul Frijters
- Computational Drug Discovery (CDD),, Nijmegen Centre for Molecular Life Sciences (NCMLS), Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | | | | | | | | | | | | | | |
Collapse
|