1
|
Zhu TF, Qian R, Wei X, Lu AP, Cao DS. PatentNetML: A Novel Framework for Predicting Key Compounds in Patents Using Network Science and Machine Learning. J Med Chem 2024; 67:1347-1359. [PMID: 38181431 DOI: 10.1021/acs.jmedchem.3c01893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2024]
Abstract
Patents play a crucial role in drug research and development, providing early access to unpublished data and offering unique insights. Identifying key compounds in patents is essential to finding novel lead compounds. This study collected a comprehensive data set comprising 1555 patents, encompassing 1000 key compounds, to explore innovative approaches for predicting these key compounds. Our novel PatentNetML framework integrated network science and machine learning algorithms, combining network measures, ADMET properties, and physicochemical properties, to construct robust classification models to identify key compounds. Through a model interpretation and an analysis of three compelling case studies, we showcase the potential of PatentNetML in unveiling hidden patterns and connections within diverse patents. While our framework is pioneering, we acknowledge its limitations when applied to patents that deviate from the assumed central pattern. This work serves as a promising foundation for future research endeavors aimed at efficiently identifying promising drug candidates and expediting drug discovery in the pharmaceutical industry.
Collapse
Affiliation(s)
- Ting-Fei Zhu
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410003, Hunan, China
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR 999077, China
| | - Rong Qian
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410003, Hunan, China
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR 999077, China
| | - Xiao Wei
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410003, Hunan, China
| | - Ai-Ping Lu
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR 999077, China
- Guangdong-Hong Kong-Macau Joint Lab on Chinese Medicine and Immune Disease Research, Guangzhou 510000, China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410003, Hunan, China
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR 999077, China
| |
Collapse
|
2
|
Martinez-Sevillano M, Falaguera MJ, Mestres J. CIPSI: An open chemical intellectual property service for medicinal chemists. Mol Inform 2024; 43:e202300221. [PMID: 38010631 DOI: 10.1002/minf.202300221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 11/22/2023] [Accepted: 11/23/2023] [Indexed: 11/29/2023]
Abstract
The availability of patent chemical data offers public access to a chemical space that is not well covered by other sources collecting small molecules from scholarly literature. However, open applications to facilitate the search and analysis of biologically-relevant molecular structures present in patents are still largely missing. We have developed CIPSI, an open Chemical Intellectual Property Service @ IMIM to assist medicinal chemists in searching and analysing molecules in SureChEMBL patents. The current version contains 6,240,500 molecules from 236,689 pharmacological patents, of which 5,949,214 are confidently assigned to core chemical structures reminiscent of the Markush structure in the patent claim. The platform includes some graphical tools to facilitate comparative patent analyses between drugs, chemical substructures, and company assignees. CIPSI is available at https://cipsi.org.
Collapse
Affiliation(s)
- Maria Martinez-Sevillano
- Systems Pharmacology, Research Group on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute, Doctor Aiguader 88, 08028, Barcelona, Spain
| | - Maria J Falaguera
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
- Open Targets, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Jordi Mestres
- Systems Pharmacology, Research Group on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute, Doctor Aiguader 88, 08028, Barcelona, Spain
- Institut de Quimica Computacional i Catalisi, Facultat de Ciencies, Universitat de Girona, Maria Aurelia Capmany 69, 17003, Girona, Spain
| |
Collapse
|
3
|
Ohms J. Validity of PubChem compounds supplied by Patentscope or SureChEMBL. WORLD PATENT INFORMATION 2022. [DOI: 10.1016/j.wpi.2022.102134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
4
|
Škuta C, Southan C, Bartůněk P. Will the chemical probes please stand up? RSC Med Chem 2021; 12:1428-1441. [PMID: 34447939 PMCID: PMC8372204 DOI: 10.1039/d1md00138h] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 06/28/2021] [Indexed: 12/22/2022] Open
Abstract
In 2005, the NIH Molecular Libraries Program (MLP) undertook the identification of tool compounds to expand biological insights, now termed small-molecule chemical probes. This inspired other organisations to initiate similar efforts from 2010 onwards. As a central focus of the Probes & Drugs portal (P&D), we have standardised, integrated and compared sets of declared probe compounds harvested from 12 different sources. This turned out to be challenging and revealed unexpected anomalies. Results in this work address key questions including; a) individual and total structure counts, b) overlaps between sources, c) comparisons with selected PubChem sources and d) investigating the probe coverage of druggable targets. In addition, we developed new high-level scoring schemes to filter collections down to probes of higher quality. This generated 548 high-quality chemical probes (HQCP) covering 447 distinct protein targets. This HQCP collection has been added to the P&D portal and will be regularly updated as established sources expand and new ones release data.
Collapse
Affiliation(s)
- Ctibor Škuta
- CZ-OPENSCREEN, National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences Vídeňská 1083 142 20 Prague 4 Czech Republic
| | - Christopher Southan
- Deanery of Biomedical Sciences, University of Edinburgh Edinburgh EH8 9XD UK
| | - Petr Bartůněk
- CZ-OPENSCREEN, National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences Vídeňská 1083 142 20 Prague 4 Czech Republic
| |
Collapse
|
5
|
Falaguera MJ, Mestres J. Identification of the Core Chemical Structure in SureChEMBL Patents. J Chem Inf Model 2021; 61:2241-2247. [PMID: 33929850 DOI: 10.1021/acs.jcim.1c00151] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The SureChEMBL database provides open access to 17 million chemical entities mentioned in 14 million patents published since 1970. However, alongside with molecules covered by patent claims, the database is full of starting materials and intermediate products of little pharmacological relevance. Herein, we introduce a new filtering protocol to automatically select the core chemical structures best representing a congeneric series of pharmacologically relevant molecules in patents. The protocol is first validated against a selection of 890 SureChEMBL patents for which a total of 51,738 manually curated molecules are deposited in ChEMBL. Our protocol was able to select 92.5% of the molecules in ChEMBL from all 270,968 molecules in SureChEMBL for those patents. Subsequently, the protocol was applied to all 240,988 US pharmacological patents for which 9,111,706 molecules are available in SureChEMBL. The unsupervised filtering process selected 5,949,214 molecules (65.3% of the total number of molecules) that form highly congeneric chemical series in 188,795 of those patents (78.3% of the total number of patents). A SureChEMBL version enriched with molecules of pharmacological relevance is available for download at https://ftp.ebi.ac.uk/pub/databases/chembl/SureChEMBLccs.
Collapse
Affiliation(s)
- Maria J Falaguera
- Research Group on Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute and University Pompeu Fabra, Parc de Recerca Biomèdica (PRBB), Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| | - Jordi Mestres
- Research Group on Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Medical Research Institute and University Pompeu Fabra, Parc de Recerca Biomèdica (PRBB), Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain
| |
Collapse
|
6
|
Southan C. Opening up connectivity between documents, structures and bioactivity. Beilstein J Org Chem 2020; 16:596-606. [PMID: 32280387 PMCID: PMC7136548 DOI: 10.3762/bjoc.16.54] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 03/12/2020] [Indexed: 12/17/2022] Open
Abstract
Bioscientists reading papers or patents strive to discern the key relationships reported within a document "D" where a bioactivity "A" with a quantitative result "R" (e.g., an IC50) is reported for chemical structure "C" that modulates (e.g., inhibits) a protein target "P". A useful shorthand for this connectivity thus becomes DARCP. The problem at the core of this article is that the community has spent millions effectively burying these relationships in PDFs over many decades but must now spend millions more trying to get them back out. The key imperative for this is to increase the flow into structured open databases. The positive impacts will include expanded data mining opportunities for drug discovery and chemical biology. Over the last decade commercial sources have manually extracted DARCP from ≈300,000 documents encompassing ≈7 million compounds interacting with ≈10,000 targets. Over a similar time, the Guide to Pharmacology, BindingDB and ChEMBL have carried out analogues DARCP extractions. Although their expert-curated numbers are lower (i.e., ≈2 million compounds against ≈3700 human proteins), these open sources have the great advantage of being merged within PubChem. Parallel efforts have focused on the extraction of document-to-compound (D-C-only) connectivity. In the absence of molecular mechanism of action (mmoa) annotation, this is of less value but can be automatically extracted. This has been significantly accomplished for patents, (e.g., by IBM, SureChEMBL and WIPO) for over 30 million compounds in PubChem. These have recently been joined by 1.4 million D-C submissions from three major chemistry publishers. In addition, both the European and US PubMed Central portals now add chemistry look-ups from abstracts and full-text papers. However, the fully automated extraction of DARCLP has not yet been achieved. This stands in contrast to the ability of biocurators to discern these relationships in minutes. Unfortunately, no journals have yet instigated a flow of author-specified DARCP directly into open databases. Progress may come from trends such as open science, open access (OA), findable, accessible, interoperable and reusable (FAIR), resource description framework (RDF) and WikiData. However, we will need to await the technical applicability in respect to DARCP capture to see if this opens up connectivity.
Collapse
Affiliation(s)
- Christopher Southan
- Deanery of Biomedical Sciences, University of Edinburgh, Edinburgh, EH8 9XD, UK.,TW2Informatics Ltd, Västra Frölunda, Gothenburg, 42166, Sweden
| |
Collapse
|
7
|
Southan C, Sharman JL, Faccenda E, Pawson AJ, Harding SD, Davies JA. Challenges of Connecting Chemistry to Pharmacology: Perspectives from Curating the IUPHAR/BPS Guide to PHARMACOLOGY. ACS OMEGA 2018; 3:8408-8420. [PMID: 30087946 PMCID: PMC6070956 DOI: 10.1021/acsomega.8b00884] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Accepted: 07/12/2018] [Indexed: 06/08/2023]
Abstract
Connecting chemistry to pharmacology has been an objective of Guide to PHARMACOLOGY (GtoPdb) and its precursor the International Union of Basic and Clinical Pharmacology Database (IUPHAR-DB) since 2003. This has been achieved by populating our database with expert-curated relationships between documents, assays, quantitative results, chemical structures, their locations within the documents, and the protein targets in the assays (D-A-R-C-P). A wide range of challenges associated with this are described in this perspective, using illustrative examples from GtoPdb entries. Our selection process begins with judgments of pharmacological relevance and scientific quality. Even though we have a stringent focus for our small-data extraction, we note that assessing the quality of papers has become more difficult over the last 15 years. We discuss ambiguity issues with the resolution of authors' descriptions of A-R-C-P entities to standardized identifiers. We also describe developments that have made this somewhat easier over the same period both in the publication ecosystem and recent enhancements of our internal processes. This perspective concludes with a look at challenges for the future, including the wider capture of mechanistic nuances and possible impacts of text mining on automated entity extraction.
Collapse
|
8
|
Southan C. Caveat Usor: Assessing Differences between Major Chemistry Databases. ChemMedChem 2018; 13:470-481. [PMID: 29451740 PMCID: PMC5900829 DOI: 10.1002/cmdc.201700724] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Revised: 02/07/2018] [Indexed: 12/24/2022]
Abstract
The three databases of PubChem, ChemSpider, and UniChem capture the majority of open chemical structure records with February 2018 totals of 95, 63, and 154 million, respectively. Collectively, they constitute a massively enabling resource for cheminformatics, chemical biology, and drug discovery. As meta-portals, they subsume and link out to the major proportion of public bioactivity data extracted from the literature and screening center assay results. Therefore, they not only present three different entry points, but the many subsumed independent resources present a fourth entry point in the form of standalone databases. Because this creates a complex picture it is important for users to have at least some appreciation of differential content to enable utility judgments for the tasks at hand. This turns out to be challenging. By comparing the three resources in detail, this review assesses their differences, some of which are not obvious. This includes the fact that coverage is significantly different between the 587, 282, and 38 contributing sources, respectively. This not only presents the "who-has-what" question, but also the reason "why" any particular inclusion is considered valuable is rarely made explicit. Also confusing is that sources nominally in common (i.e., having the same submitter name) can have significantly different structure counts, not only in each of the three but also from their standalone instantiations. Assessing a series of examples indicates that differences in loading dates and structural standardization are the main causes of this inter-portal discordance.
Collapse
Affiliation(s)
- Christopher Southan
- IUPHAR/BPS Guide to PHARMACOLOGY, Deanery of Biomedical SciencesUniversity of EdinburghEdinburghEH8 9XDUK
| |
Collapse
|
9
|
Ashenden SK, Kogej T, Engkvist O, Bender A. Innovation in Small-Molecule-Druggable Chemical Space: Where are the Initial Modulators of New Targets Published? J Chem Inf Model 2017; 57:2741-2753. [PMID: 29068231 DOI: 10.1021/acs.jcim.7b00295] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
It is well-established that the number of publications of novel small molecule modulators, and their associated targets, has increased over the years. This work focuses on publishing trends over the years with a particular focus on the comparison between patents and scientific literature which is accessible via the ChEMBL and GOSTAR databases. More precisely, the patents and scientific literature associated with bioactive molecules and their target annotations have been compared to identify where novelty (in the meaning of the first modulator of a protein target) originated from. Comparing the published date of the first small molecule modulator published in literature and patents for a particular target (with either identical or different structure) shows that modulators are usually published in both scientific literature and in patents (45%), or in scientific literature alone (51%), but rarely in patents only. When looking at the time when first modulators are published in both sources, 65% of the time they are disseminated in literature first. Finally, when analyzing just the novel small molecule modulators, regardless of the protein targets they have been published with, those structures representing novel chemistry tend to be published in patents first 61% of the time.
Collapse
Affiliation(s)
- Stephanie K Ashenden
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge , Cambridge, CB2 1EW, United Kingdom
| | - Thierry Kogej
- Discovery Sciences, IMED Biotech Unit, AstraZeneca , Gothenburg 431 50 SE, Sweden
| | - Ola Engkvist
- Discovery Sciences, IMED Biotech Unit, AstraZeneca , Gothenburg 431 50 SE, Sweden
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge , Cambridge, CB2 1EW, United Kingdom
| |
Collapse
|
10
|
Exploring sets of molecules from patents and relationships to other active compounds in chemical space networks. J Comput Aided Mol Des 2017; 31:779-788. [PMID: 28871390 DOI: 10.1007/s10822-017-0061-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2017] [Accepted: 08/31/2017] [Indexed: 10/18/2022]
Abstract
Patents from medicinal chemistry represent a rich source of novel compounds and activity data that appear only infrequently in the scientific literature. Moreover, patent information provides a primary focal point for drug discovery. Accordingly, text mining and image extraction approaches have become hot topics in patent analysis and repositories of patent data are being established. In this work, we have generated network representations using alternative similarity measures to systematically compare molecules from patents with other bioactive compounds, visualize similarity relationships, explore the chemical neighbourhood of patent molecules, and identify closely related compounds with different activities. The design of network representations that combine patent molecules and other bioactive compounds and view patent information in the context of current bioactive chemical space aids in the analysis of patents and further extends the use of molecular networks to explore structure-activity relationships.
Collapse
|
11
|
Krallinger M, Rabal O, Lourenço A, Oyarzabal J, Valencia A. Information Retrieval and Text Mining Technologies for Chemistry. Chem Rev 2017; 117:7673-7761. [PMID: 28475312 DOI: 10.1021/acs.chemrev.6b00851] [Citation(s) in RCA: 111] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.
Collapse
Affiliation(s)
- Martin Krallinger
- Structural Computational Biology Group, Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre , C/Melchor Fernández Almagro 3, Madrid E-28029, Spain
| | - Obdulia Rabal
- Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research (CIMA), University of Navarra , Avenida Pio XII 55, Pamplona E-31008, Spain
| | - Anália Lourenço
- ESEI - Department of Computer Science, University of Vigo , Edificio Politécnico, Campus Universitario As Lagoas s/n, Ourense E-32004, Spain.,Centro de Investigaciones Biomédicas (Centro Singular de Investigación de Galicia) , Campus Universitario Lagoas-Marcosende, Vigo E-36310, Spain.,CEB-Centre of Biological Engineering, University of Minho , Campus de Gualtar, Braga 4710-057, Portugal
| | - Julen Oyarzabal
- Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research (CIMA), University of Navarra , Avenida Pio XII 55, Pamplona E-31008, Spain
| | - Alfonso Valencia
- Life Science Department, Barcelona Supercomputing Centre (BSC-CNS) , C/Jordi Girona, 29-31, Barcelona E-08034, Spain.,Joint BSC-IRB-CRG Program in Computational Biology, Parc Científic de Barcelona , C/ Baldiri Reixac 10, Barcelona E-08028, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA) , Passeig de Lluís Companys 23, Barcelona E-08010, Spain
| |
Collapse
|
12
|
Retrieving GPCR data from public databases. Curr Opin Pharmacol 2016; 30:38-43. [PMID: 27472010 DOI: 10.1016/j.coph.2016.07.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Revised: 06/30/2016] [Accepted: 07/03/2016] [Indexed: 01/29/2023]
Abstract
Improvements in databases have already impacted GPCR research. The purpose of the review is to give a snapshot of the GPCR data available and provide utility examples. Consequently, this review covers a small set of major databases, including UniProt for proteins, Ensembl for genes, ChEMBL for bioactive chemistry and SureChEMBL for patents. In addition, two portals are outlined, GPCRdb and the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb) that are based on expert annotation. The former has an emphasis on structures, sequences, point mutations, analysis tools and visualisation. The latter focuses on endogenous GPCR ligands, pharmacological modulation, approved drugs, clinical candidates and tool compounds. Since data growth is accelerating, those embarking on GPCR projects should not only check databases but also recent journal and patent publications.
Collapse
|
13
|
Southan C, Sharman JL, Benson HE, Faccenda E, Pawson AJ, Alexander SPH, Buneman OP, Davenport AP, McGrath JC, Peters JA, Spedding M, Catterall WA, Fabbro D, Davies JA. The IUPHAR/BPS Guide to PHARMACOLOGY in 2016: towards curated quantitative interactions between 1300 protein targets and 6000 ligands. Nucleic Acids Res 2016; 44:D1054-68. [PMID: 26464438 PMCID: PMC4702778 DOI: 10.1093/nar/gkv1037] [Citation(s) in RCA: 987] [Impact Index Per Article: 123.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2015] [Revised: 09/25/2015] [Accepted: 09/29/2015] [Indexed: 01/05/2023] Open
Abstract
The IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb, http://www.guidetopharmacology.org) provides expert-curated molecular interactions between successful and potential drugs and their targets in the human genome. Developed by the International Union of Basic and Clinical Pharmacology (IUPHAR) and the British Pharmacological Society (BPS), this resource, and its earlier incarnation as IUPHAR-DB, is described in our 2014 publication. This update incorporates changes over the intervening seven database releases. The unique model of content capture is based on established and new target class subcommittees collaborating with in-house curators. Most information comes from journal articles, but we now also index kinase cross-screening panels. Targets are specified by UniProtKB IDs. Small molecules are defined by PubChem Compound Identifiers (CIDs); ligand capture also includes peptides and clinical antibodies. We have extended the capture of ligands and targets linked via published quantitative binding data (e.g. Ki, IC50 or Kd). The resulting pharmacological relationship network now defines a data-supported druggable genome encompassing 7% of human proteins. The database also provides an expanded substrate for the biennially published compendium, the Concise Guide to PHARMACOLOGY. This article covers content increase, entity analysis, revised curation strategies, new website features and expanded download options.
Collapse
Affiliation(s)
- Christopher Southan
- Centre for Integrative Physiology, University of Edinburgh, Edinburgh, EH8 9XD, UK
| | - Joanna L Sharman
- Centre for Integrative Physiology, University of Edinburgh, Edinburgh, EH8 9XD, UK
| | - Helen E Benson
- Centre for Integrative Physiology, University of Edinburgh, Edinburgh, EH8 9XD, UK
| | - Elena Faccenda
- Centre for Integrative Physiology, University of Edinburgh, Edinburgh, EH8 9XD, UK
| | - Adam J Pawson
- Centre for Integrative Physiology, University of Edinburgh, Edinburgh, EH8 9XD, UK
| | - Stephen P H Alexander
- School of Biomedical Sciences, University of Nottingham Medical School, Nottingham, NG7 2UH, UK
| | - O Peter Buneman
- Laboratory for Foundations of Computer Science, School of Informatics, University of Edinburgh, Edinburgh, EH8 9LE, UK
| | | | - John C McGrath
- School of Life Sciences, University of Glasgow, Glasgow, G12 8QQ, UK
| | - John A Peters
- Neuroscience Division, Medical Education Institute, Ninewells Hospital and Medical School, University of Dundee, Dundee, DD1 9SY, UK
| | | | - William A Catterall
- Department of Pharmacology, University of Washington, Seattle, WA 98195-7280, USA
| | | | - Jamie A Davies
- Centre for Integrative Physiology, University of Edinburgh, Edinburgh, EH8 9XD, UK
| |
Collapse
|
14
|
Warr WA. Many InChIs and quite some feat. J Comput Aided Mol Des 2015; 29:681-94. [PMID: 26081259 DOI: 10.1007/s10822-015-9854-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Accepted: 06/10/2015] [Indexed: 12/14/2022]
Affiliation(s)
- Wendy A Warr
- Wendy Warr & Associates, Holmes Chapel, Crewe, Cheshire, CW4 7HZ, UK,
| |
Collapse
|
15
|
Abstract
The current Ebola virus epidemic may provide some suggestions of how we can better prepare for the next pathogen outbreak. We propose several cost effective steps that could be taken that would impact the discovery and use of small molecule therapeutics including: 1. text mine the literature, 2. patent assignees and/or inventors should openly declare their relevant filings, 3. reagents and assays could be commoditized, 4. using manual curation to enhance database links, 5. engage database and curation teams, 6. consider open science approaches, 7. adapt the "box" model for shareable reference compounds, and 8. involve the physician's perspective.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC, 27526, USA ; Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA, 94010, USA
| | - Christopher Southan
- IUPHAR/BPS Guide to PHARMACOLOGY, Centre for Integrative Physiology, University of Edinburgh, Hugh Robson Building, Edinburgh, EH8 9XD, UK
| | - Megan Coffee
- Center for Infectious Diseases and Emergency Readiness, University of California at Berkeley, 1918 University Ave, Berkeley, CA, 94704, USA
| |
Collapse
|
16
|
Abstract
The current Ebola virus epidemic may provide some suggestions of how we can better prepare for the next pathogen outbreak. We propose several cost effective steps that could be taken that would impact the discovery and use of small molecule therapeutics including: 1. text mine the literature, 2. patent assignees and/or inventors should openly declare their relevant filings, 3. reagents and assays could be commoditized, 4. using manual curation to enhance database links, 5. engage database and curation teams, 6. consider open science approaches, 7. adapt the "box" model for shareable reference compounds, and 8. involve the physician's perspective.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC, 27526, USA ; Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA, 94010, USA
| | - Christopher Southan
- IUPHAR/BPS Guide to PHARMACOLOGY, Centre for Integrative Physiology, University of Edinburgh, Hugh Robson Building, Edinburgh, EH8 9XD, UK
| | - Megan Coffee
- Center for Infectious Diseases and Emergency Readiness, University of California at Berkeley, 1918 University Ave, Berkeley, CA, 94704, USA
| |
Collapse
|