Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Karapetyan K, Batchelor C, Sharpe D, Tkachenko V, Williams AJ. The Chemical Validation and Standardization Platform (CVSP): large-scale automated validation of chemical structure datasets. J Cheminform 2015;7:30. [PMID: 26155308 PMCID: PMC4494041 DOI: 10.1186/s13321-015-0072-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Accepted: 04/28/2015] [Indexed: 11/10/2022] Open

For:	Karapetyan K, Batchelor C, Sharpe D, Tkachenko V, Williams AJ. The Chemical Validation and Standardization Platform (CVSP): large-scale automated validation of chemical structure datasets. J Cheminform 2015;7:30. [PMID: 26155308 PMCID: PMC4494041 DOI: 10.1186/s13321-015-0072-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Accepted: 04/28/2015] [Indexed: 11/10/2022] Open

Electronic supplementary material

The online version of this article (doi:10.1186/s13321-017-0220-4) contains supplementary material, which is available to authorized users.

Collapse

Number

Cited by Other Article(s)

Mansouri K, Moreira-Filho JT, Lowe CN, Charest N, Martin T, Tkachenko V, Judson R, Conway M, Kleinstreuer NC, Williams AJ. Free and open-source QSAR-ready workflow for automated standardization of chemical structures in support of QSAR modeling. J Cheminform 2024;16:19. [PMID: 38378618 PMCID: PMC10880251 DOI: 10.1186/s13321-024-00814-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 02/10/2024] [Indexed: 02/22/2024] Open

Abstract

The rapid increase of publicly available chemical structures and associated experimental data presents a valuable opportunity to build robust QSAR models for applications in different fields. However, the common concern is the quality of both the chemical structure information and associated experimental data. This is especially true when those data are collected from multiple sources as chemical substance mappings can contain many duplicate structures and molecular inconsistencies. Such issues can impact the resulting molecular descriptors and their mappings to experimental data and, subsequently, the quality of the derived models in terms of accuracy, repeatability, and reliability. Herein we describe the development of an automated workflow to standardize chemical structures according to a set of standard rules and generate two and/or three-dimensional "QSAR-ready" forms prior to the calculation of molecular descriptors. The workflow was designed in the KNIME workflow environment and consists of three high-level steps. First, a structure encoding is read, and then the resulting in-memory representation is cross-referenced with any existing identifiers for consistency. Finally, the structure is standardized using a series of operations including desalting, stripping of stereochemistry (for two-dimensional structures), standardization of tautomers and nitro groups, valence correction, neutralization when possible, and then removal of duplicates. This workflow was initially developed to support collaborative modeling QSAR projects to ensure consistency of the results from the different participants. It was then updated and generalized for other modeling applications. This included modification of the "QSAR-ready" workflow to generate "MS-ready structures" to support the generation of substance mappings and searches for software applications related to non-targeted analysis mass spectrometry. Both QSAR and MS-ready workflows are freely available in KNIME, via standalone versions on GitHub, and as docker container resources for the scientific community. Scientific contribution: This work pioneers an automated workflow in KNIME, systematically standardizing chemical structures to ensure their readiness for QSAR modeling and broader scientific applications. By addressing data quality concerns through desalting, stereochemistry stripping, and normalization, it optimizes molecular descriptors' accuracy and reliability. The freely available resources in KNIME, GitHub, and docker containers democratize access, benefiting collaborative research and advancing diverse modeling endeavors in chemistry and mass spectrometry.

Collapse

Bento AP, Hersey A, Félix E, Landrum G, Gaulton A, Atkinson F, Bellis LJ, De Veij M, Leach AR. An open source chemical structure curation pipeline using RDKit. J Cheminform 2020;12:51. [PMID: 33431044 PMCID: PMC7458899 DOI: 10.1186/s13321-020-00456-1] [Citation(s) in RCA: 144] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 08/24/2020] [Indexed: 11/13/2022] Open

Baker CM, Kidley NJ, Papachristos K, Hotson M, Carson R, Gravestock D, Pouliot M, Harrison J, Dowling A. Tautomer Standardization in Chemical Databases: Deriving Business Rules from Quantum Chemistry. J Chem Inf Model 2020;60:3781-3791. [PMID: 32644790 DOI: 10.1021/acs.jcim.0c00232] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Dalecki AG, Zorn KM, Clark AM, Ekins S, Narmore WT, Tower N, Rasmussen L, Bostwick R, Kutsch O, Wolschendorf F. High-throughput screening and Bayesian machine learning for copper-dependent inhibitors of Staphylococcus aureus. Metallomics 2020;11:696-706. [PMID: 30839007 DOI: 10.1039/c8mt00342d] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Fantke P, Aurisano N, Provoost J, Karamertzanis PG, Hauschild M. Toward effective use of REACH data for science and policy. ENVIRONMENT INTERNATIONAL 2020;135:105336. [PMID: 31884133 DOI: 10.1016/j.envint.2019.105336] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Revised: 11/15/2019] [Accepted: 11/15/2019] [Indexed: 06/10/2023]

Grulke CM, Williams AJ, Thillanadarajah I, Richard AM. EPA's DSSTox database: History of development of a curated chemistry resource supporting computational toxicology research. ACTA ACUST UNITED AC 2019;12. [PMID: 33426407 PMCID: PMC7787967 DOI: 10.1016/j.comtox.2019.100096] [Citation(s) in RCA: 94] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Abstract

The US Environmental Protection Agency's (EPA) Distributed Structure-Searchable Toxicity (DSSTox) database, launched publicly in 2004, currently exceeds 875 K substances spanning hundreds of lists of interest to EPA and environmental researchers. From its inception, DSSTox has focused curation efforts on resolving chemical identifier errors and conflicts in the public domain towards the goal of assigning accurate chemical structures to data and lists of importance to the environmental research and regulatory community. Accurate structure-data associations, in turn, are necessary inputs to structure-based predictive models supporting hazard and risk assessments. In 2014, the legacy, manually curated DSSTox_V1 content was migrated to a MySQL data model, with modern cheminformatics tools supporting both manual and automated curation processes to increase efficiencies. This was followed by sequential auto-loads of filtered portions of three public datasets: EPA's Substance Registry Services (SRS), the National Library of Medicine's ChemID, and PubChem. This process was constrained by a key requirement of uniquely mapped identifiers (i.e., CAS RN, name and structure) for each substance, rejecting content where any two identifiers were conflicted either within or across datasets. This rejected content highlighted the degree of conflicting, inaccurate substance-structure ID mappings in the public domain, ranging from 12% (within EPA SRS) to 49% (across ChemID and PubChem). Substances successfully added to DSSTox from each auto-load were assigned to one of five qc_levels, conveying curator confidence in each dataset. This process enabled a significant expansion of DSSTox content to provide better coverage of the chemical landscape of interest to environmental scientists, while retaining focus on the accuracy of substance-structure-data associations. Currently, DSSTox serves as the core foundation of EPA's CompTox Chemicals Dashboard [https://comptox.epa.gov/dashboard], which provides public access to DSSTox content in support of a broad range of modeling and research activities within EPA and, increasingly, across the field of computational toxicology.

Collapse

Lane T, Russo DP, Zorn KM, Clark AM, Korotcov A, Tkachenko V, Reynolds RC, Perryman AL, Freundlich JS, Ekins AS. Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery. Mol Pharm 2018;15:4346-4360. [PMID: 29672063 PMCID: PMC6167198 DOI: 10.1021/acs.molpharmaceut.8b00083] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Abstract

Tuberculosis is a global health dilemma. In 2016, the WHO reported 10.4 million incidences and 1.7 million deaths. The need to develop new treatments for those infected with Mycobacterium tuberculosis ( Mtb) has led to many large-scale phenotypic screens and many thousands of new active compounds identified in vitro. However, with limited funding, efforts to discover new active molecules against Mtb needs to be more efficient. Several computational machine learning approaches have been shown to have good enrichment and hit rates. We have curated small molecule Mtb data and developed new models with a total of 18,886 molecules with activity cutoffs of 10 μM, 1 μM, and 100 nM. These data sets were used to evaluate different machine learning methods (including deep learning) and metrics and to generate predictions for additional molecules published in 2017. One Mtb model, a combined in vitro and in vivo data Bayesian model at a 100 nM activity yielded the following metrics for 5-fold cross validation: accuracy = 0.88, precision = 0.22, recall = 0.91, specificity = 0.88, kappa = 0.31, and MCC = 0.41. We have also curated an evaluation set ( n = 153 compounds) published in 2017, and when used to test our model, it showed the comparable statistics (accuracy = 0.83, precision = 0.27, recall = 1.00, specificity = 0.81, kappa = 0.36, and MCC = 0.47). We have also compared these models with additional machine learning algorithms showing Bayesian machine learning models constructed with literature Mtb data generated by different laboratories generally were equivalent to or outperformed deep neural networks with external test sets. Finally, we have also compared our training and test sets to show they were suitably diverse and different in order to represent useful evaluation sets. Such Mtb machine learning models could help prioritize compounds for testing in vitro and in vivo.

Collapse

McEachran AD, Mansouri K, Grulke C, Schymanski EL, Ruttkies C, Williams AJ. "MS-Ready" structures for non-targeted high-resolution mass spectrometry screening studies. J Cheminform 2018;10:45. [PMID: 30167882 PMCID: PMC6117229 DOI: 10.1186/s13321-018-0299-2] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 08/21/2018] [Indexed: 02/05/2023] Open

Brown N, Cambruzzi J, Cox PJ, Davies M, Dunbar J, Plumbley D, Sellwood MA, Sim A, Williams-Jones BI, Zwierzyna M, Sheppard DW. Big Data in Drug Discovery. PROGRESS IN MEDICINAL CHEMISTRY 2018;57:277-356. [PMID: 29680150 DOI: 10.1016/bs.pmch.2017.12.003] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]

Willighagen EL, Mayfield JW, Alvarsson J, Berg A, Carlsson L, Jeliazkova N, Kuhn S, Pluskal T, Rojas-Chertó M, Spjuth O, Torrance G, Evelo CT, Guha R, Steinbeck C. The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminform 2017;9:33. [PMID: 29086040 PMCID: PMC5461230 DOI: 10.1186/s13321-017-0220-4] [Citation(s) in RCA: 210] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 05/16/2017] [Indexed: 12/15/2022] Open

CDK 2.0 provides new features and improved performance

Egon L Willighagen Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6200 MD, Maastricht, The Netherlands.
John W Mayfield NextMove Software Ltd, Cambridge, CB4 0EY, UK
Jonathan Alvarsson Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden
Arvid Berg Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden
Lars Carlsson AstraZeneca, Innovative Medicines & Early Development, Quantitative Biology, Möndal, Sweden
Nina Jeliazkova Ideaconsult Ltd, A. Kanchev 4, 1000, Sofia, Bulgaria
Stefan Kuhn Department of Informatics, University of Leicester, Leicester, UK
Tomáš Pluskal Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA, 02142, USA
Miquel Rojas-Chertó Química Clínica Aplicada, 43870, Amposta, Spain
Ola Spjuth Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden
Gilleain Torrance , 4 Hanway Place, W1T 1HD, London, UK
Chris T Evelo Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, 6200 MD, Maastricht, The Netherlands
Rajarshi Guha National Center for Advancing Translational Sciences, 9800 Medical Center Drive, Rockville, MD, 20850, USA
Christoph Steinbeck Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University, Lessingstr. 8, 07743, Jena, Germany

Collapse

Krallinger M, Rabal O, Lourenço A, Oyarzabal J, Valencia A. Information Retrieval and Text Mining Technologies for Chemistry. Chem Rev 2017;117:7673-7761. [PMID: 28475312 DOI: 10.1021/acs.chemrev.6b00851] [Citation(s) in RCA: 111] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Goldmann D, Zdrazil B, Digles D, Ecker GF. Empowering pharmacoinformatics by linked life science data. J Comput Aided Mol Des 2017;31:319-328. [PMID: 27830428 PMCID: PMC5385323 DOI: 10.1007/s10822-016-9990-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 10/24/2016] [Indexed: 11/11/2022]

Sommer K, Friedrich NO, Bietz S, Hilbig M, Inhester T, Rarey M. UNICON: A Powerful and Easy-to-Use Compound Library Converter. J Chem Inf Model 2016;56:1105-11. [PMID: 27227368 DOI: 10.1021/acs.jcim.6b00069] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Ball N, Cronin MTD, Shen J, Blackburn K, Booth ED, Bouhifd M, Donley E, Egnash L, Hastings C, Juberg DR, Kleensang A, Kleinstreuer N, Kroese ED, Lee AC, Luechtefeld T, Maertens A, Marty S, Naciff JM, Palmer J, Pamies D, Penman M, Richarz AN, Russo DP, Stuard SB, Patlewicz G, van Ravenzwaay B, Wu S, Zhu H, Hartung T. Toward Good Read-Across Practice (GRAP) guidance. ALTEX-ALTERNATIVES TO ANIMAL EXPERIMENTATION 2016;33:149-66. [PMID: 26863606 PMCID: PMC5581000 DOI: 10.14573/altex.1601251] [Citation(s) in RCA: 116] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Accepted: 02/11/2016] [Indexed: 12/04/2022]

Hersey A, Chambers J, Bellis L, Patrícia Bento A, Gaulton A, Overington JP. Chemical databases: curation or integration by user-defined equivalence? DRUG DISCOVERY TODAY. TECHNOLOGIES 2015;14:17-24. [PMID: 26194583 PMCID: PMC6294287 DOI: 10.1016/j.ddtec.2015.01.005] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Revised: 01/15/2015] [Accepted: 01/16/2015] [Indexed: 11/30/2022]

Martin Krallinger Structural Computational Biology Group, Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre , C/Melchor Fernández Almagro 3, Madrid E-28029, Spain
Obdulia Rabal Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research (CIMA), University of Navarra , Avenida Pio XII 55, Pamplona E-31008, Spain
Anália Lourenço ESEI - Department of Computer Science, University of Vigo , Edificio Politécnico, Campus Universitario As Lagoas s/n, Ourense E-32004, Spain.,Centro de Investigaciones Biomédicas (Centro Singular de Investigación de Galicia) , Campus Universitario Lagoas-Marcosende, Vigo E-36310, Spain.,CEB-Centre of Biological Engineering, University of Minho , Campus de Gualtar, Braga 4710-057, Portugal
Julen Oyarzabal Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research (CIMA), University of Navarra , Avenida Pio XII 55, Pamplona E-31008, Spain
Alfonso Valencia Life Science Department, Barcelona Supercomputing Centre (BSC-CNS) , C/Jordi Girona, 29-31, Barcelona E-08034, Spain.,Joint BSC-IRB-CRG Program in Computational Biology, Parc Científic de Barcelona , C/ Baldiri Reixac 10, Barcelona E-08028, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA) , Passeig de Lluís Companys 23, Barcelona E-08010, Spain

Daria Goldmann Department of Pharmaceutical Chemistry, University of Vienna, Althanstraße 14, 1090, Vienna, Austria
Barbara Zdrazil Department of Pharmaceutical Chemistry, University of Vienna, Althanstraße 14, 1090, Vienna, Austria
Daniela Digles Department of Pharmaceutical Chemistry, University of Vienna, Althanstraße 14, 1090, Vienna, Austria
Gerhard F Ecker Department of Pharmaceutical Chemistry, University of Vienna, Althanstraße 14, 1090, Vienna, Austria.

Kai Sommer Center for Bioinformatics, Research Group for Computational Molecular Design, University of Hamburg , Bundesstraße 43, 20146 Hamburg, Germany
Nils-Ole Friedrich Center for Bioinformatics, Research Group for Computational Molecular Design, University of Hamburg , Bundesstraße 43, 20146 Hamburg, Germany
Stefan Bietz Center for Bioinformatics, Research Group for Computational Molecular Design, University of Hamburg , Bundesstraße 43, 20146 Hamburg, Germany
Matthias Hilbig Center for Bioinformatics, Research Group for Computational Molecular Design, University of Hamburg , Bundesstraße 43, 20146 Hamburg, Germany
Therese Inhester Center for Bioinformatics, Research Group for Computational Molecular Design, University of Hamburg , Bundesstraße 43, 20146 Hamburg, Germany
Matthias Rarey Center for Bioinformatics, Research Group for Computational Molecular Design, University of Hamburg , Bundesstraße 43, 20146 Hamburg, Germany

Nicholas Ball The Dow Chemical Company, Midland, MI, USA
Mark T D Cronin School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
Jie Shen Research Institute for Fragrance Materials, Inc. Woodcliff Lake, NJ, USA
Karen Blackburn The Procter and Gamble Co., Cincinatti, OH, USA
Ewan D Booth Syngenta Ltd, Jealott's Hill International Research Centre, Bracknell, Berkshire, UK
Mounir Bouhifd Johns Hopkins Bloomberg School of Public Health, Center for Alternatives to Animal Testing (CAAT), Baltimore, MD, USA
Elizabeth Donley Stemina Biomarker Discovery Inc., Madison, WI, USA
Laura Egnash Stemina Biomarker Discovery Inc., Madison, WI, USA
Charles Hastings BASF SE, Ludwigshafen am Rhein, Germany, and Research Triangle Park, NC, USA
Daland R Juberg The Dow Chemical Company, Midland, MI, USA
Andre Kleensang Johns Hopkins Bloomberg School of Public Health, Center for Alternatives to Animal Testing (CAAT), Baltimore, MD, USA
Nicole Kleinstreuer National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA
E Dinant Kroese Risk Analysis for Products in Development, TNO Zeist, The Netherlands
Adam C Lee DuPont Haskell Global Centers for Health and Environmental Sciences, Newark, DE, USA
Thomas Luechtefeld Johns Hopkins Bloomberg School of Public Health, Center for Alternatives to Animal Testing (CAAT), Baltimore, MD, USA
Alexandra Maertens Johns Hopkins Bloomberg School of Public Health, Center for Alternatives to Animal Testing (CAAT), Baltimore, MD, USA
Sue Marty The Dow Chemical Company, Midland, MI, USA
Jorge M Naciff The Procter and Gamble Co., Cincinatti, OH, USA
Jessica Palmer Stemina Biomarker Discovery Inc., Madison, WI, USA
David Pamies Johns Hopkins Bloomberg School of Public Health, Center for Alternatives to Animal Testing (CAAT), Baltimore, MD, USA
Mike Penman Penman Consulting, Brussels, Belgium
Andrea-Nicole Richarz School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK
Daniel P Russo Department of Chemistry and Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, USA
Sharon B Stuard The Procter and Gamble Co., Cincinatti, OH, USA
Grace Patlewicz US EPA/ORD, National Center for Computational Toxicology, Research Triangle Park, NC, USA
Bennard van Ravenzwaay Risk Analysis for Products in Development, TNO Zeist, The Netherlands
Shengde Wu The Procter and Gamble Co., Cincinatti, OH, USA
Hao Zhu Department of Chemistry and Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, USA
Thomas Hartung Johns Hopkins Bloomberg School of Public Health, Center for Alternatives to Animal Testing (CAAT), Baltimore, MD, USA.,University of Konstanz, CAAT-Europe, Konstanz, Germany

Anne Hersey European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom.
Jon Chambers European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
Louisa Bellis European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
A Patrícia Bento European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
Anna Gaulton European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
John P Overington European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom