Lipinski CA, Litterman NK, Southan C, Williams AJ, Clark AM, Ekins S. Parallel worlds of public and commercial bioactive chemistry data.
J Med Chem 2014;
58:2068-76. [PMID:
25415348 PMCID:
PMC4360371 DOI:
10.1021/jm5011308]
[Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
![]()
The
availability of structures and linked bioactivity data in databases
is powerfully enabling for drug discovery and chemical biology. However,
we now review some confounding issues with the divergent expansions
of public and commercial sources of chemical structures. These are
associated with not only expanding patent extraction but also increasingly
large vendor collections amassed via different selection criteria
between SciFinder from Chemical Abstracts Service (CAS) and major
public sources such as PubChem, ChemSpider, UniChem, and others. These
increasingly massive collections may include both real and virtual
compounds, as well as so-called prophetic compounds from patents.
We address a range of issues raised by the challenges faced resolving
the NIH probe compounds. In addition we highlight the confounding
of prior-art searching by virtual compounds that could impact the
composition of matter patentability of a new medicinal chemistry lead.
Finally, we propose some potential solutions.
Collapse