Assessment of applicability domain for multivariate counter-propagation artificial neural network predictive models by minimum euclidean distance space analysis: a case study.
Anal Chim Acta 2012;
759:28-42. [PMID:
23260674 DOI:
10.1016/j.aca.2012.11.002]
[Citation(s) in RCA: 61] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2012] [Revised: 10/30/2012] [Accepted: 11/02/2012] [Indexed: 11/23/2022]
Abstract
Alongside the validation, the concept of applicability domain (AD) is probably one of the most important aspects which determine the quality as well as reliability of the established quantitative structure-activity relationship (QSAR) models. To date, a variety of approaches for AD estimation have been devised which can be applied to particular type of QSAR models and their practical utilization is extensively elaborated in the literature. The present study introduces a novel, simple, and effective distance-based method for estimation of the AD in case of developed and validated predictive counter-propagation artificial neural network (CP ANN) models through a proficient exploitation of the euclidean distance (ED) metric in the structure-representation vector space. The performance of the method was evaluated and explained in a case study by using a pre-built and validated CP ANN model for prediction of the transport activity of the transmembrane protein bilitranslocase for a diverse set of compounds. The method was tested on two more datasets in order to confirm its performance for evaluation of the applicability domain in CP ANN models. The chemical compounds determined as potential outliers, i.e., outside of the CP ANN model AD, were confirmed in a comparative AD assessment by using the leverage approach. Moreover, the method offers a graphical depiction of the AD for fast and simple determination of the extreme points.
Collapse