Lötsch J, Malkusch S. Interpretation of cluster structures in pain-related phenotype data using explainable artificial intelligence (XAI).
Eur J Pain 2020;
25:442-465. [PMID:
33064864 DOI:
10.1002/ejp.1683]
[Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 10/08/2020] [Accepted: 10/14/2020] [Indexed: 12/22/2022]
Abstract
BACKGROUND
In pain research and clinics, it is common practice to subgroup subjects according to shared pain characteristics. This is often achieved by computer-aided clustering. In response to a recent EU recommendation that computer-aided decision making should be transparent, we propose an approach that uses machine learning to provide (1) an understandable interpretation of a cluster structure to (2) enable a transparent decision process about why a person concerned is placed in a particular cluster.
METHODS
Comprehensibility was achieved by transforming the interpretation problem into a classification problem: A sub-symbolic algorithm was used to estimate the importance of each pain measure for cluster assignment, followed by an item categorization technique to select the relevant variables. Subsequently, a symbolic algorithm as explainable artificial intelligence (XAI) provided understandable rules of cluster assignment. The approach was tested using 100-fold cross-validation.
RESULTS
The importance of the variables of the data set (6 pain-related characteristics of 82 healthy subjects) changed with the clustering scenarios. The highest median accuracy was achieved by sub-symbolic classifiers. A generalized post-hoc interpretation of clustering strategies of the model led to a loss of median accuracy. XAI models were able to interpret the cluster structure almost as correctly, but with a slight loss of accuracy.
CONCLUSIONS
Assessing the variables importance in clustering is important for understanding any cluster structure. XAI models are able to provide a human-understandable interpretation of the cluster structure. Model selection must be adapted individually to the clustering problem. The advantage of comprehensibility comes at an expense of accuracy.
Collapse