1
|
E. Ihongbe I, Fouad S, F. Mahmoud T, Rajasekaran A, Bhatia B. Evaluating Explainable Artificial Intelligence (XAI) techniques in chest radiology imaging through a human-centered Lens. PLoS One 2024; 19:e0308758. [PMID: 39383147 PMCID: PMC11463756 DOI: 10.1371/journal.pone.0308758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 07/30/2024] [Indexed: 10/11/2024] Open
Abstract
The field of radiology imaging has experienced a remarkable increase in using of deep learning (DL) algorithms to support diagnostic and treatment decisions. This rise has led to the development of Explainable AI (XAI) system to improve the transparency and trust of complex DL methods. However, XAI systems face challenges in gaining acceptance within the healthcare sector, mainly due to technical hurdles in utilizing these systems in practice and the lack of human-centered evaluation/validation. In this study, we focus on visual XAI systems applied to DL-enabled diagnostic system in chest radiography. In particular, we conduct a user study to evaluate two prominent visual XAI techniques from the human perspective. To this end, we created two clinical scenarios for diagnosing pneumonia and COVID-19 using DL techniques applied to chest X-ray and CT scans. The achieved accuracy rates were 90% for pneumonia and 98% for COVID-19. Subsequently, we employed two well-known XAI methods, Grad-CAM (Gradient-weighted Class Activation Mapping) and LIME (Local Interpretable Model-agnostic Explanations), to generate visual explanations elucidating the AI decision-making process. The visual explainability results were shared through a user study, undergoing evaluation by medical professionals in terms of clinical relevance, coherency, and user trust. In general, participants expressed a positive perception of the use of XAI systems in chest radiography. However, there was a noticeable lack of awareness regarding their value and practical aspects. Regarding preferences, Grad-CAM showed superior performance over LIME in terms of coherency and trust, although concerns were raised about its clinical usability. Our findings highlight key user-driven explainability requirements, emphasizing the importance of multi-modal explainability and the necessity to increase awareness of XAI systems among medical practitioners. Inclusive design was also identified as a crucial need to ensure better alignment of these systems with user needs.
Collapse
Affiliation(s)
- Izegbua E. Ihongbe
- School of Computer Science and Digital Technologies, Aston University, Birmingham, United Kingdom
| | - Shereen Fouad
- School of Computer Science and Digital Technologies, Aston University, Birmingham, United Kingdom
| | - Taha F. Mahmoud
- Medical Imaging Department, University Hospital of Sharjah, Sharjah, United Arab Emirates
| | - Arvind Rajasekaran
- Sandwell And West Birmingham Hospitals NHS Trust, Birmingham, United Kingdom
| | - Bahadar Bhatia
- Sandwell And West Birmingham Hospitals NHS Trust, Birmingham, United Kingdom
- University of Leicester, Leicester, United Kingdom
| |
Collapse
|
2
|
Cecil J, Lermer E, Hudecek MFC, Sauer J, Gaube S. Explainability does not mitigate the negative impact of incorrect AI advice in a personnel selection task. Sci Rep 2024; 14:9736. [PMID: 38679619 PMCID: PMC11056364 DOI: 10.1038/s41598-024-60220-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 04/19/2024] [Indexed: 05/01/2024] Open
Abstract
Despite the rise of decision support systems enabled by artificial intelligence (AI) in personnel selection, their impact on decision-making processes is largely unknown. Consequently, we conducted five experiments (N = 1403 students and Human Resource Management (HRM) employees) investigating how people interact with AI-generated advice in a personnel selection task. In all pre-registered experiments, we presented correct and incorrect advice. In Experiments 1a and 1b, we manipulated the source of the advice (human vs. AI). In Experiments 2a, 2b, and 2c, we further manipulated the type of explainability of AI advice (2a and 2b: heatmaps and 2c: charts). We hypothesized that accurate and explainable advice improves decision-making. The independent variables were regressed on task performance, perceived advice quality and confidence ratings. The results consistently showed that incorrect advice negatively impacted performance, as people failed to dismiss it (i.e., overreliance). Additionally, we found that the effects of source and explainability of advice on the dependent variables were limited. The lack of reduction in participants' overreliance on inaccurate advice when the systems' predictions were made more explainable highlights the complexity of human-AI interaction and the need for regulation and quality standards in HRM.
Collapse
Affiliation(s)
- Julia Cecil
- Department of Psychology, LMU Center for Leadership and People Management, LMU Munich, Munich, Germany.
| | - Eva Lermer
- Department of Psychology, LMU Center for Leadership and People Management, LMU Munich, Munich, Germany
- Department of Business Psychology, Technical University of Applied Sciences Augsburg, Augsburg, Germany
| | - Matthias F C Hudecek
- Department of Experimental Psychology, University of Regensburg, Regensburg, Germany
| | - Jan Sauer
- Department of Business Administration, University of Applied Sciences Amberg-Weiden, Weiden, Germany
| | - Susanne Gaube
- Department of Psychology, LMU Center for Leadership and People Management, LMU Munich, Munich, Germany
- UCL Global Business School for Health, University College London, London, UK
| |
Collapse
|
3
|
Buzcu B, Tessa M, Tchappi I, Najjar A, Hulstijn J, Calvaresi D, Aydoğan R. Towards interactive explanation-based nutrition virtual coaching systems. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS 2024; 38:5. [PMID: 38261966 PMCID: PMC10798935 DOI: 10.1007/s10458-023-09634-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 12/15/2023] [Indexed: 01/25/2024]
Abstract
The awareness about healthy lifestyles is increasing, opening to personalized intelligent health coaching applications. A demand for more than mere suggestions and mechanistic interactions has driven attention to nutrition virtual coaching systems (NVC) as a bridge between human-machine interaction and recommender, informative, persuasive, and argumentation systems. NVC can rely on data-driven opaque mechanisms. Therefore, it is crucial to enable NVC to explain their doing (i.e., engaging the user in discussions (via arguments) about dietary solutions/alternatives). By doing so, transparency, user acceptance, and engagement are expected to be boosted. This study focuses on NVC agents generating personalized food recommendations based on user-specific factors such as allergies, eating habits, lifestyles, and ingredient preferences. In particular, we propose a user-agent negotiation process entailing run-time feedback mechanisms to react to both recommendations and related explanations. Lastly, the study presents the findings obtained by the experiments conducted with multi-background participants to evaluate the acceptability and effectiveness of the proposed system. The results indicate that most participants value the opportunity to provide feedback and receive explanations for recommendations. Additionally, the users are fond of receiving information tailored to their needs. Furthermore, our interactive recommendation system performed better than the corresponding traditional recommendation system in terms of effectiveness regarding the number of agreements and rounds.
Collapse
Affiliation(s)
- Berk Buzcu
- Computer Science, Özyeğin University, Istanbul, Turkey
- University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis), Sierre, Switzerland
| | - Melissa Tessa
- Computer Science, High National School of Computer Science ESI ex-INI, Algiers, Algeria
| | - Igor Tchappi
- University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Amro Najjar
- Luxembourg Institute of Science and Technology, Esch-sur-Alzette, Luxembourg
- University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | | | - Davide Calvaresi
- University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis), Sierre, Switzerland
| | - Reyhan Aydoğan
- Computer Science, Özyeğin University, Istanbul, Turkey
- Interactive Intelligence, Delft University of Technology, Delft, The Netherlands
- University of Alcala, Alcala de Henares, Spain
| |
Collapse
|
4
|
Celar L, Byrne RMJ. How people reason with counterfactual and causal explanations for Artificial Intelligence decisions in familiar and unfamiliar domains. Mem Cognit 2023; 51:1481-1496. [PMID: 36964302 PMCID: PMC10520145 DOI: 10.3758/s13421-023-01407-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/18/2023] [Indexed: 03/26/2023]
Abstract
Few empirical studies have examined how people understand counterfactual explanations for other people's decisions, for example, "if you had asked for a lower amount, your loan application would have been approved". Yet many current Artificial Intelligence (AI) decision support systems rely on counterfactual explanations to improve human understanding and trust. We compared counterfactual explanations to causal ones, i.e., "because you asked for a high amount, your loan application was not approved", for an AI's decisions in a familiar domain (alcohol and driving) and an unfamiliar one (chemical safety) in four experiments (n = 731). Participants were shown inputs to an AI system, its decisions, and an explanation for each decision; they attempted to predict the AI's decisions, or to make their own decisions. Participants judged counterfactual explanations more helpful than causal ones, but counterfactuals did not improve the accuracy of their predictions of the AI's decisions more than causals (Experiment 1). However, counterfactuals improved the accuracy of participants' own decisions more than causals (Experiment 2). When the AI's decisions were correct (Experiments 1 and 2), participants considered explanations more helpful and made more accurate judgements in the familiar domain than in the unfamiliar one; but when the AI's decisions were incorrect, they considered explanations less helpful and made fewer accurate judgements in the familiar domain than the unfamiliar one, whether they predicted the AI's decisions (Experiment 3a) or made their own decisions (Experiment 3b). The results corroborate the proposal that counterfactuals provide richer information than causals, because their mental representation includes more possibilities.
Collapse
Affiliation(s)
- Lenart Celar
- School of Psychology and Institute of Neuroscience, Trinity College Dublin, University of Dublin, Dublin, Ireland
| | - Ruth M J Byrne
- School of Psychology and Institute of Neuroscience, Trinity College Dublin, University of Dublin, Dublin, Ireland.
| |
Collapse
|
5
|
Alfeo AL, Zippo AG, Catrambone V, Cimino MGCA, Toschi N, Valenza G. From local counterfactuals to global feature importance: efficient, robust, and model-agnostic explanations for brain connectivity networks. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 236:107550. [PMID: 37086584 DOI: 10.1016/j.cmpb.2023.107550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 04/14/2023] [Accepted: 04/14/2023] [Indexed: 05/03/2023]
Abstract
BACKGROUND Explainable artificial intelligence (XAI) is a technology that can enhance trust in mental state classifications by providing explanations for the reasoning behind artificial intelligence (AI) models outputs, especially for high-dimensional and highly-correlated brain signals. Feature importance and counterfactual explanations are two common approaches to generate these explanations, but both have drawbacks. While feature importance methods, such as shapley additive explanations (SHAP), can be computationally expensive and sensitive to feature correlation, counterfactual explanations only explain a single outcome instead of the entire model. METHODS To overcome these limitations, we propose a new procedure for computing global feature importance that involves aggregating local counterfactual explanations. This approach is specifically tailored to fMRI signals and is based on the hypothesis that instances close to the decision boundary and their counterfactuals mainly differ in the features identified as most important for the downstream classification task. We refer to this proposed feature importance measure as Boundary Crossing Solo Ratio (BoCSoR), since it quantifies the frequency with which a change in each feature in isolation leads to a change in classification outcome, i.e., the crossing of the model's decision boundary. RESULTS AND CONCLUSIONS Experimental results on synthetic data and real publicly available fMRI data from the Human Connect project show that the proposed BoCSoR measure is more robust to feature correlation and less computationally expensive than state-of-the-art methods. Additionally, it is equally effective in providing an explanation for the behavior of any AI model for brain signals. These properties are crucial for medical decision support systems, where many different features are often extracted from the same physiological measures and a gold standard is absent. Consequently, computing feature importance may become computationally expensive, and there may be a high probability of mutual correlation among features, leading to unreliable results from state-of-the-art XAI methods.
Collapse
Affiliation(s)
- Antonio Luca Alfeo
- Department of Information Engineering, University of Pisa, Largo Lucio Lazzarino, 1, Pisa, 56126, Italy; Bioengineering & Robotics Research Center E. Piaggio, University of Pisa, Largo Lucio Lazzarino, 1, Pisa, 56126, Italy.
| | - Antonio G Zippo
- Institute of Neuroscience, Consiglio Nazionale delle Ricerche, Via Raoul Follereau, 3, Vedano al Lambro (MB), 20854, Italy
| | - Vincenzo Catrambone
- Department of Information Engineering, University of Pisa, Largo Lucio Lazzarino, 1, Pisa, 56126, Italy; Bioengineering & Robotics Research Center E. Piaggio, University of Pisa, Largo Lucio Lazzarino, 1, Pisa, 56126, Italy
| | - Mario G C A Cimino
- Department of Information Engineering, University of Pisa, Largo Lucio Lazzarino, 1, Pisa, 56126, Italy; Bioengineering & Robotics Research Center E. Piaggio, University of Pisa, Largo Lucio Lazzarino, 1, Pisa, 56126, Italy
| | - Nicola Toschi
- Department of Biomedicine and Prevention, University of Rome Tor Vergata, Via Montpellier 1, Roma, 00133, Italy
| | - Gaetano Valenza
- Department of Information Engineering, University of Pisa, Largo Lucio Lazzarino, 1, Pisa, 56126, Italy; Bioengineering & Robotics Research Center E. Piaggio, University of Pisa, Largo Lucio Lazzarino, 1, Pisa, 56126, Italy
| |
Collapse
|
6
|
Cau FM, Hauptmann H, Spano LD, Tintarev N. Effects of AI and Logic-Style Explanations on Users’ Decisions under Different Levels of Uncertainty. ACM T INTERACT INTEL 2023. [DOI: 10.1145/3588320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Existing eXplainable Artificial Intelligence (XAI) techniques support people in interpreting AI advice. However, while previous work evaluates the users’ understanding of explanations, factors influencing the decision support are largely overlooked in the literature. This paper addresses this gap by studying the impact of
user uncertainty
,
AI correctness
, and the interaction between
AI uncertainty
and
explanation logic-styles
, for classification tasks. We conducted two separate studies: one requesting participants to recognise hand-written digits and one to classify the sentiment of reviews. To assess the decision making, we analysed the
task performance, agreement
with the AI suggestion, and the user’s
reliance
on the XAI interface elements. Participants make their decision relying on three pieces of information in the XAI interface (image or text instance, AI prediction, and explanation). Participants were shown one explanation style (between-participants design): according to three styles of logical reasoning (inductive, deductive, and abductive). This allowed us to study how different levels of AI uncertainty influence the effectiveness of different explanation styles. The results show that user uncertainty and AI correctness on predictions significantly affected users’ classification decisions considering the analysed metrics. In both domains (images and text), users relied mainly on the instance to decide. Users were usually overconfident about their choices, and this evidence was more pronounced for text. Furthermore, the inductive style explanations led to over-reliance on the AI advice in both domains – it was the most persuasive, even when the AI was incorrect. The abductive and deductive styles have complex effects depending on the domain and the AI uncertainty levels.
Collapse
|
7
|
Allen B. Discovering Themes in Deep Brain Stimulation Research Using Explainable Artificial Intelligence. Biomedicines 2023; 11:771. [PMID: 36979750 PMCID: PMC10045890 DOI: 10.3390/biomedicines11030771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 02/17/2023] [Accepted: 02/28/2023] [Indexed: 03/06/2023] Open
Abstract
Deep brain stimulation is a treatment that controls symptoms by changing brain activity. The complexity of how to best treat brain dysfunction with deep brain stimulation has spawned research into artificial intelligence approaches. Machine learning is a subset of artificial intelligence that uses computers to learn patterns in data and has many healthcare applications, such as an aid in diagnosis, personalized medicine, and clinical decision support. Yet, how machine learning models make decisions is often opaque. The spirit of explainable artificial intelligence is to use machine learning models that produce interpretable solutions. Here, we use topic modeling to synthesize recent literature on explainable artificial intelligence approaches to extracting domain knowledge from machine learning models relevant to deep brain stimulation. The results show that patient classification (i.e., diagnostic models, precision medicine) is the most common problem in deep brain stimulation studies that employ explainable artificial intelligence. Other topics concern attempts to optimize stimulation strategies and the importance of explainable methods. Overall, this review supports the potential for artificial intelligence to revolutionize deep brain stimulation by personalizing stimulation protocols and adapting stimulation in real time.
Collapse
Affiliation(s)
- Ben Allen
- Department of Psychology, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
8
|
Chiu TY, Le Ny J, David JP. Temporal Logic Explanations for Dynamic Decision Systems using Anchors and Monte Carlo Tree Search. ARTIF INTELL 2023. [DOI: 10.1016/j.artint.2023.103897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
|
9
|
del Castillo Torres G, Roig-Maimó MF, Mascaró-Oliver M, Amengual-Alcover E, Mas-Sansó R. Understanding How CNNs Recognize Facial Expressions: A Case Study with LIME and CEM. SENSORS (BASEL, SWITZERLAND) 2022; 23:131. [PMID: 36616728 PMCID: PMC9824600 DOI: 10.3390/s23010131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/22/2022] [Accepted: 12/18/2022] [Indexed: 06/17/2023]
Abstract
Recognizing facial expressions has been a persistent goal in the scientific community. Since the rise of artificial intelligence, convolutional neural networks (CNN) have become popular to recognize facial expressions, as images can be directly used as input. Current CNN models can achieve high recognition rates, but they give no clue about their reasoning process. Explainable artificial intelligence (XAI) has been developed as a means to help to interpret the results obtained by machine learning models. When dealing with images, one of the most-used XAI techniques is LIME. LIME highlights the areas of the image that contribute to a classification. As an alternative to LIME, the CEM method appeared, providing explanations in a way that is natural for human classification: besides highlighting what is sufficient to justify a classification, it also identifies what should be absent to maintain it and to distinguish it from another classification. This study presents the results of comparing LIME and CEM applied over complex images such as facial expression images. While CEM could be used to explain the results on images described with a reduced number of features, LIME would be the method of choice when dealing with images described with a huge number of features.
Collapse
|
10
|
Tešić M, Hahn U. Can counterfactual explanations of AI systems' predictions skew lay users' causal intuitions about the world? If so, can we correct for that? PATTERNS (NEW YORK, N.Y.) 2022; 3:100635. [PMID: 36569554 PMCID: PMC9768678 DOI: 10.1016/j.patter.2022.100635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Counterfactual (CF) explanations have been employed as one of the modes of explainability in explainable artificial intelligence (AI)-both to increase the transparency of AI systems and to provide recourse. Cognitive science and psychology have pointed out that people regularly use CFs to express causal relationships. Most AI systems, however, are only able to capture associations or correlations in data, so interpreting them as casual would not be justified. In this perspective, we present two experiments (total n = 364) exploring the effects of CF explanations of AI systems' predictions on lay people's causal beliefs about the real world. In Experiment 1, we found that providing CF explanations of an AI system's predictions does indeed (unjustifiably) affect people's causal beliefs regarding factors/features the AI uses and that people are more likely to view them as causal factors in the real world. Inspired by the literature on misinformation and health warning messaging, Experiment 2 tested whether we can correct for the unjustified change in causal beliefs. We found that pointing out that AI systems capture correlations and not necessarily causal relationships can attenuate the effects of CF explanations on people's causal beliefs.
Collapse
Affiliation(s)
- Marko Tešić
- Birkbeck, University of London, London, UK,Corresponding author
| | | |
Collapse
|
11
|
Assessing the communication gap between AI models and healthcare professionals: Explainability, utility and trust in AI-driven clinical decision-making. ARTIF INTELL 2022. [DOI: 10.1016/j.artint.2022.103839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
12
|
Sha L, Camburu OM, Lukasiewicz T. Rationalizing Predictions by Adversarial Information Calibration. ARTIF INTELL 2022. [DOI: 10.1016/j.artint.2022.103828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
13
|
G-LIME: Statistical Learning for Local Interpretations of Deep Neural Networks using Global Priors. ARTIF INTELL 2022. [DOI: 10.1016/j.artint.2022.103823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
14
|
Leichtmann B, Humer C, Hinterreiter A, Streit M, Mara M. Effects of Explainable Artificial Intelligence on trust and human behavior in a high-risk decision task. COMPUTERS IN HUMAN BEHAVIOR 2022. [DOI: 10.1016/j.chb.2022.107539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
15
|
Kaczmarek-Majer K, Casalino G, Castellano G, Dominiak M, Hryniewicz O, Kamińska O, Vessio G, Díaz-Rodríguez N. PLENARY: Explaining black-box models in natural language through fuzzy linguistic summaries. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.10.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
16
|
Verhagen RS, Neerincx MA, Tielman ML. The influence of interdependence and a transparent or explainable communication style on human-robot teamwork. Front Robot AI 2022; 9:993997. [PMID: 36158603 PMCID: PMC9493028 DOI: 10.3389/frobt.2022.993997] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 08/15/2022] [Indexed: 11/13/2022] Open
Abstract
Humans and robots are increasingly working together in human-robot teams. Teamwork requires communication, especially when interdependence between team members is high. In previous work, we identified a conceptual difference between sharing what you are doing (i.e., being transparent) and why you are doing it (i.e., being explainable). Although the second might sound better, it is important to avoid information overload. Therefore, an online experiment (n = 72) was conducted to study the effect of communication style of a robot (silent, transparent, explainable, or adaptive based on time pressure and relevancy) on human-robot teamwork. We examined the effects of these communication styles on trust in the robot, workload during the task, situation awareness, reliance on the robot, human contribution during the task, human communication frequency, and team performance. Moreover, we included two levels of interdependence between human and robot (high vs. low), since mutual dependency might influence which communication style is best. Participants collaborated with a virtual robot during two simulated search and rescue tasks varying in their level of interdependence. Results confirm that in general robot communication results in more trust in and understanding of the robot, while showing no evidence of a higher workload when the robot communicates or adds explanations to being transparent. Providing explanations, however, did result in more reliance on RescueBot. Furthermore, compared to being silent, only being explainable results a higher situation awareness when interdependence is high. Results further show that being highly interdependent decreases trust, reliance, and team performance while increasing workload and situation awareness. High interdependence also increases human communication if the robot is not silent, human rescue contribution if the robot does not provide explanations, and the strength of the positive association between situation awareness and team performance. From these results, we can conclude that robot communication is crucial for human-robot teamwork, and that important differences exist between being transparent, explainable, or adaptive. Our findings also highlight the fundamental importance of interdependence in studies on explainability in robots.
Collapse
Affiliation(s)
- Ruben S. Verhagen
- Interactive Intelligence, Intelligent Systems Department, Delft University of Technology, Delft, Netherlands
- *Correspondence: Ruben S. Verhagen,
| | - Mark A. Neerincx
- Interactive Intelligence, Intelligent Systems Department, Delft University of Technology, Delft, Netherlands
- Human-Machine Teaming, Netherlands Organization for Applied Scientific Research (TNO), Amsterdam, Netherlands
| | - Myrthe L. Tielman
- Interactive Intelligence, Intelligent Systems Department, Delft University of Technology, Delft, Netherlands
| |
Collapse
|
17
|
Buijsman S. Defining Explanation and Explanatory Depth in XAI. Minds Mach (Dordr) 2022. [DOI: 10.1007/s11023-022-09607-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
AbstractExplainable artificial intelligence (XAI) aims to help people understand black box algorithms, particularly of their outputs. But what are these explanations and when is one explanation better than another? The manipulationist definition of explanation from the philosophy of science offers good answers to these questions, holding that an explanation consists of a generalization that shows what happens in counterfactual cases. Furthermore, when it comes to explanatory depth this account holds that a generalization that has more abstract variables, is broader in scope and/or more accurate is better. By applying these definitions and contrasting them with alternative definitions in the XAI literature I hope to help clarify what a good explanation is for AI.
Collapse
|
18
|
Conclusive local interpretation rules for random forests. Data Min Knowl Discov 2022. [DOI: 10.1007/s10618-022-00839-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
19
|
Herm LV, Heinrich K, Wanner J, Janiesch C. Stop ordering machine learning algorithms by their explainability! A user-centered investigation of performance and explainability. INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT 2022. [DOI: 10.1016/j.ijinfomgt.2022.102538] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
20
|
Explainable Artificial Intelligence in Data Science. Minds Mach (Dordr) 2022. [DOI: 10.1007/s11023-022-09603-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
AbstractA widespread need to explain the behavior and outcomes of AI-based systems has emerged, due to their ubiquitous presence. Thus, providing renewed momentum to the relatively new research area of eXplainable AI (XAI). Nowadays, the importance of XAI lies in the fact that the increasing control transference to this kind of system for decision making -or, at least, its use for assisting executive stakeholders- already affects many sensitive realms (as in Politics, Social Sciences, or Law). The decision-making power handover to opaque AI systems makes mandatory explaining those, primarily in application scenarios where the stakeholders are unaware of both the high technology applied and the basic principles governing the technological solutions. The issue should not be reduced to a merely technical problem; the explainer would be compelled to transmit richer knowledge about the system (including its role within the informational ecosystem where he/she works). To achieve such an aim, the explainer could exploit, if necessary, practices from other scientific and humanistic areas. The first aim of the paper is to emphasize and justify the need for a multidisciplinary approach that is beneficiated from part of the scientific and philosophical corpus on Explaining, underscoring the particular nuances of the issue within the field of Data Science. The second objective is to develop some arguments justifying the authors’ bet by a more relevant role of ideas inspired by, on the one hand, formal techniques from Knowledge Representation and Reasoning, and on the other hand, the modeling of human reasoning when facing the explanation. This way, explaining modeling practices would seek a sound balance between the pure technical justification and the explainer-explainee agreement.
Collapse
|
21
|
Fokkema M, Iliescu D, Greiff S, Ziegler M. Machine Learning and Prediction in Psychological Assessment. EUROPEAN JOURNAL OF PSYCHOLOGICAL ASSESSMENT 2022. [DOI: 10.1027/1015-5759/a000714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Abstract. Modern prediction methods from machine learning (ML) and artificial intelligence (AI) are becoming increasingly popular, also in the field of psychological assessment. These methods provide unprecedented flexibility for modeling large numbers of predictor variables and non-linear associations between predictors and responses. In this paper, we aim to look at what these methods may contribute to the assessment of criterion validity and their possible drawbacks. We apply a range of modern statistical prediction methods to a dataset for predicting the university major completed, based on the subscales and items of a scale for vocational preferences. The results indicate that logistic regression combined with regularization performs strikingly well already in terms of predictive accuracy. More sophisticated techniques for incorporating non-linearities can further contribute to predictive accuracy and validity, but often marginally.
Collapse
Affiliation(s)
- Marjolein Fokkema
- Methodology and Statistics Department, Institute of Psychology, Leiden University, The Netherlands
| | - Dragos Iliescu
- Faculty of Psychology and Educational Sciences, University of Bucharest, Romania
| | - Samuel Greiff
- Department of Behavioural and Cognitive Sciences, University of Luxembourg, Luxembourg
| | | |
Collapse
|
22
|
Fox S. Behavioral Ethics Ecologies of Human-Artificial Intelligence Systems. Behav Sci (Basel) 2022; 12:bs12040103. [PMID: 35447675 PMCID: PMC9029794 DOI: 10.3390/bs12040103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/08/2022] [Accepted: 04/08/2022] [Indexed: 11/16/2022] Open
Abstract
Historically, evolution of behaviors often took place in environments that changed little over millennia. By contrast, today, rapid changes to behaviors and environments come from the introduction of artificial intelligence (AI) and the infrastructures that facilitate its application. Behavioral ethics is concerned with how interactions between individuals and their environments can lead people to questionable decisions and dubious actions. For example, interactions between an individual’s self-regulatory resource depletion and organizational pressure to take non-ethical actions. In this paper, four fundamental questions of behavioral ecology are applied to analyze human behavioral ethics in human–AI systems. These four questions are concerned with assessing the function of behavioral traits, how behavioral traits evolve in populations, what are the mechanisms of behavioral traits, and how they can differ among different individuals. These four fundamental behavioral ecology questions are applied in analysis of human behavioral ethics in human–AI systems. This is achieved through reference to vehicle navigation systems and healthcare diagnostic systems, which are enabled by AI. Overall, the paper provides two main contributions. First, behavioral ecology analysis of behavioral ethics. Second, application of behavioral ecology questions to identify opportunities and challenges for ethical human–AI systems.
Collapse
Affiliation(s)
- Stephen Fox
- VTT Technical Research Centre of Finland, FI-02150 Espoo, Finland
| |
Collapse
|
23
|
Zerilli J, Bhatt U, Weller A. How transparency modulates trust in artificial intelligence. PATTERNS (NEW YORK, N.Y.) 2022; 3:100455. [PMID: 35465233 PMCID: PMC9023880 DOI: 10.1016/j.patter.2022.100455] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The study of human-machine systems is central to a variety of behavioral and engineering disciplines, including management science, human factors, robotics, and human-computer interaction. Recent advances in artificial intelligence (AI) and machine learning have brought the study of human-AI teams into sharper focus. An important set of questions for those designing human-AI interfaces concerns trust, transparency, and error tolerance. Here, we review the emerging literature on this important topic, identify open questions, and discuss some of the pitfalls of human-AI team research. We present opposition (extreme algorithm aversion or distrust) and loafing (extreme automation complacency or bias) as lying at opposite ends of a spectrum, with algorithmic vigilance representing an ideal mid-point. We suggest that, while transparency may be crucial for facilitating appropriate levels of trust in AI and thus for counteracting aversive behaviors and promoting vigilance, transparency should not be conceived solely in terms of the explainability of an algorithm. Dynamic task allocation, as well as the communication of confidence and performance metrics-among other strategies-may ultimately prove more useful to users than explanations from algorithms and significantly more effective in promoting vigilance. We further suggest that, while both aversive and appreciative attitudes are detrimental to optimal human-AI team performance, strategies to curb aversion are likely to be more important in the longer term than those attempting to mitigate appreciation. Our wider aim is to channel disparate efforts in human-AI team research into a common framework and to draw attention to the ecological validity of results in this field.
Collapse
Affiliation(s)
- John Zerilli
- Institute for Ethics in AI and Faculty of Law, University of Oxford, St Cross Building, St Cross Road, Oxford OX1 3U, UK
| | - Umang Bhatt
- Leverhulme Centre for the Future of Intelligence and Department of Engineering, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, UK
- The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| | - Adrian Weller
- Leverhulme Centre for the Future of Intelligence and Department of Engineering, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, UK
- The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| |
Collapse
|
24
|
|
25
|
Week-Wise Student Performance Early Prediction in Virtual Learning Environment Using a Deep Explainable Artificial Intelligence. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12041885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Early prediction of students’ learning performance and analysis of student behavior in a virtual learning environment (VLE) are crucial to minimize the high failure rate in online courses during the COVID-19 pandemic. Nevertheless, traditional machine learning models fail to predict student performance in the early weeks due to the lack of students’ activities’ data in a week-wise timely manner (i.e., spatiotemporal feature issues). Furthermore, the imbalanced data distribution in the VLE impacts the prediction model performance. Thus, there are severe challenges in handling spatiotemporal features, imbalanced data sets, and a lack of explainability for enhancing the confidence of the prediction system. Therefore, an intelligent framework for explainable student performance prediction (ESPP) is proposed in this study in order to provide the interpretability of the prediction results. First, this framework utilized a time-series weekly student activity data set and dealt with the VLE imbalanced data distribution using a hybrid data sampling method. Then, a combination of convolutional neural network (CNN) and long short-term memory (LSTM) was employed to extract the spatiotemporal features and develop the early prediction deep learning (DL) model. Finally, the DL model was explained by visualizing and analyzing typical predictions, students’ activities’ maps, and feature importance. The numerical results of cross-validation showed that the proposed new DL model (i.e., the combined CNN-LSTM and ConvLSTM), in the early prediction cases, performed better than the baseline models of LSTM, support vector machine (SVM), and logistic regression (LR) models.
Collapse
|
26
|
A Systematic Review of Explainable Artificial Intelligence in Terms of Different Application Domains and Tasks. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12031353] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Artificial intelligence (AI) and machine learning (ML) have recently been radically improved and are now being employed in almost every application domain to develop automated or semi-automated systems. To facilitate greater human acceptability of these systems, explainable artificial intelligence (XAI) has experienced significant growth over the last couple of years with the development of highly accurate models but with a paucity of explainability and interpretability. The literature shows evidence from numerous studies on the philosophy and methodologies of XAI. Nonetheless, there is an evident scarcity of secondary studies in connection with the application domains and tasks, let alone review studies following prescribed guidelines, that can enable researchers’ understanding of the current trends in XAI, which could lead to future research for domain- and application-specific method development. Therefore, this paper presents a systematic literature review (SLR) on the recent developments of XAI methods and evaluation metrics concerning different application domains and tasks. This study considers 137 articles published in recent years and identified through the prominent bibliographic databases. This systematic synthesis of research articles resulted in several analytical findings: XAI methods are mostly developed for safety-critical domains worldwide, deep learning and ensemble models are being exploited more than other types of AI/ML models, visual explanations are more acceptable to end-users and robust evaluation metrics are being developed to assess the quality of explanations. Research studies have been performed on the addition of explanations to widely used AI/ML models for expert users. However, more attention is required to generate explanations for general users from sensitive domains such as finance and the judicial system.
Collapse
|
27
|
Woensel WV, Scioscia F, Loseto G, Seneviratne O, Patton E, Abidi S, Kagal L. Explainable Clinical Decision Support: Towards Patient-Facing Explanations for Education and Long-Term Behavior Change. Artif Intell Med 2022. [DOI: 10.1007/978-3-031-09342-5_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
28
|
|
29
|
Help Me Learn! Architecture and Strategies to Combine Recommendations and Active Learning in Manufacturing. INFORMATION 2021. [DOI: 10.3390/info12110473] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
This research work describes an architecture for building a system that guides a user from a forecast generated by a machine learning model through a sequence of decision-making steps. The system is demonstrated in a manufacturing demand forecasting use case and can be extended to other domains. In addition, the system provides the means for knowledge acquisition by gathering data from users. Finally, it implements an active learning component and compares multiple strategies to recommend media news to the user. We compare such strategies through a set of experiments to understand how they balance learning and provide accurate media news recommendations to the user. The media news aims to provide additional context to demand forecasts and enhance judgment on decision-making.
Collapse
|
30
|
“That's (not) the output I expected!” On the role of end user expectations in creating explanations of AI systems. ARTIF INTELL 2021. [DOI: 10.1016/j.artint.2021.103507] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|