1
|
Caldwell S, Sweetser P, O’Donnell N, Knight MJ, Aitchison M, Gedeon T, Johnson D, Brereton M, Gallagher M, Conroy D. An Agile New Research Framework for Hybrid Human-AI Teaming: Trust, Transparency, and Transferability. ACM T INTERACT INTEL 2022. [DOI: 10.1145/3514257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
We propose a new research framework by which the nascent discipline of human-AI teaming can be explored within experimental environments in preparation for transferal to real-world contexts. We examine the existing literature and unanswered research questions through the lens of an Agile approach to construct our proposed framework. Our framework aims to provide a structure for understanding the macro features of this research landscape, supporting holistic research into the acceptability of human-AI teaming to human team members and the affordances of AI team members. The framework has the potential to enhance decision-making and performance of hybrid human-AI teams. Further, our framework proposes the application of Agile methodology for research management and knowledge discovery. We propose a transferability pathway for hybrid teaming to be initially tested in a safe environment, such as a real-time strategy video game, with elements of lessons learned that can be transferred to real-world situations.
Collapse
Affiliation(s)
| | | | | | | | | | - Tom Gedeon
- Australian National University, Australia
| | | | | | | | | |
Collapse
|
2
|
Chong L, Zhang G, Goucher-Lambert K, Kotovsky K, Cagan J. Human confidence in artificial intelligence and in themselves: The evolution and impact of confidence on adoption of AI advice. COMPUTERS IN HUMAN BEHAVIOR 2022. [DOI: 10.1016/j.chb.2021.107018] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
3
|
Chacon A, Kausel EE, Reyes T. A longitudinal approach for understanding algorithm use. JOURNAL OF BEHAVIORAL DECISION MAKING 2022. [DOI: 10.1002/bdm.2275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Alvaro Chacon
- School of Engineering Pontificia Universidad Católica de Chile Santiago Chile
| | - Edgar E. Kausel
- School of Management Pontificia Universidad Católica de Chile Santiago Chile
| | - Tomas Reyes
- School of Engineering Pontificia Universidad Católica de Chile Santiago Chile
| |
Collapse
|
4
|
Daronnat S, Azzopardi L, Halvey M, Dubiel M. Inferring Trust From Users' Behaviours; Agents' Predictability Positively Affects Trust, Task Performance and Cognitive Load in Human-Agent Real-Time Collaboration. Front Robot AI 2021; 8:642201. [PMID: 34307467 PMCID: PMC8295498 DOI: 10.3389/frobt.2021.642201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 06/10/2021] [Indexed: 11/23/2022] Open
Abstract
Collaborative virtual agents help human operators to perform tasks in real-time. For this collaboration to be effective, human operators must appropriately trust the agent(s) they are interacting with. Multiple factors influence trust, such as the context of interaction, prior experiences with automated systems and the quality of the help offered by agents in terms of its transparency and performance. Most of the literature on trust in automation identified the performance of the agent as a key factor influencing trust. However, other work has shown that the behavior of the agent, type of the agent’s errors, and predictability of the agent’s actions can influence the likelihood of the user’s reliance on the agent and efficiency of tasks completion. Our work focuses on how agents’ predictability affects cognitive load, performance and users’ trust in a real-time human-agent collaborative task. We used an interactive aiming task where participants had to collaborate with different agents that varied in terms of their predictability and performance. This setup uses behavioral information (such as task performance and reliance on the agent) as well as standardized survey instruments to estimate participants’ reported trust in the agent, cognitive load and perception of task difficulty. Thirty participants took part in our lab-based study. Our results showed that agents with more predictable behaviors have a more positive impact on task performance, reliance and trust while reducing cognitive workload. In addition, we investigated the human-agent trust relationship by creating models that could predict participants’ trust ratings using interaction data. We found that we could reliably estimate participants’ reported trust in the agents using information related to performance, task difficulty and reliance. This study provides insights on behavioral factors that are the most meaningful to anticipate complacent or distrusting attitudes toward automation. With this work, we seek to pave the way for the development of trust-aware agents capable of responding more appropriately to users by being able to monitor components of the human-agent relationships that are the most salient for trust calibration.
Collapse
Affiliation(s)
- Sylvain Daronnat
- Department of Computer and Information Sciences, University of Strathclyde, Glasgow, United Kingdom
| | - Leif Azzopardi
- Department of Computer and Information Sciences, University of Strathclyde, Glasgow, United Kingdom
| | - Martin Halvey
- Department of Computer and Information Sciences, University of Strathclyde, Glasgow, United Kingdom
| | - Mateusz Dubiel
- Department of Computer and Information Sciences, University of Strathclyde, Glasgow, United Kingdom
| |
Collapse
|
5
|
Douer N, Meyer J. Theoretical, Measured, and Subjective Responsibility in Aided Decision Making. ACM T INTERACT INTEL 2021. [DOI: 10.1145/3425732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
When humans interact with intelligent systems, their causal responsibility for outcomes becomes equivocal. We analyze the descriptive abilities of a newly developed responsibility quantification model (ResQu) to predict actual human responsibility and perceptions of responsibility in the interaction with intelligent systems. In two laboratory experiments, participants performed a classification task. They were aided by classification systems with different capabilities. We compared the predicted theoretical responsibility values to the actual measured responsibility participants took on and to their subjective rankings of responsibility. The model predictions were strongly correlated with both measured and subjective responsibility. Participants’ behavior with each system was influenced by the system and human capabilities, but also by the subjective perceptions of these capabilities and the perception of the participant's own contribution. A bias existed only when participants with poor classification capabilities relied less than optimally on a system that had superior classification capabilities and assumed higher-than-optimal responsibility. The study implies that when humans interact with advanced intelligent systems, with capabilities that greatly exceed their own, their comparative causal responsibility will be small, even if formally the human is assigned major roles. Simply putting a human into the loop does not ensure that the human will meaningfully contribute to the outcomes. The results demonstrate the descriptive value of the ResQu model to predict behavior and perceptions of responsibility by considering the characteristics of the human, the intelligent system, the environment, and some systematic behavioral biases. The ResQu model is a new quantitative method that can be used in system design and can guide policy and legal decisions regarding human responsibility in events involving intelligent systems.
Collapse
|
6
|
Zellner M, Abbas AE, Budescu DV, Galstyan A. A survey of human judgement and quantitative forecasting methods. ROYAL SOCIETY OPEN SCIENCE 2021; 8:201187. [PMID: 33972849 PMCID: PMC8074796 DOI: 10.1098/rsos.201187] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 01/29/2021] [Indexed: 05/30/2023]
Abstract
This paper's top-level goal is to provide an overview of research conducted in the many academic domains concerned with forecasting. By providing a summary encompassing these domains, this survey connects them, establishing a common ground for future discussions. To this end, we survey literature on human judgement and quantitative forecasting as well as hybrid methods that involve both humans and algorithmic approaches. The survey starts with key search terms that identified more than 280 publications in the fields of computer science, operations research, risk analysis, decision science, psychology and forecasting. Results show an almost 10-fold increase in the application-focused forecasting literature between the 1990s and the current decade, with a clear rise of quantitative, data-driven forecasting models. Comparative studies of quantitative methods and human judgement show that (1) neither method is universally superior, and (2) the better method varies as a function of factors such as availability, quality, extent and format of data, suggesting that (3) the two approaches can complement each other to yield more accurate and resilient models. We also identify four research thrusts in the human/machine-forecasting literature: (i) the choice of the appropriate quantitative model, (ii) the nature of the interaction between quantitative models and human judgement, (iii) the training and incentivization of human forecasters, and (iv) the combination of multiple forecasts (both algorithmic and human) into one. This review surveys current research in all four areas and argues that future research in the field of human/machine forecasting needs to consider all of them when investigating predictive performance. We also address some of the ethical dilemmas that might arise due to the combination of quantitative models with human judgement.
Collapse
Affiliation(s)
| | - Ali E. Abbas
- University of Southern California, Los Angeles, CA, USA
| | | | - Aram Galstyan
- University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
7
|
Berger B, Adam M, Rühr A, Benlian A. Watch Me Improve—Algorithm Aversion and Demonstrating the Ability to Learn. BUSINESS & INFORMATION SYSTEMS ENGINEERING 2020. [DOI: 10.1007/s12599-020-00678-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
AbstractOwing to advancements in artificial intelligence (AI) and specifically in machine learning, information technology (IT) systems can support humans in an increasing number of tasks. Yet, previous research indicates that people often prefer human support to support by an IT system, even if the latter provides superior performance – a phenomenon called algorithm aversion. A possible cause of algorithm aversion put forward in literature is that users lose trust in IT systems they become familiar with and perceive to err, for example, making forecasts that turn out to deviate from the actual value. Therefore, this paper evaluates the effectiveness of demonstrating an AI-based system’s ability to learn as a potential countermeasure against algorithm aversion in an incentive-compatible online experiment. The experiment reveals how the nature of an erring advisor (i.e., human vs. algorithmic), its familiarity to the user (i.e., unfamiliar vs. familiar), and its ability to learn (i.e., non-learning vs. learning) influence a decision maker’s reliance on the advisor’s judgement for an objective and non-personal decision task. The results reveal no difference in the reliance on unfamiliar human and algorithmic advisors, but differences in the reliance on familiar human and algorithmic advisors that err. Demonstrating an advisor’s ability to learn, however, offsets the effect of familiarity. Therefore, this study contributes to an enhanced understanding of algorithm aversion and is one of the first to examine how users perceive whether an IT system is able to learn. The findings provide theoretical and practical implications for the employment and design of AI-based systems.
Collapse
|
8
|
An experience-based contrast effect when relying on a decision aid. SN APPLIED SCIENCES 2020. [DOI: 10.1007/s42452-020-03236-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
|
9
|
Juravle G, Boudouraki A, Terziyska M, Rezlescu C. Trust in artificial intelligence for medical diagnoses. PROGRESS IN BRAIN RESEARCH 2020; 253:263-282. [PMID: 32771128 DOI: 10.1016/bs.pbr.2020.06.006] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We present two online experiments investigating trust in artificial intelligence (AI) as a primary and secondary medical diagnosis tool and one experiment testing two methods to increase trust in AI. Participants in Experiment 1 read hypothetical scenarios of low and high-risk diseases, followed by two sequential diagnoses, and estimated their trust in the medical findings. In three between-participants groups, the first and second diagnoses were given by: human and AI, AI and human, and human and human doctors, respectively. In Experiment 2 we examined if people expected higher standards of performance from AI than human doctors, in order to trust AI treatment recommendations. In Experiment 3 we investigated the possibility to increase trust in AI diagnoses by: (i) informing our participants that the AI outperforms the human doctor, and (ii) nudging them to prefer AI diagnoses in a choice between AI and human doctors. Results indicate overall lower trust in AI, as well as for diagnoses of high-risk diseases. Participants trusted AI doctors less than humans for first diagnoses, and they were also less likely to trust a second opinion from an AI doctor for high risk diseases. Surprisingly, results highlight that people have comparable standards of performance for AI and human doctors and that trust in AI does not increase when people are told the AI outperforms the human doctor. Importantly, we find that the gap in trust between AI and human diagnoses is eliminated when people are nudged to select AI in a free-choice paradigm between human and AI diagnoses, with trust for AI diagnoses significantly increased when participants could choose their doctor. These findings isolate control over one's medical practitioner as a valid candidate for future trust-related medical diagnosis and highlight a solid potential path to smooth acceptance of AI diagnoses amongst patients.
Collapse
Affiliation(s)
- Georgiana Juravle
- Faculty of Psychology and Educational Sciences, Alexandru Ioan Cuza University, Iasi, Romania.
| | - Andriana Boudouraki
- School of Computer Science, University of Nottingham, Nottingham, United Kingdom
| | - Miglena Terziyska
- Department of Experimental Psychology, University College London, London, United Kingdom
| | - Constantin Rezlescu
- Department of Experimental Psychology, University College London, London, United Kingdom
| |
Collapse
|
10
|
Alexander V, Blinder C, Zak PJ. Why trust an algorithm? Performance, cognition, and neurophysiology. COMPUTERS IN HUMAN BEHAVIOR 2018. [DOI: 10.1016/j.chb.2018.07.026] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
11
|
Sutherland SC, Harteveld C, Young ME. Effects of the Advisor and Environment on Requesting and Complying With Automated Advice. ACM T INTERACT INTEL 2016. [DOI: 10.1145/2905370] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Given the rapid technological advances in our society and the increase in artificial and automated advisors with whom we interact on a daily basis, it is becoming increasingly necessary to understand how users interact with and why they choose to request and follow advice from these types of advisors. More specifically, it is necessary to understand errors in advice utilization. In the present study, we propose a methodological framework for studying interactions between users and automated or other artificial advisors. Specifically, we propose the use of virtual environments and the tarp technique for stimulus sampling, ensuring sufficient sampling of important extreme values and the stimulus space between those extremes. We use this proposed framework to identify the impact of several factors on when and how advice is used. Additionally, because these interactions take place in different environments, we explore the impact of where the interaction takes place on the decision to interact. We varied the cost of advice, the reliability of the advisor, and the predictability of the environment to better understand the impact of these factors on the overutilization of suboptimal advisors and underutilization of optimal advisors. We found that less predictable environments, more reliable advisors, and lower costs for advice led to overutilization, whereas more predictable environments and less reliable advisors led to underutilization. Moreover, once advice was received, users took longer to make a final decision, suggesting less confidence and trust in the advisor when the reliability of the advisor was lower, the environment was less predictable, and the advice was not consistent with the environmental cues. These results contribute to a more complete understanding of advice utilization and trust in advisors.
Collapse
|