1
|
Jessup SA, Alarcon GM, Willis SM, Lee MA. A closer look at how experience, task domain, and self-confidence influence reliance towards algorithms. APPLIED ERGONOMICS 2024; 121:104363. [PMID: 39096745 DOI: 10.1016/j.apergo.2024.104363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 06/20/2024] [Accepted: 07/25/2024] [Indexed: 08/05/2024]
Abstract
Prior research has demonstrated experience with a forecasting algorithm decreases reliance behaviors (i.e., the action of relying on the algorithm). However, the influence of model experience on reliance intentions (i.e., an intention or willingness to rely on the algorithm) has not been explored. Additionally, other factors such as self-confidence and domain knowledge are posited to influence algorithm reliance. The objective of this research was to examine how experience with a statistical model, task domain (used car sales, college grade point average (GPA), GitHub pull requests), and self-confidence influence reliance intentions, reliance behaviors, and perceived accuracy of one's own estimates and the model's estimates. Participants (N = 347) were recruited online and completed a forecasting task. Results indicate that there was a statistically significant effect of self-confidence and task domain on reliance intentions, reliance behaviors, and perceived accuracy. However, unlike previous findings, model experience did not significantly influence reliance behavior, nor did it lead to significant changes in reliance intentions or perceived accuracy of oneself or the model. Our data suggest that factors such as task domain and self-confidence influence algorithm use more so than model experience. Individual differences and situational factors should be considered important aspects that influence forecasters' decisions to rely on predictions from a model or to instead use their own estimates, which can lead to sub-optimal performance.
Collapse
Affiliation(s)
- Sarah A Jessup
- Consortium of Universities, Wright-Patterson AFB, OH, United States.
| | - Gene M Alarcon
- Air Force Research Laboratory, Wright-Patterson AFB, OH, United States
| | - Sasha M Willis
- General Dynamics Information Technology, Dayton, OH, United States
| | - Michael A Lee
- General Dynamics Information Technology, Dayton, OH, United States
| |
Collapse
|
2
|
Malle BF, Scheutz M, Cusimano C, Voiklis J, Komatsu T, Thapa S, Aladia S. People's judgments of humans and robots in a classic moral dilemma. Cognition 2024; 254:105958. [PMID: 39362054 DOI: 10.1016/j.cognition.2024.105958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Revised: 09/02/2024] [Accepted: 09/07/2024] [Indexed: 10/05/2024]
Abstract
How do ordinary people evaluate robots that make morally significant decisions? Previous work has found both equal and different evaluations, and different ones in either direction. In 13 studies (N = 7670), we asked people to evaluate humans and robots that make decisions in norm conflicts (variants of the classic trolley dilemma). We examined several conditions that may influence whether moral evaluations of human and robot agents are the same or different: the type of moral judgment (norms vs. blame); the structure of the dilemma (side effect vs. means-end); salience of particular information (victim, outcome); culture (Japan vs. US); and encouraged empathy. Norms for humans and robots are broadly similar, but blame judgments show a robust asymmetry under one condition: Humans are blamed less than robots specifically for inaction decisions-here, refraining from sacrificing one person for the good of many. This asymmetry may emerge because people appreciate that the human faces an impossible decision and deserves mitigated blame for inaction; when evaluating a robot, such appreciation appears to be lacking. However, our evidence for this explanation is mixed. We discuss alternative explanations and offer methodological guidance for future work into people's moral judgment of robots and humans.
Collapse
|
3
|
Glickman M, Sharot T. AI-induced hyper-learning in humans. Curr Opin Psychol 2024; 60:101900. [PMID: 39348730 DOI: 10.1016/j.copsyc.2024.101900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 08/13/2024] [Accepted: 09/09/2024] [Indexed: 10/02/2024]
Abstract
Humans evolved to learn from one another. Today, however, learning opportunities often emerge from interactions with AI systems. Here, we argue that learning from AI systems resembles learning from other humans, but may be faster and more efficient. Such 'hyper learning' can occur because AI: (i) provides a high signal-to-noise ratio that facilitates learning, (ii) has greater data processing ability, enabling it to generate persuasive arguments, and (iii) is perceived (in some domains) to have superior knowledge compared to humans. As a result, humans more quickly adopt biases from AI, are often more easily persuaded by AI, and exhibit novel problem-solving strategies after interacting with AI. Greater awareness of AI's influences is needed to mitigate the potential negative outcomes of human-AI interactions.
Collapse
Affiliation(s)
- Moshe Glickman
- Affective Brain Lab, Department of Experimental Psychology, University College London, London, UK; Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK.
| | - Tali Sharot
- Affective Brain Lab, Department of Experimental Psychology, University College London, London, UK; Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
4
|
Love J, Gronau QF, Palmer G, Eidels A, Brown SD. In human-machine trust, humans rely on a simple averaging strategy. Cogn Res Princ Implic 2024; 9:58. [PMID: 39218841 PMCID: PMC11366733 DOI: 10.1186/s41235-024-00583-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 08/08/2024] [Indexed: 09/04/2024] Open
Abstract
With the growing role of artificial intelligence (AI) in our lives, attention is increasingly turning to the way that humans and AI work together. A key aspect of human-AI collaboration is how people integrate judgements or recommendations from machine agents, when they differ from their own judgements. We investigated trust in human-machine teaming using a perceptual judgement task based on the judge-advisor system. Participants ( n = 89 ) estimated a perceptual quantity, then received a recommendation from a machine agent. The participants then made a second response which combined their first estimate and the machine's recommendation. The degree to which participants shifted their second response in the direction of the recommendations provided a measure of their trust in the machine agent. We analysed the role of advice distance in people's willingness to change their judgements. When a recommendation falls a long way from their initial judgement, do people come to doubt their own judgement, trusting the recommendation more, or do they doubt the machine agent, trusting the recommendation less? We found that although some participants exhibited these behaviours, the most common response was neither of these tendencies, and a simple model based on averaging accounted best for participants' trust behaviour. We discuss implications for theories of trust, and human-machine teaming.
Collapse
Affiliation(s)
- Jonathon Love
- Psychological Sciences, University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia.
| | - Quentin F Gronau
- Psychological Sciences, University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia
| | - Gemma Palmer
- Psychological Sciences, University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia
| | - Ami Eidels
- Psychological Sciences, University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia
| | - Scott D Brown
- Psychological Sciences, University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia
| |
Collapse
|
5
|
Tey KS, Mazar A, Tomaino G, Duckworth AL, Ungar LH. People judge others more harshly after talking to bots. PNAS NEXUS 2024; 3:pgae397. [PMID: 39319325 PMCID: PMC11421659 DOI: 10.1093/pnasnexus/pgae397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 08/27/2024] [Indexed: 09/26/2024]
Abstract
People now commonly interact with Artificial Intelligence (AI) agents. How do these interactions shape how humans perceive each other? In two preregistered studies (total N = 1,261), we show that people evaluate other humans more harshly after interacting with an AI (compared with an unrelated purported human). In Study 1, participants who worked on a creative task with AIs (versus purported humans) subsequently rated another purported human's work more negatively. Study 2 replicated this effect and demonstrated that the results hold even when participants believed their evaluation would not be shared with the purported human. Exploratory analyses of participants' conversations show that prior to their human evaluations they were more demanding, more instrumental and displayed less positive affect towards AIs (versus purported humans). These findings point to a potentially worrisome side effect of the exponential rise in human-AI interactions.
Collapse
Affiliation(s)
- Kian Siong Tey
- Department of Management and Strategy, University of Hong Kong, Hong Kong, Hong Kong
| | - Asaf Mazar
- Wharton School of Business, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Geoff Tomaino
- Marketing Department, University of Florida, Gainesville, FL 32611, USA
| | - Angela L Duckworth
- Department of Psychology and Wharton School of Business, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Lyle H Ungar
- Computer and Information Science Department, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
6
|
Abstract
Artificial intelligence (AI) has the potential to improve human decision-making by providing decision recommendations and problem-relevant information to assist human decision-makers. However, the full realization of the potential of human-AI collaboration continues to face several challenges. First, the conditions that support complementarity (i.e., situations in which the performance of a human with AI assistance exceeds the performance of an unassisted human or the AI in isolation) must be understood. This task requires humans to be able to recognize situations in which the AI should be leveraged and to develop new AI systems that can learn to complement the human decision-maker. Second, human mental models of the AI, which contain both expectations of the AI and reliance strategies, must be accurately assessed. Third, the effects of different design choices for human-AI interaction must be understood, including both the timing of AI assistance and the amount of model information that should be presented to the human decision-maker to avoid cognitive overload and ineffective reliance strategies. In response to each of these three challenges, we present an interdisciplinary perspective based on recent empirical and theoretical findings and discuss new research directions.
Collapse
Affiliation(s)
- Mark Steyvers
- Department of Cognitive Sciences, University of California, Irvine
| | - Aakriti Kumar
- Department of Cognitive Sciences, University of California, Irvine
| |
Collapse
|
7
|
Moosavi A, Huang S, Vahabi M, Motamedivafa B, Tian N, Mahmood R, Liu P, Sun CL. Prospective Human Validation of Artificial Intelligence Interventions in Cardiology: A Scoping Review. JACC. ADVANCES 2024; 3:101202. [PMID: 39372457 PMCID: PMC11450923 DOI: 10.1016/j.jacadv.2024.101202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 07/09/2024] [Accepted: 07/11/2024] [Indexed: 10/08/2024]
Abstract
Background Despite the potential of artificial intelligence (AI) in enhancing cardiovascular care, its integration into clinical practice is limited by a lack of evidence on its effectiveness with respect to human experts or gold standard practices in real-world settings. Objectives The purpose of this study was to identify AI interventions in cardiology that have been prospectively validated against human expert benchmarks or gold standard practices, assessing their effectiveness, and identifying future research areas. Methods We systematically reviewed Scopus and MEDLINE to identify peer-reviewed publications that involved prospective human validation of AI-based interventions in cardiology from January 2015 to December 2023. Results Of 2,351 initial records, 64 studies were included. Among these studies, 59 (92.2%) were published after 2020. A total of 11 (17.2%) randomized controlled trials were published. AI interventions in 44 articles (68.75%) reported definite clinical or operational improvements over human experts. These interventions were mostly used in imaging (n = 14, 21.9%), ejection fraction (n = 10, 15.6%), arrhythmia (n = 9, 14.1%), and coronary artery disease (n = 12, 18.8%) application areas. Convolutional neural networks were the most common predictive model (n = 44, 69%), and images were the most used data type (n = 38, 54.3%). Only 22 (34.4%) studies made their models or data accessible. Conclusions This review identifies the potential of AI in cardiology, with models often performing equally well as human counterparts for specific and clearly scoped tasks suitable for such models. Nonetheless, the limited number of randomized controlled trials emphasizes the need for continued validation, especially in real-world settings that closely examine joint human AI decision-making.
Collapse
Affiliation(s)
- Amirhossein Moosavi
- Telfer School of Management, University of Ottawa, Ottawa, Ontario, Canada
- University of Ottawa Heart Institute, University of Ottawa, Ottawa, Ontario, Canada
| | - Steven Huang
- University of Ottawa Heart Institute, University of Ottawa, Ottawa, Ontario, Canada
| | - Maryam Vahabi
- Telfer School of Management, University of Ottawa, Ottawa, Ontario, Canada
- University of Ottawa Heart Institute, University of Ottawa, Ottawa, Ontario, Canada
| | - Bahar Motamedivafa
- Telfer School of Management, University of Ottawa, Ottawa, Ontario, Canada
- University of Ottawa Heart Institute, University of Ottawa, Ottawa, Ontario, Canada
| | - Nelly Tian
- Marshall School of Business, University of Southern California, Los Angeles, California, USA
| | - Rafid Mahmood
- Telfer School of Management, University of Ottawa, Ottawa, Ontario, Canada
| | - Peter Liu
- University of Ottawa Heart Institute, University of Ottawa, Ottawa, Ontario, Canada
| | - Christopher L.F. Sun
- Telfer School of Management, University of Ottawa, Ottawa, Ontario, Canada
- University of Ottawa Heart Institute, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
8
|
Candrian C. How Terminology Affects Users' Responses to System Failures. HUMAN FACTORS 2024; 66:2082-2103. [PMID: 37734726 PMCID: PMC11141081 DOI: 10.1177/00187208231202572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 09/05/2023] [Indexed: 09/23/2023]
Abstract
OBJECTIVE The objective of our research is to advance the understanding of behavioral responses to a system's error. By examining trust as a dynamic variable and drawing from attribution theory, we explain the underlying mechanism and suggest how terminology can be used to mitigate the so-called algorithm aversion. In this way, we show that the use of different terms may shape consumers' perceptions and provide guidance on how these differences can be mitigated. BACKGROUND Previous research has interchangeably used various terms to refer to a system and results regarding trust in systems have been ambiguous. METHODS Across three studies, we examine the effect of different system terminology on consumer behavior following a system failure. RESULTS Our results show that terminology crucially affects user behavior. Describing a system as "AI" (i.e., self-learning and perceived as more complex) instead of as "algorithmic" (i.e., a less complex rule-based system) leads to more favorable behavioral responses by users when a system error occurs. CONCLUSION We suggest that in cases when a system's characteristics do not allow for it to be called "AI," users should be provided with an explanation of why the system's error occurred, and task complexity should be pointed out. We highlight the importance of terminology, as this can unintentionally impact the robustness and replicability of research findings. APPLICATION This research offers insights for industries utilizing AI and algorithmic systems, highlighting how strategic terminology use can shape user trust and response to errors, thereby enhancing system acceptance.
Collapse
|
9
|
Williams GY, Lim S. Psychology of AI: How AI impacts the way people feel, think, and behave. Curr Opin Psychol 2024; 58:101835. [PMID: 39047330 DOI: 10.1016/j.copsyc.2024.101835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 06/18/2024] [Accepted: 06/24/2024] [Indexed: 07/27/2024]
Abstract
Over the past decade, artificial intelligence (AI) technologies have transformed numerous facets of our lives. In this article, we summarize key themes in emerging AI research in behavioral science. In doing so, we aim to unravel the multifaceted impacts of AI on people's emotions, cognition, and behaviors, offering nuanced insights into this rapidly evolving landscape. This article concludes with proposing promising avenues for future research, outlining areas for further exploration and methodological approaches to consider.
Collapse
Affiliation(s)
| | - Sarah Lim
- Department of Business Administration, University of Illinois at Urbana-Champaign, USA
| |
Collapse
|
10
|
Mariadassou S, Klesse AK, Boegershausen J. Averse to what: Consumer aversion to algorithmic labels, but not their outputs? Curr Opin Psychol 2024; 58:101839. [PMID: 38996629 DOI: 10.1016/j.copsyc.2024.101839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 06/11/2024] [Accepted: 06/25/2024] [Indexed: 07/14/2024]
Abstract
Inspired by significant technical advancements, a rapidly growing stream of research explores human lay beliefs and reactions surrounding AI tools, which employ algorithms to mimic elements of human intelligence. This literature predominantly documents negative reactions to these tools or the underlying algorithms, often referred to as algorithm aversion or, alternatively, a preference for humans. This article proposes a third interpretation: people may be averse to their labels, but appreciative of their output. This perspective offers three core insights for how we study people's reactions to algorithms. Research would benefit from (1) carefully considering the labeling of AI tools, (2) broadening the scope of study to include interactions with these tools, and (3) accounting for their technical configuration.
Collapse
Affiliation(s)
- Shwetha Mariadassou
- Erasmus University Rotterdam, Rotterdam School of Management, the Netherlands.
| | - Anne-Kathrin Klesse
- Erasmus University Rotterdam, Rotterdam School of Management, the Netherlands
| | | |
Collapse
|
11
|
Trail M. Child welfare predictive risk models and legal decision making. CHILD ABUSE & NEGLECT 2024; 154:106943. [PMID: 39018749 DOI: 10.1016/j.chiabu.2024.106943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 04/28/2024] [Accepted: 07/04/2024] [Indexed: 07/19/2024]
Abstract
BACKGROUND Child welfare agencies around the world have experimented with algorithmic predictive modeling as a method to assist in decision making regarding foster child risk, removal and placement. OBJECTIVE Thus far, all of the predictive risk models have been confined to the employees of the various child welfare agencies at the early removal stages and none have been used by attorneys in legal arguments or by judges in making child welfare legal decisions. This study will show the effects of a predictive model on legal decision making within a child welfare context. PARTICIPANTS AND SETTING Lawyers, judges and law students with experience in child welfare or juvenile law were recruited to take an online randomized vignette survey. METHODS The survey consisted of two vignettes describing complex foster child removal and placement legal decisions where participants were exposed to one of three randomized predictive risk model scores. They were then asked follow up questions regarding their decisions to see if the risk models changed their answers. RESULTS Using structural equation modeling, high predictive model risk scores showed consistent ability to change legal decisions about removal and placement across both vignettes. Medium and low scores, though less consistent, also significantly influenced legal decision making. CONCLUSIONS Child welfare legal decision making can be affected by the use of a predictive risk model, which has implications for the development and use of these models as well as legal education for attorneys and judges in the field.
Collapse
|
12
|
Li Y, Wu J, Xue J, Zhang X. Peer or tutor? The congruity effects of service robot role and service type on usage intention. Acta Psychol (Amst) 2024; 248:104429. [PMID: 39088994 DOI: 10.1016/j.actpsy.2024.104429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 07/23/2024] [Accepted: 07/23/2024] [Indexed: 08/03/2024] Open
Abstract
The invention of service robots has reduced the labor cost and improved enterprises' efficiency and service quality. However, it is still difficult to enhance consumers' intention to use robot-by-robot design efficiently. Based on social roles of anthropomorphic cues, service robots can be divided into peer (e.g., kind and amiable friends) or tutor (e.g., authoritative and professional experts) robots. From a matching perspective, this paper investigates (1) whether robot role and service type have an impact on consumers' intention to employ service robots in different ways, and (2) how cognitive trust and affective trust can play a mediating role during this process. In this paper, the authors conducted an online a scenario-based experiment and collected a valid sample of 332 consumers. The results show that the participants are more willing to apply the tutor robot in the scenario of utilitarian service, and the peer robot in the scenario of hedonic service. In addition, cognitive trust and affective trust have a matching mediation effect. Specifically, for the utilitarian service, cognitive trust mediates the effect of robot role on consumers' intention to adopt the robots, while the mediating effect of affective trust is not significant. As for the hedonic service, affective trust mediated the effect of robot role on the intention to use, whereas the mediating effect of cognitive trust is not significant.
Collapse
Affiliation(s)
- Yuxuan Li
- College of Business Administration, Chengdu Jincheng College, Chengdu 611731, China
| | - Jifei Wu
- School of Marxism, Sun Yat-Sen University, Guangzhou 510275, China
| | - Jiaolong Xue
- Business School, Sichuan University, Chengdu 610064, China.
| | - Xiangyun Zhang
- School of Business, Sun Yat-Sen University, Guangzhou 510275, China.
| |
Collapse
|
13
|
Rosholm M, Bodilsen ST, Michel B, Nielsen AS. Predictive risk modeling for child maltreatment detection and enhanced decision-making: Evidence from Danish administrative data. PLoS One 2024; 19:e0305974. [PMID: 38985689 PMCID: PMC11236184 DOI: 10.1371/journal.pone.0305974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 06/05/2024] [Indexed: 07/12/2024] Open
Abstract
Child maltreatment is a widespread problem with significant costs for both victims and society. In this retrospective cohort study, we develop predictive risk models using Danish administrative data to predict removal decisions among referred children and assess the effectiveness of caseworkers in identifying children at risk of maltreatment. The study analyzes 195,639 referrals involving 102,309 children Danish Child Protection Services received from April 2016 to December 2017. We implement four machine learning models of increasing complexity, incorporating extensive background information on each child and their family. Our best-performing model exhibits robust predictive power, with an AUC-ROC score exceeding 87%, indicating its ability to consistently rank referred children based on their likelihood of being removed. Additionally, we find strong positive correlations between the model's predictions and various adverse child outcomes, such as crime, physical and mental health issues, and school absenteeism. Furthermore, we demonstrate that predictive risk models can enhance caseworkers' decision-making processes by reducing classification errors and identifying at-risk children at an earlier stage, enabling timely interventions and potentially improving outcomes for vulnerable children.
Collapse
Affiliation(s)
- Michael Rosholm
- Department of Economics and Business Economics, Aarhus University, Aarhus, Denmark
- TrygFonden's Centre for Child Research, Aarhus University, Aarhus, Denmark
- IZA Institute of Labor Economics, Bonn, Germany
- Centre for Integrated Register-based Research, Aarhus University, Aarhus, Denmark
| | - Simon Tranberg Bodilsen
- Department of Economics and Business Economics, Aarhus University, Aarhus, Denmark
- TrygFonden's Centre for Child Research, Aarhus University, Aarhus, Denmark
- Centre for Integrated Register-based Research, Aarhus University, Aarhus, Denmark
| | - Bastien Michel
- TrygFonden's Centre for Child Research, Aarhus University, Aarhus, Denmark
- School of Economics and Management, Nantes University, Nantes, France
| | - Albeck Søren Nielsen
- Department of Economics and Business Economics, Aarhus University, Aarhus, Denmark
- TrygFonden's Centre for Child Research, Aarhus University, Aarhus, Denmark
- Centre for Integrated Register-based Research, Aarhus University, Aarhus, Denmark
| |
Collapse
|
14
|
Introzzi L, Zonca J, Cabitza F, Cherubini P, Reverberi C. Enhancing human-AI collaboration: The case of colonoscopy. Dig Liver Dis 2024; 56:1131-1139. [PMID: 37940501 DOI: 10.1016/j.dld.2023.10.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 10/19/2023] [Accepted: 10/23/2023] [Indexed: 11/10/2023]
Abstract
Diagnostic errors impact patient health and healthcare costs. Artificial Intelligence (AI) shows promise in mitigating this burden by supporting Medical Doctors in decision-making. However, the mere display of excellent or even superhuman performance by AI in specific tasks does not guarantee a positive impact on medical practice. Effective AI assistance should target the primary causes of human errors and foster effective collaborative decision-making with human experts who remain the ultimate decision-makers. In this narrative review, we apply these principles to the specific scenario of AI assistance during colonoscopy. By unraveling the neurocognitive foundations of the colonoscopy procedure, we identify multiple bottlenecks in perception, attention, and decision-making that contribute to diagnostic errors, shedding light on potential interventions to mitigate them. Furthermore, we explored how existing AI devices fare in clinical practice and whether they achieved an optimal integration with the human decision-maker. We argue that to foster optimal Human-AI collaboration, future research should expand our knowledge of factors influencing AI's impact, establish evidence-based cognitive models, and develop training programs based on them. These efforts will enhance human-AI collaboration, ultimately improving diagnostic accuracy and patient outcomes. The principles illuminated in this review hold more general value, extending their relevance to a wide array of medical procedures and beyond.
Collapse
Affiliation(s)
- Luca Introzzi
- Department of Psychology, Università Milano - Bicocca, Milano, Italy
| | - Joshua Zonca
- Department of Psychology, Università Milano - Bicocca, Milano, Italy; Milan Center for Neuroscience, Università Milano - Bicocca, Milano, Italy
| | - Federico Cabitza
- Department of Informatics, Systems and Communication, Università Milano - Bicocca, Milano, Italy; IRCCS Istituto Ortopedico Galeazzi, Milano, Italy
| | - Paolo Cherubini
- Department of Brain and Behavioral Sciences, Università Statale di Pavia, Pavia, Italy
| | - Carlo Reverberi
- Department of Psychology, Università Milano - Bicocca, Milano, Italy; Milan Center for Neuroscience, Università Milano - Bicocca, Milano, Italy.
| |
Collapse
|
15
|
Zhao J, Wang Y, Mancenido MV, Chiou EK, Maciejewski R. Evaluating the Impact of Uncertainty Visualization on Model Reliance. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:4093-4107. [PMID: 37028077 DOI: 10.1109/tvcg.2023.3251950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Machine learning models have gained traction as decision support tools for tasks that require processing copious amounts of data. However, to achieve the primary benefits of automating this part of decision-making, people must be able to trust the machine learning model's outputs. In order to enhance people's trust and promote appropriate reliance on the model, visualization techniques such as interactive model steering, performance analysis, model comparison, and uncertainty visualization have been proposed. In this study, we tested the effects of two uncertainty visualization techniques in a college admissions forecasting task, under two task difficulty levels, using Amazon's Mechanical Turk platform. Results show that (1) people's reliance on the model depends on the task difficulty and level of machine uncertainty and (2) ordinal forms of expressing model uncertainty are more likely to calibrate model usage behavior. These outcomes emphasize that reliance on decision support tools can depend on the cognitive accessibility of the visualization technique and perceptions of model performance and task difficulty.
Collapse
|
16
|
van der Zander QEW, Roumans R, Kusters CHJ, Dehghani N, Masclee AAM, de With PHN, van der Sommen F, Snijders CCP, Schoon EJ. Appropriate trust in artificial intelligence for the optical diagnosis of colorectal polyps: The role of human/artificial intelligence interaction. Gastrointest Endosc 2024:S0016-5107(24)03324-8. [PMID: 38942330 DOI: 10.1016/j.gie.2024.06.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 03/26/2024] [Accepted: 06/19/2024] [Indexed: 06/30/2024]
Abstract
BACKGROUND AND AIMS Computer-aided diagnosis (CADx) for the optical diagnosis of colorectal polyps is thoroughly investigated. However, studies on human-artificial intelligence interaction are lacking. Our aim was to investigate endoscopists' trust in CADx by evaluating whether communicating a calibrated algorithm confidence score improved trust. METHODS Endoscopists optically diagnosed 60 colorectal polyps. Initially, endoscopists diagnosed the polyps without CADx assistance (initial diagnosis). Immediately afterward, the same polyp was again shown with a CADx prediction: either only a prediction (benign or premalignant) or a prediction accompanied by a calibrated confidence score (0-100). A confidence score of 0 indicated a benign prediction, 100 a (pre)malignant prediction. In half of the polyps, CADx was mandatory, and for the other half, CADx was optional. After reviewing the CADx prediction, endoscopists made a final diagnosis. Histopathology was used as the gold standard. Endoscopists' trust in CADx was measured as CADx prediction utilization: the willingness to follow CADx predictions when the endoscopists initially disagreed with the CADx prediction. RESULTS Twenty-three endoscopists participated. Presenting CADx predictions increased the endoscopists' diagnostic accuracy (69.3% initial vs 76.6% final diagnosis, P < .001). The CADx prediction was used in 36.5% (n = 183 of 501) disagreements. Adding a confidence score led to lower CADx prediction utilization, except when the confidence score surpassed 60. Mandatory CADx decreased CADx prediction utilization compared to optional CADx. Appropriate trust-using correct or disregarding incorrect CADx predictions-was 48.7% (n = 244 of 501). CONCLUSIONS Appropriate trust was common, and CADx prediction utilization was highest for the optional CADx without confidence scores. These results express the importance of a better understanding of human-artificial intelligence interaction.
Collapse
Affiliation(s)
- Quirine E W van der Zander
- Department of Gastroenterology and Hepatology, Maastricht University Medical Center, Maastricht, The Netherlands; GROW, School for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands.
| | - Rachel Roumans
- Human-Technology Interaction, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Carolus H J Kusters
- Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven The Netherlands
| | - Nikoo Dehghani
- Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven The Netherlands
| | - Ad A M Masclee
- Department of Gastroenterology and Hepatology, Maastricht University Medical Center, Maastricht, The Netherlands
| | - Peter H N de With
- Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven The Netherlands
| | - Fons van der Sommen
- Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven The Netherlands
| | - Chris C P Snijders
- Human-Technology Interaction, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Erik J Schoon
- Department of Gastroenterology and Hepatology, Maastricht University Medical Center, Maastricht, The Netherlands; Division of Gastroenterology and Hepatology, Catharina Hospital Eindhoven, Eindhoven, The Netherlands
| |
Collapse
|
17
|
Schlund R, Zitek EM. Algorithmic versus human surveillance leads to lower perceptions of autonomy and increased resistance. COMMUNICATIONS PSYCHOLOGY 2024; 2:53. [PMID: 39242768 PMCID: PMC11332184 DOI: 10.1038/s44271-024-00102-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 05/21/2024] [Indexed: 09/09/2024]
Abstract
Past research indicates that people tend to react adversely to surveillance, but does it matter if advanced technologies such as artificial intelligence conduct surveillance rather than humans? Across four experiments (Study 1, N = 107; Study 2, N = 157; Study 3, N = 117; Study 4, N = 814), we examined how participants reacted to monitoring and evaluation by human or algorithmic surveillance when recalling instances of surveillance from their lives (Study 1), generating ideas (Studies 2 and 3), or imagining working in a call center (Study 4). Our results revealed that participants subjected to algorithmic (v. human) surveillance perceived they had less autonomy (Studies 1, 3, and 4), criticized the surveillance more (Studies 1-3), performed worse (Studies 2 and 3), and reported greater intentions to resist (Studies 1 and 4). Framing the purpose of the algorithmic surveillance as developmental, and thus informational, as opposed to evaluative, mitigated the perception of decreased autonomy and level of resistance (Study 4).
Collapse
Affiliation(s)
- Rachel Schlund
- ILR School, Department of Organizational Behavior, Cornell University, Ithaca, NY, USA.
| | - Emily M Zitek
- ILR School, Department of Organizational Behavior, Cornell University, Ithaca, NY, USA
| |
Collapse
|
18
|
Proksch S, Schühle J, Streeb E, Weymann F, Luther T, Kimmerle J. The impact of text topic and assumed human vs. AI authorship on competence and quality assessment. Front Artif Intell 2024; 7:1412710. [PMID: 38881953 PMCID: PMC11176609 DOI: 10.3389/frai.2024.1412710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 05/14/2024] [Indexed: 06/18/2024] Open
Abstract
Background While Large Language Models (LLMs) are considered positively with respect to technological progress and abilities, people are rather opposed to machines making moral decisions. But the circumstances under which algorithm aversion or algorithm appreciation are more likely to occur with respect to LLMs have not yet been sufficiently investigated. Therefore, the aim of this study was to investigate how texts with moral or technological topics, allegedly written either by a human author or by ChatGPT, are perceived. Methods In a randomized controlled experiment, n = 164 participants read six texts, three of which had a moral and three a technological topic (predictor text topic). The alleged author of each text was randomly either labeled "ChatGPT" or "human author" (predictor authorship). We captured three dependent variables: assessment of author competence, assessment of content quality, and participants' intention to submit the text in a hypothetical university course (sharing intention). We hypothesized interaction effects, that is, we expected ChatGPT to score lower than alleged human authors for moral topics and higher than alleged human authors for technological topics and vice versa. Results We only found a small interaction effect for perceived author competence, p = 0.004, d = 0.40, but not for the other dependent variables. However, ChatGPT was consistently devalued compared to alleged human authors across all dependent variables: there were main effects of authorship for assessment of the author competence, p < 0.001, d = 0.95; for assessment of content quality, p < 0.001, d = 0.39; as well as for sharing intention, p < 0.001, d = 0.57. There was also a small main effect of text topic on the assessment of text quality, p = 0.002, d = 0.35. Conclusion These results are more in line with previous findings on algorithm aversion than with algorithm appreciation. We discuss the implications of these findings for the acceptance of the use of LLMs for text composition.
Collapse
Affiliation(s)
- Sebastian Proksch
- Department of Psychology, Eberhard Karls University Tuebingen, Tuebingen, Germany
| | - Julia Schühle
- Department of Psychology, Eberhard Karls University Tuebingen, Tuebingen, Germany
| | - Elisabeth Streeb
- Department of Psychology, Eberhard Karls University Tuebingen, Tuebingen, Germany
| | - Finn Weymann
- Department of Psychology, Eberhard Karls University Tuebingen, Tuebingen, Germany
| | - Teresa Luther
- Knowledge Construction Lab, Leibniz-Institut fuer Wissensmedien, Tuebingen, Germany
| | - Joachim Kimmerle
- Department of Psychology, Eberhard Karls University Tuebingen, Tuebingen, Germany
- Knowledge Construction Lab, Leibniz-Institut fuer Wissensmedien, Tuebingen, Germany
| |
Collapse
|
19
|
Cecil J, Lermer E, Hudecek MFC, Sauer J, Gaube S. Explainability does not mitigate the negative impact of incorrect AI advice in a personnel selection task. Sci Rep 2024; 14:9736. [PMID: 38679619 PMCID: PMC11056364 DOI: 10.1038/s41598-024-60220-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 04/19/2024] [Indexed: 05/01/2024] Open
Abstract
Despite the rise of decision support systems enabled by artificial intelligence (AI) in personnel selection, their impact on decision-making processes is largely unknown. Consequently, we conducted five experiments (N = 1403 students and Human Resource Management (HRM) employees) investigating how people interact with AI-generated advice in a personnel selection task. In all pre-registered experiments, we presented correct and incorrect advice. In Experiments 1a and 1b, we manipulated the source of the advice (human vs. AI). In Experiments 2a, 2b, and 2c, we further manipulated the type of explainability of AI advice (2a and 2b: heatmaps and 2c: charts). We hypothesized that accurate and explainable advice improves decision-making. The independent variables were regressed on task performance, perceived advice quality and confidence ratings. The results consistently showed that incorrect advice negatively impacted performance, as people failed to dismiss it (i.e., overreliance). Additionally, we found that the effects of source and explainability of advice on the dependent variables were limited. The lack of reduction in participants' overreliance on inaccurate advice when the systems' predictions were made more explainable highlights the complexity of human-AI interaction and the need for regulation and quality standards in HRM.
Collapse
Affiliation(s)
- Julia Cecil
- Department of Psychology, LMU Center for Leadership and People Management, LMU Munich, Munich, Germany.
| | - Eva Lermer
- Department of Psychology, LMU Center for Leadership and People Management, LMU Munich, Munich, Germany
- Department of Business Psychology, Technical University of Applied Sciences Augsburg, Augsburg, Germany
| | - Matthias F C Hudecek
- Department of Experimental Psychology, University of Regensburg, Regensburg, Germany
| | - Jan Sauer
- Department of Business Administration, University of Applied Sciences Amberg-Weiden, Weiden, Germany
| | - Susanne Gaube
- Department of Psychology, LMU Center for Leadership and People Management, LMU Munich, Munich, Germany
- UCL Global Business School for Health, University College London, London, UK
| |
Collapse
|
20
|
Wahn B, Schmitz L. A bonus task boosts people's willingness to offload cognition to an algorithm. Cogn Res Princ Implic 2024; 9:24. [PMID: 38652184 PMCID: PMC11039595 DOI: 10.1186/s41235-024-00550-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 04/03/2024] [Indexed: 04/25/2024] Open
Abstract
With the increased sophistication of technology, humans have the possibility to offload a variety of tasks to algorithms. Here, we investigated whether the extent to which people are willing to offload an attentionally demanding task to an algorithm is modulated by the availability of a bonus task and by the knowledge about the algorithm's capacity. Participants performed a multiple object tracking (MOT) task which required them to visually track targets on a screen. Participants could offload an unlimited number of targets to a "computer partner". If participants decided to offload the entire task to the computer, they could instead perform a bonus task which resulted in additional financial gain-however, this gain was conditional on a high performance accuracy in the MOT task. Thus, participants should only offload the entire task if they trusted the computer to perform accurately. We found that participants were significantly more willing to completely offload the task if they were informed beforehand that the computer's accuracy was flawless (Experiment 1 vs. 2). Participants' offloading behavior was not significantly affected by whether the bonus task was incentivized or not (Experiment 2 vs. 3). These results combined with those from our previous study (Wahn et al. in PLoS ONE 18:e0286102, 2023), which did not include a bonus task but was identical otherwise, show that the human willingness to offload an attentionally demanding task to an algorithm is considerably boosted by the availability of a bonus task-even if not incentivized-and by the knowledge about the algorithm's capacity.
Collapse
Affiliation(s)
- Basil Wahn
- Institute of Educational Research, Ruhr University Bochum, Bochum, Germany.
- Department of Cognitive Psychology and Ergonomics, Technische Universität Berlin, Berlin, Germany.
| | - Laura Schmitz
- Department of Neurology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| |
Collapse
|
21
|
Messeri L, Crockett MJ. Artificial intelligence and illusions of understanding in scientific research. Nature 2024; 627:49-58. [PMID: 38448693 DOI: 10.1038/s41586-024-07146-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 01/31/2024] [Indexed: 03/08/2024]
Abstract
Scientists are enthusiastically imagining ways in which artificial intelligence (AI) tools might improve research. Why are AI tools so attractive and what are the risks of implementing them across the research pipeline? Here we develop a taxonomy of scientists' visions for AI, observing that their appeal comes from promises to improve productivity and objectivity by overcoming human shortcomings. But proposed AI solutions can also exploit our cognitive limitations, making us vulnerable to illusions of understanding in which we believe we understand more about the world than we actually do. Such illusions obscure the scientific community's ability to see the formation of scientific monocultures, in which some types of methods, questions and viewpoints come to dominate alternative approaches, making science less innovative and more vulnerable to errors. The proliferation of AI tools in science risks introducing a phase of scientific enquiry in which we produce more but understand less. By analysing the appeal of these tools, we provide a framework for advancing discussions of responsible knowledge production in the age of AI.
Collapse
Affiliation(s)
- Lisa Messeri
- Department of Anthropology, Yale University, New Haven, CT, USA.
| | - M J Crockett
- Department of Psychology, Princeton University, Princeton, NJ, USA.
- University Center for Human Values, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
22
|
Sele D, Chugunova M. Putting a human in the loop: Increasing uptake, but decreasing accuracy of automated decision-making. PLoS One 2024; 19:e0298037. [PMID: 38335162 PMCID: PMC10857587 DOI: 10.1371/journal.pone.0298037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 01/17/2024] [Indexed: 02/12/2024] Open
Abstract
Automated decision-making gains traction, prompting discussions on regulation with calls for human oversight. Understanding how human involvement affects the acceptance of algorithmic recommendations and the accuracy of resulting decisions is vital. In an online experiment (N = 292), for a prediction task, participants choose a recommendation stemming either from an algorithm or another participant. In a between-subject design, we varied if the prediction was delegated completely or if the recommendation could be adjusted. 66% of times, participants preferred to delegate the decision to an algorithm over an equally accurate human. The preference for an algorithm increased by 7 percentage points if participants could monitor and adjust the recommendations. Participants followed algorithmic recommendations more closely. Importantly, they were less likely to intervene with the least accurate recommendations. Hence, in our experiment the human-in-the-loop design increases the uptake but decreases the accuracy of the decisions.
Collapse
Affiliation(s)
- Daniela Sele
- Center for Law & Economics, ETH Zurich, Zurich, Switzerland
| | - Marina Chugunova
- Max Planck Institute for Innovation and Competition, Munich, Germany
| |
Collapse
|
23
|
Chai F, Ma J, Wang Y, Zhu J, Han T. Grading by AI makes me feel fairer? How different evaluators affect college students' perception of fairness. Front Psychol 2024; 15:1221177. [PMID: 38371704 PMCID: PMC10869489 DOI: 10.3389/fpsyg.2024.1221177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 01/18/2024] [Indexed: 02/20/2024] Open
Abstract
Introduction In the field of education, new technologies have enhanced the objectivity and scientificity of educational evaluation. However, concerns have been raised about the fairness of evaluators, such as artificial intelligence (AI) algorithms. This study aimed to assess college students' perceptions of fairness in educational evaluation scenarios through three studies using experimental vignettes. Methods Three studies were conducted involving 172 participants in Study 1, 149 in Study 2, and 145 in Study 3. Different evaluation contexts were used in each study to assess the influence of evaluators on students' perception of fairness. Information transparency and explanations for evaluation outcomes were also examined as potential moderators. Results Study 1 found that different evaluators could significantly influence the perception of fairness under three evaluation contexts. Students perceived AI algorithms as fairer evaluators than teachers. Study 2 revealed that information transparency was a mediator, indicating that students perceived higher fairness with AI algorithms due to increased transparency compared with teachers. Study 3 revealed that the explanation of evaluation outcomes moderated the effect of evaluator on students' perception of fairness. Specifically, when provided with explanations for evaluation results, the effect of evaluator on students' perception of fairness was lessened. Discussion This study emphasizes the importance of information transparency and comprehensive explanations in the evaluation process, which is more crucial than solely focusing on the type of evaluators. It also draws attention to potential risks like algorithmic hegemony and advocates for ethical considerations, including privacy regulations, in integrating new technologies into educational evaluation systems. Overall, this study provides valuable theoretical insights and practical guidance for conducting fairer educational evaluations in the era of new technologies.
Collapse
Affiliation(s)
- Fangyuan Chai
- Graduate School of Education, Beijing Foreign Studies University, Beijing, China
| | - Jiajia Ma
- Graduate School of Education, Beijing Foreign Studies University, Beijing, China
| | - Yi Wang
- Graduate School of Education, Beijing Foreign Studies University, Beijing, China
| | - Jun Zhu
- Graduate School of Education, Beijing Foreign Studies University, Beijing, China
| | - Tingting Han
- School of Marxism, Hubei University of Economics, Wuhan, Hubei, China
| |
Collapse
|
24
|
Thuillard S, Audergon L, Kotalova T, Sonderegger A, Sauer J. Human and machine-induced social stress in complex work environments: Effects on performance and subjective state. APPLIED ERGONOMICS 2024; 115:104179. [PMID: 37984084 DOI: 10.1016/j.apergo.2023.104179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 11/02/2023] [Accepted: 11/10/2023] [Indexed: 11/22/2023]
Abstract
Social stress at work can lead to severe consequences. As a result of technological developments, social stress will increasingly be induced by machines. It is therefore crucial to understand how machine-induced social stress affects operators. The present study aimed to compare human and machine-induced social stress with regard to its effect on primary and secondary task performance, and on subjective state (e.g., self-esteem, mood and justice). 90 participants worked on a high-fidelity simulation of a complex work environment, on which they had received extensive training (2h15). Social stress was induced by a human or a machine using a combination of negative performance feedback and ostracism. Results indicate that social stress did not affect performance, affect or state self-esteem. Machine-induced and human-induced social stress overall had similar effects, except for the latter impairing perceived justice. We discuss implications of these results for automation at the workplace and outline future research directions.
Collapse
Affiliation(s)
- S Thuillard
- Université de Fribourg, Rue P.- A. de Faucigny, 1700, Fribourg, Switzerland.
| | - L Audergon
- Université de Fribourg, Rue P.- A. de Faucigny, 1700, Fribourg, Switzerland
| | - T Kotalova
- Université de Fribourg, Rue P.- A. de Faucigny, 1700, Fribourg, Switzerland
| | - A Sonderegger
- Bern University of Applied Sciences, Business School, Institute for New Work, Brückenstrasse 73, 3005, Bern, Switzerland
| | - J Sauer
- Université de Fribourg, Rue P.- A. de Faucigny, 1700, Fribourg, Switzerland
| |
Collapse
|
25
|
Nguyen T. ChatGPT in Medical Education: A Precursor for Automation Bias? JMIR MEDICAL EDUCATION 2024; 10:e50174. [PMID: 38231545 PMCID: PMC10831594 DOI: 10.2196/50174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 12/11/2023] [Indexed: 01/18/2024]
Abstract
Artificial intelligence (AI) in health care has the promise of providing accurate and efficient results. However, AI can also be a black box, where the logic behind its results is nonrational. There are concerns if these questionable results are used in patient care. As physicians have the duty to provide care based on their clinical judgment in addition to their patients' values and preferences, it is crucial that physicians validate the results from AI. Yet, there are some physicians who exhibit a phenomenon known as automation bias, where there is an assumption from the user that AI is always right. This is a dangerous mindset, as users exhibiting automation bias will not validate the results, given their trust in AI systems. Several factors impact a user's susceptibility to automation bias, such as inexperience or being born in the digital age. In this editorial, I argue that these factors and a lack of AI education in the medical school curriculum cause automation bias. I also explore the harms of automation bias and why prospective physicians need to be vigilant when using AI. Furthermore, it is important to consider what attitudes are being taught to students when introducing ChatGPT, which could be some students' first time using AI, prior to their use of AI in the clinical setting. Therefore, in attempts to avoid the problem of automation bias in the long-term, in addition to incorporating AI education into the curriculum, as is necessary, the use of ChatGPT in medical education should be limited to certain tasks. Otherwise, having no constraints on what ChatGPT should be used for could lead to automation bias.
Collapse
Affiliation(s)
- Tina Nguyen
- The University of Texas Medical Branch, Galveston, TX, United States
| |
Collapse
|
26
|
Semujanga B, Parent-Rocheleau X. Time-Based Stress and Procedural Justice: Can Transparency Mitigate the Effects of Algorithmic Compensation in Gig Work? INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2024; 21:86. [PMID: 38248549 PMCID: PMC10815495 DOI: 10.3390/ijerph21010086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 01/02/2024] [Accepted: 01/09/2024] [Indexed: 01/23/2024]
Abstract
The gig economy has led to a new management style, using algorithms to automate managerial decisions. Algorithmic management has aroused the interest of researchers, particularly regarding the prevalence of precarious working conditions and the health issues related to gig work. Despite algorithmically driven remuneration mechanisms' influence on work conditions, few studies have focused on the compensation dimension of algorithmic management. We investigate the effects of algorithmic compensation on gig workers in relation to perceptions of procedural justice and time-based stress, two important predictors of work-related health problems. Also, this study examines the moderating effect of algorithmic transparency in these relationships. Survey data were collected from 962 gig workers via a research panel. The results of hierarchical multiple regression analysis show that the degree of exposure to algorithmic compensation is positively related to time-based stress. However, contrary to our expectations, algorithmic compensation is also positively associated with procedural justice perceptions and our results indicate that this relation is enhanced at higher levels of perceived algorithmic transparency. Furthermore, transparency does not play a role in the relationship between algorithmic compensation and time-based stress. These findings suggest that perceived algorithmic transparency makes algorithmic compensation even fairer but does not appear to make it less stressful.
Collapse
Affiliation(s)
- Benjamin Semujanga
- Department of Human Resources Management, HEC Montréal, 3000 Côte Ste-Catherine, Montréal, QC H3T 2A7, Canada;
| | | |
Collapse
|
27
|
He X, Zheng X, Ding H. Existing Barriers Faced by and Future Design Recommendations for Direct-to-Consumer Health Care Artificial Intelligence Apps: Scoping Review. J Med Internet Res 2023; 25:e50342. [PMID: 38109173 PMCID: PMC10758939 DOI: 10.2196/50342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Revised: 09/20/2023] [Accepted: 11/28/2023] [Indexed: 12/19/2023] Open
Abstract
BACKGROUND Direct-to-consumer (DTC) health care artificial intelligence (AI) apps hold the potential to bridge the spatial and temporal disparities in health care resources, but they also come with individual and societal risks due to AI errors. Furthermore, the manner in which consumers interact directly with health care AI is reshaping traditional physician-patient relationships. However, the academic community lacks a systematic comprehension of the research overview for such apps. OBJECTIVE This paper systematically delineated and analyzed the characteristics of included studies, identified existing barriers and design recommendations for DTC health care AI apps mentioned in the literature and also provided a reference for future design and development. METHODS This scoping review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews guidelines and was conducted according to Arksey and O'Malley's 5-stage framework. Peer-reviewed papers on DTC health care AI apps published until March 27, 2023, in Web of Science, Scopus, the ACM Digital Library, IEEE Xplore, PubMed, and Google Scholar were included. The papers were analyzed using Braun and Clarke's reflective thematic analysis approach. RESULTS Of the 2898 papers retrieved, 32 (1.1%) covering this emerging field were included. The included papers were recently published (2018-2023), and most (23/32, 72%) were from developed countries. The medical field was mostly general practice (8/32, 25%). In terms of users and functionalities, some apps were designed solely for single-consumer groups (24/32, 75%), offering disease diagnosis (14/32, 44%), health self-management (8/32, 25%), and health care information inquiry (4/32, 13%). Other apps connected to physicians (5/32, 16%), family members (1/32, 3%), nursing staff (1/32, 3%), and health care departments (2/32, 6%), generally to alert these groups to abnormal conditions of consumer users. In addition, 8 barriers and 6 design recommendations related to DTC health care AI apps were identified. Some more subtle obstacles that are particularly worth noting and corresponding design recommendations in consumer-facing health care AI systems, including enhancing human-centered explainability, establishing calibrated trust and addressing overtrust, demonstrating empathy in AI, improving the specialization of consumer-grade products, and expanding the diversity of the test population, were further discussed. CONCLUSIONS The booming DTC health care AI apps present both risks and opportunities, which highlights the need to explore their current status. This paper systematically summarized and sorted the characteristics of the included studies, identified existing barriers faced by, and made future design recommendations for such apps. To the best of our knowledge, this is the first study to systematically summarize and categorize academic research on these apps. Future studies conducting the design and development of such systems could refer to the results of this study, which is crucial to improve the health care services provided by DTC health care AI apps.
Collapse
Affiliation(s)
- Xin He
- School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, China
| | - Xi Zheng
- School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, China
| | - Huiyuan Ding
- School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
28
|
Kim K. Maximizers' Reactance to Algorithm-Recommended Options: The Moderating Role of Autotelic vs. Instrumental Choices. Behav Sci (Basel) 2023; 13:938. [PMID: 37998684 PMCID: PMC10669481 DOI: 10.3390/bs13110938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 11/01/2023] [Accepted: 11/12/2023] [Indexed: 11/25/2023] Open
Abstract
The previous literature has provided mixed findings regarding whether consumers appreciate or are opposed to algorithms. The primary goal of this paper is to address these inconsistencies by identifying the maximizing tendency as a critical moderating variable. In Study 1, it was found that maximizers, individuals who strive for the best possible outcomes, exhibit greater reactance toward algorithm-recommended choices than satisficers, those who are satisfied with a good-enough option. This increased reactance also resulted in decreased algorithm adoption intention. Study 2 replicated and extended the findings from Study 1 by identifying the moderating role of choice goals. Maximizers are more likely to experience reactance to algorithm-recommended options when the act of choosing itself is intrinsically motivating and meaningful (i.e., autotelic choices) compared to when the decision is merely a means to an end (i.e., instrumental choices). The results of this research contribute to a nuanced understanding of how consumers with different decision-making styles navigate the landscape of choice in the digital age. Furthermore, it offers practical insights for firms that utilize algorithmic recommendations in their businesses.
Collapse
Affiliation(s)
- Kaeun Kim
- Department of Business Administration, Dong-A University, Busan 49236, Republic of Korea
| |
Collapse
|
29
|
Böhm R, Jörling M, Reiter L, Fuchs C. People devalue generative AI's competence but not its advice in addressing societal and personal challenges. COMMUNICATIONS PSYCHOLOGY 2023; 1:32. [PMID: 39242905 PMCID: PMC11332189 DOI: 10.1038/s44271-023-00032-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 10/17/2023] [Indexed: 09/09/2024]
Abstract
The release of ChatGPT and related tools have made generative artificial intelligence (AI) easily accessible for the broader public. We conducted four preregistered experimental studies (total N = 3308; participants from the US) to investigate people's perceptions of generative AI and the advice it generates on how to address societal and personal challenges. The results indicate that when individuals are (vs. are not) aware that the advice was generated by AI, they devalue the author's competence but not the content or the intention to share and follow the advice on how to address societal challenges (Study 1) and personal challenges (Studies 2a and 2b). Study 3 further shows that individuals' preference to receive advice from AI (vs. human experts) increases when they gained positive experience with generative AI advice in the past. The results are discussed regarding the nature of AI aversion in the context of generative AI and beyond.
Collapse
Affiliation(s)
- Robert Böhm
- Faculty of Psychology, University of Vienna, Universitätsstrasse 7, 1010, Vienna, Austria.
- Department of Psychology and Copenhagen Center for Social Data Science (SODAS), University of Copenhagen, Øster Farimagsgade 2A, 1353, Copenhagen K, Denmark.
| | - Moritz Jörling
- Marketing Department, EMLyon Business School, 23 Av. Guy de Collongue, 69130, Écully, France
| | - Leonhard Reiter
- Faculty of Psychology, University of Vienna, Universitätsstrasse 7, 1010, Vienna, Austria
| | - Christoph Fuchs
- Faculty of Business, Economics, and Statistics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090, Vienna, Austria
| |
Collapse
|
30
|
Rojahn J, Palu A, Skiena S, Jones JJ. American public opinion on artificial intelligence in healthcare. PLoS One 2023; 18:e0294028. [PMID: 37943752 PMCID: PMC10635466 DOI: 10.1371/journal.pone.0294028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 10/15/2023] [Indexed: 11/12/2023] Open
Abstract
Billions of dollars are being invested into developing medical artificial intelligence (AI) systems and yet public opinion of AI in the medical field seems to be mixed. Although high expectations for the future of medical AI do exist in the American public, anxiety and uncertainty about what it can do and how it works is widespread. Continuing evaluation of public opinion on AI in healthcare is necessary to ensure alignment between patient attitudes and the technologies adopted. We conducted a representative-sample survey (total N = 203) to measure the trust of the American public towards medical AI. Primarily, we contrasted preferences for AI and human professionals to be medical decision-makers. Additionally, we measured expectations for the impact and use of medical AI in the future. We present four noteworthy results: (1) The general public strongly prefers human medical professionals make medical decisions, while at the same time believing they are more likely to make culturally biased decisions than AI. (2) The general public is more comfortable with a human reading their medical records than an AI, both now and "100 years from now." (3) The general public is nearly evenly split between those who would trust their own doctor to use AI and those who would not. (4) Respondents expect AI will improve medical treatment but more so in the distant future than immediately.
Collapse
Affiliation(s)
- Jessica Rojahn
- Department of Sociology, Stony Brook University, Stony Brook, New York, United States of America
| | - Andrea Palu
- Department of Sociology, Stony Brook University, Stony Brook, New York, United States of America
| | - Steven Skiena
- Department of Computer Science, Stony Brook University, Stony Brook, New York, United States of America
| | - Jason J. Jones
- Department of Sociology, Stony Brook University, Stony Brook, New York, United States of America
- Institute for Advanced Computational Science, Stony Brook University, Stony Brook, New York, United States of America
| |
Collapse
|
31
|
Vicente L, Matute H. Humans inherit artificial intelligence biases. Sci Rep 2023; 13:15737. [PMID: 37789032 PMCID: PMC10547752 DOI: 10.1038/s41598-023-42384-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 09/09/2023] [Indexed: 10/05/2023] Open
Abstract
Artificial intelligence recommendations are sometimes erroneous and biased. In our research, we hypothesized that people who perform a (simulated) medical diagnostic task assisted by a biased AI system will reproduce the model's bias in their own decisions, even when they move to a context without AI support. In three experiments, participants completed a medical-themed classification task with or without the help of a biased AI system. The biased recommendations by the AI influenced participants' decisions. Moreover, when those participants, assisted by the AI, moved on to perform the task without assistance, they made the same errors as the AI had made during the previous phase. Thus, participants' responses mimicked AI bias even when the AI was no longer making suggestions. These results provide evidence of human inheritance of AI bias.
Collapse
Affiliation(s)
- Lucía Vicente
- Department of Psychology, Deusto University, Avenida Universidades 24, 48007, Bilbao, Spain
| | - Helena Matute
- Department of Psychology, Deusto University, Avenida Universidades 24, 48007, Bilbao, Spain.
| |
Collapse
|
32
|
Rainey C, Villikudathil AT, McConnell J, Hughes C, Bond R, McFadden S. An experimental machine learning study investigating the decision-making process of students and qualified radiographers when interpreting radiographic images. PLOS DIGITAL HEALTH 2023; 2:e0000229. [PMID: 37878569 PMCID: PMC10599497 DOI: 10.1371/journal.pdig.0000229] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 07/29/2023] [Indexed: 10/27/2023]
Abstract
AI is becoming more prevalent in healthcare and is predicted to be further integrated into workflows to ease the pressure on an already stretched service. The National Health Service in the UK has prioritised AI and Digital health as part of its Long-Term Plan. Few studies have examined the human interaction with such systems in healthcare, despite reports of biases being present with the use of AI in other technologically advanced fields, such as finance and aviation. Understanding is needed of how certain user characteristics may impact how radiographers engage with AI systems in use in the clinical setting to mitigate against problems before they arise. The aim of this study is to determine correlations of skills, confidence in AI and perceived knowledge amongst student and qualified radiographers in the UK healthcare system. A machine learning based AI model was built to predict if the interpreter was either a student (n = 67) or a qualified radiographer (n = 39) in advance, using important variables from a feature selection technique named Boruta. A survey, which required the participant to interpret a series of plain radiographic examinations with and without AI assistance, was created on the Qualtrics survey platform and promoted via social media (Twitter/LinkedIn), therefore adopting convenience, snowball sampling This survey was open to all UK radiographers, including students and retired radiographers. Pearson's correlation analysis revealed that males who were proficient in their profession were more likely than females to trust AI. Trust in AI was negatively correlated with age and with level of experience. A machine learning model was built, the best model predicted the image interpreter to be qualified radiographers with 0.93 area under curve and a prediction accuracy of 93%. Further testing in prospective validation cohorts using a larger sample size is required to determine the clinical utility of the proposed machine learning model.
Collapse
Affiliation(s)
- Clare Rainey
- Faculty of Life and Health Sciences, School of Health Sciences, Ulster University, York Street, Belfast, Northern Ireland, United Kingdom
| | - Angelina T. Villikudathil
- Faculty of Life and Health Sciences, School of Health Sciences, Ulster University, York Street, Belfast, Northern Ireland, United Kingdom
| | | | - Ciara Hughes
- Faculty of Life and Health Sciences, School of Health Sciences, Ulster University, York Street, Belfast, Northern Ireland, United Kingdom
| | - Raymond Bond
- Faculty of Computing, School of Computing, Engineering and the Built Environment, Ulster University, York Street, Belfast, Northern Ireland, United Kingdom
| | - Sonyia McFadden
- Faculty of Life and Health Sciences, School of Health Sciences, Ulster University, York Street, Belfast, Northern Ireland, United Kingdom
| |
Collapse
|
33
|
Du M. Machine vs. human, who makes a better judgment on innovation? Take GPT-4 for example. Front Artif Intell 2023; 6:1206516. [PMID: 37680588 PMCID: PMC10482032 DOI: 10.3389/frai.2023.1206516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 08/02/2023] [Indexed: 09/09/2023] Open
Abstract
Introduction Human decision-making is a complex process that is often influenced by various external and internal factors. One such factor is noise, random, and irrelevant influences that can skew outcomes. Methods This essay uses the CAT test and computer simulations to measure creativity. Results Evidence indicates that humans are intrinsically prone to noise, leading to inconsistent and, at times, inaccurate decisions. In contrast, simple rules demonstrate a higher level of accuracy and consistency, while artificial intelligence demonstrates an even higher capability to process vast data and employ logical algorithms. Discussion The potential of AI, particularly its intuitive capabilities, might be surpassing human intuition in specific decision-making scenarios. This raises crucial questions about the future roles of humans and machines in decision-making spheres, especially in domains where precision is paramount.
Collapse
Affiliation(s)
- Mark Du
- Department of Computer Science, National Taiwan University, New Taipei, Taiwan
| |
Collapse
|
34
|
McKee KR, Bai X, Fiske ST. Humans perceive warmth and competence in artificial intelligence. iScience 2023; 26:107256. [PMID: 37520710 PMCID: PMC10371826 DOI: 10.1016/j.isci.2023.107256] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 05/04/2023] [Accepted: 06/27/2023] [Indexed: 08/01/2023] Open
Abstract
Artificial intelligence (A.I.) increasingly suffuses everyday life. However, people are frequently reluctant to interact with A.I. systems. This challenges both the deployment of beneficial A.I. technology and the development of deep learning systems that depend on humans for oversight, direction, and regulation. Nine studies (N = 3,300) demonstrate that social-cognitive processes guide human interactions across a diverse range of real-world A.I. systems. Across studies, perceived warmth and competence emerge prominently in participants' impressions of A.I. systems. Judgments of warmth and competence systematically depend on human-A.I. interdependence and autonomy. In particular, participants perceive systems that optimize interests aligned with human interests as warmer and systems that operate independently from human direction as more competent. Finally, a prisoner's dilemma game shows that warmth and competence judgments predict participants' willingness to cooperate with a deep-learning system. These results underscore the generality of intent detection to perceptions of a broad array of algorithmic actors.
Collapse
Affiliation(s)
| | - Xuechunzi Bai
- Department of Psychology, Princeton University, Princeton, NJ 08540, USA
- School of Public and International Affairs, Princeton University, Princeton, NJ 08540, USA
| | - Susan T. Fiske
- Department of Psychology, Princeton University, Princeton, NJ 08540, USA
- School of Public and International Affairs, Princeton University, Princeton, NJ 08540, USA
| |
Collapse
|
35
|
Pozzi G. Testimonial injustice in medical machine learning. JOURNAL OF MEDICAL ETHICS 2023; 49:536-540. [PMID: 36635066 DOI: 10.1136/jme-2022-108630] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 01/02/2023] [Indexed: 06/17/2023]
Abstract
Machine learning (ML) systems play an increasingly relevant role in medicine and healthcare. As their applications move ever closer to patient care and cure in clinical settings, ethical concerns about the responsibility of their use come to the fore. I analyse an aspect of responsible ML use that bears not only an ethical but also a significant epistemic dimension. I focus on ML systems' role in mediating patient-physician relations. I thereby consider how ML systems may silence patients' voices and relativise the credibility of their opinions, which undermines their overall credibility status without valid moral and epistemic justification. More specifically, I argue that withholding credibility due to how ML systems operate can be particularly harmful to patients and, apart from adverse outcomes, qualifies as a form of testimonial injustice. I make my case for testimonial injustice in medical ML by considering ML systems currently used in the USA to predict patients' risk of misusing opioids (automated Prediction Drug Monitoring Programmes, PDMPs for short). I argue that the locus of testimonial injustice in ML-mediated medical encounters is found in the fact that these systems are treated as markers of trustworthiness on which patients' credibility is assessed. I further show how ML-based PDMPs exacerbate and further propagate social inequalities at the expense of vulnerable social groups.
Collapse
Affiliation(s)
- Giorgia Pozzi
- Technology, Policy and Management, Delft University of Technology, Delft, The Netherlands
| |
Collapse
|
36
|
Grassini S. Development and validation of the AI attitude scale (AIAS-4): a brief measure of general attitude toward artificial intelligence. Front Psychol 2023; 14:1191628. [PMID: 37554139 PMCID: PMC10406504 DOI: 10.3389/fpsyg.2023.1191628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 06/16/2023] [Indexed: 08/10/2023] Open
Abstract
The rapid advancement of artificial intelligence (AI) has generated an increasing demand for tools that can assess public attitudes toward AI. This study proposes the development and the validation of the AI Attitude Scale (AIAS), a concise self-report instrument designed to evaluate public perceptions of AI technology. The first version of the AIAS that the present manuscript proposes comprises five items, including one reverse-scored item, which aims to gauge individuals' beliefs about AI's influence on their lives, careers, and humanity overall. The scale is designed to capture attitudes toward AI, focusing on the perceived utility and potential impact of technology on society and humanity. The psychometric properties of the scale were investigated using diverse samples in two separate studies. An exploratory factor analysis was initially conducted on a preliminary 5-item version of the scale. Such exploratory validation study revealed the need to divide the scale into two factors. While the results demonstrated satisfactory internal consistency for the overall scale and its correlation with related psychometric measures, separate analyses for each factor showed robust internal consistency for Factor 1 but insufficient internal consistency for Factor 2. As a result, a second version of the scale is developed and validated, omitting the item that displayed weak correlation with the remaining items in the questionnaire. The refined final 1-factor, 4-item AIAS demonstrated superior overall internal consistency compared to the initial 5-item scale and the proposed factors. Further confirmatory factor analyses, performed on a different sample of participants, confirmed that the 1-factor model (4-items) of the AIAS exhibited an adequate fit to the data, providing additional evidence for the scale's structural validity and generalizability across diverse populations. In conclusion, the analyses reported in this article suggest that the developed and validated 4-items AIAS can be a valuable instrument for researchers and professionals working on AI development who seek to understand and study users' general attitudes toward AI.
Collapse
Affiliation(s)
- Simone Grassini
- Department of Psychosocial Science, University of Bergen, Bergen, Norway
- Cognitive and Behavioral Neuroscience Lab, University of Stavanger, Stavanger, Norway
| |
Collapse
|
37
|
Maslej MM, Kloiber S, Ghassemi M, Yu J, Hill SL. Out with AI, in with the psychiatrist: a preference for human-derived clinical decision support in depression care. Transl Psychiatry 2023; 13:210. [PMID: 37328465 DOI: 10.1038/s41398-023-02509-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 05/27/2023] [Accepted: 06/02/2023] [Indexed: 06/18/2023] Open
Abstract
Advancements in artificial intelligence (AI) are enabling the development of clinical support tools (CSTs) in psychiatry to facilitate the review of patient data and inform clinical care. To promote their successful integration and prevent over-reliance, it is important to understand how psychiatrists will respond to information provided by AI-based CSTs, particularly if it is incorrect. We conducted an experiment to examine psychiatrists' perceptions of AI-based CSTs for treating major depressive disorder (MDD) and to determine whether perceptions interacted with the quality of CST information. Eighty-three psychiatrists read clinical notes about a hypothetical patient with MDD and reviewed two CSTs embedded within a single dashboard: the note's summary and a treatment recommendation. Psychiatrists were randomised to believe the source of CSTs was either AI or another psychiatrist, and across four notes, CSTs provided either correct or incorrect information. Psychiatrists rated the CSTs on various attributes. Ratings for note summaries were less favourable when psychiatrists believed the notes were generated with AI as compared to another psychiatrist, regardless of whether the notes provided correct or incorrect information. A smaller preference for psychiatrist-generated information emerged in ratings of attributes that reflected the summary's accuracy or its inclusion of important information from the full clinical note. Ratings for treatment recommendations were also less favourable when their perceived source was AI, but only when recommendations were correct. There was little evidence that clinical expertise or familiarity with AI impacted results. These findings suggest that psychiatrists prefer human-derived CSTs. This preference was less pronounced for ratings that may have prompted a deeper review of CST information (i.e. a comparison with the full clinical note to evaluate the summary's accuracy or completeness, assessing an incorrect treatment recommendation), suggesting a role of heuristics. Future work should explore other contributing factors and downstream implications for integrating AI into psychiatric care.
Collapse
Affiliation(s)
- Marta M Maslej
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada.
| | - Stefan Kloiber
- Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON, Canada
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
| | - Marzyeh Ghassemi
- Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | - Joanna Yu
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | - Sean L Hill
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada
- Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON, Canada
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| |
Collapse
|
38
|
Lee JH, Hong H, Nam G, Hwang EJ, Park CM. Effect of Human-AI Interaction on Detection of Malignant Lung Nodules on Chest Radiographs. Radiology 2023; 307:e222976. [PMID: 37367443 DOI: 10.1148/radiol.222976] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
Background The factors affecting radiologists' diagnostic determinations in artificial intelligence (AI)-assisted image reading remain underexplored. Purpose To assess how AI diagnostic performance and reader characteristics influence detection of malignant lung nodules during AI-assisted reading of chest radiographs. Materials and Methods This retrospective study consisted of two reading sessions from April 2021 to June 2021. Based on the first session without AI assistance, 30 readers were assigned into two groups with equivalent areas under the free-response receiver operating characteristic curve (AUFROCs). In the second session, each group reinterpreted radiographs assisted by either a high or low accuracy AI model (blinded to the fact that two different AI models were used). Reader performance for detecting lung cancer and reader susceptibility (changing the original reading following the AI suggestion) were compared. A generalized linear mixed model was used to identify the factors influencing AI-assisted detection performance, including readers' attitudes and experiences of AI and Grit score. Results Of the 120 chest radiographs assessed, 60 were obtained in patients with lung cancer (mean age, 67 years ± 12 [SD]; 32 male; 63 cancers) and 60 in controls (mean age, 67 years ± 12; 36 male). Readers included 20 thoracic radiologists (5-18 years of experience) and 10 radiology residents (2-3 years of experience). Use of the high accuracy AI model improved readers' detection performance to a greater extent than use of the low accuracy AI model (area under the receiver operating characteristic curve, 0.77 to 0.82 vs 0.75 to 0.75; AUFROC, 0.71 to 0.79 vs 0.7 to 0.72). Readers who used the high accuracy AI showed a higher susceptibility (67%, 224 of 334 cases) to changing their diagnosis based on the AI suggestions than those using the low accuracy AI (59%, 229 of 386 cases). Accurate readings at the first session, correct AI suggestions, high accuracy Al, and diagnostic difficulty were associated with accurate AI-assisted readings, but readers' characteristics were not. Conclusion An AI model with high diagnostic accuracy led to improved performance of radiologists in detecting lung cancer on chest radiographs and increased radiologists' susceptibility to AI suggestions. © RSNA, 2023 Supplemental material is available for this article.
Collapse
Affiliation(s)
- Jong Hyuk Lee
- From the Department of Radiology (J.H.L., E.J.H., C.M.P.) and Medical Research Collaborating Center (H.H.), Seoul National University Hospital, Seoul, Korea; Lunit, Seoul, Korea (G.N.); Institute of Medical and Biological Engineering and Institute of Radiation Medicine, Seoul National University Medical Research Center, Seoul, Korea (C.M.P.); and Department of Radiology, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 03080, Korea (C.M.P.)
| | - Hyunsook Hong
- From the Department of Radiology (J.H.L., E.J.H., C.M.P.) and Medical Research Collaborating Center (H.H.), Seoul National University Hospital, Seoul, Korea; Lunit, Seoul, Korea (G.N.); Institute of Medical and Biological Engineering and Institute of Radiation Medicine, Seoul National University Medical Research Center, Seoul, Korea (C.M.P.); and Department of Radiology, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 03080, Korea (C.M.P.)
| | - Gunhee Nam
- From the Department of Radiology (J.H.L., E.J.H., C.M.P.) and Medical Research Collaborating Center (H.H.), Seoul National University Hospital, Seoul, Korea; Lunit, Seoul, Korea (G.N.); Institute of Medical and Biological Engineering and Institute of Radiation Medicine, Seoul National University Medical Research Center, Seoul, Korea (C.M.P.); and Department of Radiology, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 03080, Korea (C.M.P.)
| | - Eui Jin Hwang
- From the Department of Radiology (J.H.L., E.J.H., C.M.P.) and Medical Research Collaborating Center (H.H.), Seoul National University Hospital, Seoul, Korea; Lunit, Seoul, Korea (G.N.); Institute of Medical and Biological Engineering and Institute of Radiation Medicine, Seoul National University Medical Research Center, Seoul, Korea (C.M.P.); and Department of Radiology, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 03080, Korea (C.M.P.)
| | - Chang Min Park
- From the Department of Radiology (J.H.L., E.J.H., C.M.P.) and Medical Research Collaborating Center (H.H.), Seoul National University Hospital, Seoul, Korea; Lunit, Seoul, Korea (G.N.); Institute of Medical and Biological Engineering and Institute of Radiation Medicine, Seoul National University Medical Research Center, Seoul, Korea (C.M.P.); and Department of Radiology, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 03080, Korea (C.M.P.)
| |
Collapse
|
39
|
Cassidy T, Chapman L. Noise and transient ischaemic attacks - A challenge? J R Coll Physicians Edinb 2023; 53:132-134. [PMID: 36883336 DOI: 10.1177/14782715231161500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023] Open
Abstract
Consistency in medical decision-making is ideally expected. This includes consistency between different clinicians so that the same patient will receive the same diagnosis regardless of the assessing clinician. It also encompasses reliability as an individual clinician meaning at any given time or context, we apply the same process and principles to ensure the decisions we make do not deviate significantly from our peers or indeed our own past decisions. However, consistency in decision-making can be challenged when working within a busy healthcare system. We discuss the concept of 'noise' and explore how it affects decision-making in acute presentations of transient neurology where doctors can differ in terms of their diagnostic decisions.
Collapse
Affiliation(s)
- Tim Cassidy
- St. Vincent's University Hospital, Dublin, Ireland
| | - Lucy Chapman
- St. Vincent's University Hospital, Dublin, Ireland
| |
Collapse
|
40
|
Wang S, Kim KJ. Content Moderation on Social Media: Does It Matter Who and Why Moderates Hate Speech? CYBERPSYCHOLOGY, BEHAVIOR AND SOCIAL NETWORKING 2023. [PMID: 37140448 DOI: 10.1089/cyber.2022.0158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Artificial intelligence (AI) has been increasingly integrated into content moderation to detect and remove hate speech on social media. An online experiment (N = 478) was conducted to examine how moderation agents (AI vs. human vs. human-AI collaboration) and removal explanations (with vs. without) affect users' perceptions and acceptance of removal decisions for hate speech targeting social groups with certain characteristics, such as religion or sexual orientation. The results showed that individuals exhibit consistent levels of perceived trustworthiness and acceptance of removal decisions regardless of the type of moderation agent. When explanations for the content takedown were provided, removal decisions made jointly by humans and AI were perceived as more trustworthy than the same decisions made by humans alone, which increased users' willingness to accept the verdict. However, this moderated mediation effect was only significant when Muslims, not homosexuals, were the target of hate speech.
Collapse
Affiliation(s)
- Sai Wang
- Department of Interactive Media, School of Communication, Hong Kong Baptist University, Kowloon, Hong Kong
| | - Ki Joon Kim
- Department of Media and Communication, City University of Hong Kong, Kowloon, Hong Kong
| |
Collapse
|
41
|
Robertson C, Woods A, Bergstrand K, Findley J, Balser C, Slepian MJ. Diverse patients' attitudes towards Artificial Intelligence (AI) in diagnosis. PLOS DIGITAL HEALTH 2023; 2:e0000237. [PMID: 37205713 DOI: 10.1371/journal.pdig.0000237] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Accepted: 03/20/2023] [Indexed: 05/21/2023]
Abstract
Artificial intelligence (AI) has the potential to improve diagnostic accuracy. Yet people are often reluctant to trust automated systems, and some patient populations may be particularly distrusting. We sought to determine how diverse patient populations feel about the use of AI diagnostic tools, and whether framing and informing the choice affects uptake. To construct and pretest our materials, we conducted structured interviews with a diverse set of actual patients. We then conducted a pre-registered (osf.io/9y26x), randomized, blinded survey experiment in factorial design. A survey firm provided n = 2675 responses, oversampling minoritized populations. Clinical vignettes were randomly manipulated in eight variables with two levels each: disease severity (leukemia versus sleep apnea), whether AI is proven more accurate than human specialists, whether the AI clinic is personalized to the patient through listening and/or tailoring, whether the AI clinic avoids racial and/or financial biases, whether the Primary Care Physician (PCP) promises to explain and incorporate the advice, and whether the PCP nudges the patient towards AI as the established, recommended, and easy choice. Our main outcome measure was selection of AI clinic or human physician specialist clinic (binary, "AI uptake"). We found that with weighting representative to the U.S. population, respondents were almost evenly split (52.9% chose human doctor and 47.1% chose AI clinic). In unweighted experimental contrasts of respondents who met pre-registered criteria for engagement, a PCP's explanation that AI has proven superior accuracy increased uptake (OR = 1.48, CI 1.24-1.77, p < .001), as did a PCP's nudge towards AI as the established choice (OR = 1.25, CI: 1.05-1.50, p = .013), as did reassurance that the AI clinic had trained counselors to listen to the patient's unique perspectives (OR = 1.27, CI: 1.07-1.52, p = .008). Disease severity (leukemia versus sleep apnea) and other manipulations did not affect AI uptake significantly. Compared to White respondents, Black respondents selected AI less often (OR = .73, CI: .55-.96, p = .023) and Native Americans selected it more often (OR: 1.37, CI: 1.01-1.87, p = .041). Older respondents were less likely to choose AI (OR: .99, CI: .987-.999, p = .03), as were those who identified as politically conservative (OR: .65, CI: .52-.81, p < .001) or viewed religion as important (OR: .64, CI: .52-.77, p < .001). For each unit increase in education, the odds are 1.10 greater for selecting an AI provider (OR: 1.10, CI: 1.03-1.18, p = .004). While many patients appear resistant to the use of AI, accuracy information, nudges and a listening patient experience may help increase acceptance. To ensure that the benefits of AI are secured in clinical practice, future research on best methods of physician incorporation and patient decision making is required.
Collapse
Affiliation(s)
- Christopher Robertson
- University of Arizona, Tucson, Arizona, United States of America
- Boston University, Boston, Massachusetts, United States of America
| | - Andrew Woods
- University of Arizona, Tucson, Arizona, United States of America
| | - Kelly Bergstrand
- University of Texas at Arlington, Arlington Texas, United States of America
| | - Jess Findley
- University of Arizona, Tucson, Arizona, United States of America
| | - Cayley Balser
- University of Arizona, Tucson, Arizona, United States of America
| | - Marvin J Slepian
- University of Arizona, Tucson, Arizona, United States of America
| |
Collapse
|
42
|
Morin-Martel A. Machine learning in bail decisions and judges' trustworthiness. AI & SOCIETY 2023:1-12. [PMID: 37358945 PMCID: PMC10120473 DOI: 10.1007/s00146-023-01673-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 04/11/2023] [Indexed: 06/28/2023]
Abstract
The use of AI algorithms in criminal trials has been the subject of very lively ethical and legal debates recently. While there are concerns over the lack of accuracy and the harmful biases that certain algorithms display, new algorithms seem more promising and might lead to more accurate legal decisions. Algorithms seem especially relevant for bail decisions, because such decisions involve statistical data to which human reasoners struggle to give adequate weight. While getting the right legal outcome is a strong desideratum of criminal trials, advocates of the relational theory of procedural justice give us good reason to think that fairness and perceived fairness of legal procedures have a value that is independent from the outcome. According to this literature, one key aspect of fairness is trustworthiness. In this paper, I argue that using certain algorithms to assist bail decisions could increase three different aspects of judges' trustworthiness: (1) actual trustworthiness, (2) rich trustworthiness, and (3) perceived trustworthiness.
Collapse
|
43
|
Yu L, Li Y, Fan F. Employees' Appraisals and Trust of Artificial Intelligences' Transparency and Opacity. Behav Sci (Basel) 2023; 13:bs13040344. [PMID: 37102857 PMCID: PMC10135857 DOI: 10.3390/bs13040344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 04/16/2023] [Accepted: 04/18/2023] [Indexed: 04/28/2023] Open
Abstract
Artificial intelligence (AI) is being increasingly used as a decision agent in enterprises. Employees' appraisals and AI affect the smooth progress of AI-employee cooperation. This paper studies (1) whether employees' challenge appraisals, threat appraisals and trust in AI are different for AI transparency and opacity. (2) This study investigates how AI transparency affects employees' trust in AI through employee appraisals (challenge and threat appraisals), and (3) whether and how employees' domain knowledge about AI moderates the relationship between AI transparency and appraisals. A total of 375 participants with work experience were recruited for an online hypothetical scenario experiment. The results showed that AI transparency (vs. opacity) led to higher challenge appraisals and trust and lower threat appraisals. However, in both AI transparency and opacity, employees believed that AI decisions brought more challenges than threats. In addition, we found the parallel mediating effect of challenge appraisals and threat appraisals. AI transparency promotes employees' trust in AI by increasing employees' challenge appraisals and reducing employees' threat appraisals. Finally, employees' domain knowledge about AI moderated the relationship between AI transparency and appraisals. Specifically, domain knowledge negatively moderated the positive effect of AI transparency on challenge appraisals, and domain knowledge positively moderated the negative effect of AI transparency on threat appraisals.
Collapse
Affiliation(s)
- Liangru Yu
- School of Economics and Management, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Yi Li
- School of Economics and Management, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Fan Fan
- Faculty of Collaborative Regional Innovation, Ehime University, Matsuyama 790-8566, Ehime, Japan
| |
Collapse
|
44
|
Jaffé ME, Douneva M, Bartlome R, Greifeneder R. A million reasons or just one? How coin flips impact the number of relevant reasons for decisions. EUROPEAN JOURNAL OF SOCIAL PSYCHOLOGY 2023. [DOI: 10.1002/ejsp.2941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
Affiliation(s)
- Mariela E. Jaffé
- University of Basel Basel Switzerland
- University Psychiatric Clinics Basel Basel Switzerland
| | | | | | | |
Collapse
|
45
|
Schaap G, Bosse T, Hendriks Vettehen P. The ABC of algorithmic aversion: not agent, but benefits and control determine the acceptance of automated decision-making. AI & SOCIETY 2023. [DOI: 10.1007/s00146-023-01649-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
Abstract
AbstractWhile algorithmic decision-making (ADM) is projected to increase exponentially in the coming decades, the academic debate on whether people are ready to accept, trust, and use ADM as opposed to human decision-making is ongoing. The current research aims at reconciling conflicting findings on ‘algorithmic aversion’ in the literature. It does so by investigating algorithmic aversion while controlling for two important characteristics that are often associated with ADM: increased benefits (monetary and accuracy) and decreased user control. Across three high-powered (Ntotal = 1192), preregistered 2 (agent: algorithm/human) × 2 (benefits: high/low) × 2 (control: user control/no control) between-subjects experiments, and two domains (finance and dating), the results were quite consistent: there is little evidence for a default aversion against algorithms and in favor of human decision makers. Instead, users accept or reject decisions and decisional agents based on their predicted benefits and the ability to exercise control over the decision.
Collapse
|
46
|
Bercean BA, Birhala A, Ardelean PG, Barbulescu I, Benta MM, Rasadean CD, Costachescu D, Avramescu C, Tenescu A, Iarca S, Buburuzan AS, Marcu M, Birsasteanu F. Evidence of a cognitive bias in the quantification of COVID-19 with CT: an artificial intelligence randomised clinical trial. Sci Rep 2023; 13:4887. [PMID: 36966179 PMCID: PMC10039355 DOI: 10.1038/s41598-023-31910-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 03/19/2023] [Indexed: 03/27/2023] Open
Abstract
Chest computed tomography (CT) has played a valuable, distinct role in the screening, diagnosis, and follow-up of COVID-19 patients. The quantification of COVID-19 pneumonia on CT has proven to be an important predictor of the treatment course and outcome of the patient although it remains heavily reliant on the radiologist's subjective perceptions. Here, we show that with the adoption of CT for COVID-19 management, a new type of psychophysical bias has emerged in radiology. A preliminary survey of 40 radiologists and a retrospective analysis of CT data from 109 patients from two hospitals revealed that radiologists overestimated the percentage of lung involvement by 10.23 ± 4.65% and 15.8 ± 6.6%, respectively. In the subsequent randomised controlled trial, artificial intelligence (AI) decision support reduced the absolute overestimation error (P < 0.001) from 9.5% ± 6.6 (No-AI analysis arm, n = 38) to 1.0% ± 5.2 (AI analysis arm, n = 38). These results indicate a human perception bias in radiology that has clinically meaningful effects on the quantitative analysis of COVID-19 on CT. The objectivity of AI was shown to be a valuable complement in mitigating the radiologist's subjectivity, reducing the overestimation tenfold.Trial registration: https://Clinicaltrial.gov . Identifier: NCT05282056, Date of registration: 01/02/2022.
Collapse
Affiliation(s)
- Bogdan A Bercean
- Rayscape, 5, Nicolae Iorga, 010431, Bucharest, Romania.
- Politehnica University of Timișoara, 2, Victoriei Square, 300006, Timisoara, Romania.
| | | | - Paula G Ardelean
- Rayscape, 5, Nicolae Iorga, 010431, Bucharest, Romania
- Department of Radiology, Pius Brinzeu County Emergency Hospital, 156, Liviu Rebreanu, 300723, Timisoara, Romania
| | - Ioana Barbulescu
- Rayscape, 5, Nicolae Iorga, 010431, Bucharest, Romania
- Department of Radiology, Pius Brinzeu County Emergency Hospital, 156, Liviu Rebreanu, 300723, Timisoara, Romania
| | - Marius M Benta
- Rayscape, 5, Nicolae Iorga, 010431, Bucharest, Romania
- Department of Radiology, Pius Brinzeu County Emergency Hospital, 156, Liviu Rebreanu, 300723, Timisoara, Romania
| | - Cristina D Rasadean
- Rayscape, 5, Nicolae Iorga, 010431, Bucharest, Romania
- Department of Radiology, Pius Brinzeu County Emergency Hospital, 156, Liviu Rebreanu, 300723, Timisoara, Romania
| | - Dan Costachescu
- Rayscape, 5, Nicolae Iorga, 010431, Bucharest, Romania
- Victor Babeş University of Medicine and Pharmacy, 2, Eftimie Murgu Square, 300041, Timisoara, Romania
| | - Cristian Avramescu
- Rayscape, 5, Nicolae Iorga, 010431, Bucharest, Romania
- Politehnica University of Timișoara, 2, Victoriei Square, 300006, Timisoara, Romania
| | - Andrei Tenescu
- Rayscape, 5, Nicolae Iorga, 010431, Bucharest, Romania
- Politehnica University of Timișoara, 2, Victoriei Square, 300006, Timisoara, Romania
| | - Stefan Iarca
- Rayscape, 5, Nicolae Iorga, 010431, Bucharest, Romania
| | - Alexandru S Buburuzan
- Rayscape, 5, Nicolae Iorga, 010431, Bucharest, Romania
- The University of Manchester, Oxford Rd, Manchester, M13 9PL, UK
| | - Marius Marcu
- Politehnica University of Timișoara, 2, Victoriei Square, 300006, Timisoara, Romania
| | - Florin Birsasteanu
- Department of Radiology, Pius Brinzeu County Emergency Hospital, 156, Liviu Rebreanu, 300723, Timisoara, Romania
- Victor Babeş University of Medicine and Pharmacy, 2, Eftimie Murgu Square, 300041, Timisoara, Romania
| |
Collapse
|
47
|
Panigutti C, Beretta A, Fadda D, Giannotti F, Pedreschi D, Perotti A, Rinzivillo S. Co-design of human-centered, explainable AI for clinical decision support. ACM T INTERACT INTEL 2023. [DOI: 10.1145/3587271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
eXplainable AI (XAI) involves two intertwined but separate challenges: the development of techniques to extract explanations from black-box AI models, and the way such explanations are presented to users, i.e., the explanation user interface. Despite its importance, the second aspect has received limited attention so far in the literature. Effective AI explanation interfaces are fundamental for allowing human decision-makers to take advantage and oversee high-risk AI systems effectively. Following an iterative design approach, we present the first cycle of prototyping-testing-redesigning of an explainable AI technique, and its explanation user interface for clinical Decision Support Systems (DSS). We first present an XAI technique that meets the technical requirements of the healthcare domain: sequential, ontology-linked patient data, and multi-label classification tasks. We demonstrate its applicability to explain a clinical DSS, and we design a first prototype of an explanation user interface. Next, we test such a prototype with healthcare providers and collect their feedback, with a two-fold outcome: first, we obtain evidence that explanations increase users’ trust in the XAI system, and second, we obtain useful insights on the perceived deficiencies of their interaction with the system, so that we can re-design a better, more human-centered explanation interface.
Collapse
Affiliation(s)
- Cecilia Panigutti
- Università di Pisa, Italy and European Commission, Joint Research Centre (JRC), Italy
| | | | | | | | | | | | | |
Collapse
|
48
|
Bauer K, von Zahn M, Hinz O. Expl(AI)ned: The Impact of Explainable Artificial Intelligence on Users’ Information Processing. INFORMATION SYSTEMS RESEARCH 2023. [DOI: 10.1287/isre.2023.1199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Although future regulations increasingly advocate that AI applications must be interpretable by users, we know little about how such explainability can affect human information processing. By conducting two experimental studies, we help to fill this gap. We show that explanations pave the way for AI systems to reshape users' understanding of the world around them. Specifically, state-of-the-art explainability methods evoke mental model adjustments that are subject to confirmation bias, allowing misconceptions and mental errors to persist and even accumulate. Moreover, mental model adjustments create spillover effects that alter users' behavior in related but distinct domains where they do not have access to an AI system. These spillover effects of mental model adjustments risk manipulating user behavior, promoting discriminatory biases, and biasing decision making. The reported findings serve as a warning that the indiscriminate use of modern explainability methods as an isolated measure to address AI systems' black-box problems can lead to unintended, unforeseen problems because it creates a new channel through which AI systems can influence human behavior in various domains.
Collapse
Affiliation(s)
- Kevin Bauer
- Information Systems Department, University of Mannheim, 68161 Mannheim, Germany
| | - Moritz von Zahn
- Information Systems Department, Goethe University, 60323 Frankfurt am Main, Germany
| | - Oliver Hinz
- Information Systems Department, Goethe University, 60323 Frankfurt am Main, Germany
| |
Collapse
|
49
|
Westphal M, Vössing M, Satzger G, Yom-Tov GB, Rafaeli A. Decision control and explanations in human-AI collaboration: Improving user perceptions and compliance. COMPUTERS IN HUMAN BEHAVIOR 2023. [DOI: 10.1016/j.chb.2023.107714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
|
50
|
Gain-loss separability in human- but not computer-based changes of mind. COMPUTERS IN HUMAN BEHAVIOR 2023. [DOI: 10.1016/j.chb.2023.107712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
|