1
|
Ziereis A, Schacht A. Validation of scrambling methods for vocal affect bursts. Behav Res Methods 2024; 56:3089-3101. [PMID: 37673809 PMCID: PMC11133081 DOI: 10.3758/s13428-023-02222-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/14/2023] [Indexed: 09/08/2023]
Abstract
Studies on perception and cognition require sound methods allowing us to disentangle the basic sensory processing of physical stimulus properties from the cognitive processing of stimulus meaning. Similar to the scrambling of images, the scrambling of auditory signals is aimed at creating stimulus instances that are unrecognizable but have comparable low-level features. In the present study, we generated scrambled stimuli of short vocalizations taken from the Montreal Affective Voices database (Belin et al., Behav Res Methods, 40(2):531-539, 2008) by applying four different scrambling methods (frequency-, phase-, and two time-scrambling transformations). The original stimuli and their scrambled versions were judged by 60 participants for the apparency of a human voice, gender, and valence of the expressions, or, if no human voice was detected, for the valence of the subjective response to the stimulus. The human-likeness ratings were reduced for all scrambled versions relative to the original stimuli, albeit to a lesser extent for phase-scrambled versions of neutral bursts. For phase-scrambled neutral bursts, valence ratings were equivalent to those of the original neutral burst. All other scrambled versions were rated as slightly unpleasant, indicating that they should be used with caution due to their potential aversiveness.
Collapse
Affiliation(s)
- Annika Ziereis
- Department for Cognition, Emotion and Behavior, Affective Neuroscience and Psychophysiology Laboratory, Institute of Psychology, University of Göttingen, Göttingen, Germany.
| | - Anne Schacht
- Department for Cognition, Emotion and Behavior, Affective Neuroscience and Psychophysiology Laboratory, Institute of Psychology, University of Göttingen, Göttingen, Germany
| |
Collapse
|
2
|
Huang G, Moore RK. Using social robots for language learning: are we there yet? JOURNAL OF CHINA COMPUTER-ASSISTED LANGUAGE LEARNING 2023; 3:208-230. [PMID: 38013743 PMCID: PMC10464067 DOI: 10.1515/jccall-2023-0013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Accepted: 05/29/2023] [Indexed: 11/29/2023]
Abstract
Along with the development of speech and language technologies and growing market interest, social robots have attracted more academic and commercial attention in recent decades. Their multimodal embodiment offers a broad range of possibilities, which have gained importance in the education sector. It has also led to a new technology-based field of language education: robot-assisted language learning (RALL). RALL has developed rapidly in second language learning, especially driven by the need to compensate for the shortage of first-language tutors. There are many implementation cases and studies of social robots, from early government-led attempts in Japan and South Korea to increasing research interests in Europe and worldwide. Compared with RALL used for English as a foreign language (EFL), however, there are fewer studies on applying RALL for teaching Chinese as a foreign language (CFL). One potential reason is that RALL is not well-known in the CFL field. This scope review paper attempts to fill this gap by addressing the balance between classroom implementation and research frontiers of social robots. The review first introduces the technical tool used in RALL, namely the social robot, at a high level. It then presents a historical overview of the real-life implementation of social robots in language classrooms in East Asia and Europe. It then provides a summary of the evaluation of RALL from the perspectives of L2 learners, teachers and technology developers. The overall goal of this paper is to gain insights into RALL's potential and challenges and identify a rich set of open research questions for applying RALL to CFL. It is hoped that the review may inform interdisciplinary analysis and practice for scientific research and front-line teaching in future.
Collapse
|
3
|
Yorgancigil E, Yildirim F, Urgen BA, Erdogan SB. An Exploratory Analysis of the Neural Correlates of Human-Robot Interactions With Functional Near Infrared Spectroscopy. Front Hum Neurosci 2022; 16:883905. [PMID: 35923750 PMCID: PMC9339604 DOI: 10.3389/fnhum.2022.883905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 06/22/2022] [Indexed: 11/13/2022] Open
Abstract
Functional near infrared spectroscopy (fNIRS) has been gaining increasing interest as a practical mobile functional brain imaging technology for understanding the neural correlates of social cognition and emotional processing in the human prefrontal cortex (PFC). Considering the cognitive complexity of human-robot interactions, the aim of this study was to explore the neural correlates of emotional processing of congruent and incongruent pairs of human and robot audio-visual stimuli in the human PFC with fNIRS methodology. Hemodynamic responses from the PFC region of 29 subjects were recorded with fNIRS during an experimental paradigm which consisted of auditory and visual presentation of human and robot stimuli. Distinct neural responses to human and robot stimuli were detected at the dorsolateral prefrontal cortex (DLPFC) and orbitofrontal cortex (OFC) regions. Presentation of robot voice elicited significantly less hemodynamic response than presentation of human voice in a left OFC channel. Meanwhile, processing of human faces elicited significantly higher hemodynamic activity when compared to processing of robot faces in two left DLPFC channels and a left OFC channel. Significant correlation between the hemodynamic and behavioral responses for the face-voice mismatch effect was found in the left OFC. Our results highlight the potential of fNIRS for unraveling the neural processing of human and robot audio-visual stimuli, which might enable optimization of social robot designs and contribute to elucidation of the neural processing of human and robot stimuli in the PFC in naturalistic conditions.
Collapse
Affiliation(s)
- Emre Yorgancigil
- Department of Medical Engineering, Acibadem Mehmet Ali Aydinlar University, Istanbul, Turkey
- *Correspondence: Emre Yorgancigil
| | - Funda Yildirim
- Cognitive Science Master's Program, Yeditepe University, Istanbul, Turkey
- Department of Computer Engineering, Yeditepe University, Istanbul, Turkey
| | - Burcu A. Urgen
- Department of Psychology, Bilkent University, Ankara, Turkey
- Neuroscience Graduate Program, Bilkent University, Ankara, Turkey
- Aysel Sabuncu Brain Research Center, National Magnetic Resonance Research Center (UMRAM), Ankara, Turkey
| | - Sinem Burcu Erdogan
- Department of Medical Engineering, Acibadem Mehmet Ali Aydinlar University, Istanbul, Turkey
| |
Collapse
|
4
|
Schreibelmayr S, Mara M. Robot Voices in Daily Life: Vocal Human-Likeness and Application Context as Determinants of User Acceptance. Front Psychol 2022; 13:787499. [PMID: 35645911 PMCID: PMC9136288 DOI: 10.3389/fpsyg.2022.787499] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 03/24/2022] [Indexed: 11/13/2022] Open
Abstract
The growing popularity of speech interfaces goes hand in hand with the creation of synthetic voices that sound ever more human. Previous research has been inconclusive about whether anthropomorphic design features of machines are more likely to be associated with positive user responses or, conversely, with uncanny experiences. To avoid detrimental effects of synthetic voice design, it is therefore crucial to explore what level of human realism human interactors prefer and whether their evaluations may vary across different domains of application. In a randomized laboratory experiment, 165 participants listened to one of five female-sounding robot voices, each with a different degree of human realism. We assessed how much participants anthropomorphized the voice (by subjective human-likeness ratings, a name-giving task and an imagination task), how pleasant and how eerie they found it, and to what extent they would accept its use in various domains. Additionally, participants completed Big Five personality measures and a tolerance of ambiguity scale. Our results indicate a positive relationship between human-likeness and user acceptance, with the most realistic sounding voice scoring highest in pleasantness and lowest in eeriness. Participants were also more likely to assign real human names to the voice (e.g., "Julia" instead of "T380") if it sounded more realistic. In terms of application context, participants overall indicated lower acceptance of the use of speech interfaces in social domains (care, companionship) than in others (e.g., information & navigation), though the most human-like voice was rated significantly more acceptable in social applications than the remaining four. While most personality factors did not prove influential, openness to experience was found to moderate the relationship between voice type and user acceptance such that individuals with higher openness scores rated the most human-like voice even more positively. Study results are discussed in the light of the presented theory and in relation to open research questions in the field of synthetic voice design.
Collapse
|
5
|
Diel A, Weigelt S, Macdorman KF. A Meta-analysis of the Uncanny Valley's Independent and Dependent Variables. ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION 2022. [DOI: 10.1145/3470742] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
The
uncanny valley (UV)
effect is a negative affective reaction to human-looking artificial entities. It hinders comfortable, trust-based interactions with android robots and virtual characters. Despite extensive research, a consensus has not formed on its theoretical basis or methodologies. We conducted a meta-analysis to assess operationalizations of human likeness (independent variable) and the UV effect (dependent variable). Of 468 studies, 72 met the inclusion criteria. These studies employed 10 different stimulus creation techniques, 39 affect measures, and 14 indirect measures. Based on 247 effect sizes, a three-level meta-analysis model revealed the UV effect had a large effect size, Hedges’
g
= 1.01 [0.80, 1.22]. A mixed-effects meta-regression model with creation technique as the moderator variable revealed
face distortion
produced the largest effect size,
g
= 1.46 [0.69, 2.24], followed by
distinct entities, g
= 1.20 [1.02, 1.38],
realism render, g
= 0.99 [0.62, 1.36], and
morphing, g
= 0.94 [0.64, 1.24]. Affective indices producing the largest effects were
threatening, likable, aesthetics, familiarity
, and
eeriness
, and indirect measures were
dislike frequency, categorization reaction time, like frequency, avoidance
, and
viewing duration
. This meta-analysis—the first on the UV effect—provides a methodological foundation and design principles for future research.
Collapse
Affiliation(s)
- Alexander Diel
- School of Psychology, Cardiff University, Cardiff, United Kingdom
| | - Sarah Weigelt
- Department of Vision, Visual Impairments & Blindness, Faculty of Rehabilitation Sciences, Technical University of Dortmund, Dortmund, Germany
| | - Karl F. Macdorman
- School of Informatics and Computing, Indiana University, Indianapolis, IN, USA
| |
Collapse
|
6
|
Diel A, MacDorman KF. Creepy cats and strange high houses: Support for configural processing in testing predictions of nine uncanny valley theories. J Vis 2021; 21:1. [PMID: 33792617 PMCID: PMC8024776 DOI: 10.1167/jov.21.4.1] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
In 1970, Masahiro Mori proposed the uncanny valley (UV), a region in a human-likeness continuum where an entity risks eliciting a cold, eerie, repellent feeling. Recent studies have shown that this feeling can be elicited by entities modeled not only on humans but also nonhuman animals. The perceptual and cognitive mechanisms underlying the UV effect are not well understood, although many theories have been proposed to explain them. To test the predictions of nine classes of theories, a within-subjects experiment was conducted with 136 participants. The theories' predictions were compared with ratings of 10 classes of stimuli on eeriness and coldness indices. One type of theory, configural processing, predicted eight out of nine significant effects. Atypicality, in its extended form, in which the uncanny valley effect is amplified by the stimulus appearing more human, also predicted eight. Threat avoidance predicted seven; atypicality, perceptual mismatch, and mismatch+ predicted six; category+, novelty avoidance, mate selection, and psychopathy avoidance predicted five; and category uncertainty predicted three. Empathy's main prediction was not supported. Given that the number of significant effects predicted depends partly on our choice of hypotheses, a detailed consideration of each result is advised. We do, however, note the methodological value of examining many competing theories in the same experiment.
Collapse
Affiliation(s)
- Alexander Diel
- School of Psychology, Cardiff University, Cardiff, United Kingdom.,Indiana University School of Informatics and Computing, Indianapolis, IN, USA.,
| | - Karl F MacDorman
- Indiana University School of Informatics and Computing, Indianapolis, IN, USA.,
| |
Collapse
|
7
|
Chattopadhyay D, Ma T, Sharifi H, Martyn-Nemeth P. Computer-Controlled Virtual Humans in Patient-Facing Systems: Systematic Review and Meta-Analysis. J Med Internet Res 2020; 22:e18839. [PMID: 32729837 PMCID: PMC7426801 DOI: 10.2196/18839] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 05/08/2020] [Accepted: 05/20/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Virtual humans (VH) are computer-generated characters that appear humanlike and simulate face-to-face conversations using verbal and nonverbal cues. Unlike formless conversational agents, like smart speakers or chatbots, VH bring together the capabilities of both a conversational agent and an interactive avatar (computer-represented digital characters). Although their use in patient-facing systems has garnered substantial interest, it is unknown to what extent VH are effective in health applications. OBJECTIVE The purpose of this review was to examine the effectiveness of VH in patient-facing systems. The design and implementation characteristics of these systems were also examined. METHODS Electronic bibliographic databases were searched for peer-reviewed articles with relevant key terms. Studies were included in the systematic review if they designed or evaluated VH in patient-facing systems. Of the included studies, studies that used a randomized controlled trial to evaluate VH were included in the meta-analysis; they were then summarized using the PICOTS framework (population, intervention, comparison group, outcomes, time frame, setting). Summary effect sizes, using random-effects models, were calculated, and the risk of bias was assessed. RESULTS Among the 8,125 unique records identified, 53 articles describing 33 unique systems, were qualitatively, systematically reviewed. Two distinct design categories emerged - simple VH and VH augmented with health sensors and trackers. Of the 53 articles, 16 (26 studies) with 44 primary and 22 secondary outcomes were included in the meta-analysis. Meta-analysis of the 44 primary outcome measures revealed a significant difference between intervention and control conditions, favoring the VH intervention (SMD = .166, 95% CI .039-.292, P=.012), but with evidence of some heterogeneity, I2=49.3%. There were more cross-sectional (k=15) than longitudinal studies (k=11). The intervention was delivered using a personal computer in most studies (k=18), followed by a tablet (k=4), mobile kiosk (k=2), head-mounted display (k=1), and a desktop computer in a community center (k=1). CONCLUSIONS We offer evidence for the efficacy of VH in patient-facing systems. Considering that studies included different population and outcome types, more focused analysis is needed in the future. Future studies also need to identify what features of virtual human interventions contribute toward their effectiveness.
Collapse
Affiliation(s)
- Debaleena Chattopadhyay
- Department of Computer Science, University of Illinois at Chicago, Chicago, IL, United States
| | - Tengteng Ma
- Department of Information and Decision Sciences, University of Illinois at Chicago, Chicago, IL, United States
| | - Hasti Sharifi
- Department of Computer Science, University of Illinois at Chicago, Chicago, IL, United States
| | - Pamela Martyn-Nemeth
- Department of Biobehavioral Health Science, University of Illinois at Chicago, Chicago, IL, United States
| |
Collapse
|
8
|
|
9
|
MacDorman KF, Chattopadhyay D. Reducing consistency in human realism increases the uncanny valley effect; increasing category uncertainty does not. Cognition 2015; 146:190-205. [PMID: 26435049 DOI: 10.1016/j.cognition.2015.09.019] [Citation(s) in RCA: 112] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2014] [Revised: 09/21/2015] [Accepted: 09/22/2015] [Indexed: 12/01/2022]
Abstract
Human replicas may elicit unintended cold, eerie feelings in viewers, an effect known as the uncanny valley. Masahiro Mori, who proposed the effect in 1970, attributed it to inconsistencies in the replica's realism with some of its features perceived as human and others as nonhuman. This study aims to determine whether reducing realism consistency in visual features increases the uncanny valley effect. In three rounds of experiments, 548 participants categorized and rated humans, animals, and objects that varied from computer animated to real. Two sets of features were manipulated to reduce realism consistency. (For humans, the sets were eyes-eyelashes-mouth and skin-nose-eyebrows.) Reducing realism consistency caused humans and animals, but not objects, to appear eerier and colder. However, the predictions of a competing theory, proposed by Ernst Jentsch in 1906, were not supported: The most ambiguous representations-those eliciting the greatest category uncertainty-were neither the eeriest nor the coldest.
Collapse
Affiliation(s)
- Karl F MacDorman
- Indiana University School of Informatics and Computing, 535 West Michigan St., Indianapolis, IN 46202, USA.
| | - Debaleena Chattopadhyay
- Indiana University School of Informatics and Computing, 535 West Michigan St., Indianapolis, IN 46202, USA
| |
Collapse
|