1
|
Kamiloğlu RG, Sauter DA. Sounds like a fight: listeners can infer behavioural contexts from spontaneous nonverbal vocalisations. Cogn Emot 2024; 38:277-295. [PMID: 37997898 PMCID: PMC11057848 DOI: 10.1080/02699931.2023.2285854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 11/13/2023] [Indexed: 11/25/2023]
Abstract
When we hear another person laugh or scream, can we tell the kind of situation they are in - for example, whether they are playing or fighting? Nonverbal expressions are theorised to vary systematically across behavioural contexts. Perceivers might be sensitive to these putative systematic mappings and thereby correctly infer contexts from others' vocalisations. Here, in two pre-registered experiments, we test the prediction that listeners can accurately deduce production contexts (e.g. being tickled, discovering threat) from spontaneous nonverbal vocalisations, like sighs and grunts. In Experiment 1, listeners (total n = 3120) matched 200 nonverbal vocalisations to one of 10 contexts using yes/no response options. Using signal detection analysis, we show that listeners were accurate at matching vocalisations to nine of the contexts. In Experiment 2, listeners (n = 337) categorised the production contexts by selecting from 10 response options in a forced-choice task. By analysing unbiased hit rates, we show that participants categorised all 10 contexts at better-than-chance levels. Together, these results demonstrate that perceivers can infer contexts from nonverbal vocalisations at rates that exceed that of random selection, suggesting that listeners are sensitive to systematic mappings between acoustic structures in vocalisations and behavioural contexts.
Collapse
Affiliation(s)
- Roza G. Kamiloğlu
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands
- Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Disa A. Sauter
- Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
2
|
Anikin A, Canessa-Pollard V, Pisanski K, Massenet M, Reby D. Beyond speech: Exploring diversity in the human voice. iScience 2023; 26:108204. [PMID: 37908309 PMCID: PMC10613903 DOI: 10.1016/j.isci.2023.108204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 07/20/2023] [Accepted: 10/11/2023] [Indexed: 11/02/2023] Open
Abstract
Humans have evolved voluntary control over vocal production for speaking and singing, while preserving the phylogenetically older system of spontaneous nonverbal vocalizations such as laughs and screams. To test for systematic acoustic differences between these vocal domains, we analyzed a broad, cross-cultural corpus representing over 2 h of speech, singing, and nonverbal vocalizations. We show that, while speech is relatively low-pitched and tonal with mostly regular phonation, singing and especially nonverbal vocalizations vary enormously in pitch and often display harsh-sounding, irregular phonation owing to nonlinear phenomena. The evolution of complex supralaryngeal articulatory spectro-temporal modulation has been critical for speech, yet has not significantly constrained laryngeal source modulation. In contrast, articulation is very limited in nonverbal vocalizations, which predominantly contain minimally articulated open vowels and rapid temporal modulation in the roughness range. We infer that vocal source modulation works best for conveying affect, while vocal filter modulation mainly facilitates semantic communication.
Collapse
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Lund University, Lund, Sweden
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Inserm, 23 rue Michelon, 42023 Saint-Etienne, France
| | - Valentina Canessa-Pollard
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Inserm, 23 rue Michelon, 42023 Saint-Etienne, France
- Psychology, Institute of Psychology, Business and Human Sciences, University of Chichester, Chichester, West Sussex PO19 6PE, UK
| | - Katarzyna Pisanski
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Inserm, 23 rue Michelon, 42023 Saint-Etienne, France
- CNRS French National Centre for Scientific Research, DDL Dynamics of Language Lab, University of Lyon 2, 69007 Lyon, France
- Institute of Psychology, University of Wrocław, Dawida 1, 50-527 Wrocław, Poland
| | - Mathilde Massenet
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Inserm, 23 rue Michelon, 42023 Saint-Etienne, France
| | - David Reby
- ENES Bioacoustics Research Lab, CRNL, University of Saint-Etienne, CNRS, Inserm, 23 rue Michelon, 42023 Saint-Etienne, France
| |
Collapse
|
3
|
Johnson KT, Narain J, Quatieri T, Maes P, Picard RW. ReCANVo: A database of real-world communicative and affective nonverbal vocalizations. Sci Data 2023; 10:523. [PMID: 37543663 PMCID: PMC10404278 DOI: 10.1038/s41597-023-02405-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 07/24/2023] [Indexed: 08/07/2023] Open
Abstract
Nonverbal vocalizations, such as sighs, grunts, and yells, are informative expressions within typical verbal speech. Likewise, individuals who produce 0-10 spoken words or word approximations ("minimally speaking" individuals) convey rich affective and communicative information through nonverbal vocalizations even without verbal speech. Yet, despite their rich content, little to no data exists on the vocal expressions of this population. Here, we present ReCANVo: Real-World Communicative and Affective Nonverbal Vocalizations - a novel dataset of non-speech vocalizations labeled by function from minimally speaking individuals. The ReCANVo database contains over 7000 vocalizations spanning communicative and affective functions from eight minimally speaking individuals, along with communication profiles for each participant. Vocalizations were recorded in real-world settings and labeled in real-time by a close family member who knew the communicator well and had access to contextual information while labeling. ReCANVo is a novel database of nonverbal vocalizations from minimally speaking individuals, the largest available dataset of nonverbal vocalizations, and one of the only affective speech datasets collected amidst daily life across contexts.
Collapse
Affiliation(s)
- Kristina T Johnson
- Massachusetts Institute of Technology, MIT Media Lab, Cambridge, MA, USA.
| | - Jaya Narain
- Massachusetts Institute of Technology, MIT Media Lab, Cambridge, MA, USA.
| | - Thomas Quatieri
- Massachusetts Institute of Technology, Lincoln Laboratory, Lexington, MA, USA
| | - Pattie Maes
- Massachusetts Institute of Technology, MIT Media Lab, Cambridge, MA, USA
| | - Rosalind W Picard
- Massachusetts Institute of Technology, MIT Media Lab, Cambridge, MA, USA
| |
Collapse
|
4
|
Anikin A. The honest sound of physical effort. PeerJ 2023; 11:e14944. [PMID: 37033726 PMCID: PMC10078454 DOI: 10.7717/peerj.14944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 02/02/2023] [Indexed: 04/05/2023] Open
Abstract
Acoustic correlates of physical effort are still poorly understood, even though effort is vocally communicated in a variety of contexts with crucial fitness consequences, including both confrontational and reproductive social interactions. In this study 33 lay participants spoke during a brief, but intense isometric hold (L-sit), first without any voice-related instructions, and then asked either to conceal their effort or to imitate it without actually performing the exercise. Listeners in two perceptual experiments then rated 383 recordings on perceived level of effort (n = 39 listeners) or categorized them as relaxed speech, actual effort, pretended effort, or concealed effort (n = 102 listeners). As expected, vocal effort increased compared to baseline, but the accompanying acoustic changes (increased loudness, pitch, and tense voice quality) were under voluntary control, so that they could be largely suppressed or imitated at will. In contrast, vocal tremor at approximately 10 Hz was most pronounced under actual load, and its experimental addition to relaxed baseline recordings created the impression of concealed effort. In sum, a brief episode of intense physical effort causes pronounced vocal changes, some of which are difficult to control. Listeners can thus estimate the true level of exertion, whether to judge the condition of their opponent in a fight or to monitor a partner’s investment into cooperative physical activities.
Collapse
|
5
|
Hadjimichael D, Tsoukas H. Phronetic improvisation: A virtue ethics perspective. MANAGEMENT LEARNING 2022. [DOI: 10.1177/13505076221111855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Traditional approaches to organizational improvisation treat it as a merely functional response to environmental constrains and unforeseen disruptions, neglecting its moral dimension, especially the valued ends improvisers aim to achieve. We attempt to address this gap by drawing on virtue ethics. In particular, we explore how phronetic improvisation is accomplished by drawing on the diary of an emergency-room physician, in which she describes her (and colleagues’) experience of dealing with Covid-19 in a New York Hospital, during the first spike in March–April 2020. We argue that improvisation is phronetic insofar as practitioners actively care for the valued ends of their practice. In particular, practitioners seek to phronetically fulfil the internal goods of their practice, while complying with institutional demands, in the context of coping with situational exigencies. Phronetic improvisation involves paying attention to what is salient in the situation at hand, while informed by an open-ended commitment to valued ends and constrained by scarce resources, and driven by a willingness to meet what is at stake through adapting general knowledge to situational demands. Such an inventive process may involve reshaping the original internal goods of the practice, in light of important institutional constrains.
Collapse
|
6
|
Anikin A, Reby D. Ingressive phonation conveys arousal in human nonverbal vocalizations. BIOACOUSTICS 2022. [DOI: 10.1080/09524622.2022.2039295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Lund University, Lund, Sweden
- Enes Sensory Neuro-Ethology Lab, Crnl, Jean Monnet University of Saint Étienne, St-Étienne, France
| | - David Reby
- Enes Sensory Neuro-Ethology Lab, Crnl, Jean Monnet University of Saint Étienne, St-Étienne, France
| |
Collapse
|
7
|
Anikin A, Pisanski K, Reby D. Static and dynamic formant scaling conveys body size and aggression. ROYAL SOCIETY OPEN SCIENCE 2022; 9:211496. [PMID: 35242348 PMCID: PMC8753157 DOI: 10.1098/rsos.211496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 12/09/2021] [Indexed: 05/03/2023]
Abstract
When producing intimidating aggressive vocalizations, humans and other animals often extend their vocal tracts to lower their voice resonance frequencies (formants) and thus sound big. Is acoustic size exaggeration more effective when the vocal tract is extended before, or during, the vocalization, and how do listeners interpret within-call changes in apparent vocal tract length? We compared perceptual effects of static and dynamic formant scaling in aggressive human speech and nonverbal vocalizations. Acoustic manipulations corresponded to elongating or shortening the vocal tract either around (Experiment 1) or from (Experiment 2) its resting position. Gradual formant scaling that preserved average frequencies conveyed the impression of smaller size and greater aggression, regardless of the direction of change. Vocal tract shortening from the original length conveyed smaller size and less aggression, whereas vocal tract elongation conveyed larger size and more aggression, and these effects were stronger for static than for dynamic scaling. Listeners familiarized with the speaker's natural voice were less often 'fooled' by formant manipulations when judging speaker size, but paid more attention to formants when judging aggressive intent. Thus, within-call vocal tract scaling conveys emotion, but a better way to sound large and intimidating is to keep the vocal tract consistently extended.
Collapse
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Lund University, Lund, Sweden
- ENES Sensory Neuro-Ethology lab, CRNL, Jean Monnet University of Saint Étienne, UMR 5293, 42023, St-Étienne, France
| | - Katarzyna Pisanski
- ENES Sensory Neuro-Ethology lab, CRNL, Jean Monnet University of Saint Étienne, UMR 5293, 42023, St-Étienne, France
| | - David Reby
- ENES Sensory Neuro-Ethology lab, CRNL, Jean Monnet University of Saint Étienne, UMR 5293, 42023, St-Étienne, France
| |
Collapse
|
8
|
Sivasathiaseelan H, Marshall CR, Benhamou E, van Leeuwen JEP, Bond RL, Russell LL, Greaves C, Moore KM, Hardy CJD, Frost C, Rohrer JD, Scott SK, Warren JD. Laughter as a paradigm of socio-emotional signal processing in dementia. Cortex 2021; 142:186-203. [PMID: 34273798 PMCID: PMC8438290 DOI: 10.1016/j.cortex.2021.05.020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 04/01/2021] [Accepted: 05/21/2021] [Indexed: 11/03/2022]
Abstract
Laughter is a fundamental communicative signal in our relations with other people and is used to convey a diverse repertoire of social and emotional information. It is therefore potentially a useful probe of impaired socio-emotional signal processing in neurodegenerative diseases. Here we investigated the cognitive and affective processing of laughter in forty-seven patients representing all major syndromes of frontotemporal dementia, a disease spectrum characterised by severe socio-emotional dysfunction (twenty-two with behavioural variant frontotemporal dementia, twelve with semantic variant primary progressive aphasia, thirteen with nonfluent-agrammatic variant primary progressive aphasia), in relation to fifteen patients with typical amnestic Alzheimer's disease and twenty healthy age-matched individuals. We assessed cognitive labelling (identification) and valence rating (affective evaluation) of samples of spontaneous (mirthful and hostile) and volitional (posed) laughter versus two auditory control conditions (a synthetic laughter-like stimulus and spoken numbers). Neuroanatomical associations of laughter processing were assessed using voxel-based morphometry of patients' brain MR images. While all dementia syndromes were associated with impaired identification of laughter subtypes relative to healthy controls, this was significantly more severe overall in frontotemporal dementia than in Alzheimer's disease and particularly in the behavioural and semantic variants, which also showed abnormal affective evaluation of laughter. Over the patient cohort, laughter identification accuracy was correlated with measures of daily-life socio-emotional functioning. Certain striking syndromic signatures emerged, including enhanced liking for hostile laughter in behavioural variant frontotemporal dementia, impaired processing of synthetic laughter in the nonfluent-agrammatic variant (consistent with a generic complex auditory perceptual deficit) and enhanced liking for numbers ('numerophilia') in the semantic variant. Across the patient cohort, overall laughter identification accuracy correlated with regional grey matter in a core network encompassing inferior frontal and cingulo-insular cortices; and more specific correlates of laughter identification accuracy were delineated in cortical regions mediating affective disambiguation (identification of hostile and posed laughter in orbitofrontal cortex) and authenticity (social intent) decoding (identification of mirthful and posed laughter in anteromedial prefrontal cortex) (all p < .05 after correction for multiple voxel-wise comparisons over the whole brain). These findings reveal a rich diversity of cognitive and affective laughter phenotypes in canonical dementia syndromes and suggest that laughter is an informative probe of neural mechanisms underpinning socio-emotional dysfunction in neurodegenerative disease.
Collapse
Affiliation(s)
- Harri Sivasathiaseelan
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom.
| | - Charles R Marshall
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom; Preventive Neurology Unit, Wolfson Institute of Preventive Medicine, Queen Mary University of London, London, United Kingdom
| | - Elia Benhamou
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Janneke E P van Leeuwen
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Rebecca L Bond
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Lucy L Russell
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Caroline Greaves
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Katrina M Moore
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Chris J D Hardy
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Chris Frost
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom; Department of Medical Statistics, Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Jonathan D Rohrer
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Sophie K Scott
- Institute of Cognitive Neuroscience, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Jason D Warren
- Dementia Research Centre, UCL Queen Square Institute of Neurology, University College London, London, United Kingdom
| |
Collapse
|
9
|
Anikin A, Pisanski K, Massenet M, Reby D. Harsh is large: nonlinear vocal phenomena lower voice pitch and exaggerate body size. Proc Biol Sci 2021; 288:20210872. [PMID: 34229494 PMCID: PMC8261225 DOI: 10.1098/rspb.2021.0872] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
A lion's roar, a dog's bark, an angry yell in a pub brawl: what do these vocalizations have in common? They all sound harsh due to nonlinear vocal phenomena (NLP)—deviations from regular voice production, hypothesized to lower perceived voice pitch and thereby exaggerate the apparent body size of the vocalizer. To test this yet uncorroborated hypothesis, we synthesized human nonverbal vocalizations, such as roars, groans and screams, with and without NLP (amplitude modulation, subharmonics and chaos). We then measured their effects on nearly 700 listeners' perceptions of three psychoacoustic (pitch, timbre, roughness) and three ecological (body size, formidability, aggression) characteristics. In an explicit rating task, all NLP lowered perceived voice pitch, increased voice darkness and roughness, and caused vocalizers to sound larger, more formidable and more aggressive. Key results were replicated in an implicit associations test, suggesting that the ‘harsh is large’ bias will arise in ecologically relevant confrontational contexts that involve a rapid, and largely implicit, evaluation of the opponent's size. In sum, nonlinearities in human vocalizations can flexibly communicate both formidability and intention to attack, suggesting they are not a mere byproduct of loud vocalizing, but rather an informative acoustic signal well suited for intimidating potential opponents.
Collapse
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Lund University, 22100 Lund, Sweden.,Equipe de Neuro-Ethologie Sensorielle, CNRS and University of Saint Étienne, UMR 5293, 42023 St-Étienne, France
| | - Katarzyna Pisanski
- Equipe de Neuro-Ethologie Sensorielle, CNRS and University of Saint Étienne, UMR 5293, 42023 St-Étienne, France.,CNRS, French National Centre for Scientific Research, Laboratoire de Dynamique du Langage, University of Lyon 2, 69007 Lyon, France
| | - Mathilde Massenet
- Equipe de Neuro-Ethologie Sensorielle, CNRS and University of Saint Étienne, UMR 5293, 42023 St-Étienne, France
| | - David Reby
- Equipe de Neuro-Ethologie Sensorielle, CNRS and University of Saint Étienne, UMR 5293, 42023 St-Étienne, France
| |
Collapse
|
10
|
Holz N, Larrouy-Maestri P, Poeppel D. The paradoxical role of emotional intensity in the perception of vocal affect. Sci Rep 2021; 11:9663. [PMID: 33958630 PMCID: PMC8102532 DOI: 10.1038/s41598-021-88431-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 04/09/2021] [Indexed: 11/08/2022] Open
Abstract
Vocalizations including laughter, cries, moans, or screams constitute a potent source of information about the affective states of others. It is typically conjectured that the higher the intensity of the expressed emotion, the better the classification of affective information. However, attempts to map the relation between affective intensity and inferred meaning are controversial. Based on a newly developed stimulus database of carefully validated non-speech expressions ranging across the entire intensity spectrum from low to peak, we show that the intuition is false. Based on three experiments (N = 90), we demonstrate that intensity in fact has a paradoxical role. Participants were asked to rate and classify the authenticity, intensity and emotion, as well as valence and arousal of the wide range of vocalizations. Listeners are clearly able to infer expressed intensity and arousal; in contrast, and surprisingly, emotion category and valence have a perceptual sweet spot: moderate and strong emotions are clearly categorized, but peak emotions are maximally ambiguous. This finding, which converges with related observations from visual experiments, raises interesting theoretical challenges for the emotion communication literature.
Collapse
Affiliation(s)
- N Holz
- Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Frankfurt/M, Germany.
| | - P Larrouy-Maestri
- Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Frankfurt/M, Germany
- Max Planck NYU Center for Language, Music, and Emotion, Frankfurt/M, Germany
| | - D Poeppel
- Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Frankfurt/M, Germany
- Max Planck NYU Center for Language, Music, and Emotion, Frankfurt/M, Germany
- Department of Psychology, New York University, New York, NY, USA
| |
Collapse
|
11
|
Woodard K, Plate RC, Morningstar M, Wood A, Pollak SD. Categorization of Vocal Emotion Cues Depends on Distributions of Input. AFFECTIVE SCIENCE 2021; 2:301-310. [PMID: 33870212 PMCID: PMC8035059 DOI: 10.1007/s42761-021-00038-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 02/09/2021] [Indexed: 01/23/2023]
Abstract
Learners use the distributional properties of stimuli to identify environmentally relevant categories in a range of perceptual domains, including words, shapes, faces, and colors. We examined whether similar processes may also operate on affective information conveyed through the voice. In Experiment 1, we tested how adults (18–22-year-olds) and children (8–10-year-olds) categorized affective states communicated by vocalizations varying continuously from “calm” to “upset.” We found that the threshold for categorizing both verbal (i.e., spoken word) and nonverbal (i.e., a yell) vocalizations as “upset” depended on the statistical distribution of the stimuli participants encountered. In Experiment 2, we replicated and extended these findings in adults using vocalizations that conveyed multiple negative affect states. These results suggest perceivers’ flexibly and rapidly update their interpretation of affective vocal cues based upon context.
Collapse
Affiliation(s)
- Kristina Woodard
- Department of Psychology, University of Wisconsin – Madison, 1500 Highland Avenue, Madison, WI 53705 USA
| | - Rista C. Plate
- Department of Psychology, University of Wisconsin – Madison, 1500 Highland Avenue, Madison, WI 53705 USA
- Department of Psychology, University of Pennsylvania, 3720 Walnut Street, Philadelphia, PA 19104 USA
| | | | - Adrienne Wood
- Department of Psychology, University of Virginia, 485 McCormick Rd, Charlottesville, VA 22904 USA
| | - Seth D. Pollak
- Department of Psychology, University of Wisconsin – Madison, 1500 Highland Avenue, Madison, WI 53705 USA
| |
Collapse
|
12
|
Frühholz S, Dietziker J, Staib M, Trost W. Neurocognitive processing efficiency for discriminating human non-alarm rather than alarm scream calls. PLoS Biol 2021; 19:e3000751. [PMID: 33848299 PMCID: PMC8043411 DOI: 10.1371/journal.pbio.3000751] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Accepted: 02/15/2021] [Indexed: 11/19/2022] Open
Abstract
Across many species, scream calls signal the affective significance of events to other agents. Scream calls were often thought to be of generic alarming and fearful nature, to signal potential threats, with instantaneous, involuntary, and accurate recognition by perceivers. However, scream calls are more diverse in their affective signaling nature than being limited to fearfully alarming a threat, and thus the broader sociobiological relevance of various scream types is unclear. Here we used 4 different psychoacoustic, perceptual decision-making, and neuroimaging experiments in humans to demonstrate the existence of at least 6 psychoacoustically distinctive types of scream calls of both alarming and non-alarming nature, rather than there being only screams caused by fear or aggression. Second, based on perceptual and processing sensitivity measures for decision-making during scream recognition, we found that alarm screams (with some exceptions) were overall discriminated the worst, were responded to the slowest, and were associated with a lower perceptual sensitivity for their recognition compared with non-alarm screams. Third, the neural processing of alarm compared with non-alarm screams during an implicit processing task elicited only minimal neural signal and connectivity in perceivers, contrary to the frequent assumption of a threat processing bias of the primate neural system. These findings show that scream calls are more diverse in their signaling and communicative nature in humans than previously assumed, and, in contrast to a commonly observed threat processing bias in perceptual discriminations and neural processes, we found that especially non-alarm screams, and positive screams in particular, seem to have higher efficiency in speeded discriminations and the implicit neural processing of various scream types in humans.
Collapse
Affiliation(s)
- Sascha Frühholz
- Cognitive and Affective Neuroscience Unit, University of Zurich, Zurich, Switzerland
- Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
- Department of Psychology, University of Oslo, Oslo, Norway
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
- * E-mail:
| | - Joris Dietziker
- Cognitive and Affective Neuroscience Unit, University of Zurich, Zurich, Switzerland
| | - Matthias Staib
- Cognitive and Affective Neuroscience Unit, University of Zurich, Zurich, Switzerland
| | - Wiebke Trost
- Cognitive and Affective Neuroscience Unit, University of Zurich, Zurich, Switzerland
| |
Collapse
|
13
|
Engelberg JWM, Schwartz JW, Gouzoules H. The emotional canvas of human screams: patterns and acoustic cues in the perceptual categorization of a basic call type. PeerJ 2021; 9:e10990. [PMID: 33854835 PMCID: PMC7953872 DOI: 10.7717/peerj.10990] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 02/01/2021] [Indexed: 11/20/2022] Open
Abstract
Screams occur across taxonomically widespread species, typically in antipredator situations, and are strikingly similar acoustically, but in nonhuman primates, they have taken on acoustically varied forms in association with more contextually complex functions related to agonistic recruitment. Humans scream in an even broader range of contexts, but the extent to which acoustic variation allows listeners to perceive different emotional meanings remains unknown. We investigated how listeners responded to 30 contextually diverse human screams on six different emotion prompts as well as how selected acoustic cues predicted these responses. We found that acoustic variation in screams was associated with the perception of different emotions from these calls. Emotion ratings generally fell along two dimensions: one contrasting perceived anger, frustration, and pain with surprise and happiness, roughly associated with call duration and roughness, and one related to perceived fear, associated with call fundamental frequency. Listeners were more likely to rate screams highly in emotion prompts matching the source context, suggesting that some screams conveyed information about emotional context, but it is noteworthy that the analysis of screams from happiness contexts (n = 11 screams) revealed that they more often yielded higher ratings of fear. We discuss the implications of these findings for the role and evolution of nonlinguistic vocalizations in human communication, including consideration of how the expanded diversity in calls such as human screams might represent a derived function of language.
Collapse
Affiliation(s)
| | - Jay W. Schwartz
- Department of Psychology, Emory University, Atlanta, GA, USA
- Psychological Sciences Department, Western Oregon University, Monmouth, OR, USA
| | | |
Collapse
|
14
|
Anikin A, Pisanski K, Reby D. Do nonlinear vocal phenomena signal negative valence or high emotion intensity? ROYAL SOCIETY OPEN SCIENCE 2020; 7:201306. [PMID: 33489278 PMCID: PMC7813245 DOI: 10.1098/rsos.201306] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 11/05/2020] [Indexed: 05/06/2023]
Abstract
Nonlinear vocal phenomena (NLPs) are commonly reported in animal calls and, increasingly, in human vocalizations. These perceptually harsh and chaotic voice features function to attract attention and convey urgency, but they may also signal aversive states. To test whether NLPs enhance the perception of negative affect or only signal high arousal, we added subharmonics, sidebands or deterministic chaos to 48 synthetic human nonverbal vocalizations of ambiguous valence: gasps of fright/surprise, moans of pain/pleasure, roars of frustration/achievement and screams of fear/delight. In playback experiments (N = 900 listeners), we compared their perceived valence and emotion intensity in positive or negative contexts or in the absence of any contextual cues. Primarily, NLPs increased the perceived aversiveness of vocalizations regardless of context. To a smaller extent, they also increased the perceived emotion intensity, particularly when the context was negative or absent. However, NLPs also enhanced the perceived intensity of roars of achievement, indicating that their effects can generalize to positive emotions. In sum, a harsh voice with NLPs strongly tips the balance towards negative emotions when a vocalization is ambiguous, but with sufficiently informative contextual cues, NLPs may be re-evaluated as expressions of intense positive affect, underlining the importance of context in nonverbal communication.
Collapse
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Lund University, Lund, Sweden
- Equipe de Neuro-Ethologie Sensorielle (ENES) / Centre de Recherche en Neurosciences de Lyon (CRNL), University of Lyon/Saint-Etienne, CNRS UMR5292, INSERM UMR_S 1028, Saint-Etienne, France
- Author for correspondence: Andrey Anikin e-mail:
| | - Katarzyna Pisanski
- Equipe de Neuro-Ethologie Sensorielle (ENES) / Centre de Recherche en Neurosciences de Lyon (CRNL), University of Lyon/Saint-Etienne, CNRS UMR5292, INSERM UMR_S 1028, Saint-Etienne, France
| | - David Reby
- Equipe de Neuro-Ethologie Sensorielle (ENES) / Centre de Recherche en Neurosciences de Lyon (CRNL), University of Lyon/Saint-Etienne, CNRS UMR5292, INSERM UMR_S 1028, Saint-Etienne, France
| |
Collapse
|
15
|
Abstract
Numerous species use different forms of communication in order to successfully interact in their respective environment. This article seeks to elucidate limitations of the classical conduit metaphor by investigating communication from the perspectives of biology and artificial neural networks. First, communication is a biological natural phenomenon, found to be fruitfully grounded in an organism’s embodied structures and memory system, where specific abilities are tied to procedural, semantic, and episodic long-term memory as well as to working memory. Second, the account explicates differences between non-verbal and verbal communication and shows how artificial neural networks can communicate by means of ontologically non-committal modelling. This approach enables new perspectives of communication to emerge regarding both sender and receiver. It is further shown that communication features gradient properties that are plausibly divided into a reflexive and a reflective form, parallel to knowledge and reflection.
Collapse
|
16
|
Abstract
Researchers examining nonverbal communication of emotions are becoming increasingly interested in differentiations between different positive emotional states like interest, relief, and pride. But despite the importance of the voice in communicating emotion in general and positive emotion in particular, there is to date no systematic review of what characterizes vocal expressions of different positive emotions. Furthermore, integration and synthesis of current findings are lacking. In this review, we comprehensively review studies (N = 108) investigating acoustic features relating to specific positive emotions in speech prosody and nonverbal vocalizations. We find that happy voices are generally loud with considerable variability in loudness, have high and variable pitch, and are high in the first two formant frequencies. When specific positive emotions are directly compared with each other, pitch mean, loudness mean, and speech rate differ across positive emotions, with patterns mapping onto clusters of emotions, so-called emotion families. For instance, pitch is higher for epistemological emotions (amusement, interest, relief), moderate for savouring emotions (contentment and pleasure), and lower for a prosocial emotion (admiration). Some, but not all, of the differences in acoustic patterns also map on to differences in arousal levels. We end by pointing to limitations in extant work and making concrete proposals for future research on positive emotions in the voice.
Collapse
|
17
|
Bolló H, Kovács K, Lefter R, Gombos F, Kubinyi E, Topál J, Kis A. REM versus Non-REM sleep disturbance specifically affects inter-specific emotion processing in family dogs (Canis familiaris). Sci Rep 2020; 10:10492. [PMID: 32591578 PMCID: PMC7319983 DOI: 10.1038/s41598-020-67092-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Accepted: 06/01/2020] [Indexed: 12/21/2022] Open
Abstract
Dogs have outstanding capabilities to read human emotional expressions, both vocal and facial. It has also been shown that positively versus negatively valenced dog-human social interactions substantially affect dogs' subsequent sleep. In the present study, we manipulated dogs' (N = 15, in a within subject design) sleep structure by specifically disrupting REM versus Non-REM sleep, while maintaining equal sleep efficiency (monitored via non-invasive polysomnography). We found that both the number of awakenings as well as relative Non-REM (but not relative REM) duration influenced dogs' viewing patterns in a task where sad and happy human faces were simultaneously projected with sad or happy human voice playbacks. In accordance with the emotion laterality hypothesis, the interaction between sound valence and Non-REM sleep duration was specific to images projected to the left (regardless of image-sound congruency). These results reveal the first evidence of a causal link between sleep structure and inter-specific emotion-processing in the family dog.
Collapse
Affiliation(s)
- Henrietta Bolló
- Doctoral School of Psychology, ELTE Eötvös Loránd University, Budapest, Hungary.
- Institute of Psychology, ELTE Eötvös Loránd University, Budapest, Hungary.
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Budapest, Hungary.
| | - Krisztina Kovács
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Radu Lefter
- Alexandru Ioan Cuza University, Iasi, Romania
| | - Ferenc Gombos
- Pázmány Péter Catholic University, Budapest, Hungary
- MTA-PPKE Adolescent Development Research Group, Budapest, Hungary
| | - Enikő Kubinyi
- ELTE Eötvös Loránd University, Budapest, Hungary
- Department of Ethology ELTE Eötvös Loránd Universi, Budapest, Hungary
| | - József Topál
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Anna Kis
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Budapest, Hungary
| |
Collapse
|
18
|
Abstract
Vocal affect is a subcomponent of emotion programs that coordinate a variety of physiological and psychological systems. Emotional vocalizations comprise a suite of vocal behaviors shaped by evolution to solve adaptive social communication problems. The acoustic forms of vocal emotions are often explicable with reference to the communicative functions they serve. An adaptationist approach to vocal emotions requires that we distinguish between evolved signals and byproduct cues, and understand vocal affect as a collection of multiple strategic communicative systems subject to the evolutionary dynamics described by signaling theory. We should expect variability across disparate societies in vocal emotion according to culturally evolved pragmatic rules, and universals in vocal production and perception to the extent that form–function relationships are present.
Collapse
Affiliation(s)
- Gregory A. Bryant
- Department of Communication, Center for Behavior, Evolution, and Culture, University of California, Los Angeles, USA
| |
Collapse
|
19
|
Artificial sounds following biological rules: A novel approach for non-verbal communication in HRI. Sci Rep 2020; 10:7080. [PMID: 32341387 PMCID: PMC7184580 DOI: 10.1038/s41598-020-63504-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Accepted: 03/11/2020] [Indexed: 11/29/2022] Open
Abstract
Emotionally expressive non-verbal vocalizations can play a major role in human-robot interactions. Humans can assess the intensity and emotional valence of animal vocalizations based on simple acoustic features such as call length and fundamental frequency. These simple encoding rules are suggested to be general across terrestrial vertebrates. To test the degree of this generalizability, our aim was to synthesize a set of artificial sounds by systematically changing the call length and fundamental frequency, and examine how emotional valence and intensity is attributed to them by humans. Based on sine wave sounds, we generated sound samples in seven categories by increasing complexity via incorporating different characteristics of animal vocalizations. We used an online questionnaire to measure the perceived emotional valence and intensity of the sounds in a two-dimensional model of emotions. The results show that sounds with low fundamental frequency and shorter call lengths were considered to have a more positive valence, and samples with high fundamental frequency were rated as more intense across all categories, regardless of the sound complexity. We conclude that applying the basic rules of vocal emotion encoding can be a good starting point for the development of novel non-verbal vocalizations for artificial agents.
Collapse
|
20
|
Abstract
To ensure that listeners pay attention and do not habituate, emotionally intense vocalizations may be under evolutionary pressure to exploit processing biases in the auditory system by maximising their bottom-up salience. This "salience code" hypothesis was tested using 128 human nonverbal vocalizations representing eight emotions: amusement, anger, disgust, effort, fear, pain, pleasure, and sadness. As expected, within each emotion category salience ratings derived from pairwise comparisons strongly correlated with perceived emotion intensity. For example, while laughs as a class were less salient than screams of fear, salience scores almost perfectly explained the perceived intensity of both amusement and fear considered separately. Validating self-rated salience evaluations, high- vs. low-salience sounds caused 25% more recall errors in a short-term memory task, whereas emotion intensity had no independent effect on recall errors. Furthermore, the acoustic characteristics of salient vocalizations were similar to those previously described for non-emotional sounds (greater duration and intensity, high pitch, bright timbre, rapid modulations, and variable spectral characteristics), confirming that vocalizations were not salient merely because of their emotional content. The acoustic code in nonverbal communication is thus aligned with sensory biases, offering a general explanation for some non-arbitrary properties of human and animal high-arousal vocalizations.
Collapse
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Lund University, Lund, Sweden
| |
Collapse
|
21
|
Anikin A. A Moan of Pleasure Should Be Breathy: The Effect of Voice Quality on the Meaning of Human Nonverbal Vocalizations. PHONETICA 2020; 77:327-349. [PMID: 31962309 PMCID: PMC7592904 DOI: 10.1159/000504855] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Accepted: 11/15/2019] [Indexed: 05/19/2023]
Abstract
Prosodic features, such as intonation and voice intensity, have a well-documented role in communicating emotion, but less is known about the role of laryngeal voice quality in speech and particularly in nonverbal vocalizations such as laughs and moans. Potentially, however, variations in voice quality between tense and breathy may convey rich information about the speaker's physiological and affective state. In this study breathiness was manipulated in synthetic human nonverbal vocalizations by adjusting the relative strength of upper harmonics and aspiration noise. In experiment 1 (28 prototypes × 3 manipulations = 84 sounds), otherwise identical vocalizations with tense versus breathy voice quality were associated with higher arousal (general alertness), higher dominance, and lower valence (unpleasant states). Ratings on discrete emotions in experiment 2 (56 × 3 = 168 sounds) confirmed that breathiness was reliably associated with positive emotions, particularly in ambiguous vocalizations (gasps and moans). The spectral centroid did not fully account for the effect of manipulation, confirming that the perceived change in voice quality was more specific than a general shift in timbral brightness. Breathiness is thus involved in communicating emotion with nonverbal vocalizations, possibly due to changes in low-level auditory salience and perceived vocal effort.
Collapse
|
22
|
Smit I, Szabo D, Kubinyi E. Age-related positivity effect on behavioural responses of dogs to human vocalisations. Sci Rep 2019; 9:20201. [PMID: 31882873 PMCID: PMC6934484 DOI: 10.1038/s41598-019-56636-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Accepted: 11/25/2019] [Indexed: 11/09/2022] Open
Abstract
Age-related changes in the brain can alter how emotions are processed. In humans, valence specific changes in attention and memory were reported with increasing age, i.e. older people are less attentive toward and experience fewer negative emotions, while processing of positive emotions remains intact. Little is yet known about this "positivity effect" in non-human animals. We tested young (n = 21, 1-5 years) and old (n = 19, >10 years) family dogs with positive (laugh), negative (cry), and neutral (hiccup, cough) human vocalisations and investigated age-related differences in their behavioural reactions. Only dogs with intact hearing were analysed and the selected sound samples were balanced regarding mean and fundamental frequencies between valence categories. Compared to young dogs, old individuals reacted slower only to the negative sounds and there was no significant difference in the duration of the reactions between groups. The selective response of the aged dogs to the sound stimuli suggests that the results cannot be explained by general cognitive and/or perceptual decline. and supports the presence of an age-related positivity effect in dogs, too. Similarities in emotional processing between humans and dogs may imply analogous changes in subcortical emotional processing in the canine brain during ageing.
Collapse
Affiliation(s)
- Iris Smit
- Department of Ethology, Eötvös Loránd University, Budapest, 1117, Hungary.
- HAS University of Applied Sciences, 's-Hertogenbosch, 5223DE, The Netherlands.
| | - Dora Szabo
- Department of Ethology, Eötvös Loránd University, Budapest, 1117, Hungary
| | - Enikő Kubinyi
- Department of Ethology, Eötvös Loránd University, Budapest, 1117, Hungary
| |
Collapse
|
23
|
Was That a Scream? Listener Agreement and Major Distinguishing Acoustic Features. JOURNAL OF NONVERBAL BEHAVIOR 2019. [DOI: 10.1007/s10919-019-00325-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
24
|
The Jena Speaker Set (JESS)-A database of voice stimuli from unfamiliar young and old adult speakers. Behav Res Methods 2019; 52:990-1007. [PMID: 31637667 DOI: 10.3758/s13428-019-01296-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Here we describe the Jena Speaker Set (JESS), a free database for unfamiliar adult voice stimuli, comprising voices from 61 young (18-25 years) and 59 old (60-81 years) female and male speakers uttering various sentences, syllables, read text, semi-spontaneous speech, and vowels. Listeners rated two voice samples (short sentences) per speaker for attractiveness, likeability, two measures of distinctiveness ("deviation"-based [DEV] and "voice in the crowd"-based [VITC]), regional accent, and age. Interrater reliability was high, with Cronbach's α between .82 and .99. Young voices were generally rated as more attractive than old voices, but particularly so when male listeners judged female voices. Moreover, young female voices were rated as more likeable than both young male and old female voices. Young voices were judged to be less distinctive than old voices according to the DEV measure, with no differences in the VITC measure. In age ratings, listeners almost perfectly discriminated young from old voices; additionally, young female voices were perceived as being younger than young male voices. Correlations between the rating dimensions above demonstrated (among other things) that DEV-based distinctiveness was strongly negatively correlated with rated attractiveness and likeability. By contrast, VITC-based distinctiveness was uncorrelated with rated attractiveness and likeability in young voices, although a moderate negative correlation was observed for old voices. Overall, the present results demonstrate systematic effects of vocal age and gender on impressions based on the voice and inform as to the selection of suitable voice stimuli for further research into voice perception, learning, and memory.
Collapse
|
25
|
Abstract
Language is a cornerstone of human culture, yet the evolution of this cognitive-demanding ability is shrouded in mystery. Studying how different species demonstrate this trait can provide clues for its evolutionary route. Indeed, recent decades saw ample scientific attempts to compare human speech, the prominent behavioral manifestation of language, with other animals' vocalizations. Diligent studies have found only elementary parallels to speech in other animals, fortifying the belief that language is uniquely human. But have we really tested this uniqueness claim? Surprisingly, a true impartial comparison between human speech and other animals' vocalizations has hardly ever been conducted. Here, I illustrate how treating humans as an equal species in vocal-communication research is expected to provide us with no evidence for human superiority in this realm. Thus, novel balanced and unbiased comparative studies are vital for identifying any unique component of human speech and language.
Collapse
Affiliation(s)
- Yosef Prat
- School of Zoology, Faculty of Life Sciences, Tel-Aviv University
| |
Collapse
|
26
|
Cowen A, Sauter D, Tracy JL, Keltner D. Mapping the Passions: Toward a High-Dimensional Taxonomy of Emotional Experience and Expression. Psychol Sci Public Interest 2019; 20:69-90. [PMID: 31313637 PMCID: PMC6675572 DOI: 10.1177/1529100619850176] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
What would a comprehensive atlas of human emotions include? For 50 years, scientists have sought to map emotion-related experience, expression, physiology, and recognition in terms of the "basic six"-anger, disgust, fear, happiness, sadness, and surprise. Claims about the relationships between these six emotions and prototypical facial configurations have provided the basis for a long-standing debate over the diagnostic value of expression (for review and latest installment in this debate, see Barrett et al., p. 1). Building on recent empirical findings and methodologies, we offer an alternative conceptual and methodological approach that reveals a richer taxonomy of emotion. Dozens of distinct varieties of emotion are reliably distinguished by language, evoked in distinct circumstances, and perceived in distinct expressions of the face, body, and voice. Traditional models-both the basic six and affective-circumplex model (valence and arousal)-capture a fraction of the systematic variability in emotional response. In contrast, emotion-related responses (e.g., the smile of embarrassment, triumphant postures, sympathetic vocalizations, blends of distinct expressions) can be explained by richer models of emotion. Given these developments, we discuss why tests of a basic-six model of emotion are not tests of the diagnostic value of facial expression more generally. Determining the full extent of what facial expressions can tell us, marginally and in conjunction with other behavioral and contextual cues, will require mapping the high-dimensional, continuous space of facial, bodily, and vocal signals onto richly multifaceted experiences using large-scale statistical modeling and machine-learning methods.
Collapse
Affiliation(s)
- Alan Cowen
- Department of Psychology, University of California, Berkeley
| | - Disa Sauter
- Faculty of Social and Behavioural Sciences, University of Amsterdam
| | | | - Dacher Keltner
- Department of Psychology, University of California, Berkeley
| |
Collapse
|
27
|
Engelberg JWM, Schwartz JW, Gouzoules H. Do human screams permit individual recognition? PeerJ 2019; 7:e7087. [PMID: 31275746 PMCID: PMC6596410 DOI: 10.7717/peerj.7087] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Accepted: 05/07/2019] [Indexed: 11/20/2022] Open
Abstract
The recognition of individuals through vocalizations is a highly adaptive ability in the social behavior of many species, including humans. However, the extent to which nonlinguistic vocalizations such as screams permit individual recognition in humans remains unclear. Using a same-different vocalizer discrimination task, we investigated participants' ability to correctly identify whether pairs of screams were produced by the same person or two different people, a critical prerequisite to individual recognition. Despite prior theory-based contentions that screams are not acoustically well-suited to conveying identity cues, listeners discriminated individuals at above-chance levels by their screams, including both acoustically modified and unmodified exemplars. We found that vocalizer gender explained some variation in participants' discrimination abilities and response times, but participant attributes (gender, experience, empathy) did not. Our findings are consistent with abundant evidence from nonhuman primates, suggesting that both human and nonhuman screams convey cues to caller identity, thus supporting the thesis of evolutionary continuity in at least some aspects of scream function across primate species.
Collapse
Affiliation(s)
| | - Jay W Schwartz
- Department of Psychology, Emory University, Atlanta, GA, USA
| | | |
Collapse
|
28
|
Abstract
Voice synthesis is a useful method for investigating the communicative role of different acoustic features. Although many text-to-speech systems are available, researchers of human nonverbal vocalizations and bioacousticians may profit from a dedicated simple tool for synthesizing and manipulating natural-sounding vocalizations. Soundgen ( https://CRAN.R-project.org/package=soundgen ) is an open-source R package that synthesizes nonverbal vocalizations based on meaningful acoustic parameters, which can be specified from the command line or in an interactive app. This tool was validated by comparing the perceived emotion, valence, arousal, and authenticity of 60 recorded human nonverbal vocalizations (screams, moans, laughs, and so on) and their approximate synthetic reproductions. Each synthetic sound was created by manually specifying only a small number of high-level control parameters, such as syllable length and a few anchors for the intonation contour. Nevertheless, the valence and arousal ratings of synthetic sounds were similar to those of the original recordings, and the authenticity ratings were comparable, maintaining parity with the originals for less complex vocalizations. Manipulating the precise acoustic characteristics of synthetic sounds may shed light on the salient predictors of emotion in the human voice. More generally, soundgen may prove useful for any studies that require precise control over the acoustic features of nonspeech sounds, including research on animal vocalizations and auditory perception.
Collapse
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Department of Philosophy, Lund University, Box 192, SE-221 00, Lund, Sweden.
| |
Collapse
|
29
|
Anikin A. The perceptual effects of manipulating nonlinear phenomena in synthetic nonverbal vocalizations. BIOACOUSTICS 2019. [DOI: 10.1080/09524622.2019.1581839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Department of Philosophy, Lund University, Lund, Sweden
| |
Collapse
|
30
|
Engelberg JW, Gouzoules H. The credibility of acted screams: Implications for emotional communication research. Q J Exp Psychol (Hove) 2018; 72:1889-1902. [PMID: 30514163 DOI: 10.1177/1747021818816307] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Researchers have long relied on acted material to study emotional expression and perception in humans. It has been suggested, however, that certain aspects of natural expressions are difficult or impossible to produce voluntarily outside of their associated emotional contexts, and that acted expressions tend to be overly intense caricatures. From an evolutionary perspective, listeners' abilities to distinguish acted from natural expressions likely depend on the type of expression in question, the costs entailed in its production, and elements of receiver psychology. Here, we investigated these issues as they relate to human screams. We also examined whether listeners' abilities to distinguish acted from natural screams might vary as a function of individual differences in emotional processing and empathy. Using a forced-choice categorization task, we found that listeners could not distinguish acted from natural exemplars, suggesting that actors can produce dramatisations of screams resembling natural vocalisations. Intensity ratings did not differ between acted and natural screams, nor did individual differences in emotional processing significantly predict performance. Scream duration predicted both the probability that an exemplar was categorised as acted and the probability that participants classified that scream accurately. These findings are discussed with respect to potential evolutionary implications and their practical relevance to future research using acted screams.
Collapse
|
31
|
Oliva M, Anikin A. Pupil dilation reflects the time course of emotion recognition in human vocalizations. Sci Rep 2018; 8:4871. [PMID: 29559673 PMCID: PMC5861097 DOI: 10.1038/s41598-018-23265-x] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 03/08/2018] [Indexed: 11/13/2022] Open
Abstract
The processing of emotional signals usually causes an increase in pupil size, and this effect has been largely attributed to autonomic arousal prompted by the stimuli. Additionally, changes in pupil size were associated with decision making during non-emotional perceptual tasks. Therefore, in this study we investigated the relationship between pupil size fluctuations and the process of emotion recognition. Participants heard human nonverbal vocalizations (e.g., laughing, crying) and indicated the emotional state of the speakers as soon as they had identified it. The results showed that during emotion recognition, the time course of pupil response was driven by the decision-making process. In particular, peak pupil dilation betrayed the time of emotional selection. In addition, pupil response revealed properties of the decisions, such as the perceived emotional valence and the confidence in the assessment. Because pupil dilation (under isoluminance conditions) is almost exclusively promoted by norepinephrine (NE) release from the locus coeruleus (LC), the results suggest an important role of the LC-NE system during emotion processing.
Collapse
Affiliation(s)
- Manuel Oliva
- Lund University, Cognitive Science, Lund, SE-22100, Sweden.
| | - Andrey Anikin
- Lund University, Cognitive Science, Lund, SE-22100, Sweden
| |
Collapse
|
32
|
Anikin A, Lima CF. Perceptual and acoustic differences between authentic and acted nonverbal emotional vocalizations. Q J Exp Psychol (Hove) 2018; 71:622-641. [PMID: 27937389 DOI: 10.1080/17470218.2016.1270976] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Most research on nonverbal emotional vocalizations is based on actor portrayals, but how similar are they to the vocalizations produced spontaneously in everyday life? Perceptual and acoustic differences have been discovered between spontaneous and volitional laughs, but little is known about other emotions. We compared 362 acted vocalizations from seven corpora with 427 authentic vocalizations using acoustic analysis, and 278 vocalizations (139 authentic and 139 acted) were also tested in a forced-choice authenticity detection task ( N = 154 listeners). Target emotions were: achievement, amusement, anger, disgust, fear, pain, pleasure, and sadness. Listeners distinguished between authentic and acted vocalizations with accuracy levels above chance across all emotions (overall accuracy 65%). Accuracy was highest for vocalizations of achievement, anger, fear, and pleasure, which also displayed the largest differences in acoustic characteristics. In contrast, both perceptual and acoustic differences between authentic and acted vocalizations of amusement, disgust, and sadness were relatively small. Acoustic predictors of authenticity included higher and more variable pitch, lower harmonicity, and less regular temporal structure. The existence of perceptual and acoustic differences between authentic and acted vocalizations for all analysed emotions suggests that it may be useful to include spontaneous expressions in datasets for psychological research and affective computing.
Collapse
Affiliation(s)
- Andrey Anikin
- 1 Division of Cognitive Science, Department of Philosophy, Lund University, Lund, Sweden
| | - César F Lima
- 2 Institute of Cognitive Neuroscience, University College London, London, UK.,3 Center for Psychology, University of Porto, Porto, Portugal.,4 Instituto Universitário de Lisboa (ISCTE-IUL), Lisboa, Portugal
| |
Collapse
|
33
|
Anikin A, Bååth R, Persson T. Human Non-linguistic Vocal Repertoire: Call Types and Their Meaning. JOURNAL OF NONVERBAL BEHAVIOR 2017; 42:53-80. [PMID: 29497221 PMCID: PMC5816134 DOI: 10.1007/s10919-017-0267-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Recent research on human nonverbal vocalizations has led to considerable progress in our understanding of vocal communication of emotion. However, in contrast to studies of animal vocalizations, this research has focused mainly on the emotional interpretation of such signals. The repertoire of human nonverbal vocalizations as acoustic types, and the mapping between acoustic and emotional categories, thus remain underexplored. In a cross-linguistic naming task (Experiment 1), verbal categorization of 132 authentic (non-acted) human vocalizations by English-, Swedish- and Russian-speaking participants revealed the same major acoustic types: laugh, cry, scream, moan, and possibly roar and sigh. The association between call type and perceived emotion was systematic but non-redundant: listeners associated every call type with a limited, but in some cases relatively wide, range of emotions. The speed and consistency of naming the call type predicted the speed and consistency of inferring the caller’s emotion, suggesting that acoustic and emotional categorizations are closely related. However, participants preferred to name the call type before naming the emotion. Furthermore, nonverbal categorization of the same stimuli in a triad classification task (Experiment 2) was more compatible with classification by call type than by emotion, indicating the former’s greater perceptual salience. These results suggest that acoustic categorization may precede attribution of emotion, highlighting the need to distinguish between the overt form of nonverbal signals and their interpretation by the perceiver. Both within- and between-call acoustic variation can then be modeled explicitly, bringing research on human nonverbal vocalizations more in line with the work on animal communication.
Collapse
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Department of Philosophy, Lund University, Box 192, 221 00 Lund, Sweden
| | - Rasmus Bååth
- Division of Cognitive Science, Department of Philosophy, Lund University, Box 192, 221 00 Lund, Sweden
| | - Tomas Persson
- Division of Cognitive Science, Department of Philosophy, Lund University, Box 192, 221 00 Lund, Sweden
| |
Collapse
|