1
|
Payne B, Addlesee A, Rieser V, McGettigan C. Self-ownership, not self-production, modulates bias and agency over a synthesised voice. Cognition 2024; 248:105804. [PMID: 38678806 DOI: 10.1016/j.cognition.2024.105804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 04/08/2024] [Accepted: 04/23/2024] [Indexed: 05/01/2024]
Abstract
Voices are fundamentally social stimuli, and their importance to the self may be underpinned by how far they can be used to express the self and achieve communicative goals. This paper examines how self-bias and agency over a synthesised voice is altered when that voice is used to represent the self in social interaction. To enable participants to use a new voice, a novel two-player game was created, in which participants communicated online using a text-to-speech (TTS) synthesised voice. We then measured self-bias and sense of agency attributed to this synthesised voice, comparing participants who had used their new voice to interact with another person (n = 44) to a control group of participants (n = 44) who had been only briefly exposed to the voices. We predicted that the new, synthesised self-voice would be more perceptually prioritised after it had been self-produced. Further, that participants' sense of agency over the voice would be increased, if they had experienced self-producing the voice, relative to those who only owned it. Contrary to the hypothesis, the results indicated that both experimental participants and control participants similarly prioritised the new synthesised voice and experienced a similar degree of agency over it, relative to voices owned by others. Critically then, being able to produce the new voice in a social interaction did not modulate bias towards it nor participant's sense of agency over it. These results suggest that merely having ownership over a new voice may be sufficient to generate a perceptual bias and a sense of agency over it.
Collapse
Affiliation(s)
- Bryony Payne
- Department of Speech, Hearing and Phonetic Sciences, University College London, United Kingdom; Department of Psychology, King's College London, United Kingdom.
| | - Angus Addlesee
- School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom
| | - Verena Rieser
- School of Mathematical and Computer Sciences, Heriot-Watt University, United Kingdom
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, United Kingdom
| |
Collapse
|
2
|
Belyk M, Carignan C, McGettigan C. An open-source toolbox for measuring vocal tract shape from real-time magnetic resonance images. Behav Res Methods 2024; 56:2623-2635. [PMID: 37507650 PMCID: PMC10990993 DOI: 10.3758/s13428-023-02171-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/14/2023] [Indexed: 07/30/2023]
Abstract
Real-time magnetic resonance imaging (rtMRI) is a technique that provides high-contrast videographic data of human anatomy in motion. Applied to the vocal tract, it is a powerful method for capturing the dynamics of speech and other vocal behaviours by imaging structures internal to the mouth and throat. These images provide a means of studying the physiological basis for speech, singing, expressions of emotion, and swallowing that are otherwise not accessible for external observation. However, taking quantitative measurements from these images is notoriously difficult. We introduce a signal processing pipeline that produces outlines of the vocal tract from the lips to the larynx as a quantification of the dynamic morphology of the vocal tract. Our approach performs simple tissue classification, but constrained to a researcher-specified region of interest. This combination facilitates feature extraction while retaining the domain-specific expertise of a human analyst. We demonstrate that this pipeline generalises well across datasets covering behaviours such as speech, vocal size exaggeration, laughter, and whistling, as well as producing reliable outcomes across analysts, particularly among users with domain-specific expertise. With this article, we make this pipeline available for immediate use by the research community, and further suggest that it may contribute to the continued development of fully automated methods based on deep learning algorithms.
Collapse
Affiliation(s)
- Michel Belyk
- Department of Psychology, Edge Hill University, Ormskirk, UK.
| | - Christopher Carignan
- Department of Speech Hearing and Phonetic Sciences, University College London, London, UK
| | - Carolyn McGettigan
- Department of Speech Hearing and Phonetic Sciences, University College London, London, UK
| |
Collapse
|
3
|
Guldner S, Lavan N, Lally C, Wittmann L, Nees F, Flor H, McGettigan C. Human talkers change their voices to elicit specific trait percepts. Psychon Bull Rev 2024; 31:209-222. [PMID: 37507647 PMCID: PMC10866754 DOI: 10.3758/s13423-023-02333-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/30/2023] [Indexed: 07/30/2023]
Abstract
The voice is a variable and dynamic social tool with functional relevance for self-presentation, for example, during a job interview or courtship. Talkers adjust their voices flexibly to their situational or social environment. Here, we investigated how effectively intentional voice modulations can evoke trait impressions in listeners (Experiment 1), whether these trait impressions are recognizable (Experiment 2), and whether they meaningfully influence social interactions (Experiment 3). We recorded 40 healthy adult speakers' whilst speaking neutrally and whilst producing vocal expressions of six social traits (e.g., likeability, confidence). Multivariate ratings of 40 listeners showed that vocal modulations amplified specific trait percepts (Experiments 1 and 2), which could be explained by two principal components relating to perceived affiliation and competence. Moreover, vocal modulations increased the likelihood of listeners choosing the voice to be suitable for corresponding social goals (i.e., a confident rather than likeable voice to negotiate a promotion, Experiment 3). These results indicate that talkers modulate their voice along a common trait space for social navigation. Moreover, beyond reactive voice changes, vocal behaviour can be strategically used by talkers to communicate subtle information about themselves to listeners. These findings advance our understanding of non-verbal vocal behaviour for social communication.
Collapse
Affiliation(s)
- Stella Guldner
- Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
- Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
| | - Nadine Lavan
- Department of Psychology, Queen Mary University of London, London, UK
| | - Clare Lally
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Lisa Wittmann
- Institute of Psychology, University of Regensburg, Regensburg, Germany
| | - Frauke Nees
- Institute of Medical Psychology and Medical Sociology, University Medical Centre Schleswig Holstein, Kiel University, Kiel, Germany
| | - Herta Flor
- Institute of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| |
Collapse
|
4
|
Bradshaw AR, Lametti DR, Shiller DM, Jasmin K, Huang R, McGettigan C. Speech motor adaptation during synchronous and metronome-timed speech. J Exp Psychol Gen 2023; 152:3476-3489. [PMID: 37616075 DOI: 10.1037/xge0001459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/25/2023]
Abstract
Sensorimotor integration during speech has been investigated by altering the sound of a speaker's voice in real time; in response, the speaker learns to change their production of speech sounds in order to compensate (adaptation). This line of research has however been predominantly limited to very simple speaking contexts, typically involving (a) repetitive production of single words and (b) production of speech while alone, without the usual exposure to other voices. This study investigated adaptation to a real-time perturbation of the first and second formants during production of sentences either in synchrony with a prerecorded voice (synchronous speech group) or alone (solo speech group). Experiment 1 (n = 30) found no significant difference in the average magnitude of compensatory formant changes between the groups; however, synchronous speech resulted in increased between-individual variability in such formant changes. Participants also showed acoustic-phonetic convergence to the voice they were synchronizing with prior to introduction of the feedback alteration. Furthermore, the extent to which the changes required for convergence agreed with those required for adaptation was positively correlated with the magnitude of subsequent adaptation. Experiment 2 tested an additional group with a metronome-timed speech task (n = 15) and found a similar pattern of increased between-participant variability in formant changes. These findings demonstrate that speech motor adaptation can be measured robustly at the group level during performance of more complex speaking tasks; however, further work is needed to resolve whether self-voice adaptation and other-voice convergence reflect additive or interactive effects during sensorimotor control of speech. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Collapse
Affiliation(s)
- Abigail R Bradshaw
- Department of Speech, Hearing and Phonetic Sciences, University College London
| | | | - Douglas M Shiller
- School of Speech-Language Pathology and Audiology, Universite de Montreal
| | - Kyle Jasmin
- Department of Psychology, Royal Holloway, University of London
| | - Ruiling Huang
- Department of Speech, Hearing and Phonetic Sciences, University College London
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London
| |
Collapse
|
5
|
Patel B, Zhang Z, McGettigan C, Belyk M. Speech With Pauses Sounds Deceptive to Listeners With and Without Hearing Impairment. J Speech Lang Hear Res 2023; 66:3735-3744. [PMID: 37672786 DOI: 10.1044/2023_jslhr-22-00618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
PURPOSE Communication is as much persuasion as it is the transfer of information. This creates a tension between the interests of the speaker and those of the listener, as dishonest speakers naturally attempt to hide deceptive speech and listeners are faced with the challenge of sorting truths from lies. Listeners with hearing impairment in particular may have differing levels of access to the acoustical cues that give away deceptive speech. A greater tendency toward speech pauses has been hypothesized to result from the cognitive demands of lying convincingly. Higher vocal pitch has also been hypothesized to mark the increased anxiety of a dishonest speaker. METHOD Listeners with or without hearing impairments heard short utterances from natural conversations, some of which had been digitally manipulated to contain either increased pausing or raised vocal pitch. Listeners were asked to guess whether each statement was a lie in a two-alternative forced-choice task. Participants were also asked explicitly which cues they believed had influenced their decisions. RESULTS Statements were more likely to be perceived as a lie when they contained pauses, but not when vocal pitch was raised. This pattern held regardless of hearing ability. In contrast, both groups of listeners self-reported using vocal pitch cues to identify deceptive statements, though at lower rates than pauses. CONCLUSIONS Listeners may have only partial awareness of the cues that influence their impression of dishonesty. Listeners with hearing impairment may place greater weight on acoustical cues according to the differing degrees of access provided by hearing aids. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24052446.
Collapse
Affiliation(s)
- Bindiya Patel
- Department of Audiological Sciences, University College London, United Kingdom
| | - Ziyun Zhang
- Department of Speech Hearing and Phonetic Sciences, University College London, United Kingdom
| | - Carolyn McGettigan
- Department of Speech Hearing and Phonetic Sciences, University College London, United Kingdom
| | - Michel Belyk
- Department of Psychology, Edge Hill University, Ormskirk, United Kingdom
| |
Collapse
|
6
|
Abstract
When hearing a voice, listeners can form a detailed impression of the person behind the voice. Existing models of voice processing focus primarily on one aspect of person perception - identity recognition from familiar voices - but do not account for the perception of other person characteristics (e.g., sex, age, personality traits). Here, we present a broader perspective, proposing that listeners have a common perceptual goal of perceiving who they are hearing, whether the voice is familiar or unfamiliar. We outline and discuss a model - the Person Perception from Voices (PPV) model - that achieves this goal via a common mechanism of recognising a familiar person, persona, or set of speaker characteristics. Our PPV model aims to provide a more comprehensive account of how listeners perceive the person they are listening to, using an approach that incorporates and builds on aspects of the hierarchical frameworks and prototype-based mechanisms proposed within existing models of voice identity recognition.
Collapse
Affiliation(s)
- Nadine Lavan
- Department of Experimental and Biological Psychology, Queen Mary University of London, London, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing, and Phonetic Sciences, University College London, London, UK
| |
Collapse
|
7
|
Wang H, Chen R, Yan Y, McGettigan C, Rosen S, Adank P. Perceptual Learning of Noise-Vocoded Speech Under Divided Attention. Trends Hear 2023; 27:23312165231192297. [PMID: 37547940 PMCID: PMC10408355 DOI: 10.1177/23312165231192297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 07/13/2023] [Accepted: 07/19/2023] [Indexed: 08/08/2023] Open
Abstract
Speech perception performance for degraded speech can improve with practice or exposure. Such perceptual learning is thought to be reliant on attention and theoretical accounts like the predictive coding framework suggest a key role for attention in supporting learning. However, it is unclear whether speech perceptual learning requires undivided attention. We evaluated the role of divided attention in speech perceptual learning in two online experiments (N = 336). Experiment 1 tested the reliance of perceptual learning on undivided attention. Participants completed a speech recognition task where they repeated forty noise-vocoded sentences in a between-group design. Participants performed the speech task alone or concurrently with a domain-general visual task (dual task) at one of three difficulty levels. We observed perceptual learning under divided attention for all four groups, moderated by dual-task difficulty. Listeners in easy and intermediate visual conditions improved as much as the single-task group. Those who completed the most challenging visual task showed faster learning and achieved similar ending performance compared to the single-task group. Experiment 2 tested whether learning relies on domain-specific or domain-general processes. Participants completed a single speech task or performed this task together with a dual task aiming to recruit domain-specific (lexical or phonological), or domain-general (visual) processes. All secondary task conditions produced patterns and amount of learning comparable to the single speech task. Our results demonstrate that the impact of divided attention on perceptual learning is not strictly dependent on domain-general or domain-specific processes and speech perceptual learning persists under divided attention.
Collapse
Affiliation(s)
- Han Wang
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Rongru Chen
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Yu Yan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Stuart Rosen
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Patti Adank
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| |
Collapse
|
8
|
Belyk M, McGettigan C. Real-time magnetic resonance imaging reveals distinct vocal tract configurations during spontaneous and volitional laughter. Philos Trans R Soc Lond B Biol Sci 2022; 377:20210511. [PMID: 36126659 PMCID: PMC9489295 DOI: 10.1098/rstb.2021.0511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 02/15/2022] [Indexed: 12/22/2022] Open
Abstract
A substantial body of acoustic and behavioural evidence points to the existence of two broad categories of laughter in humans: spontaneous laughter that is emotionally genuine and somewhat involuntary, and volitional laughter that is produced on demand. In this study, we tested the hypothesis that these are also physiologically distinct vocalizations, by measuring and comparing them using real-time magnetic resonance imaging (rtMRI) of the vocal tract. Following Ruch and Ekman (Ruch and Ekman 2001 In Emotions, qualia, and consciousness (ed. A Kaszniak), pp. 426-443), we further predicted that spontaneous laughter should be relatively less speech-like (i.e. less articulate) than volitional laughter. We collected rtMRI data from five adult human participants during spontaneous laughter, volitional laughter and spoken vowels. We report distinguishable vocal tract shapes during the vocalic portions of these three vocalization types, where volitional laughs were intermediate between spontaneous laughs and vowels. Inspection of local features within the vocal tract across the different vocalization types offers some additional support for Ruch and Ekman's predictions. We discuss our findings in light of a dual pathway hypothesis for the neural control of human volitional and spontaneous vocal behaviours, identifying tongue shape and velum lowering as potential biomarkers of spontaneous laughter to be investigated in future research. This article is part of the theme issue 'Cracking the laugh code: laughter through the lens of biology, psychology and neuroscience'.
Collapse
Affiliation(s)
- Michel Belyk
- Department of Psychology, Edge Hill University, Ormskirk L39 4QP, UK
- Department of Speech, Hearing and Phonetic Sciences, University College London, London WC1N 1PF, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London WC1N 1PF, UK
| |
Collapse
|
9
|
O'Neill C, Burke D, McGettigan C, Purcell C, Kunzmann A, Eatock M. P-190 The Geriatric 8 score is associated with risk of hospitalisation and 6-month survival in patients with incurable pancreatic cancer receiving gemcitabine and capecitabine. Ann Oncol 2022. [DOI: 10.1016/j.annonc.2022.04.280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022] Open
|
10
|
Zhang Z, McGettigan C, Belyk M. Speech timing cues reveal deceptive speech in social deduction board games. PLoS One 2022; 17:e0263852. [PMID: 35148352 PMCID: PMC8836341 DOI: 10.1371/journal.pone.0263852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 01/27/2022] [Indexed: 11/18/2022] Open
Abstract
The faculty of language allows humans to state falsehoods in their choice of words. However, while what is said might easily uphold a lie, how it is said may reveal deception. Hence, some features of the voice that are difficult for liars to control may keep speech mostly, if not always, honest. Previous research has identified that speech timing and voice pitch cues can predict the truthfulness of speech, but this evidence has come primarily from laboratory experiments, which sacrifice ecological validity for experimental control. We obtained ecologically valid recordings of deceptive speech while observing natural utterances from players of a popular social deduction board game, in which players are assigned roles that either induce honest or dishonest interactions. When speakers chose to lie, they were prone to longer and more frequent pauses in their speech. This finding is in line with theoretical predictions that lying is more cognitively demanding. However, lying was not reliably associated with vocal pitch. This contradicts predictions that increased physiological arousal from lying might increase muscular tension in the larynx, but is consistent with human specialisations that grant Homo sapiens sapiens an unusual degree of control over the voice relative to other primates. The present study demonstrates the utility of social deduction board games as a means of making naturalistic observations of human behaviour from semi-structured social interactions.
Collapse
Affiliation(s)
- Ziyun Zhang
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| | - Michel Belyk
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
- Department of Psychology, Edge Hill University, Ormskirk, United Kingdom
| |
Collapse
|
11
|
Abstract
Humans are vocal modulators par excellence. This ability is supported in part by the dual representation of the laryngeal muscles in the motor cortex. Movement, however, is not the product of motor cortex alone but of a broader motor network. This network consists of brain regions that contain somatotopic maps that parallel the organization in motor cortex. We therefore present a novel hypothesis that the dual laryngeal representation is repeated throughout the broader motor network. In support of the hypothesis, we review existing literature that demonstrates the existence of network-wide somatotopy and present initial evidence for the hypothesis' plausibility. Understanding how this uniquely human phenotype in motor cortex interacts with broader brain networks is an important step toward understanding how humans evolved the ability to speak. We further suggest that this system may provide a means to study how individual components of the nervous system evolved within the context of neuronal networks. This article is part of the theme issue 'Voice modulation: from origin and mechanism to social impact (Part I)'.
Collapse
Affiliation(s)
- Michel Belyk
- Department of Speech Hearing and Phonetic Sciences, University College London, London WC1N 1PJ, UK
- Department of Psychology, Edge Hill University, Ormskirk, L39 4QP, UK
| | - Nicole Eichert
- Wellcome Centre for Integrative Neuroimaging, Centre for Functional MRI of the Brain (FMRIB), Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford OX3 9DU, UK
| | - Carolyn McGettigan
- Department of Speech Hearing and Phonetic Sciences, University College London, London WC1N 1PJ, UK
| |
Collapse
|
12
|
Waters S, Kanber E, Lavan N, Belyk M, Carey D, Cartei V, Lally C, Miquel M, McGettigan C. Singers show enhanced performance and neural representation of vocal imitation. Philos Trans R Soc Lond B Biol Sci 2021; 376:20200399. [PMID: 34719245 PMCID: PMC8558773 DOI: 10.1098/rstb.2020.0399] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/06/2021] [Indexed: 12/17/2022] Open
Abstract
Humans have a remarkable capacity to finely control the muscles of the larynx, via distinct patterns of cortical topography and innervation that may underpin our sophisticated vocal capabilities compared with non-human primates. Here, we investigated the behavioural and neural correlates of laryngeal control, and their relationship to vocal expertise, using an imitation task that required adjustments of larynx musculature during speech. Highly trained human singers and non-singer control participants modulated voice pitch and vocal tract length (VTL) to mimic auditory speech targets, while undergoing real-time anatomical scans of the vocal tract and functional scans of brain activity. Multivariate analyses of speech acoustics, larynx movements and brain activation data were used to quantify vocal modulation behaviour and to search for neural representations of the two modulated vocal parameters during the preparation and execution of speech. We found that singers showed more accurate task-relevant modulations of speech pitch and VTL (i.e. larynx height, as measured with vocal tract MRI) during speech imitation; this was accompanied by stronger representation of VTL within a region of the right somatosensory cortex. Our findings suggest a common neural basis for enhanced vocal control in speech and song. This article is part of the theme issue 'Voice modulation: from origin and mechanism to social impact (Part I)'.
Collapse
Affiliation(s)
- Sheena Waters
- Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
- Wolfson Institute of Preventive Medicine, Barts and The London School of Medicine and Dentistry, Charterhouse Square, London EC1M 6BQ, UK
| | - Elise Kanber
- Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London WC1N 1PF, UK
| | - Nadine Lavan
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London WC1N 1PF, UK
- Department of Biological and Experimental Psychology, Queen Mary University of London, Mile End Road, Bethnal Green, London E1 4NS, UK
| | - Michel Belyk
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London WC1N 1PF, UK
| | - Daniel Carey
- Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
- Data & AI, Novartis Pharmaceuticals, Novartis Global Service Center, 203 Merrion Road, Dublin 4 D04 NN12, Ireland
| | - Valentina Cartei
- Equipe de Neuro-Ethologie Sensorielle (ENES), Centre de Recherche en Neurosciences de Lyon, Université de Lyon/Saint-Etienne, 21 rue du Docteur Paul Michelon, 42100 Saint-Etienne, France
- Department of Psychology, Institute of Education, Health and Social Sciences, University of Chichester, College Lane, Chichester, West Sussex PO19 6PE, UK
| | - Clare Lally
- Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London WC1N 1PF, UK
| | - Marc Miquel
- Department of Clinical Physics, Barts Health NHS Trust, London EC1A 7BE, UK
- William Harvey Research Institute, Queen Mary University of London, London EC1M 6BQ, UK
| | - Carolyn McGettigan
- Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
- Speech, Hearing and Phonetic Sciences, University College London, 2 Wakefield Street, London WC1N 1PF, UK
| |
Collapse
|
13
|
Abstract
Joint speech behaviours where speakers produce speech in unison are found in a variety of everyday settings, and have clinical relevance as a temporary fluency-enhancing technique for people who stutter. It is currently unknown whether such synchronisation of speech timing among two speakers is also accompanied by alignment in their vocal characteristics, for example in acoustic measures such as pitch. The current study investigated this by testing whether convergence in voice fundamental frequency (F0) between speakers could be demonstrated during synchronous speech. Sixty participants across two online experiments were audio recorded whilst reading a series of sentences, first on their own, and then in synchrony with another speaker (the accompanist) in a number of between-subject conditions. Experiment 1 demonstrated significant convergence in participants' F0 to a pre-recorded accompanist voice, in the form of both upward (high F0 accompanist condition) and downward (low and extra-low F0 accompanist conditions) changes in F0. Experiment 2 demonstrated that such convergence was not seen during a visual synchronous speech condition, in which participants spoke in synchrony with silent video recordings of the accompanist. An audiovisual condition in which participants were able to both see and hear the accompanist in pre-recorded videos did not result in greater convergence in F0 compared to synchronisation with the pre-recorded voice alone. These findings suggest the need for models of speech motor control to incorporate interactions between self- and other-speech feedback during speech production, and suggest a novel hypothesis for the mechanisms underlying the fluency-enhancing effects of synchronous speech in people who stutter.
Collapse
Affiliation(s)
- Abigail R. Bradshaw
- Department of Speech, Hearing & Phonetic Sciences, University College London, London, United Kingdom
| | - Carolyn McGettigan
- Department of Speech, Hearing & Phonetic Sciences, University College London, London, United Kingdom
| |
Collapse
|
14
|
Abstract
Previous research suggests that familiarity with a voice can afford benefits for voice and speech perception. However, even familiar voice perception has been reported to be error-prone, especially in the face of challenges such as reduced verbal cues and acoustic distortions. It has been hypothesized that such findings may arise due to listeners not being "familiar enough" with the voices used in laboratory studies, and thus being inexperienced with their full vocal repertoire. Extending this idea, voice perception based on highly familiar voices-acquired via substantial, naturalistic experience-should therefore be more robust than voice perception from less familiar voices. We investigated this proposal by contrasting voice perception of personally familiar voices (participants' romantic partners) versus lab-trained voices in challenging experimental tasks. Specifically, we tested how differences in familiarity may affect voice-identity perception from nonverbal vocalizations and acoustically modulated speech. Large benefits for the personally familiar voice over a less familiar, lab-trained voice were found for identity recognition, with listeners displaying both highly accurate yet more conservative recognition of personally familiar voices. However, no familiar-voice benefits were found for speech perception in background noise. Our findings suggest that listeners have fine-tuned representations of highly familiar voices that result in more robust and accurate voice recognition despite challenging listening contexts, yet these advantages may not always extend to speech perception. We conclude that familiarity with voices is indeed on a continuum, with identity perception for personally familiar voices being highly accurate. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
Affiliation(s)
- Elise Kanber
- Department of Speech, Hearing and Phonetic Sciences, University College London
| | - Nadine Lavan
- Department of Speech, Hearing and Phonetic Sciences, University College London
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London
| |
Collapse
|
15
|
Bradshaw AR, Lametti DR, McGettigan C. The Role of Sensory Feedback in Developmental Stuttering: A Review. Neurobiol Lang (Camb) 2021; 2:308-334. [PMID: 37216145 PMCID: PMC10158644 DOI: 10.1162/nol_a_00036] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Accepted: 03/16/2021] [Indexed: 05/24/2023]
Abstract
Developmental stuttering is a neurodevelopmental disorder that severely affects speech fluency. Multiple lines of evidence point to a role of sensory feedback in the disorder; this has led to a number of theories proposing different disruptions to the use of sensory feedback during speech motor control in people who stutter. The purpose of this review was to bring together evidence from studies using altered auditory feedback paradigms with people who stutter, in order to evaluate the predictions of these different theories. This review highlights converging evidence for particular patterns of differences in the responses of people who stutter to feedback perturbations. The implications for hypotheses on the nature of the disruption to sensorimotor control of speech in the disorder are discussed, with reference to neurocomputational models of speech control (predominantly, the DIVA model; Guenther et al., 2006; Tourville et al., 2008). While some consistent patterns are emerging from this evidence, it is clear that more work in this area is needed with developmental samples in particular, in order to tease apart differences related to symptom onset from those related to compensatory strategies that develop with experience of stuttering.
Collapse
Affiliation(s)
- Abigail R. Bradshaw
- Department of Speech, Hearing & Phonetic Sciences, University College London, UK
| | | | - Carolyn McGettigan
- Department of Speech, Hearing & Phonetic Sciences, University College London, UK
| |
Collapse
|
16
|
Abstract
When presented with voices, we make rapid, automatic judgements of social traits such as trustworthiness—and such judgements are highly consistent across listeners. However, it remains unclear whether voice-based first impressions actually influence behaviour towards a voice’s owner, and—if they do—whether and how they interact over time with the voice owner’s observed actions to further influence the listener’s behaviour. This study used an investment game paradigm to investigate (1) whether voices judged to differ in relevant social traits accrued different levels of investment and/or (2) whether first impressions of the voices interacted with the behaviour of their apparent owners to influence investments over time. Results show that participants were responding to their partner’s behaviour. Crucially, however, there were no effects of voice. These findings suggest that, at least under some conditions, social traits perceived from the voice alone may not influence trusting behaviours in the context of a virtual interaction.
Collapse
Affiliation(s)
- Sarah Knight
- Department of Psychology, University of York, York, UK.,Speech, Hearing and Phonetic Sciences, University College London, London, UK.,Department of Psychology, Royal Holloway, University of London, London, UK
| | - Nadine Lavan
- Speech, Hearing and Phonetic Sciences, University College London, London, UK.,Department of Psychology, Royal Holloway, University of London, London, UK
| | - Ilaria Torre
- Division of Robotics, Perception and Learning, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Carolyn McGettigan
- Speech, Hearing and Phonetic Sciences, University College London, London, UK.,Department of Psychology, Royal Holloway, University of London, London, UK
| |
Collapse
|
17
|
Lavan N, Mileva M, Burton AM, Young AW, McGettigan C. Trait evaluations of faces and voices: Comparing within- and between-person variability. J Exp Psychol Gen 2021; 150:1854-1869. [PMID: 33734774 DOI: 10.1037/xge0001019] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Human faces and voices are rich sources of information that can vary in many different ways. Most of the literature on face/voice perception has focused on understanding how people look and sound different to each other (between-person variability). However, recent studies highlight the ways in which the same person can look and sound different on different occasions (within-person variability). Across three experiments, we examined how within- and between-person variability relate to one another for social trait impressions by collecting trait ratings attributed to multiple face images and voice recordings of the same people. We find that within-person variability in social trait evaluations is at least as great as between-person variability. Using different stimulus sets across experiments, trait impressions of voices are consistently more variable within people than between people-a pattern that is only evident occasionally when judging faces. Our findings highlight the importance of understanding within-person variability, showing how judgments of the same person can vary widely on different encounters and quantify how this pattern differs for voice and face perception. The work consequently has implications for theoretical models proposing that voices can be considered "auditory faces" and imposes limitations to the "kernel of truth" hypothesis of trait evaluations. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
Affiliation(s)
- Nadine Lavan
- Department of Speech, Hearing and Phonetic Sciences
| | | | | | | | | |
Collapse
|
18
|
Bradshaw AR, McGettigan C. Instrumental learning in social interactions: Trait learning from faces and voices. Q J Exp Psychol (Hove) 2021; 74:1344-1359. [PMID: 33596727 PMCID: PMC8261770 DOI: 10.1177/1747021821999663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Recent research suggests that reinforcement learning may underlie trait formation in social interactions with faces. The current study investigated whether the same learning mechanisms could be engaged for trait learning from voices. On each trial of a training phase, participants (N = 192) chose from pairs of human or slot machine targets that varied in the (1) reward value and (2) generosity of their payouts. Targets were either auditory (voices or tones; Experiment 1) or visual (faces or icons; Experiment 2) and were presented sequentially before payout feedback. A test phase measured participant choice behaviour, and a post-test recorded their target preference ratings. For auditory targets, we found a significant effect of reward only on target choices, but saw higher preference ratings for more generous humans and slot machines. For visual targets, findings from previous studies were replicated: participants learned about both generosity and reward, but generosity was prioritised in the human condition. These findings provide one of the first demonstrations of reinforcement learning of reward with auditory stimuli in a social learning task, but suggest that the use of auditory targets does alter learning in this paradigm. Conversely, reinforcement learning of reward and trait information with visual stimuli remains intact even when sequential presentation introduces a delay in feedback.
Collapse
Affiliation(s)
- Abigail R Bradshaw
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK.,Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| |
Collapse
|
19
|
Garrido L, Tsantani M, Storrs K, McGettigan C, Kriegeskorte N. Distinct identity information encoded in FFA and OFA. J Vis 2020. [DOI: 10.1167/jov.20.11.536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
20
|
Payne B, Lavan N, Knight S, McGettigan C. Perceptual prioritization of self-associated voices. Br J Psychol 2020; 112:585-610. [PMID: 33068323 DOI: 10.1111/bjop.12479] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 07/16/2020] [Accepted: 09/22/2020] [Indexed: 11/26/2022]
Abstract
Information associated with the self is prioritized relative to information associated with others and is therefore processed more quickly and accurately. Across three experiments, we examined whether a new externally-generated voice could become associated with the self and thus be prioritized in perception. In the first experiment, participants learned associations between three unfamiliar voices and three identities (self, friend, stranger). Participants then made speeded judgements of whether voice-identity pairs were correctly matched, or not. A clear self-prioritization effect was found, with participants showing quicker and more accurate responses to the newly self-associated voice relative to either the friend- or stranger- voice. In two further experiments, we tested whether this prioritization effect increased if the self-voice was gender-matched to the identity of the participant (Experiment 2) or if the self-voice was chosen by the participant (Experiment 3). Gender-matching did not significantly influence prioritization; the self-voice was similarly prioritized when it matched the gender identity of the listener as when it did not. However, we observed that choosing the self-voice did interact with prioritization (Experiment 3); the self-voice became more prominent, via lesser prioritization of the other identities, when the self-voice was chosen relative to when it was not. Our findings have implications for the design and selection of individuated synthetic voices used for assistive communication devices, suggesting that agency in choosing a new vocal identity may modulate the distinctiveness of that voice relative to others.
Collapse
Affiliation(s)
- Bryony Payne
- Department of Speech, Hearing and Phonetic Sciences, University College London, UK
| | - Nadine Lavan
- Department of Speech, Hearing and Phonetic Sciences, University College London, UK.,Department of Psychology, Queen Mary University of London, UK
| | - Sarah Knight
- Department of Speech, Hearing and Phonetic Sciences, University College London, UK.,Department of Psychology, University of York, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, UK
| |
Collapse
|
21
|
Johnson J, McGettigan C, Lavan N. Comparing unfamiliar voice and face identity perception using identity sorting tasks. Q J Exp Psychol (Hove) 2020; 73:1537-1545. [PMID: 32530364 PMCID: PMC7534197 DOI: 10.1177/1747021820938659] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Revised: 02/11/2020] [Accepted: 03/03/2020] [Indexed: 11/16/2022]
Abstract
Identity sorting tasks, in which participants sort multiple naturally varying stimuli of usually two identities into perceived identities, have recently gained popularity in voice and face processing research. In both modalities, participants who are unfamiliar with the identities tend to perceive multiple stimuli of the same identity as different people and thus fail to "tell people together." These similarities across modalities suggest that modality-general mechanisms may underpin sorting behaviour. In this study, participants completed a voice sorting and a face sorting task. Taking an individual differences approach, we asked whether participants' performance on voice and face sorting of unfamiliar identities is correlated. Participants additionally completed a voice discrimination (Bangor Voice Matching Test) and a face discrimination task (Glasgow Face Matching Test). Using these tasks, we tested whether performance on sorting related to explicit identity discrimination. Performance on voice sorting and face sorting tasks was correlated, suggesting that common modality-general processes underpin these tasks. However, no significant correlations were found between sorting and discrimination performance, with the exception of significant relationships for performance on "same identity" trials with "telling people together" for voices and faces. Overall, any reported relationships were however relatively weak, suggesting the presence of additional modality-specific and task-specific processes.
Collapse
Affiliation(s)
- Justine Johnson
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Nadine Lavan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| |
Collapse
|
22
|
Guldner S, Nees F, McGettigan C. Vocomotor and Social Brain Networks Work Together to Express Social Traits in Voices. Cereb Cortex 2020; 30:6004-6020. [PMID: 32577719 DOI: 10.1093/cercor/bhaa175] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 05/08/2020] [Accepted: 05/31/2020] [Indexed: 11/14/2022] Open
Abstract
Voice modulation is important when navigating social interactions-tone of voice in a business negotiation is very different from that used to comfort an upset child. While voluntary vocal behavior relies on a cortical vocomotor network, social voice modulation may require additional social cognitive processing. Using functional magnetic resonance imaging, we investigated the neural basis for social vocal control and whether it involves an interplay of vocal control and social processing networks. Twenty-four healthy adult participants modulated their voice to express social traits along the dimensions of the social trait space (affiliation and competence) or to express body size (control for vocal flexibility). Naïve listener ratings showed that vocal modulations were effective in evoking social trait ratings along the two primary dimensions of the social trait space. Whereas basic vocal modulation engaged the vocomotor network, social voice modulation specifically engaged social processing regions including the medial prefrontal cortex, superior temporal sulcus, and precuneus. Moreover, these regions showed task-relevant modulations in functional connectivity to the left inferior frontal gyrus, a core vocomotor control network area. These findings highlight the impact of the integration of vocal motor control and social information processing for socially meaningful voice modulation.
Collapse
Affiliation(s)
- Stella Guldner
- Department of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim 68159, Germany.,Graduate School of Economic and Social Sciences, University of Mannheim, Mannheim 68159, Germany.,Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Frauke Nees
- Department of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim 68159, Germany.,Institute of Medical Psychology and Medical Sociology, University Medical Center Schleswig Holstein, Kiel University, Kiel 24105, Germany
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK.,Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK
| |
Collapse
|
23
|
Lavan N, Mileva M, McGettigan C. How does familiarity with a voice affect trait judgements? Br J Psychol 2020; 112:282-300. [PMID: 32445499 DOI: 10.1111/bjop.12454] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 05/05/2020] [Indexed: 11/27/2022]
Abstract
From only a single spoken word, listeners can form a wealth of first impressions of a person's character traits and personality based on their voice. However, due to the substantial within-person variability in voices, these trait judgements are likely to be highly stimulus-dependent for unfamiliar voices: The same person may sound very trustworthy in one recording but less trustworthy in another. How trait judgements differ when listeners are familiar with a voice is unclear: Are listeners who are familiar with the voices as susceptible to the effects of within-person variability? Does the semantic knowledge listeners have about a familiar person influence their judgements? In the current study, we tested the effect of familiarity on listeners' trait judgements from variable voices across 3 experiments. Using a between-subjects design, we contrasted trait judgements by listeners who were familiar with a set of voices - either through laboratory-based training or through watching a TV show - with listeners who were unfamiliar with the voices. We predicted that familiarity with the voices would reduce variability in trait judgements for variable voice recordings from the same identity (cf. Mileva, Kramer & Burton, Perception, 48, 471 and 2019, for faces). However, across the 3 studies and two types of measures to assess variability, we found no compelling evidence to suggest that trait impressions were systematically affected by familiarity.
Collapse
Affiliation(s)
- Nadine Lavan
- Department of Speech, Hearing and Phonetic Sciences, University College London, UK
| | - Mila Mileva
- Department of Psychology, University of York, UK.,School of Psychology, University of Plymouth, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, UK
| |
Collapse
|
24
|
Lavan N, Merriman SE, Ladwa P, Burston LFK, Knight S, McGettigan C. 'Please sort these voice recordings into 2 identities': Effects of task instructions on performance in voice sorting studies. Br J Psychol 2019; 111:556-569. [PMID: 31328792 DOI: 10.1111/bjop.12416] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Revised: 06/12/2019] [Indexed: 11/29/2022]
Abstract
We investigated the effects of two types of task instructions on performance on a voice sorting task by listeners who were either familiar or unfamiliar with the voices. Listeners were asked to sort 15 naturally varying stimuli from two voice identities into perceived identities. Half of the listeners sorted the recordings freely into as many identities as they perceived; the other half were forced to sort stimuli into two identities only. As reported in previous studies, unfamiliar listeners formed more clusters than familiar listeners. Listeners therefore perceived different naturally varying stimuli from the same identity as coming from different identities, while being highly accurate at telling apart the stimuli from different voices. We further show that a change in task instructions - forcing listeners to sort stimuli into two identities only - helped unfamiliar listeners to overcome this selective failure at 'telling people together'. This improvement, however, came at the cost of an increase in errors in telling people apart. For familiar listeners, similar non-significant trends were apparent. Therefore, even when informed about correct number of identities, listeners may fail to accurately perceive identity further highlighting that voice identity perception in the context of natural within-person variability is a challenging task. We discuss our results in terms of similarities and differences to findings in the face perception literature and their importance in applied settings, such as forensic voice identification.
Collapse
Affiliation(s)
- Nadine Lavan
- Department of Speech, Hearing and Phonetic Sciences, University College London, UK.,Department of Psychology, Royal Holloway, University of London, UK
| | | | - Paayal Ladwa
- Department of Psychology, Royal Holloway, University of London, UK
| | - Luke F K Burston
- Department of Psychology, Royal Holloway, University of London, UK
| | - Sarah Knight
- Department of Speech, Hearing and Phonetic Sciences, University College London, UK.,Department of Psychology, Royal Holloway, University of London, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, UK.,Department of Psychology, Royal Holloway, University of London, UK
| |
Collapse
|
25
|
Lavan N, Knight S, Hazan V, McGettigan C. The effects of high variability training on voice identity learning. Cognition 2019; 193:104026. [PMID: 31323377 DOI: 10.1016/j.cognition.2019.104026] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Revised: 05/16/2019] [Accepted: 07/08/2019] [Indexed: 10/26/2022]
Abstract
High variability training has been shown to benefit the learning of new face identities. In three experiments, we investigated whether this is also the case for voice identity learning. In Experiment 1a, we contrasted high variability training sets - which included stimuli extracted from a number of different recording sessions, speaking environments and speaking styles - with low variability stimulus sets that only included a single speaking style (read speech) extracted from one recording session (see Ritchie & Burton, 2017 for faces). Listeners were tested on an old/new recognition task using read sentences (i.e. test materials fully overlapped with the low variability training stimuli) and we found a high variability disadvantage. In Experiment 1b, listeners were trained in a similar way, however, now there was no overlap in speaking style or recording session between training sets and test stimuli. Here, we found a high variability advantage. In Experiment 2, variability was manipulated in terms of the number of unique items as opposed to number of unique speaking styles. Here, we contrasted the high variability training sets used in Experiment 1a with low variability training sets that included the same breadth of styles, but fewer unique items; instead, individual items were repeated (see Murphy, Ipser, Gaigg, & Cook, 2015 for faces). We found only weak evidence for a high variability advantage, which could be explained by stimulus-specific effects. We propose that high variability advantages may be particularly pronounced when listeners are required to generalise from trained stimuli to different-sounding, previously unheard stimuli. We discuss these findings in the context of mechanisms thought to underpin advantages for high variability training.
Collapse
Affiliation(s)
- Nadine Lavan
- Department of Speech, Hearing and Phonetic Sciences, University College London, United Kingdom; Department of Psychology, Royal Holloway, University of London, United Kingdom.
| | - Sarah Knight
- Department of Speech, Hearing and Phonetic Sciences, University College London, United Kingdom
| | - Valerie Hazan
- Department of Speech, Hearing and Phonetic Sciences, University College London, United Kingdom
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, United Kingdom; Department of Psychology, Royal Holloway, University of London, United Kingdom.
| |
Collapse
|
26
|
Abstract
Models of voice perception propose that identities are encoded relative to an abstracted average or prototype. While there is some evidence for norm-based coding when learning to discriminate different voices, little is known about how the representation of an individual's voice identity is formed through variable exposure to that voice. In two experiments, we show evidence that participants form abstracted representations of individual voice identities based on averages, despite having never been exposed to these averages during learning. We created 3 perceptually distinct voice identities, fully controlling their within-person variability. Listeners first learned to recognise these identities based on ring-shaped distributions located around the perimeter of within-person voice spaces - crucially, these distributions were missing their centres. At test, listeners' accuracy for old/new judgements was higher for stimuli located on an untrained distribution nested around the centre of each ring-shaped distribution compared to stimuli on the trained ring-shaped distribution.
Collapse
Affiliation(s)
- Nadine Lavan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, WC1N 1PF, UK.
- Department of Psychology, Royal Holloway, University of London, Egham, TW20 0EX, UK.
| | - Sarah Knight
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, WC1N 1PF, UK
- Department of Psychology, Royal Holloway, University of London, Egham, TW20 0EX, UK
| | - Carolyn McGettigan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, WC1N 1PF, UK.
- Department of Psychology, Royal Holloway, University of London, Egham, TW20 0EX, UK.
| |
Collapse
|
27
|
Einarsson G, Sherrard L, Zorn B, Hatch J, McGettigan C, Bradbury I, Campbell C, Johnston E, O'Neill K, McIlreavey L, McGrath S, Gilpin D, Murray M, Lavelle G, McElvaney G, Wolfgang M, Boucher R, Muhlebach M, Elborn J, Tunney M. P140 Microbial community composition in cystic fibrosis patients during treatment for pulmonary exacerbation. J Cyst Fibros 2019. [DOI: 10.1016/s1569-1993(19)30434-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
28
|
Lavan N, Burston LF, Ladwa P, Merriman SE, Knight S, McGettigan C. Breaking voice identity perception: Expressive voices are more confusable for listeners. Q J Exp Psychol (Hove) 2019; 72:2240-2248. [PMID: 30808271 DOI: 10.1177/1747021819836890] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The human voice is a highly flexible instrument for self-expression, yet voice identity perception is largely studied using controlled speech recordings. Using two voice-sorting tasks with naturally varying stimuli, we compared the performance of listeners who were familiar and unfamiliar with the TV show Breaking Bad. Listeners organised audio clips of speech with (1) low-expressiveness and (2) high-expressiveness into perceived identities. We predicted that increased expressiveness (e.g., shouting, strained voice) would significantly impair performance. Overall, while unfamiliar listeners were less able to generalise identity across exemplars, the two groups performed equivalently well when telling voices apart when dealing with low-expressiveness stimuli. However, high vocal expressiveness significantly impaired telling apart in both the groups: this led to increased misidentifications, where sounds from one character were assigned to the other. These misidentifications were highly consistent for familiar listeners but less consistent for unfamiliar listeners. Our data suggest that vocal flexibility has powerful effects on identity perception, where changes in the acoustic properties of vocal signals introduced by expressiveness lead to effects apparent in familiar and unfamiliar listeners alike. At the same time, expressiveness appears to have affected other aspects of voice identity processing selectively in one listener group but not the other, thus revealing complex interactions of stimulus properties and listener characteristics (i.e., familiarity) in identity processing.
Collapse
Affiliation(s)
- Nadine Lavan
- 1 Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK.,2 Department of Psychology, Royal Holloway, University of London, London, UK
| | - Luke Fk Burston
- 2 Department of Psychology, Royal Holloway, University of London, London, UK
| | - Paayal Ladwa
- 2 Department of Psychology, Royal Holloway, University of London, London, UK
| | - Siobhan E Merriman
- 2 Department of Psychology, Royal Holloway, University of London, London, UK
| | - Sarah Knight
- 1 Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Carolyn McGettigan
- 1 Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK.,2 Department of Psychology, Royal Holloway, University of London, London, UK
| |
Collapse
|
29
|
Abstract
In two experiments, we explore how speaker sex recognition is affected by vocal flexibility, introduced by volitional and spontaneous vocalizations. In Experiment 1, participants judged speaker sex from two spontaneous vocalizations, laughter and crying, and volitionally produced vowels. Striking effects of speaker sex emerged: For male vocalizations, listeners' performance was significantly impaired for spontaneous vocalizations (laughter and crying) compared to a volitional baseline (repeated vowels), a pattern that was also reflected in longer reaction times for spontaneous vocalizations. Further, performance was less accurate for laughter than crying. For female vocalizations, a different pattern emerged. In Experiment 2, we largely replicated the findings of Experiment 1 using spontaneous laughter, volitional laughter and (volitional) vowels: here, performance for male vocalizations was impaired for spontaneous laughter compared to both volitional laughter and vowels, providing further evidence that differences in volitional control over vocal production may modulate our ability to accurately perceive speaker sex from vocal signals. For both experiments, acoustic analyses showed relationships between stimulus fundamental frequency (F0) and the participants' responses. The higher the F0 of a vocal signal, the more likely listeners were to perceive a vocalization as being produced by a female speaker, an effect that was more pronounced for vocalizations produced by males. We discuss the results in terms of the availability of salient acoustic cues across different vocalizations.
Collapse
Affiliation(s)
- Nadine Lavan
- Department of Psychology, Royal Holloway, University of London, Egham Hill, Egham, TW20 0EX UK
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| | - Abigail Domone
- Department of Psychology, Royal Holloway, University of London, Egham Hill, Egham, TW20 0EX UK
| | - Betty Fisher
- Department of Psychology, Royal Holloway, University of London, Egham Hill, Egham, TW20 0EX UK
| | - Noa Kenigzstein
- Department of Psychology, Royal Holloway, University of London, Egham Hill, Egham, TW20 0EX UK
| | | | - Carolyn McGettigan
- Department of Psychology, Royal Holloway, University of London, Egham Hill, Egham, TW20 0EX UK
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, UK
| |
Collapse
|
30
|
Tsantani M, Kriegeskorte N, McGettigan C, Garrido L. Faces and voices in the brain: RSA reveals modality-general person-identity representations in the STS. J Vis 2018. [DOI: 10.1167/18.10.1139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Affiliation(s)
- Maria Tsantani
- Division of Psychology, Department of Life Sciences, Brunel University London
| | | | | | - Lúcia Garrido
- Division of Psychology, Department of Life Sciences, Brunel University London
| |
Collapse
|
31
|
Agnew ZK, Banissy MJ, McGettigan C, Walsh V, Scott SK. Investigating the Neural Basis of Theta Burst Stimulation to Premotor Cortex on Emotional Vocalization Perception: A Combined TMS-fMRI Study. Front Hum Neurosci 2018; 12:150. [PMID: 29867402 PMCID: PMC5962765 DOI: 10.3389/fnhum.2018.00150] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2017] [Accepted: 04/04/2018] [Indexed: 12/01/2022] Open
Abstract
Previous studies have established a role for premotor cortex in the processing of auditory emotional vocalizations. Inhibitory continuous theta burst transcranial magnetic stimulation (cTBS) applied to right premotor cortex selectively increases the reaction time to a same-different task, implying a causal role for right ventral premotor cortex (PMv) in the processing of emotional sounds. However, little is known about the functional networks to which PMv contribute across the cortical hemispheres. In light of these data, the present study aimed to investigate how and where in the brain cTBS affects activity during the processing of auditory emotional vocalizations. Using functional neuroimaging, we report that inhibitory cTBS applied to the right premotor cortex (compared to vertex control site) results in three distinct response profiles: following stimulation of PMv, widespread frontoparietal cortices, including a site close to the target site, and parahippocampal gyrus displayed an increase in activity, whereas the reverse response profile was apparent in a set of midline structures and right IFG. A third response profile was seen in left supramarginal gyrus in which activity was greater post-stimulation at both stimulation sites. Finally, whilst previous studies have shown a condition specific behavioral effect following cTBS to premotor cortex, we did not find a condition specific neural change in BOLD response. These data demonstrate a complex relationship between cTBS and activity in widespread neural networks and are discussed in relation to both emotional processing and the neural basis of cTBS.
Collapse
Affiliation(s)
- Zarinah K Agnew
- Institute of Cognitive Neuroscience, University College London, London, United Kingdom.,Otolaryngology-Head & Neck Surgery Clinic, University of California, San Francisco, San Francisco, CA, United States
| | - Michael J Banissy
- Institute of Cognitive Neuroscience, University College London, London, United Kingdom.,Department of Psychology, Goldsmiths, University of London, London, United Kingdom
| | | | - Vincent Walsh
- Institute of Cognitive Neuroscience, University College London, London, United Kingdom
| | - Sophie K Scott
- Institute of Cognitive Neuroscience, University College London, London, United Kingdom
| |
Collapse
|
32
|
Agnew ZK, McGettigan C, Banks B, Scott SK. Group and individual variability in speech production networks during delayed auditory feedback. J Acoust Soc Am 2018; 143:3009. [PMID: 29857719 PMCID: PMC5963950 DOI: 10.1121/1.5026500] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2016] [Revised: 02/05/2018] [Accepted: 02/12/2018] [Indexed: 06/08/2023]
Abstract
Altering reafferent sensory information can have a profound effect on motor output. Introducing a short delay [delayed auditory feedback (DAF)] during speech production results in modulations of voice and loudness, and produces a range of speech dysfluencies. The ability of speakers to resist the effects of delayed feedback is variable yet it is unclear what neural processes underlie differences in susceptibility to DAF. Here, susceptibility to DAF is investigated by looking at the neural basis of within and between subject changes in speech fluency under 50 and 200 ms delay conditions. Using functional magnetic resonance imaging, networks involved in producing speech under two levels of DAF were identified, lying largely within networks active during normal speech production. Independent of condition, fluency ratings were associated with midbrain activity corresponding to periaqueductal grey matter. Across subject variability in ability to produce normal sounding speech under a 200 ms delay was associated with activity in ventral sensorimotor cortices, whereas ability to produce normal sounding speech under a 50 ms delay was associated with left inferior frontal gyrus activity. These data indicate whilst overlapping cortical mechanisms are engaged for speaking under different delay conditions, susceptibility to different temporal delays in speech feedback may involve different processes.
Collapse
Affiliation(s)
- Z K Agnew
- Institute for Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, United Kingdom
| | - C McGettigan
- Institute for Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, United Kingdom
| | - B Banks
- Institute for Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, United Kingdom
| | - S K Scott
- Institute for Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, United Kingdom
| |
Collapse
|
33
|
Carey D, Miquel ME, Evans BG, Adank P, McGettigan C. Vocal Tract Images Reveal Neural Representations of Sensorimotor Transformation During Speech Imitation. Cereb Cortex 2018; 27:3064-3079. [PMID: 28334401 PMCID: PMC5939209 DOI: 10.1093/cercor/bhx056] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Indexed: 12/23/2022] Open
Abstract
Imitating speech necessitates the transformation from sensory targets to vocal tract motor output, yet little is known about the representational basis of this process in the human brain. Here, we address this question by using real-time MR imaging (rtMRI) of the vocal tract and functional MRI (fMRI) of the brain in a speech imitation paradigm. Participants trained on imitating a native vowel and a similar nonnative vowel that required lip rounding. Later, participants imitated these vowels and an untrained vowel pair during separate fMRI and rtMRI runs. Univariate fMRI analyses revealed that regions including left inferior frontal gyrus were more active during sensorimotor transformation (ST) and production of nonnative vowels, compared with native vowels; further, ST for nonnative vowels activated somatomotor cortex bilaterally, compared with ST of native vowels. Using test representational similarity analysis (RSA) models constructed from participants’ vocal tract images and from stimulus formant distances, we found that RSA searchlight analyses of fMRI data showed either type of model could be represented in somatomotor, temporal, cerebellar, and hippocampal neural activation patterns during ST. We thus provide the first evidence of widespread and robust cortical and subcortical neural representation of vocal tract and/or formant parameters, during prearticulatory ST.
Collapse
Affiliation(s)
- Daniel Carey
- Department of Psychology, Royal Holloway, University of London, London TW20 0EX, UK.,Combined Universities Brain Imaging Centre, Royal Holloway, University of London, London TW20 0EX, UK.,The Irish Longitudinal Study on Ageing (TILDA), Department of Medical Gerontology, Trinity College Dublin, Dublin, Ireland
| | - Marc E Miquel
- William Harvey Research Institute, Queen Mary, University of London, London EC1M 6BQ, UK.,Clinical Physics, Barts Health NHS Trust, London EC1A 7BE, UK
| | - Bronwen G Evans
- Department of Speech, Hearing & Phonetic Sciences, University College London, London WC1E 6BT, UK
| | - Patti Adank
- Department of Speech, Hearing & Phonetic Sciences, University College London, London WC1E 6BT, UK
| | - Carolyn McGettigan
- Department of Psychology, Royal Holloway, University of London, London TW20 0EX, UK.,Combined Universities Brain Imaging Centre, Royal Holloway, University of London, London TW20 0EX, UK.,Institute of Cognitive Neuroscience, University College London, London WC1N 3AR, UK
| |
Collapse
|
34
|
|
35
|
Carey D, Miquel ME, Evans BG, Adank P, McGettigan C. Functional brain outcomes of L2 speech learning emerge during sensorimotor transformation. Neuroimage 2017; 159:18-31. [PMID: 28669904 DOI: 10.1016/j.neuroimage.2017.06.053] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2017] [Revised: 06/20/2017] [Accepted: 06/21/2017] [Indexed: 11/18/2022] Open
Abstract
Sensorimotor transformation (ST) may be a critical process in mapping perceived speech input onto non-native (L2) phonemes, in support of subsequent speech production. Yet, little is known concerning the role of ST with respect to L2 speech, particularly where learned L2 phones (e.g., vowels) must be produced in more complex lexical contexts (e.g., multi-syllabic words). Here, we charted the behavioral and neural outcomes of producing trained L2 vowels at word level, using a speech imitation paradigm and functional MRI. We asked whether participants would be able to faithfully imitate trained L2 vowels when they occurred in non-words of varying complexity (one or three syllables). Moreover, we related individual differences in imitation success during training to BOLD activation during ST (i.e., pre-imitation listening), and during later imitation. We predicted that superior temporal and peri-Sylvian speech regions would show increased activation as a function of item complexity and non-nativeness of vowels, during ST. We further anticipated that pre-scan acoustic learning performance would predict BOLD activation for non-native (vs. native) speech during ST and imitation. We found individual differences in imitation success for training on the non-native vowel tokens in isolation; these were preserved in a subsequent task, during imitation of mono- and trisyllabic words containing those vowels. fMRI data revealed a widespread network involved in ST, modulated by both vowel nativeness and utterance complexity: superior temporal activation increased monotonically with complexity, showing greater activation for non-native than native vowels when presented in isolation and in trisyllables, but not in monosyllables. Individual differences analyses showed that learning versus lack of improvement on the non-native vowel during pre-scan training predicted increased ST activation for non-native compared with native items, at insular cortex, pre-SMA/SMA, and cerebellum. Our results hold implications for the importance of ST as a process underlying successful imitation of non-native speech.
Collapse
Affiliation(s)
- Daniel Carey
- Department of Psychology, Royal Holloway, University of London, TW20 0EX, UK; Combined Universities Brain Imaging Centre, Royal Holloway, University of London, TW20 0EX, UK; The Irish Longitudinal Study on Ageing (TILDA), Dept. Medical Gerontology, TCD, Dublin, Ireland
| | - Marc E Miquel
- William Harvey Research Institute, Queen Mary, University of London, EC1M 6BQ, UK; Clinical Physics, Barts Health NHS Trust, London, EC1A 7BE, UK
| | - Bronwen G Evans
- Department of Speech, Hearing & Phonetic Sciences, University College London, WC1E 6BT, UK
| | - Patti Adank
- Department of Speech, Hearing & Phonetic Sciences, University College London, WC1E 6BT, UK
| | - Carolyn McGettigan
- Department of Psychology, Royal Holloway, University of London, TW20 0EX, UK; Combined Universities Brain Imaging Centre, Royal Holloway, University of London, TW20 0EX, UK; Institute of Cognitive Neuroscience, University College London, WC1N 3AR, UK.
| |
Collapse
|
36
|
Abstract
We present an investigation of the perception of authenticity in audiovisual laughter, in which we contrast spontaneous and volitional samples and examine the contributions of unimodal affective information to multimodal percepts. In a pilot study, we demonstrate that listeners perceive spontaneous laughs as more authentic than volitional ones, both in unimodal (audio-only, visual-only) and multimodal contexts (audiovisual). In the main experiment, we show that the discriminability of volitional and spontaneous laughter is enhanced for multimodal laughter. Analyses of relationships between affective ratings and the perception of authenticity show that, while both unimodal percepts significantly predict evaluations of audiovisual laughter, it is auditory affective cues that have the greater influence on multimodal percepts. We discuss differences and potential mismatches in emotion signalling through voices and faces, in the context of spontaneous and volitional behaviour, and highlight issues that should be addressed in future studies of dynamic multimodal emotion processing.
Collapse
Affiliation(s)
- Nadine Lavan
- Department of Psychology, Royal Holloway, University of London, Egham, UK
| | - Carolyn McGettigan
- Department of Psychology, Royal Holloway, University of London, Egham, UK
- Institute of Cognitive Neuroscience, University College London, London, UK
| |
Collapse
|
37
|
Spence C, Einarsson G, Lee A, McGettigan C, Johnston E, Verleden S, Vanaudenaerde B, McDonough J, Lammertyn E, Dupont L, Elborn J, Gilpin D, Tunney M. WS03.6 Estimation of total bacterial load in explanted cystic fibrosis (CF) lungs via qPCR. J Cyst Fibros 2017. [DOI: 10.1016/s1569-1993(17)30173-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
38
|
McGettigan C, Jonhston E, Elborn J, Downey D, Tunney M, Gilpin D. 149 Comparison of culture and quantitative PCR for bacterial quantification in CF sputum. J Cyst Fibros 2017. [DOI: 10.1016/s1569-1993(17)30513-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
39
|
McGettigan C, Jasmin K, Eisner F, Agnew ZK, Josephs OJ, Calder AJ, Jessop R, Lawson RP, Spielmann M, Scott SK. You talkin' to me? Communicative talker gaze activates left-lateralized superior temporal cortex during perception of degraded speech. Neuropsychologia 2017; 100:51-63. [PMID: 28400328 PMCID: PMC5446325 DOI: 10.1016/j.neuropsychologia.2017.04.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Revised: 04/05/2017] [Accepted: 04/07/2017] [Indexed: 11/13/2022]
Abstract
Neuroimaging studies of speech perception have consistently indicated a left-hemisphere dominance in the temporal lobes’ responses to intelligible auditory speech signals (McGettigan and Scott, 2012). However, there are important communicative cues that cannot be extracted from auditory signals alone, including the direction of the talker's gaze. Previous work has implicated the superior temporal cortices in processing gaze direction, with evidence for predominantly right-lateralized responses (Carlin & Calder, 2013). The aim of the current study was to investigate whether the lateralization of responses to talker gaze differs in an auditory communicative context. Participants in a functional MRI experiment watched and listened to videos of spoken sentences in which the auditory intelligibility and talker gaze direction were manipulated factorially. We observed a left-dominant temporal lobe sensitivity to the talker's gaze direction, in which the left anterior superior temporal sulcus/gyrus and temporal pole showed an enhanced response to direct gaze – further investigation revealed that this pattern of lateralization was modulated by auditory intelligibility. Our results suggest flexibility in the distribution of neural responses to social cues in the face within the context of a challenging speech perception task. Talker gaze is an important social cue during speech comprehension. Neural responses to gaze were measured during perception of degraded sentences. Gaze direction modulated activation in left-lateralized superior temporal cortex. Left lateralization became stronger when speech was less intelligible. Results suggest task-dependent flexibility in cortical responses to gaze.
Collapse
Affiliation(s)
- Carolyn McGettigan
- Department of Psychology, Royal Holloway University of London, Egham Hill, Egham TW20 0EX, UK; Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK.
| | - Kyle Jasmin
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| | - Frank Eisner
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Donders Institute, Radboud University, Montessorilaan 3, 6525 HR Nijmegen, Netherlands
| | - Zarinah K Agnew
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Department of Otolaryngology, University of California, San Francisco, 513 Parnassus Avenue, San Francisco, CA, USA
| | - Oliver J Josephs
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, 12 Queen Square, London WC1N 3BG, UK
| | - Andrew J Calder
- MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 7EF, UK
| | - Rosemary Jessop
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| | - Rebecca P Lawson
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK; Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, 12 Queen Square, London WC1N 3BG, UK
| | - Mona Spielmann
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| | - Sophie K Scott
- Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK
| |
Collapse
|
40
|
Abstract
[Correction Notice: An Erratum for this article was reported in Vol 17(6) of Emotion (see record 2017-18585-001). In the article, the copyright attribution was incorrectly listed and the Creative Commons CC-BY license disclaimer was incorrectly omitted from the author note. The correct copyright is "© 2017 The Author(s)" and the omitted disclaimer is below. All versions of this article have been corrected. "This article has been published under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Copyright for this article is retained by the author(s). Author(s) grant(s) the American Psychological Association the exclusive right to publish the article and identify itself as the original publisher."] Emotions are a vital component of social communication, carried across a range of modalities and via different perceptual signals such as specific muscle contractions in the face and in the upper respiratory system. Previous studies have found that emotion recognition impairments after brain damage depend on the modality of presentation: recognition from faces may be impaired whereas recognition from voices remains preserved, and vice versa. On the other hand, there is also evidence for shared neural activation during emotion processing in both modalities. In a behavioral study, we investigated whether there are shared representations in the recognition of emotions from faces and voices. We used a within-subjects design in which participants rated the intensity of facial expressions and nonverbal vocalizations for each of the 6 basic emotion labels. For each participant and each modality, we then computed a representation matrix with the intensity ratings of each emotion. These matrices allowed us to examine the patterns of confusions between emotions and to characterize the representations of emotions within each modality. We then compared the representations across modalities by computing the correlations of the representation matrices across faces and voices. We found highly correlated matrices across modalities, which suggest similar representations of emotions across faces and voices. We also showed that these results could not be explained by commonalities between low-level visual and acoustic properties of the stimuli. We thus propose that there are similar or shared coding mechanisms for emotions which may act independently of modality, despite their distinct perceptual inputs. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Collapse
Affiliation(s)
- Lisa Katharina Kuhn
- Experimental Neuropsychology Unit, Department of Psychology, Saarland University
| | - Taeko Wydell
- Department of Life Sciences, Brunel University London
| | - Nadine Lavan
- Department of Psychology, Royal Holloway, University of London
| | | | - Lúcia Garrido
- Department of Life Sciences, Brunel University London
| |
Collapse
|
41
|
Lavan N, Scott SK, McGettigan C. Impaired generalization of speaker identity in the perception of familiar and unfamiliar voices. ACTA ACUST UNITED AC 2016; 145:1604-1614. [DOI: 10.1037/xge0000223] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
42
|
Carey D, McGettigan C. Magnetic resonance imaging of the brain and vocal tract: Applications to the study of speech production and language learning. Neuropsychologia 2016; 98:201-211. [PMID: 27288115 DOI: 10.1016/j.neuropsychologia.2016.06.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2016] [Revised: 06/02/2016] [Accepted: 06/05/2016] [Indexed: 10/21/2022]
Abstract
The human vocal system is highly plastic, allowing for the flexible expression of language, mood and intentions. However, this plasticity is not stable throughout the life span, and it is well documented that adult learners encounter greater difficulty than children in acquiring the sounds of foreign languages. Researchers have used magnetic resonance imaging (MRI) to interrogate the neural substrates of vocal imitation and learning, and the correlates of individual differences in phonetic "talent". In parallel, a growing body of work using MR technology to directly image the vocal tract in real time during speech has offered primarily descriptive accounts of phonetic variation within and across languages. In this paper, we review the contribution of neural MRI to our understanding of vocal learning, and give an overview of vocal tract imaging and its potential to inform the field. We propose methods by which our understanding of speech production and learning could be advanced through the combined measurement of articulation and brain activity using MRI - specifically, we describe a novel paradigm, developed in our laboratory, that uses both MRI techniques to for the first time map directly between neural, articulatory and acoustic data in the investigation of vocalisation. This non-invasive, multimodal imaging method could be used to track central and peripheral correlates of spoken language learning, and speech recovery in clinical settings, as well as provide insights into potential sites for targeted neural interventions.
Collapse
Affiliation(s)
- Daniel Carey
- Department of Psychology, Royal Holloway, University of London, Egham, UK
| | - Carolyn McGettigan
- Department of Psychology, Royal Holloway, University of London, Egham, UK
| |
Collapse
|
43
|
Pisanski K, Cartei V, McGettigan C, Raine J, Reby D. Voice Modulation: A Window into the Origins of Human Vocal Control? Trends Cogn Sci 2016; 20:304-318. [PMID: 26857619 DOI: 10.1016/j.tics.2016.01.002] [Citation(s) in RCA: 96] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2015] [Revised: 01/05/2016] [Accepted: 01/07/2016] [Indexed: 11/17/2022]
Abstract
An unresolved issue in comparative approaches to speech evolution is the apparent absence of an intermediate vocal communication system between human speech and the less flexible vocal repertoires of other primates. We argue that humans' ability to modulate nonverbal vocal features evolutionarily linked to expression of body size and sex (fundamental and formant frequencies) provides a largely overlooked window into the nature of this intermediate system. Recent behavioral and neural evidence indicates that humans' vocal control abilities, commonly assumed to subserve speech, extend to these nonverbal dimensions. This capacity appears in continuity with context-dependent frequency modulations recently identified in other mammals, including primates, and may represent a living relic of early vocal control abilities that led to articulated human speech.
Collapse
Affiliation(s)
- Katarzyna Pisanski
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK; Institute of Psychology, University of Wrocław, Wrocław, Poland
| | - Valentina Cartei
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK
| | - Carolyn McGettigan
- Royal Holloway Vocal Communication Laboratory, Department of Psychology, Royal Holloway, University of London, Egham, UK
| | - Jordan Raine
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK
| | - David Reby
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of Sussex, Brighton, UK.
| |
Collapse
|
44
|
Abstract
Spoken conversations typically take place in noisy environments, and different kinds of masking sounds place differing demands on cognitive resources. Previous studies, examining the modulation of neural activity associated with the properties of competing sounds, have shown that additional speech streams engage the superior temporal gyrus. However, the absence of a condition in which target speech was heard without additional masking made it difficult to identify brain networks specific to masking and to ascertain the extent to which competing speech was processed equivalently to target speech. In this study, we scanned young healthy adults with continuous fMRI, while they listened to stories masked by sounds that differed in their similarity to speech. We show that auditory attention and control networks are activated during attentive listening to masked speech in the absence of an overt behavioral task. We demonstrate that competing speech is processed predominantly in the left hemisphere within the same pathway as target speech but is not treated equivalently within that stream and that individuals who perform better in speech in noise tasks activate the left mid-posterior superior temporal gyrus more. Finally, we identify neural responses associated with the onset of sounds in the auditory environment; activity was found within right lateralized frontal regions consistent with a phasic alerting response. Taken together, these results provide a comprehensive account of the neural processes involved in listening in noise.
Collapse
Affiliation(s)
| | | | - Zarinah K Agnew
- University College London.,University of California, San Francisco
| | | | | |
Collapse
|
45
|
Lima CF, Lavan N, Evans S, Agnew Z, Halpern AR, Shanmugalingam P, Meekings S, Boebinger D, Ostarek M, McGettigan C, Warren JE, Scott SK. Feel the Noise: Relating Individual Differences in Auditory Imagery to the Structure and Function of Sensorimotor Systems. Cereb Cortex 2015; 25:4638-50. [PMID: 26092220 PMCID: PMC4816805 DOI: 10.1093/cercor/bhv134] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Humans can generate mental auditory images of voices or songs, sometimes perceiving them almost as vividly as perceptual experiences. The functional networks supporting auditory imagery have been described, but less is known about the systems associated with interindividual differences in auditory imagery. Combining voxel-based morphometry and fMRI, we examined the structural basis of interindividual differences in how auditory images are subjectively perceived, and explored associations between auditory imagery, sensory-based processing, and visual imagery. Vividness of auditory imagery correlated with gray matter volume in the supplementary motor area (SMA), parietal cortex, medial superior frontal gyrus, and middle frontal gyrus. An analysis of functional responses to different types of human vocalizations revealed that the SMA and parietal sites that predict imagery are also modulated by sound type. Using representational similarity analysis, we found that higher representational specificity of heard sounds in SMA predicts vividness of imagery, indicating a mechanistic link between sensory- and imagery-based processing in sensorimotor cortex. Vividness of imagery in the visual domain also correlated with SMA structure, and with auditory imagery scores. Altogether, these findings provide evidence for a signature of imagery in brain structure, and highlight a common role of perceptual–motor interactions for processing heard and internally generated auditory information.
Collapse
Affiliation(s)
- César F Lima
- Institute of Cognitive Neuroscience Center for Psychology, University of Porto, Porto, Portugal
| | - Nadine Lavan
- Institute of Cognitive Neuroscience Department of Psychology, Royal Holloway University of London, London, UK
| | | | - Zarinah Agnew
- Institute of Cognitive Neuroscience Department of Otolaryngology, University of California, San Francisco, USA
| | | | | | | | | | | | - Carolyn McGettigan
- Institute of Cognitive Neuroscience Department of Psychology, Royal Holloway University of London, London, UK
| | - Jane E Warren
- Faculty of Brain Sciences, University College London, London, UK
| | | |
Collapse
|
46
|
Adank P, McGettigan C, Kotz SAE. Editorial: Current research and emerging directions on the cognitive and neural organization of speech processing. Front Hum Neurosci 2015; 9:305. [PMID: 26074806 PMCID: PMC4444830 DOI: 10.3389/fnhum.2015.00305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2015] [Accepted: 05/12/2015] [Indexed: 12/02/2022] Open
Affiliation(s)
- Patti Adank
- Division of Psychology and Language Sciences, Speech, Hearing and Phonetic Sciences, University College London London, UK
| | | | - Sonja A E Kotz
- Max Planck Institute Leipzig Leipzig, Germany ; School of Psychological Sciences, University of Manchester Manchester, UK
| |
Collapse
|
47
|
McGettigan C. The social life of voices: studying the neural bases for the expression and perception of the self and others during spoken communication. Front Hum Neurosci 2015; 9:129. [PMID: 25852517 PMCID: PMC4365687 DOI: 10.3389/fnhum.2015.00129] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2014] [Accepted: 02/25/2015] [Indexed: 11/24/2022] Open
Affiliation(s)
- Carolyn McGettigan
- Department of Psychology, Royal Holloway, University of London Egham, UK
| |
Collapse
|
48
|
Abstract
Laughter is often considered to be the product of humour. However, laughter is a social emotion, occurring most often in interactions, where it is associated with bonding, agreement, affection, and emotional regulation. Laughter is underpinned by complex neural systems, allowing it to be used flexibly. In humans and chimpanzees, social (voluntary) laughter is distinctly different from evoked (involuntary) laughter, a distinction which is also seen in brain imaging studies of laughter.
Collapse
Affiliation(s)
| | - Nadine Lavan
- Department of Psychology, Royal Holloway University of London, UK
| | - Sinead Chen
- Institute of Cognitive Neuroscience, UCL London, UK
| | | |
Collapse
|
49
|
Agnew Z, van de Koot H, McGettigan C, Scott S. Do sentences with unaccusative verbs involve syntactic movement? Evidence from neuroimaging. Lang Cogn Neurosci 2014; 29:1035-1045. [PMID: 25210717 PMCID: PMC4151820 DOI: 10.1080/23273798.2014.887125] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/12/2012] [Accepted: 12/23/2013] [Indexed: 06/03/2023]
Abstract
This study focuses on the neural processing of English sentences containing unergative, unaccusative and transitive verbs. We demonstrate common responses in bilateral superior temporal gyri in response to listening to sentences containing unaccusative and transitive verbs compared to unergative verbs; we did not detect any activation that was specific to unaccusatives. Our findings indicate that the neural processing of unaccusative and transitive verbs is highly similar, and very different from the processing of unergative verbs. We discuss the consequences of these results for the linguistic analysis of movement phenomena.
Collapse
Affiliation(s)
- Z.K. Agnew
- Institute for Cognitive Neuroscience, UCL, 17 Queen Square, LondonWC1N 3AR, UK
| | - H. van de Koot
- Research Department of Linguistics, UCL, 2 Wakefield Street, LondonWC1N 1 PF, UK
| | - C. McGettigan
- Institute for Cognitive Neuroscience, UCL, 17 Queen Square, LondonWC1N 3AR, UK
| | - S.K. Scott
- Institute for Cognitive Neuroscience, UCL, 17 Queen Square, LondonWC1N 3AR, UK
| |
Collapse
|
50
|
Lavan N, Lima CF, Harvey H, Scott SK, McGettigan C. I thought that I heard you laughing: Contextual facial expressions modulate the perception of authentic laughter and crying. Cogn Emot 2014; 29:935-44. [DOI: 10.1080/02699931.2014.957656] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|