1
|
Rathcke T, Smit E, Zheng Y, Canzi M. Perception of temporal structure in speech is influenced by body movement and individual beat perception ability. Atten Percept Psychophys 2024:10.3758/s13414-024-02893-8. [PMID: 38769276 DOI: 10.3758/s13414-024-02893-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/03/2024] [Indexed: 05/22/2024]
Abstract
The subjective experience of time flow in speech deviates from the sound acoustics in substantial ways. The present study focuses on the perceptual tendency to regularize time intervals found in speech but not in other types of sounds with a similar temporal structure. We investigate to what extent individual beat perception ability is responsible for perceptual regularization and if the effect can be eliminated through the involvement of body movement during listening. Participants performed a musical beat perception task and compared spoken sentences to their drumbeat-based versions either after passive listening or after listening and moving along with the beat of the sentences. The results show that the interval regularization prevails in listeners with a low beat perception ability performing a passive listening task and is eliminated in an active listening task involving body movement. Body movement also helped to promote a veridical percept of temporal structure in speech at the group level. We suggest that body movement engages an internal timekeeping mechanism, promoting the fidelity of auditory encoding even in sounds of high temporal complexity and irregularity such as natural speech.
Collapse
Affiliation(s)
- Tamara Rathcke
- Department of Linguistics, University of Konstanz, Konstanz, 78464, Baden-Württemberg, Germany.
| | - Eline Smit
- Department of Linguistics, University of Konstanz, Konstanz, 78464, Baden-Württemberg, Germany
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Street, Penrith, 2751, NSW, Australia
| | - Yue Zheng
- Department of Psychology, University of York, York, YO10 5DD, UK
- Department of Hearing Sciences, University of Nottingham, Nottingham, NG7 2RD, UK
| | - Massimiliano Canzi
- Department of Linguistics, University of Konstanz, Konstanz, 78464, Baden-Württemberg, Germany
| |
Collapse
|
2
|
Cacciante L, Pregnolato G, Salvalaggio S, Federico S, Kiper P, Smania N, Turolla A. Language and gesture neural correlates: A meta-analysis of functional magnetic resonance imaging studies. INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS 2024; 59:902-912. [PMID: 37971416 DOI: 10.1111/1460-6984.12987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 11/03/2023] [Indexed: 11/19/2023]
Abstract
BACKGROUND Humans often use co-speech gestures to promote effective communication. Attention has been paid to the cortical areas engaged in the processing of co-speech gestures. AIMS To investigate the neural network underpinned in the processing of co-speech gestures and to observe whether there is a relationship between areas involved in language and gesture processing. METHODS & PROCEDURES We planned to include studies with neurotypical and/or stroke participants who underwent a bimodal task (i.e., processing of co-speech gestures with relative speech) and a unimodal task (i.e., speech or gesture alone) during a functional magnetic resonance imaging (fMRI) session. After a database search, abstract and full-text screening were conducted. Qualitative and quantitative data were extracted, and a meta-analysis was performed with the software GingerALE 3.0.2, performing contrast analyses of uni- and bimodal tasks. MAIN CONTRIBUTION The database search produced 1024 records. After the screening process, 27 studies were included in the review. Data from 15 studies were quantitatively analysed through meta-analysis. Meta-analysis found three clusters with a significant activation of the left middle frontal gyrus and inferior frontal gyrus, and bilateral middle occipital gyrus and inferior temporal gyrus. CONCLUSIONS There is a close link at the neural level for the semantic processing of auditory and visual information during communication. These findings encourage the integration of the use of co-speech gestures during aphasia treatment as a strategy to foster the possibility to communicate effectively for people with aphasia. WHAT THIS PAPER ADDS What is already known on this subject Gestures are an integral part of human communication, and they may have a relationship at neural level with speech processing. What this paper adds to the existing knowledge During processing of bi- and unimodal communication, areas related to semantic processing and multimodal processing are activated, suggesting that there is a close link between co-speech gestures and spoken language at a neural level. What are the potential or actual clinical implications of this work? Knowledge of the functions related to gesture and speech processing neural networks will allow for the adoption of model-based neurorehabilitation programs to foster recovery from aphasia by strengthening the specific functions of these brain networks.
Collapse
Affiliation(s)
- Luisa Cacciante
- Laboratory of Healthcare Innovation Technology, IRCCS San Camillo Hospital, Venice, Italy
| | - Giorgia Pregnolato
- Laboratory of Healthcare Innovation Technology, IRCCS San Camillo Hospital, Venice, Italy
| | - Silvia Salvalaggio
- Laboratory of Computational Neuroimaging, IRCCS San Camillo Hospital, Venice, Italy
- Padova Neuroscience Center, Università degli Studi di Padova, Padua, Italy
| | - Sara Federico
- Laboratory of Healthcare Innovation Technology, IRCCS San Camillo Hospital, Venice, Italy
| | - Pawel Kiper
- Laboratory of Healthcare Innovation Technology, IRCCS San Camillo Hospital, Venice, Italy
| | - Nicola Smania
- Department of Neurosciences, Biomedicine and Movement Sciences, University of Verona, Verona, Italy
| | - Andrea Turolla
- Department of Biomedical and Neuromotor Sciences-DIBINEM, Alma Mater Studiorum Università di Bologna, Bologna, Italy
- Unit of Occupational Medicine, IRCCS Azienda Ospedaliero-Universitaria di Bologna, Bologna, Italy
| |
Collapse
|
3
|
Nirme J, Gulz A, Haake M, Gullberg M. Early or synchronized gestures facilitate speech recall-a study based on motion capture data. Front Psychol 2024; 15:1345906. [PMID: 38596333 PMCID: PMC11002957 DOI: 10.3389/fpsyg.2024.1345906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 03/07/2024] [Indexed: 04/11/2024] Open
Abstract
Introduction Temporal co-ordination between speech and gestures has been thoroughly studied in natural production. In most cases gesture strokes precede or coincide with the stressed syllable in words that they are semantically associated with. Methods To understand whether processing of speech and gestures is attuned to such temporal coordination, we investigated the effect of delaying, preposing or eliminating individual gestures on the memory for words in an experimental study in which 83 participants watched video sequences of naturalistic 3D-animated speakers generated based on motion capture data. A target word in the sequence appeared (a) with a gesture presented in its original position synchronized with speech, (b) temporally shifted 500 ms before or (c) after the original position, or (d) with the gesture eliminated. Participants were asked to retell the videos in a free recall task. The strength of recall was operationalized as the inclusion of the target word in the free recall. Results Both eliminated and delayed gesture strokes resulted in reduced recall rates compared to synchronized strokes, whereas there was no difference between advanced (preposed) and synchronized strokes. An item-level analysis also showed that the greater the interval between the onsets of delayed strokes and stressed syllables in target words, the greater the negative effect was on recall. Discussion These results indicate that speech-gesture synchrony affects memory for speech, and that temporal patterns that are common in production lead to the best recall. Importantly, the study also showcases a procedure for using motion capture-based 3D-animated speakers to create an experimental paradigm for the study of speech-gesture comprehension.
Collapse
Affiliation(s)
- Jens Nirme
- Lund University Cognitive Science, Lund, Sweden
| | - Agneta Gulz
- Lund University Cognitive Science, Lund, Sweden
| | | | - Marianne Gullberg
- Centre for Languages and Literature and Lund University Humanities Lab, Lund University, Lund, Sweden
| |
Collapse
|
4
|
Ter Bekke M, Drijvers L, Holler J. Hand Gestures Have Predictive Potential During Conversation: An Investigation of the Timing of Gestures in Relation to Speech. Cogn Sci 2024; 48:e13407. [PMID: 38279899 DOI: 10.1111/cogs.13407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 07/09/2023] [Accepted: 01/10/2024] [Indexed: 01/29/2024]
Abstract
During face-to-face conversation, transitions between speaker turns are incredibly fast. These fast turn exchanges seem to involve next speakers predicting upcoming semantic information, such that next turn planning can begin before a current turn is complete. Given that face-to-face conversation also involves the use of communicative bodily signals, an important question is how bodily signals such as co-speech hand gestures play into these processes of prediction and fast responding. In this corpus study, we found that hand gestures that depict or refer to semantic information started before the corresponding information in speech, which held both for the onset of the gesture as a whole, as well as the onset of the stroke (the most meaningful part of the gesture). This early timing potentially allows listeners to use the gestural information to predict the corresponding semantic information to be conveyed in speech. Moreover, we provided further evidence that questions with gestures got faster responses than questions without gestures. However, we found no evidence for the idea that how much a gesture precedes its lexical affiliate (i.e., its predictive potential) relates to how fast responses were given. The findings presented here highlight the importance of the temporal relation between speech and gesture and help to illuminate the potential mechanisms underpinning multimodal language processing during face-to-face conversation.
Collapse
Affiliation(s)
- Marlijn Ter Bekke
- Donders Institute for Brain, Cognition and Behaviour, Radboud University
- Max Planck Institute for Psycholinguistics
| | - Linda Drijvers
- Donders Institute for Brain, Cognition and Behaviour, Radboud University
- Max Planck Institute for Psycholinguistics
| | - Judith Holler
- Donders Institute for Brain, Cognition and Behaviour, Radboud University
- Max Planck Institute for Psycholinguistics
| |
Collapse
|
5
|
Cairney BE, West SH, Haebig E, Cox CR, Lucas HD. Interpretations of meaningful and ambiguous hand gestures in autistic and non-autistic adults: A norming study. Behav Res Methods 2023:10.3758/s13428-023-02268-1. [PMID: 38012511 DOI: 10.3758/s13428-023-02268-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/06/2023] [Indexed: 11/29/2023]
Abstract
Gestures are ubiquitous in human communication, and a growing but inconsistent body of research suggests that people with autism spectrum disorder (ASD) may process co-speech gestures differently from neurotypical individuals. To facilitate research on this topic, we created a database of 162 gesture videos that have been normed for comprehensibility by both autistic and non-autistic raters. These videos portray an actor performing silent gestures that range from highly meaningful (e.g., iconic gestures) to ambiguous or meaningless. Each video was rated for meaningfulness and given a one-word descriptor by 40 autistic and 40 non-autistic adults, and analyses were conducted to assess the level of within- and across-group agreement. Across gestures, the meaningfulness ratings provided by raters with and without ASD correlated at r > 0.90, indicating a very high level of agreement. Overall, autistic raters produced a more diverse set of verbal labels for each gesture than did non-autistic raters. However, measures of within-gesture semantic similarity among the responses provided by each group did not differ, suggesting that increased variability within the ASD group may have occurred at the lexical rather than semantic level. This study is the first to compare gesture naming between autistic and non-autistic individuals, and the resulting dataset is the first gesture stimulus set for which both groups were equally represented in the norming process. This database also has broad applicability to other areas of research related to gesture processing and comprehension. The video database and accompanying norming data are available on the Open Science Framework.
Collapse
Affiliation(s)
- Brianna E Cairney
- Department of Psychology, Louisiana State University, 236 Audubon Hall, Baton Rouge, LA, 70803, USA.
| | - Stanley H West
- Department of Psychology, Louisiana State University, 236 Audubon Hall, Baton Rouge, LA, 70803, USA
| | - Eileen Haebig
- Department of Communication Sciences & Disorders, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Christopher R Cox
- Department of Psychology, Louisiana State University, 236 Audubon Hall, Baton Rouge, LA, 70803, USA
| | - Heather D Lucas
- Department of Psychology, Louisiana State University, 236 Audubon Hall, Baton Rouge, LA, 70803, USA
| |
Collapse
|
6
|
Clough S, Padilla VG, Brown-Schmidt S, Duff MC. Intact speech-gesture integration in narrative recall by adults with moderate-severe traumatic brain injury. Neuropsychologia 2023; 189:108665. [PMID: 37619936 PMCID: PMC10592037 DOI: 10.1016/j.neuropsychologia.2023.108665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Revised: 07/27/2023] [Accepted: 08/18/2023] [Indexed: 08/26/2023]
Abstract
PURPOSE Real-world communication is situated in rich multimodal contexts, containing speech and gesture. Speakers often convey unique information in gesture that is not present in the speech signal (e.g., saying "He searched for a new recipe" while making a typing gesture). We examine the narrative retellings of participants with and without moderate-severe traumatic brain injury across three timepoints over two online Zoom sessions to investigate whether people with TBI can integrate information from co-occurring speech and gesture and if information from gesture persists across delays. METHODS 60 participants with TBI and 60 non-injured peers watched videos of a narrator telling four short stories. On key details, the narrator produced complementary gestures that conveyed unique information. Participants retold the stories at three timepoints: immediately after, 20-min later, and one-week later. We examined the words participants used when retelling these key details, coding them as a Speech Match (e.g., "He searched for a new recipe"), a Gesture Match (e.g., "He searched for a new recipe online), or Other ("He looked for a new recipe"). We also examined whether participants produced representative gestures themselves when retelling these details. RESULTS Despite recalling fewer story details, participants with TBI were as likely as non-injured peers to report information from gesture in their narrative retellings. All participants were more likely to report information from gesture and produce representative gestures themselves one-week later compared to immediately after hearing the story. CONCLUSION We demonstrated that speech-gesture integration is intact after TBI in narrative retellings. This finding has exciting implications for the utility of gesture to support comprehension and memory after TBI and expands our understanding of naturalistic multimodal language processing in this population.
Collapse
Affiliation(s)
- Sharice Clough
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, United States.
| | - Victoria-Grace Padilla
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, United States
| | - Sarah Brown-Schmidt
- Department of Psychology and Human Development, Vanderbilt University, United States
| | - Melissa C Duff
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, United States
| |
Collapse
|
7
|
Cavicchio F, Busà MG. The Role of Representational Gestures and Speech Synchronicity in Auditory Input by L2 and L1 Speakers. JOURNAL OF PSYCHOLINGUISTIC RESEARCH 2023; 52:1721-1735. [PMID: 37171686 DOI: 10.1007/s10936-023-09947-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 03/19/2023] [Indexed: 05/13/2023]
Abstract
Speech and gesture are two integrated and temporally coordinated systems. Manual gestures can help second language (L2) speakers with vocabulary learning and word retrieval. However, it is still under-investigated whether the synchronisation of speech and gesture has a role in helping listeners compensate for the difficulties in processing L2 aural information. In this paper, we tested, in two behavioural experiments, how L2 speakers process speech and gesture asynchronies in comparison to native speakers (L1). L2 speakers responded significantly faster when gestures and the semantic relevant speech were synchronous than asynchronous. They responded significantly slower than L1 speakers regardless of speech/gesture synchronisation. On the other hand, L1 speakers did not show a significant difference between asynchronous and synchronous integration of gestures and speech. We conclude that gesture-speech asynchrony affects L2 speakers more than L1 speakers.
Collapse
Affiliation(s)
| | - Maria Grazia Busà
- Dipartimento di Studi Linguistici e Letterari, Università degli Studi di Padova, Padova, Italy
| |
Collapse
|
8
|
Zhao W. TMS reveals a two-stage priming circuit of gesture-speech integration. Front Psychol 2023; 14:1156087. [PMID: 37228338 PMCID: PMC10203497 DOI: 10.3389/fpsyg.2023.1156087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 04/19/2023] [Indexed: 05/27/2023] Open
Abstract
Introduction Naturalistically, multisensory information of gesture and speech is intrinsically integrated to enable coherent comprehension. Such cross-modal semantic integration is temporally misaligned, with the onset of gesture preceding the relevant speech segment. It has been proposed that gestures prime subsequent speech. However, there are unresolved questions regarding the roles and time courses that the two sources of information play in integration. Methods In two between-subject experiments of healthy college students, we segmented the gesture-speech integration period into 40-ms time windows (TWs) based on two separately division criteria, while interrupting the activity of the integration node of the left posterior middle temporal gyrus (pMTG) and the left inferior frontal gyrus (IFG) with double-pulse transcranial magnetic stimulation (TMS). In Experiment 1, we created fixed time-advances of gesture over speech and divided the TWs from the onset of speech. In Experiment 2, we differentiated the processing stages of gesture and speech and segmented the TWs in reference to the speech lexical identification point (IP), while speech onset occurred at the gesture semantic discrimination point (DP). Results The results showed a TW-selective interruption of the pMTG and IFG only in Experiment 2, with the pMTG involved in TW1 (-120 ~ -80 ms of speech IP), TW2 (-80 ~ -40 ms), TW6 (80 ~ 120 ms) and TW7 (120 ~ 160 ms) and the IFG involved in TW3 (-40 ~ 0 ms) and TW6. Meanwhile no significant disruption of gesture-speech integration was reported in Experiment 1. Discussion We determined that after the representation of gesture has been established, gesture-speech integration occurs such that speech is first primed in a phonological processing stage before gestures are unified with speech to form a coherent meaning. Our findings provide new insights into multisensory speech and co-speech gesture integration by tracking the causal contributions of the two sources of information.
Collapse
|
9
|
Liao Q, Gao M, Weng X, Hu Q. Processing different types of iconicity in Chinese transferred epithet comprehension: An ERP study. Front Psychol 2022; 13:1032029. [PMID: 36619018 PMCID: PMC9813483 DOI: 10.3389/fpsyg.2022.1032029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022] Open
Abstract
Transferred epithet can be regarded as a reflection of semantic markedness since the modifier and the modified conflict with each other and lead to semantic deviation; yet the corresponding processing mechanism is less studied. The present study examined the neurocognitive mechanism of Chinese transferred epithet comprehension by employing ERP technique from the perspective of Iconicity of Markedness. Participants were required to read materials with different types of semantic markedness, namely unmarked linguistic expression (literal sentences) and marked linguistic expression (transferred epithets), and then judge whether the targets were words or pseudo-words. In terms of semantic markedness, the targets are words reflecting the unmarked semantic meaning of literal sentences and marked semantic meaning of transferred epithets respectively. The target words after transferred epithets elicited a larger N400 and a smaller LPC than those in literal sentences. These results suggest that processing sentences with marked and unmarked iconicity involve different neural mechanisms, with the former requiring more cognitive efforts to extract the similarity features.
Collapse
Affiliation(s)
- Qiaoyun Liao
- Institute of Linguistics, Shanghai International Studies University, Shanghai, China
| | - Mengting Gao
- Institute of Linguistics, Shanghai International Studies University, Shanghai, China
| | - Xin Weng
- Institute of Linguistics, Shanghai International Studies University, Shanghai, China,*Correspondence: Xin Weng, ; Quan Hu,
| | - Quan Hu
- School of English Studies, Sichuan International Studies University, Chongqing, China,*Correspondence: Xin Weng, ; Quan Hu,
| |
Collapse
|
10
|
Li M, Chen X, Zhu J, Chen F. Audiovisual Mandarin Lexical Tone Perception in Quiet and Noisy Contexts: The Influence of Visual Cues and Speech Rate. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:4385-4403. [PMID: 36269618 DOI: 10.1044/2022_jslhr-22-00024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
PURPOSE Armed with the theory of embodied cognition proposing tight interactions between perception, motor, and cognition, this study aimed to test the hypothesis that speech rate-altered Mandarin lexical tone perception in quiet and noisy environments could be affected by the bodily dynamic cross-modal information. METHOD Fifty-three adult listeners completed a Mandarin tone perception task with 720 tone stimuli in auditory-only (AO), auditory-facial (AF), and auditory-facial-plus-gestural (AFG) modalities, at fast, normal, and slow speech rates under quiet and noisy conditions. In AF and AFG modalities, both congruent and incongruent audiovisual information were designed and presented. Generalized linear mixed-effects models were constructed to analyze the accuracy of tone perception across different conditions. RESULTS In Mandarin tone perception, the magnitude of enhancement of AF and AFG cues across three speech rates was significantly higher than that of the AO cue in the adverse context of noise, yet additional metaphoric gestures did not show significant differences from the facial information. Furthermore, the performance of auditory tone perception at the fast speech rate was significantly better than that at the normal speech rate when the inputs were incongruent between auditory and visual channels in quiet. CONCLUSIONS This study provided compelling evidence showing that integrated audiovisual information plays a vital role not only in improving lexical tone perception in noise but also in modulating the effects of speech rate on Mandarin tone perception in quiet for native listeners. Our findings, supporting the theory of embodied cognition, are implicational for speech and hearing rehabilitation among both young and old clinical populations.
Collapse
Affiliation(s)
- Manhong Li
- School of Foreign Languages, Hunan University, Changsha, China
- School of Foreign Languages, Hunan First Normal University, Changsha, China
| | - Xiaoxiang Chen
- School of Foreign Languages, Hunan University, Changsha, China
| | - Jiaqiang Zhu
- Research Centre for Language, Cognition, and Neuroscience, Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, China
| | - Fei Chen
- School of Foreign Languages, Hunan University, Changsha, China
| |
Collapse
|
11
|
Trujillo JP, Levinson SC, Holler J. A multi-scale investigation of the human communication system's response to visual disruption. ROYAL SOCIETY OPEN SCIENCE 2022; 9:211489. [PMID: 35425638 PMCID: PMC9006025 DOI: 10.1098/rsos.211489] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 03/25/2022] [Indexed: 05/03/2023]
Abstract
In human communication, when the speech is disrupted, the visual channel (e.g. manual gestures) can compensate to ensure successful communication. Whether speech also compensates when the visual channel is disrupted is an open question, and one that significantly bears on the status of the gestural modality. We test whether gesture and speech are dynamically co-adapted to meet communicative needs. To this end, we parametrically reduce visibility during casual conversational interaction and measure the effects on speakers' communicative behaviour using motion tracking and manual annotation for kinematic and acoustic analyses. We found that visual signalling effort was flexibly adapted in response to a decrease in visual quality (especially motion energy, gesture rate, size, velocity and hold-time). Interestingly, speech was also affected: speech intensity increased in response to reduced visual quality (particularly in speech-gesture utterances, but independently of kinematics). Our findings highlight that multi-modal communicative behaviours are flexibly adapted at multiple scales of measurement and question the notion that gesture plays an inferior role to speech.
Collapse
Affiliation(s)
- James P. Trujillo
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Wundtlaan 1, 6525XD Nijmegen, The Netherlands
| | - Stephen C. Levinson
- Max Planck Institute for Psycholinguistics, Wundtlaan 1, 6525XD Nijmegen, The Netherlands
| | - Judith Holler
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, The Netherlands
- Max Planck Institute for Psycholinguistics, Wundtlaan 1, 6525XD Nijmegen, The Netherlands
| |
Collapse
|
12
|
Wu YC, Müller HM, Coulson S. Visuospatial Working Memory and Understanding Co-Speech Iconic Gestures: Do Gestures Help to Paint a Mental Picture? DISCOURSE PROCESSES 2022. [DOI: 10.1080/0163853x.2022.2028087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Ying Choon Wu
- Institute for Neural Computation, University of California, San Diego
| | - Horst M. Müller
- Faculty of Linguistics and Literary Studies, Bielefeld University
| | - Seana Coulson
- Cognitive Science Department, University of California, San Diego
| |
Collapse
|
13
|
Manfredi M, Boggio PS. Neural correlates of sex differences in communicative gestures and speech comprehension: A preliminary study. Soc Neurosci 2021; 16:653-667. [PMID: 34697990 DOI: 10.1080/17470919.2021.1997800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
The goal of this study was to investigate whether the semantic processing of the audiovisual combination of communicative gestures with speech differs between men and women. We recorded event-related brain potentials in women and men during the presentation of communicative gestures that were either congruent or incongruent with the speech.Our results showed that incongruent gestures elicited an N400 effect over frontal sites compared to congruent ones in both groups. Moreover, the females showed an earlier N2 response to incongruent stimuli than congruent ones, while larger sustained negativity and late positivity in response to incongruent stimuli was observed only in males. These results suggest that women rapidly recognize and process audiovisual combinations of communicative gestures and speech (as early as 300 ms) whereas men analyze them at the later stages of the process.
Collapse
Affiliation(s)
- Mirella Manfredi
- Department of Psychology, University of Zurich, Zurich, Switzerland
| | - Paulo Sergio Boggio
- Social and Cognitive Neuroscience Laboratory, Center for Biological Science and Health, Mackenzie Presbyterian University, São Paulo, Brazil
| |
Collapse
|
14
|
The role of iconic gestures and mouth movements in face-to-face communication. Psychon Bull Rev 2021; 29:600-612. [PMID: 34671936 PMCID: PMC9038814 DOI: 10.3758/s13423-021-02009-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/06/2021] [Indexed: 11/16/2022]
Abstract
Human face-to-face communication is multimodal: it comprises speech as well as visual cues, such as articulatory and limb gestures. In the current study, we assess how iconic gestures and mouth movements influence audiovisual word recognition. We presented video clips of an actress uttering single words accompanied, or not, by more or less informative iconic gestures. For each word we also measured the informativeness of the mouth movements from a separate lipreading task. We manipulated whether gestures were congruent or incongruent with the speech, and whether the words were audible or noise vocoded. The task was to decide whether the speech from the video matched a previously seen picture. We found that congruent iconic gestures aided word recognition, especially in the noise-vocoded condition, and the effect was larger (in terms of reaction times) for more informative gestures. Moreover, more informative mouth movements facilitated performance in challenging listening conditions when the speech was accompanied by gestures (either congruent or incongruent) suggesting an enhancement when both cues are present relative to just one. We also observed (a trend) that more informative mouth movements speeded up word recognition across clarity conditions, but only when the gestures were absent. We conclude that listeners use and dynamically weight the informativeness of gestures and mouth movements available during face-to-face communication.
Collapse
|
15
|
Perniss P, Vinson D, Vigliocco G. Making Sense of the Hands and Mouth: The Role of "Secondary" Cues to Meaning in British Sign Language and English. Cogn Sci 2021; 44:e12868. [PMID: 32619055 DOI: 10.1111/cogs.12868] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 05/01/2020] [Accepted: 05/06/2020] [Indexed: 01/06/2023]
Abstract
Successful face-to-face communication involves multiple channels, notably hand gestures in addition to speech for spoken language, and mouth patterns in addition to manual signs for sign language. In four experiments, we assess the extent to which comprehenders of British Sign Language (BSL) and English rely, respectively, on cues from the hands and the mouth in accessing meaning. We created congruent and incongruent combinations of BSL manual signs and mouthings and English speech and gesture by video manipulation and asked participants to carry out a picture-matching task. When participants were instructed to pay attention only to the primary channel, incongruent "secondary" cues still affected performance, showing that these are reliably used for comprehension. When both cues were relevant, the languages diverged: Hand gestures continued to be used in English, but mouth movements did not in BSL. Moreover, non-fluent speakers and signers varied in the use of these cues: Gestures were found to be more important for non-native than native speakers; mouth movements were found to be less important for non-fluent signers. We discuss the results in terms of the information provided by different communicative channels, which combine to provide meaningful information.
Collapse
Affiliation(s)
| | - David Vinson
- Division of Psychology and Language Sciences, University College London
| | | |
Collapse
|
16
|
Kandana Arachchige KG, Simoes Loureiro I, Blekic W, Rossignol M, Lefebvre L. The Role of Iconic Gestures in Speech Comprehension: An Overview of Various Methodologies. Front Psychol 2021; 12:634074. [PMID: 33995189 PMCID: PMC8118122 DOI: 10.3389/fpsyg.2021.634074] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 04/01/2021] [Indexed: 11/28/2022] Open
Abstract
Iconic gesture-speech integration is a relatively recent field of investigation with numerous researchers studying its various aspects. The results obtained are just as diverse. The definition of iconic gestures is often overlooked in the interpretations of results. Furthermore, while most behavioral studies have demonstrated an advantage of bimodal presentation, brain activity studies show a diversity of results regarding the brain regions involved in the processing of this integration. Clinical studies also yield mixed results, some suggesting parallel processing channels, others a unique and integrated channel. This review aims to draw attention to the methodological variations in research on iconic gesture-speech integration and how they impact conclusions regarding the underlying phenomena. It will also attempt to draw together the findings from other relevant research and suggest potential areas for further investigation in order to better understand processes at play during speech integration process.
Collapse
Affiliation(s)
| | | | - Wivine Blekic
- Cognitive Psychology and Neuropsychology, University of Mons, Mons, Belgium
| | - Mandy Rossignol
- Cognitive Psychology and Neuropsychology, University of Mons, Mons, Belgium
| | - Laurent Lefebvre
- Cognitive Psychology and Neuropsychology, University of Mons, Mons, Belgium
| |
Collapse
|
17
|
Manfredi M, Cohn N, Ribeiro B, Sanchez Pinho P, Fernandes Rodrigues Pereira E, Boggio PS. The electrophysiology of audiovisual processing in visual narratives in adolescents with autism spectrum disorder. Brain Cogn 2021; 151:105730. [PMID: 33892434 DOI: 10.1016/j.bandc.2021.105730] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 02/15/2021] [Accepted: 04/03/2021] [Indexed: 12/24/2022]
Abstract
We investigated the semantic processing of the multimodal audiovisual combination of visual narratives with auditory descriptive words and auditory sounds in individuals with ASD. To this aim, we recorded ERPs to critical auditory words and sounds associated with events in visual narrative that were either semantically congruent or incongruent with the climactic visual event. A similar N400 effect was found both in adolescents with ASD and neurotypical adolescents (ages 9-16) when accessing different types of auditory information (i.e. words and sounds) into a visual narrative. This result might suggest that verbal information processing in ASD adolescents could be facilitated by direct association with meaningful visual information. In addition, we observed differences in scalp distribution of later brain responses between ASD and neurotypical adolescents. This finding might suggest ASD adolescents differ from neurotypical adolescents during the processing of the multimodal combination of visual narratives with auditory information at later stages of the process. In conclusion, the semantic processing of verbal information, typically impaired in individuals with ASD, can be facilitated when embedded into a meaningful visual information.
Collapse
Affiliation(s)
- Mirella Manfredi
- Social and Cognitive Neuroscience Laboratory, Center for Biological Science and Health, Mackenzie Presbyterian University, São Paulo, Brazil; Department of Psychology, University of Zurich, Zurich, Switzerland.
| | - Neil Cohn
- Department of Communication and Cognition, Tilburg University, Tilburg, Netherlands
| | - Beatriz Ribeiro
- Social and Cognitive Neuroscience Laboratory, Center for Biological Science and Health, Mackenzie Presbyterian University, São Paulo, Brazil
| | - Pamella Sanchez Pinho
- Social and Cognitive Neuroscience Laboratory, Center for Biological Science and Health, Mackenzie Presbyterian University, São Paulo, Brazil
| | | | - Paulo Sergio Boggio
- Social and Cognitive Neuroscience Laboratory, Center for Biological Science and Health, Mackenzie Presbyterian University, São Paulo, Brazil
| |
Collapse
|
18
|
Drijvers L, Jensen O, Spaak E. Rapid invisible frequency tagging reveals nonlinear integration of auditory and visual information. Hum Brain Mapp 2021; 42:1138-1152. [PMID: 33206441 PMCID: PMC7856646 DOI: 10.1002/hbm.25282] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 10/15/2020] [Accepted: 10/21/2020] [Indexed: 12/21/2022] Open
Abstract
During communication in real-life settings, the brain integrates information from auditory and visual modalities to form a unified percept of our environment. In the current magnetoencephalography (MEG) study, we used rapid invisible frequency tagging (RIFT) to generate steady-state evoked fields and investigated the integration of audiovisual information in a semantic context. We presented participants with videos of an actress uttering action verbs (auditory; tagged at 61 Hz) accompanied by a gesture (visual; tagged at 68 Hz, using a projector with a 1,440 Hz refresh rate). Integration difficulty was manipulated by lower-order auditory factors (clear/degraded speech) and higher-order visual factors (congruent/incongruent gesture). We identified MEG spectral peaks at the individual (61/68 Hz) tagging frequencies. We furthermore observed a peak at the intermodulation frequency of the auditory and visually tagged signals (fvisual - fauditory = 7 Hz), specifically when lower-order integration was easiest because signal quality was optimal. This intermodulation peak is a signature of nonlinear audiovisual integration, and was strongest in left inferior frontal gyrus and left temporal regions; areas known to be involved in speech-gesture integration. The enhanced power at the intermodulation frequency thus reflects the ease of lower-order audiovisual integration and demonstrates that speech-gesture information interacts in higher-order language areas. Furthermore, we provide a proof-of-principle of the use of RIFT to study the integration of audiovisual stimuli, in relation to, for instance, semantic context.
Collapse
Affiliation(s)
- Linda Drijvers
- Donders Institute for Brain, Cognition, and Behaviour, Centre for Cognition, Montessorilaan 3Radboud UniversityNijmegenHRThe Netherlands
- Max Planck Institute for PsycholinguisticsNijmegenXDThe Netherlands
| | - Ole Jensen
- School of Psychology, Centre for Human Brain HealthUniversity of BirminghamBirminghamUnited Kingdom
| | - Eelke Spaak
- Donders Institute for Brain, Cognition, and Behaviour, Centre for Cognitive Neuroimaging, Kapittelweg 29Radboud UniversityNijmegenENThe Netherlands
| |
Collapse
|
19
|
Construing events first-hand: Gesture viewpoints interact with speech to shape the attribution and memory of agency. Mem Cognit 2021; 49:884-894. [PMID: 33415717 DOI: 10.3758/s13421-020-01135-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2020] [Indexed: 11/08/2022]
Abstract
Beyond conveying objective content about objects and actions, what can co-speech iconic gestures reveal about a speaker's subjective relationship to that content? The present study explores this question by investigating how gesture viewpoints can inform a listener's construal of a speaker's agency. Forty native English speakers watched videos of an actor uttering sentences with different viewpoints-that of low agency or high agency-conveyed through both speech and gesture. Participants were asked to (1) rate the speaker's responsibility for the action described in each video (encoding task) and (2) complete a surprise memory test of the spoken sentences (recall task). For the encoding task, participants rated responsibility near ceiling when agency in speech was high, with a slight dip when accompanied by gestures of low agency. When agency in speech was low, responsibility ratings were raised markedly when accompanied by gestures of high agency. In the recall task, participants produced more incorrect recall of spoken agency when the viewpoints expressed through speech and gesture were inconsistent with one another. Our findings suggest that, beyond conveying objective content, co-speech iconic gestures can also guide listeners in gauging a speaker's agentic relationship to actions and events.
Collapse
|
20
|
He Y, Luell S, Muralikrishnan R, Straube B, Nagels A. Gesture's body orientation modulates the N400 for visual sentences primed by gestures. Hum Brain Mapp 2020; 41:4901-4911. [PMID: 32808721 PMCID: PMC7643362 DOI: 10.1002/hbm.25166] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 07/16/2020] [Accepted: 07/23/2020] [Indexed: 01/08/2023] Open
Abstract
Body orientation of gesture entails social-communicative intention, and may thus influence how gestures are perceived and comprehended together with auditory speech during face-to-face communication. To date, despite the emergence of neuroscientific literature on the role of body orientation on hand action perception, limited studies have directly investigated the role of body orientation in the interaction between gesture and language. To address this research question, we carried out an electroencephalography (EEG) experiment presenting to participants (n = 21) videos of frontal and lateral communicative hand gestures of 5 s (e.g., raising a hand), followed by visually presented sentences that are either congruent or incongruent with the gesture (e.g., "the mountain is high/low…"). Participants underwent a semantic probe task, judging whether a target word is related or unrelated to the gesture-sentence event. EEG results suggest that, during the perception phase of handgestures, while both frontal and lateral gestures elicited a power decrease in both the alpha (8-12 Hz) and the beta (16-24 Hz) bands, lateral versus frontal gestures elicited reduced power decrease in the beta band, source-located to the medial prefrontal cortex. For sentence comprehension, at the critical word whose meaning is congruent/incongruent with the gesture prime, frontal gestures elicited an N400 effect for gesture-sentence incongruency. More importantly, this incongruency effect was significantly reduced for lateral gestures. These findings suggest that body orientation plays an important role in gesture perception, and that its inferred social-communicative intention may influence gesture-language interaction at semantic level.
Collapse
Affiliation(s)
- Yifei He
- Department of Psychiatry and PsychotherapyPhilipps‐University MarburgMarburgGermany
| | - Svenja Luell
- Department of General LinguisticsJohannes‐Gutenberg University MainzMainzGermany
| | - R. Muralikrishnan
- Department of NeuroscienceMax Planck Institute for Empirical AestheticsFrankfurtGermany
| | - Benjamin Straube
- Department of Psychiatry and PsychotherapyPhilipps‐University MarburgMarburgGermany
| | - Arne Nagels
- Department of General LinguisticsJohannes‐Gutenberg University MainzMainzGermany
| |
Collapse
|
21
|
Vigliocco G, Krason A, Stoll H, Monti A, Buxbaum LJ. Multimodal comprehension in left hemisphere stroke patients. Cortex 2020; 133:309-327. [PMID: 33161278 PMCID: PMC8105917 DOI: 10.1016/j.cortex.2020.09.025] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 08/10/2020] [Accepted: 09/22/2020] [Indexed: 12/11/2022]
Abstract
Hand gestures, imagistically related to the content of speech, are ubiquitous in face-to-face communication. Here we investigated people with aphasia's (PWA) processing of speech accompanied by gestures using lesion-symptom mapping. Twenty-nine PWA and 15 matched controls were shown a picture of an object/action and then a video-clip of a speaker producing speech and/or gestures in one of the following combinations: speech-only, gesture-only, congruent speech-gesture, and incongruent speech-gesture. Participants' task was to indicate, in different blocks, whether the picture and the word matched (speech task), or whether the picture and the gesture matched (gesture task). Multivariate lesion analysis with Support Vector Regression Lesion-Symptom Mapping (SVR-LSM) showed that benefit for congruent speech-gesture was associated with 1) lesioned voxels in anterior fronto-temporal regions including inferior frontal gyrus (IFG), and sparing of posterior temporal cortex and lateral temporal-occipital regions (pTC/LTO) for the speech task, and 2) conversely, lesions to pTC/LTO and sparing of anterior regions for the gesture task. The two tasks did not share overlapping voxels. Costs from incongruent speech-gesture pairings were associated with lesioned voxels in these same anterior (for the speech task) and posterior (for the gesture task) regions, but crucially, also shared voxels in superior temporal gyrus (STG) and middle temporal gyrus (MTG), including the anterior temporal lobe. These results suggest that IFG and pTC/LTO contribute to extracting semantic information from speech and gesture, respectively; however, they are not causally involved in integrating information from the two modalities. In contrast, regions in anterior STG/MTG are associated with performance in both tasks and may thus be critical to speech-gesture integration. These conclusions are further supported by associations between performance in the experimental tasks and performance in tests assessing lexical-semantic processing and gesture recognition.
Collapse
Affiliation(s)
- Gabriella Vigliocco
- Experimental Psychology, University College London, UK; Cognition and Action Laboratory, Moss Rehabilitation Research Institute, Elkins Park, PA, USA.
| | - Anna Krason
- Experimental Psychology, University College London, UK
| | - Harrison Stoll
- Cognition and Action Laboratory, Moss Rehabilitation Research Institute, Elkins Park, PA, USA
| | | | - Laurel J Buxbaum
- Cognition and Action Laboratory, Moss Rehabilitation Research Institute, Elkins Park, PA, USA
| |
Collapse
|
22
|
Momsen J, Gordon J, Wu YC, Coulson S. Verbal working memory and co-speech gesture processing. Brain Cogn 2020; 146:105640. [PMID: 33171343 DOI: 10.1016/j.bandc.2020.105640] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 09/21/2020] [Accepted: 10/19/2020] [Indexed: 12/15/2022]
Abstract
Multimodal discourse requires an assembly of cognitive processes that are uniquely recruited for language comprehension in social contexts. In this study, we investigated the role of verbal working memory for the online integration of speech and iconic gestures. Participants memorized and rehearsed a series of auditorily presented digits in low (one digit) or high (four digits) memory load conditions. To observe how verbal working memory load impacts online discourse comprehension, ERPs were recorded while participants watched discourse videos containing either congruent or incongruent speech-gesture combinations during the maintenance portion of the memory task. While expected speech-gesture congruity effects were found in the low memory load condition, high memory load trials elicited enhanced frontal positivities that indicated a unique interaction between online speech-gesture integration and the availability of verbal working memory resources. This work contributes to an understanding of discourse comprehension by demonstrating that language processing in a multimodal context is subject to the relationship between cognitive resource availability and the degree of controlled processing required for task performance. We suggest that verbal working memory is less important for speech-gesture integration than it is for mediating speech processing under high task demands.
Collapse
Affiliation(s)
- Jacob Momsen
- Joint Doctoral Program Language and Communicative Disorders, San Diego State University and UC San Diego, United States
| | - Jared Gordon
- Cognitive Science Department, UC San Diego, United States
| | - Ying Choon Wu
- Swartz Center for Computational Neuroscience, UC San Diego, United States
| | - Seana Coulson
- Joint Doctoral Program Language and Communicative Disorders, San Diego State University and UC San Diego, United States; Cognitive Science Department, UC San Diego, United States.
| |
Collapse
|
23
|
Morett LM, Landi N, Irwin J, McPartland JC. N400 amplitude, latency, and variability reflect temporal integration of beat gesture and pitch accent during language processing. Brain Res 2020; 1747:147059. [PMID: 32818527 PMCID: PMC7493208 DOI: 10.1016/j.brainres.2020.147059] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 08/03/2020] [Accepted: 08/12/2020] [Indexed: 01/19/2023]
Abstract
This study examines how across-trial (average) and trial-by-trial (variability in) amplitude and latency of the N400 event-related potential (ERP) reflect temporal integration of pitch accent and beat gesture. Thirty native English speakers viewed videos of a talker producing sentences with beat gesture co-occurring with a pitch accented focus word (synchronous), beat gesture co-occurring with the onset of a subsequent non-focused word (asynchronous), or the absence of beat gesture (no beat). Across trials, increased amplitude and earlier latency were observed when beat gesture was temporally asynchronous with pitch accenting than when it was temporally synchronous with pitch accenting or absent. Moreover, temporal asynchrony of beat gesture relative to pitch accent increased trial-by-trial variability of N400 amplitude and latency and influenced the relationship between across-trial and trial-by-trial N400 latency. These results indicate that across-trial and trial-by-trial amplitude and latency of the N400 ERP reflect temporal integration of beat gesture and pitch accent during language comprehension, supporting extension of the integrated systems hypothesis of gesture-speech processing and neural noise theories to focus processing in typical adult populations.
Collapse
Affiliation(s)
| | - Nicole Landi
- Haskins Laboratories, University of Connecticut, United States
| | - Julia Irwin
- Haskins Laboratories, Southern Connecticut State University, United States
| | | |
Collapse
|
24
|
Alviar C, Dale R, Dewitt A, Kello C. Multimodal Coordination of Sound and Movement in Music and Speech. DISCOURSE PROCESSES 2020. [DOI: 10.1080/0163853x.2020.1768500] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Camila Alviar
- Cognitive and Information Sciences, University of California, Merced
| | - Rick Dale
- Department of Communication, University of California, Los Angeles
| | - Akeiylah Dewitt
- Cognitive and Information Sciences, University of California, Merced
| | - Christopher Kello
- Cognitive and Information Sciences, University of California, Merced
| |
Collapse
|
25
|
Drijvers L, Özyürek A. Non-native Listeners Benefit Less from Gestures and Visible Speech than Native Listeners During Degraded Speech Comprehension. LANGUAGE AND SPEECH 2020; 63:209-220. [PMID: 30795715 PMCID: PMC7254629 DOI: 10.1177/0023830919831311] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Native listeners benefit from both visible speech and iconic gestures to enhance degraded speech comprehension (Drijvers & Ozyürek, 2017). We tested how highly proficient non-native listeners benefit from these visual articulators compared to native listeners. We presented videos of an actress uttering a verb in clear, moderately, or severely degraded speech, while her lips were blurred, visible, or visible and accompanied by a gesture. Our results revealed that unlike native listeners, non-native listeners were less likely to benefit from the combined enhancement of visible speech and gestures, especially since the benefit from visible speech was minimal when the signal quality was not sufficient.
Collapse
Affiliation(s)
- Linda Drijvers
- Linda Drijvers, Radboud University, Centre for Language Studies, Donders Institute for Brain, Cognition and Behaviour, Wundtlaan 1, Nijmegen, 6525 XD, The Netherlands.
| | - Asli Özyürek
- />Centre for Language Studies, Radboud University, Nijmegen, The Netherlands
- />Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, The Netherlands
- />Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| |
Collapse
|
26
|
Joue G, Boven L, Willmes K, Evola V, Demenescu LR, Hassemer J, Mittelberg I, Mathiak K, Schneider F, Habel U. Metaphor processing is supramodal semantic processing: The role of the bilateral lateral temporal regions in multimodal communication. BRAIN AND LANGUAGE 2020; 205:104772. [PMID: 32126372 DOI: 10.1016/j.bandl.2020.104772] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2019] [Revised: 01/26/2020] [Accepted: 02/09/2020] [Indexed: 06/10/2023]
Abstract
This paper presents an fMRI study on healthy adult understanding of metaphors in multimodal communication. We investigated metaphors expressed either only in coverbal gestures ("monomodal metaphors") or in speech with accompanying gestures ("multimodal metaphors"). Monomodal metaphoric gestures convey metaphoric information not expressed in the accompanying speech (e.g. saying the non-metaphoric utterance, "She felt bad" while dropping down the hand with palm facing up; here, the gesture alone indicates metaphoricity), whereas coverbal gestures in multimodal metaphors indicate metaphoricity redundant to the speech (e.g. saying the metaphoric utterance, "Her spirits fell" while dropping the hand with palm facing up). In other words, in monomodal metaphors, gestures add information not spoken, whereas the gestures in multimodal metaphors can be redundant to the spoken content. Understanding and integrating the information in each modality, here spoken and visual, is important in multimodal communication, but most prior studies have only considered multimodal metaphors where the gesture is redundant to what is spoken. Our participants watched audiovisual clips of an actor speaking while gesturing. We found that abstract metaphor comprehension recruited the lateral superior/middle temporal cortices, regardless of the modality in which the conceptual metaphor is expressed. These results suggest that abstract metaphors, regardless of modality, involve resources implicated in general semantic processing and are consistent with the role of these areas in supramodal semantic processing as well as the theory of embodied cognition.
Collapse
Affiliation(s)
- Gina Joue
- Human Technology Center, RWTH Aachen University, Theaterplatz 14, 52056 Aachen, Germany; Department of Psychiatry, Psychotherapy and Psychosomatics, School of Medicine, RWTH Aachen University, Pauwelsstraße 30, 52074 Aachen, Germany; Department of Systems Neuroscience, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany.
| | - Linda Boven
- School of Medicine, RWTH Aachen University, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Klaus Willmes
- Section Neuropsychology, Department of Neurology, School of Medicine, RWTH Aachen University, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Vito Evola
- Human Technology Center, RWTH Aachen University, Theaterplatz 14, 52056 Aachen, Germany; Bonn-Aachen International Center for Information Technology, Dahlmannstraße 2, 53113 Bonn, Germany; Faculty of Social Sciences and Humanities, New University of Lisbon, Portugal
| | - Liliana R Demenescu
- Department of Psychiatry, Psychotherapy and Psychosomatics, School of Medicine, RWTH Aachen University, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Julius Hassemer
- Human Technology Center, RWTH Aachen University, Theaterplatz 14, 52056 Aachen, Germany
| | - Irene Mittelberg
- Human Technology Center, RWTH Aachen University, Theaterplatz 14, 52056 Aachen, Germany
| | - Klaus Mathiak
- Department of Psychiatry, Psychotherapy and Psychosomatics, School of Medicine, RWTH Aachen University, Pauwelsstraße 30, 52074 Aachen, Germany; JARA, Translational Brain Medicine, 52425 Jülich, Germany; Institute of Neuroscience and Medicine (INM-1), Research Center Jülich, 52425 Jülich, Germany
| | - Frank Schneider
- Department of Psychiatry, Psychotherapy and Psychosomatics, School of Medicine, RWTH Aachen University, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Ute Habel
- Department of Psychiatry, Psychotherapy and Psychosomatics, School of Medicine, RWTH Aachen University, Pauwelsstraße 30, 52074 Aachen, Germany
| |
Collapse
|
27
|
King JPJ, Loy JE, Rohde H, Corley M. Interpreting nonverbal cues to deception in real time. PLoS One 2020; 15:e0229486. [PMID: 32150573 PMCID: PMC7062244 DOI: 10.1371/journal.pone.0229486] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 01/28/2020] [Indexed: 11/18/2022] Open
Abstract
When questioning the veracity of an utterance, we perceive certain non-linguistic behaviours to indicate that a speaker is being deceptive. Recent work has highlighted that listeners’ associations between speech disfluency and dishonesty are detectable at the earliest stages of reference comprehension, suggesting that the manner of spoken delivery influences pragmatic judgements concurrently with the processing of lexical information. Here, we investigate the integration of a speaker’s gestures into judgements of deception, and ask if and when associations between nonverbal cues and deception emerge. Participants saw and heard a video of a potentially dishonest speaker describe treasure hidden behind an object, while also viewing images of both the named object and a distractor object. Their task was to click on the object behind which they believed the treasure to actually be hidden. Eye and mouse movements were recorded. Experiment 1 investigated listeners’ associations between visual cues and deception, using a variety of static and dynamic cues. Experiment 2 focused on adaptor gestures. We show that a speaker’s nonverbal behaviour can have a rapid and direct influence on listeners’ pragmatic judgements, supporting the idea that communication is fundamentally multimodal.
Collapse
Affiliation(s)
- Josiah P. J. King
- Department of Psychology, PPLS, University of Edinburgh, Edinburgh, Scotland, United Kingdom
- * E-mail:
| | - Jia E. Loy
- Centre for Language Evolution, PPLS, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Hannah Rohde
- Department of Linguistics and English Language, PPLS, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Martin Corley
- Department of Psychology, PPLS, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| |
Collapse
|
28
|
Abstract
Digitally animated characters are promising tools in research studying how we integrate information from speech and visual sources such as gestures because they allow specific gesture features to be manipulated in isolation. We present an approach combining motion capture and 3D-animated characters that allows us to manipulate natural individual gesture strokes for experimental purposes, for example to temporally shift and present gestures in ecologically valid sequences. We exemplify how such stimuli can be used in an experiment investigating implicit detection of speech–gesture (a) synchrony, and discuss the general applicability of the workflow for research in this domain.
Collapse
|
29
|
Drijvers L, Vaitonytė J, Özyürek A. Degree of Language Experience Modulates Visual Attention to Visible Speech and Iconic Gestures During Clear and Degraded Speech Comprehension. Cogn Sci 2019; 43:e12789. [PMID: 31621126 PMCID: PMC6790953 DOI: 10.1111/cogs.12789] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2018] [Revised: 07/12/2019] [Accepted: 08/19/2019] [Indexed: 11/27/2022]
Abstract
Visual information conveyed by iconic hand gestures and visible speech can enhance speech comprehension under adverse listening conditions for both native and non-native listeners. However, how a listener allocates visual attention to these articulators during speech comprehension is unknown. We used eye-tracking to investigate whether and how native and highly proficient non-native listeners of Dutch allocated overt eye gaze to visible speech and gestures during clear and degraded speech comprehension. Participants watched video clips of an actress uttering a clear or degraded (6-band noise-vocoded) action verb while performing a gesture or not, and were asked to indicate the word they heard in a cued-recall task. Gestural enhancement was the largest (i.e., a relative reduction in reaction time cost) when speech was degraded for all listeners, but it was stronger for native listeners. Both native and non-native listeners mostly gazed at the face during comprehension, but non-native listeners gazed more often at gestures than native listeners. However, only native but not non-native listeners' gaze allocation to gestures predicted gestural benefit during degraded speech comprehension. We conclude that non-native listeners might gaze at gesture more as it might be more challenging for non-native listeners to resolve the degraded auditory cues and couple those cues to phonological information that is conveyed by visible speech. This diminished phonological knowledge might hinder the use of semantic information that is conveyed by gestures for non-native compared to native listeners. Our results demonstrate that the degree of language experience impacts overt visual attention to visual articulators, resulting in different visual benefits for native versus non-native listeners.
Collapse
Affiliation(s)
- Linda Drijvers
- Donders Institute for Brain, Cognition, and BehaviourRadboud University
| | - Julija Vaitonytė
- Department of Cognitive and Artificial Intelligence (School of Humanities and Digital Sciences)Tilburg University
| | - Asli Özyürek
- Donders Institute for Brain, Cognition, and BehaviourRadboud University
- Centre for Language StudiesRadboud University
- Max Planck Institute for Psycholinguistics
| |
Collapse
|
30
|
Drijvers L, van der Plas M, Özyürek A, Jensen O. Native and non-native listeners show similar yet distinct oscillatory dynamics when using gestures to access speech in noise. Neuroimage 2019; 194:55-67. [DOI: 10.1016/j.neuroimage.2019.03.032] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2018] [Revised: 03/12/2019] [Accepted: 03/15/2019] [Indexed: 11/30/2022] Open
|
31
|
Holler J, Levinson SC. Multimodal Language Processing in Human Communication. Trends Cogn Sci 2019; 23:639-652. [PMID: 31235320 DOI: 10.1016/j.tics.2019.05.006] [Citation(s) in RCA: 106] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 05/17/2019] [Accepted: 05/21/2019] [Indexed: 11/25/2022]
Abstract
The natural ecology of human language is face-to-face interaction comprising the exchange of a plethora of multimodal signals. Trying to understand the psycholinguistic processing of language in its natural niche raises new issues, first and foremost the binding of multiple, temporally offset signals under tight time constraints posed by a turn-taking system. This might be expected to overload and slow our cognitive system, but the reverse is in fact the case. We propose cognitive mechanisms that may explain this phenomenon and call for a multimodal, situated psycholinguistic framework to unravel the full complexities of human language processing.
Collapse
Affiliation(s)
- Judith Holler
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands; Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, The Netherlands.
| | - Stephen C Levinson
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands; Centre for Language Studies, Radboud University Nijmegen, Nijmegen, The Netherlands
| |
Collapse
|
32
|
Manfredi M, Cohn N, De Araújo Andreoli M, Boggio PS. Listening beyond seeing: Event-related potentials to audiovisual processing in visual narrative. BRAIN AND LANGUAGE 2018; 185:1-8. [PMID: 29986168 DOI: 10.1016/j.bandl.2018.06.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 06/28/2018] [Accepted: 06/28/2018] [Indexed: 06/08/2023]
Abstract
Every day we integrate meaningful information coming from different sensory modalities, and previous work has debated whether conceptual knowledge is represented in modality-specific neural stores specialized for specific types of information, and/or in an amodal, shared system. In the current study, we investigated semantic processing through a cross-modal paradigm which asked whether auditory semantic processing could be modulated by the constraints of context built up across a meaningful visual narrative sequence. We recorded event-related brain potentials (ERPs) to auditory words and sounds associated to events in visual narratives-i.e., seeing images of someone spitting while hearing either a word (Spitting!) or a sound (the sound of spitting)-which were either semantically congruent or incongruent with the climactic visual event. Our results showed that both incongruent sounds and words evoked an N400 effect, however, the distribution of the N400 effect to words (centro-parietal) differed from that of sounds (frontal). In addition, words had an earlier latency N400 than sounds. Despite these differences, a sustained late frontal negativity followed the N400s and did not differ between modalities. These results support the idea that semantic memory balances a distributed cortical network accessible from multiple modalities, yet also engages amodal processing insensitive to specific modalities.
Collapse
Affiliation(s)
- Mirella Manfredi
- Social and Cognitive Neuroscience Laboratory, Center for Biological Science and Health, Mackenzie Presbyterian University, São Paulo, Brazil.
| | - Neil Cohn
- Tilburg Center for Cognition and Communication, Tilburg University, Tilburg, Netherlands
| | - Mariana De Araújo Andreoli
- Social and Cognitive Neuroscience Laboratory, Center for Biological Science and Health, Mackenzie Presbyterian University, São Paulo, Brazil
| | - Paulo Sergio Boggio
- Social and Cognitive Neuroscience Laboratory, Center for Biological Science and Health, Mackenzie Presbyterian University, São Paulo, Brazil
| |
Collapse
|
33
|
Drijvers L, Özyürek A, Jensen O. Alpha and Beta Oscillations Index Semantic Congruency between Speech and Gestures in Clear and Degraded Speech. J Cogn Neurosci 2018; 30:1086-1097. [DOI: 10.1162/jocn_a_01301] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Previous work revealed that visual semantic information conveyed by gestures can enhance degraded speech comprehension, but the mechanisms underlying these integration processes under adverse listening conditions remain poorly understood. We used MEG to investigate how oscillatory dynamics support speech–gesture integration when integration load is manipulated by auditory (e.g., speech degradation) and visual semantic (e.g., gesture congruency) factors. Participants were presented with videos of an actress uttering an action verb in clear or degraded speech, accompanied by a matching (mixing gesture + “mixing”) or mismatching (drinking gesture + “walking”) gesture. In clear speech, alpha/beta power was more suppressed in the left inferior frontal gyrus and motor and visual cortices when integration load increased in response to mismatching versus matching gestures. In degraded speech, beta power was less suppressed over posterior STS and medial temporal lobe for mismatching compared with matching gestures, showing that integration load was lowest when speech was degraded and mismatching gestures could not be integrated and disambiguate the degraded signal. Our results thus provide novel insights on how low-frequency oscillatory modulations in different parts of the cortex support the semantic audiovisual integration of gestures in clear and degraded speech: When speech is clear, the left inferior frontal gyrus and motor and visual cortices engage because higher-level semantic information increases semantic integration load. When speech is degraded, posterior STS/middle temporal gyrus and medial temporal lobe are less engaged because integration load is lowest when visual semantic information does not aid lexical retrieval and speech and gestures cannot be integrated.
Collapse
Affiliation(s)
| | - Asli Özyürek
- Radboud University
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | | |
Collapse
|
34
|
He Y, Nagels A, Schlesewsky M, Straube B. The Role of Gamma Oscillations During Integration of Metaphoric Gestures and Abstract Speech. Front Psychol 2018; 9:1348. [PMID: 30104995 PMCID: PMC6077537 DOI: 10.3389/fpsyg.2018.01348] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2017] [Accepted: 07/13/2018] [Indexed: 11/13/2022] Open
Abstract
Metaphoric (MP) co-speech gestures are commonly used during daily communication. They communicate about abstract information by referring to gestures that are clearly concrete (e.g., raising a hand for “the level of the football game is high”). To understand MP co-speech gestures, a multisensory integration at semantic level is necessary between abstract speech and concrete gestures. While semantic gesture-speech integration has been extensively investigated using functional magnetic resonance imaging, evidence from electroencephalography (EEG) is rare. In the current study, we set out an EEG experiment, investigating the processing of MP vs. iconic (IC) co-speech gestures in different contexts, to reveal the oscillatory signature of MP gesture integration. German participants (n = 20) viewed video clips with an actor performing both types of gestures, accompanied by either comprehensible German or incomprehensible Russian (R) speech, or speaking German sentences without any gestures. Time-frequency analysis of the EEG data showed that, when gestures were accompanied by comprehensible German speech, MP gestures elicited decreased gamma band power (50–70 Hz) between 500 and 700 ms in the parietal electrodes when compared to IC gestures, and the source of this effect was localized to the right middle temporal gyrus. This difference is likely to reflect integration processes, as it was reduced in the R language and no-gesture conditions. Our findings provide the first empirical evidence suggesting the functional relationship between gamma band oscillations and higher-level semantic processes in a multisensory setting.
Collapse
Affiliation(s)
- Yifei He
- Translational Neuroimaging Lab, Department of Psychiatry and Psychotherapy, Marburg Center for Mind, Brain and Behavior, Philipps-University Marburg, Marburg, Germany
| | - Arne Nagels
- Department of General Linguistics, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Matthias Schlesewsky
- School of Psychology, Social Work and Social Policy, University of South Australia, Adelaide, SA, Australia
| | - Benjamin Straube
- Translational Neuroimaging Lab, Department of Psychiatry and Psychotherapy, Marburg Center for Mind, Brain and Behavior, Philipps-University Marburg, Marburg, Germany
| |
Collapse
|
35
|
Perniss P. Why We Should Study Multimodal Language. Front Psychol 2018; 9:1109. [PMID: 30002643 PMCID: PMC6032889 DOI: 10.3389/fpsyg.2018.01109] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 06/11/2018] [Indexed: 12/21/2022] Open
Affiliation(s)
- Pamela Perniss
- School of Humanities, University of Brighton, Brighton, United Kingdom
| |
Collapse
|
36
|
Hilverman C, Clough SA, Duff MC, Cook SW. Patients with hippocampal amnesia successfully integrate gesture and speech. Neuropsychologia 2018; 117:332-338. [PMID: 29932960 DOI: 10.1016/j.neuropsychologia.2018.06.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Revised: 06/12/2018] [Accepted: 06/15/2018] [Indexed: 10/28/2022]
Abstract
During conversation, people integrate information from co-speech hand gestures with information in spoken language. For example, after hearing the sentence, "A piece of the log flew up and hit Carl in the face" while viewing a gesture directed at the nose, people tend to later report that the log hit Carl in the nose (information only in gesture) rather than in the face (information in speech). The cognitive and neural mechanisms that support the integration of gesture with speech are unclear. One possibility is that the hippocampus - known for its role in relational memory and information integration - is necessary for integrating gesture and speech. To test this possibility, we examined how patients with hippocampal amnesia and healthy and brain-damaged comparison participants express information from gesture in a narrative retelling task. Participants watched videos of an experimenter telling narratives that included hand gestures that contained supplementary information. Participants were asked to retell the narratives and their spoken retellings were assessed for the presence of information from gesture. For features that had been accompanied by supplementary gesture, patients with amnesia retold fewer of these features overall and fewer retellings that matched the speech from the narrative. Yet their retellings included features that contained information that had been present uniquely in gesture in amounts that were not reliably different from comparison groups. Thus, a functioning hippocampus is not necessary for gesture-speech integration over short timescales. Providing unique information in gesture may enhance communication for individuals with declarative memory impairment, possibly via non-declarative memory mechanisms.
Collapse
Affiliation(s)
- Caitlin Hilverman
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, United States.
| | - Sharice A Clough
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, United States
| | - Melissa C Duff
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, United States
| | - Susan Wagner Cook
- DeLTA Center, University of Iowa, Iowa City, IA, United States; Department of Psychological and Brain Sciences, University of Iowa, Iowa City, IA, United States
| |
Collapse
|
37
|
Drijvers L, Özyürek A. Native language status of the listener modulates the neural integration of speech and iconic gestures in clear and adverse listening conditions. BRAIN AND LANGUAGE 2018; 177-178:7-17. [PMID: 29421272 DOI: 10.1016/j.bandl.2018.01.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2017] [Revised: 01/05/2018] [Accepted: 01/15/2018] [Indexed: 06/08/2023]
Abstract
Native listeners neurally integrate iconic gestures with speech, which can enhance degraded speech comprehension. However, it is unknown how non-native listeners neurally integrate speech and gestures, as they might process visual semantic context differently than natives. We recorded EEG while native and highly-proficient non-native listeners watched videos of an actress uttering an action verb in clear or degraded speech, accompanied by a matching ('to drive'+driving gesture) or mismatching gesture ('to drink'+mixing gesture). Degraded speech elicited an enhanced N400 amplitude compared to clear speech in both groups, revealing an increase in neural resources needed to resolve the spoken input. A larger N400 effect was found in clear speech for non-natives compared to natives, but in degraded speech only for natives. Non-native listeners might thus process gesture more strongly than natives when speech is clear, but need more auditory cues to facilitate access to gestural semantic information when speech is degraded.
Collapse
Affiliation(s)
- Linda Drijvers
- Radboud University, Centre for Language Studies, Erasmusplein 1, 6525 HT Nijmegen, The Netherlands; Radboud University, Donders Institute for Brain, Cognition, and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands.
| | - Asli Özyürek
- Radboud University, Centre for Language Studies, Erasmusplein 1, 6525 HT Nijmegen, The Netherlands; Radboud University, Donders Institute for Brain, Cognition, and Behaviour, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands; Max Planck Institute for Psycholinguistics, Wundtlaan 1, 6525 XD Nijmegen, The Netherlands
| |
Collapse
|
38
|
Drijvers L, Özyürek A, Jensen O. Hearing and seeing meaning in noise: Alpha, beta, and gamma oscillations predict gestural enhancement of degraded speech comprehension. Hum Brain Mapp 2018; 39:2075-2087. [PMID: 29380945 PMCID: PMC5947738 DOI: 10.1002/hbm.23987] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Revised: 01/09/2018] [Accepted: 01/19/2018] [Indexed: 11/10/2022] Open
Abstract
During face‐to‐face communication, listeners integrate speech with gestures. The semantic information conveyed by iconic gestures (e.g., a drinking gesture) can aid speech comprehension in adverse listening conditions. In this magnetoencephalography (MEG) study, we investigated the spatiotemporal neural oscillatory activity associated with gestural enhancement of degraded speech comprehension. Participants watched videos of an actress uttering clear or degraded speech, accompanied by a gesture or not and completed a cued‐recall task after watching every video. When gestures semantically disambiguated degraded speech comprehension, an alpha and beta power suppression and a gamma power increase revealed engagement and active processing in the hand‐area of the motor cortex, the extended language network (LIFG/pSTS/STG/MTG), medial temporal lobe, and occipital regions. These observed low‐ and high‐frequency oscillatory modulations in these areas support general unification, integration and lexical access processes during online language comprehension, and simulation of and increased visual attention to manual gestures over time. All individual oscillatory power modulations associated with gestural enhancement of degraded speech comprehension predicted a listener's correct disambiguation of the degraded verb after watching the videos. Our results thus go beyond the previously proposed role of oscillatory dynamics in unimodal degraded speech comprehension and provide first evidence for the role of low‐ and high‐frequency oscillations in predicting the integration of auditory and visual information at a semantic level.
Collapse
Affiliation(s)
- Linda Drijvers
- Radboud University, Centre for Language Studies, Erasmusplein 1, 6525 HT, Nijmegen, The Netherlands.,Radboud University, Donders Institute for Brain, Cognition, and Behaviour, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands
| | - Asli Özyürek
- Radboud University, Centre for Language Studies, Erasmusplein 1, 6525 HT, Nijmegen, The Netherlands.,Radboud University, Donders Institute for Brain, Cognition, and Behaviour, Montessorilaan 3, 6525 HR, Nijmegen, The Netherlands.,Max Planck Institute for Psycholinguistics, Wundtlaan 1, 6525 XD, Nijmegen, The Netherlands
| | - Ole Jensen
- School of Psychology, Centre for Human Brain Health, University of Birmingham, Hills Building, Birmingham, B15 2TT, United Kingdom
| |
Collapse
|
39
|
Lönnqvist L, Loukusa S, Hurtig T, Mäkinen L, Siipo A, Väyrynen E, Palo P, Laukka S, Mämmelä L, Mattila ML, Ebeling H. How Young Adults with Autism Spectrum Disorder Watch and Interpret Pragmatically Complex Scenes. Q J Exp Psychol (Hove) 2017; 70:2331-2346. [DOI: 10.1080/17470218.2016.1233988] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
The aim of the current study was to investigate subtle characteristics of social perception and interpretation in high-functioning individuals with autism spectrum disorders (ASDs), and to study the relation between watching and interpreting. As a novelty, we used an approach that combined moment-by-moment eye tracking and verbal assessment. Sixteen young adults with ASD and 16 neurotypical control participants watched a video depicting a complex communication situation while their eye movements were tracked. The participants also completed a verbal task with questions related to the pragmatic content of the video. We compared verbal task scores and eye movements between groups, and assessed correlations between task performance and eye movements. Individuals with ASD had more difficulty than the controls in interpreting the video, and during two short moments there were significant group differences in eye movements. Additionally, we found significant correlations between verbal task scores and moment-level eye movement in the ASD group, but not among the controls. We concluded that participants with ASD had slight difficulties in understanding the pragmatic content of the video stimulus and attending to social cues, and that the connection between pragmatic understanding and eye movements was more pronounced for participants with ASD than for neurotypical participants.
Collapse
Affiliation(s)
- Linda Lönnqvist
- Logopedics, Faculty of Humanities, and Child Language Research Center, University of Oulu, Oulu, Finland
| | - Soile Loukusa
- Logopedics, Faculty of Humanities, and Child Language Research Center, University of Oulu, Oulu, Finland
| | - Tuula Hurtig
- PEDEGO Research Unit, Child Psychiatry, University of Oulu, Oulu, Finland
- Clinic of Child Psychiatry, Oulu University Hospital, Oulu, Finland
- Neuroscience Research Unit, Psychiatry, University of Oulu, Oulu, Finland
| | - Leena Mäkinen
- Logopedics, Faculty of Humanities, and Child Language Research Center, University of Oulu, Oulu, Finland
| | - Antti Siipo
- Learning Research Laboratory, Research Unit of Psychology, Faculty of Education, University of Oulu, Oulu, Finland
| | - Eero Väyrynen
- BME Research Group, Department of Computer Science and Engineering, University of Oulu, Oulu, Finland
| | - Pertti Palo
- CASL Research Centre, Queen Margaret University, Edinburgh, UK
| | - Seppo Laukka
- Learning Research Laboratory, Research Unit of Psychology, Faculty of Education, University of Oulu, Oulu, Finland
| | - Laura Mämmelä
- Clinic of Child Psychiatry, Oulu University Hospital, Oulu, Finland
- Department of Psychology, Faculty of Social Sciences, University of Jyväskylä, Jyväskylä, Finland
| | - Marja-Leena Mattila
- PEDEGO Research Unit, Child Psychiatry, University of Oulu, Oulu, Finland
- Clinic of Child Psychiatry, Oulu University Hospital, Oulu, Finland
| | - Hanna Ebeling
- PEDEGO Research Unit, Child Psychiatry, University of Oulu, Oulu, Finland
- Clinic of Child Psychiatry, Oulu University Hospital, Oulu, Finland
| |
Collapse
|
40
|
Congdon EL, Novack MA, Brooks N, Hemani-Lopez N, O'Keefe L, Goldin-Meadow S. Better together: Simultaneous presentation of speech and gesture in math instruction supports generalization and retention. LEARNING AND INSTRUCTION 2017; 50:65-74. [PMID: 29051690 PMCID: PMC5642925 DOI: 10.1016/j.learninstruc.2017.03.005] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
When teachers gesture during instruction, children retain and generalize what they are taught (Goldin-Meadow, 2014). But why does gesture have such a powerful effect on learning? Previous research shows that children learn most from a math lesson when teachers present one problem-solving strategy in speech while simultaneously presenting a different, but complementary, strategy in gesture (Singer & Goldin-Meadow, 2005). One possibility is that gesture is powerful in this context because it presents information simultaneously with speech. Alternatively, gesture may be effective simply because it involves the body, in which case the timing of information presented in speech and gesture may be less important for learning. Here we find evidence for the importance of simultaneity: 3rd grade children retain and generalize what they learn from a math lesson better when given instruction containing simultaneous speech and gesture than when given instruction containing sequential speech and gesture. Interpreting these results in the context of theories of multimodal learning, we find that gesture capitalizes on its synchrony with speech to promote learning that lasts and can be generalized.
Collapse
|
41
|
Manfredi M, Cohn N, Kutas M. When a hit sounds like a kiss: An electrophysiological exploration of semantic processing in visual narrative. BRAIN AND LANGUAGE 2017; 169:28-38. [PMID: 28242517 PMCID: PMC5465314 DOI: 10.1016/j.bandl.2017.02.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Revised: 02/02/2017] [Accepted: 02/07/2017] [Indexed: 06/06/2023]
Abstract
Researchers have long questioned whether information presented through different sensory modalities involves distinct or shared semantic systems. We investigated uni-sensory cross-modal processing by recording event-related brain potentials to words replacing the climactic event in a visual narrative sequence (comics). We compared Onomatopoeic words, which phonetically imitate action sounds (Pow!), with Descriptive words, which describe an action (Punch!), that were (in)congruent within their sequence contexts. Across two experiments, larger N400s appeared to Anomalous Onomatopoeic or Descriptive critical panels than to their congruent counterparts, reflecting a difficulty in semantic access/retrieval. Also, Descriptive words evinced a greater late frontal positivity compared to Onomatopoetic words, suggesting that, though plausible, they may be less predictable/expected in visual narratives. Our results indicate that uni-sensory cross-model integration of word/letter-symbol strings within visual narratives elicit ERP patterns typically observed for written sentence processing, thereby suggesting the engagement of similar domain-independent integration/interpretation mechanisms.
Collapse
Affiliation(s)
- Mirella Manfredi
- Department of Psychology, University of Milano-Bicocca, Milan, Italy; Social and Cognitive Neuroscience Laboratory, Center for Biological Science and Health, Mackenzie Presbyterian University, São Paulo, Brazil.
| | - Neil Cohn
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA; Tilburg Center for Cognition and Communication, Tilburg University, Tilburg, Netherlands
| | - Marta Kutas
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| |
Collapse
|
42
|
Gunter TC, Weinbrenner JED. When to Take a Gesture Seriously: On How We Use and Prioritize Communicative Cues. J Cogn Neurosci 2017; 29:1355-1367. [PMID: 28358659 DOI: 10.1162/jocn_a_01125] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
When people talk, their speech is often accompanied by gestures. Although it is known that co-speech gestures can influence face-to-face communication, it is currently unclear to what extent they are actively used and under which premises they are prioritized to facilitate communication. We investigated these open questions in two experiments that varied how pointing gestures disambiguate the utterances of an interlocutor. Participants, whose event-related brain responses were measured, watched a video, where an actress was interviewed about, for instance, classical literature (e.g., Goethe and Shakespeare). While responding, the actress pointed systematically to the left side to refer to, for example, Goethe, or to the right to refer to Shakespeare. Her final statement was ambiguous and combined with a pointing gesture. The P600 pattern found in Experiment 1 revealed that, when pointing was unreliable, gestures were only monitored for their cue validity and not used for reference tracking related to the ambiguity. However, when pointing was a valid cue (Experiment 2), it was used for reference tracking, as indicated by a reduced N400 for pointing. In summary, these findings suggest that a general prioritization mechanism is in use that constantly monitors and evaluates the use of communicative cues against communicative priors on the basis of accumulated error information.
Collapse
Affiliation(s)
- Thomas C Gunter
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | | |
Collapse
|
43
|
|
44
|
Peeters D, Snijders TM, Hagoort P, Özyürek A. Linking language to the visual world: Neural correlates of comprehending verbal reference to objects through pointing and visual cues. Neuropsychologia 2017; 95:21-29. [DOI: 10.1016/j.neuropsychologia.2016.12.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Revised: 11/25/2016] [Accepted: 12/05/2016] [Indexed: 10/20/2022]
|
45
|
Vauclair J, Cochet H. La communication gestuelle : Une voie royale pour le développement du langage. ENFANCE 2016. [DOI: 10.3917/enf1.164.0419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
|
46
|
Biau E, Morís Fernández L, Holle H, Avila C, Soto-Faraco S. Hand gestures as visual prosody: BOLD responses to audio–visual alignment are modulated by the communicative nature of the stimuli. Neuroimage 2016; 132:129-137. [DOI: 10.1016/j.neuroimage.2016.02.018] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2014] [Revised: 12/16/2015] [Accepted: 02/09/2016] [Indexed: 11/15/2022] Open
|
47
|
Giezen MR, Emmorey K. Semantic Integration and Age of Acquisition Effects in Code-Blend Comprehension. JOURNAL OF DEAF STUDIES AND DEAF EDUCATION 2016; 21:213-221. [PMID: 26657077 PMCID: PMC4886315 DOI: 10.1093/deafed/env056] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Revised: 11/09/2015] [Accepted: 11/09/2015] [Indexed: 06/05/2023]
Abstract
Semantic and lexical decision tasks were used to investigate the mechanisms underlying code-blend facilitation: the finding that hearing bimodal bilinguals comprehend signs in American Sign Language (ASL) and spoken English words more quickly when they are presented together simultaneously than when each is presented alone. More robust facilitation effects were observed for semantic decision than for lexical decision, suggesting that lexical integration of signs and words within a code-blend occurs primarily at the semantic level, rather than at the level of form. Early bilinguals exhibited greater facilitation effects than late bilinguals for English (the dominant language) in the semantic decision task, possibly because early bilinguals are better able to process early visual cues from ASL signs and use these to constrain English word recognition. Comprehension facilitation via semantic integration of words and signs is consistent with co-speech gesture research demonstrating facilitative effects of gesture integration on language comprehension.
Collapse
|
48
|
Wu YC, Coulson S. Iconic Gestures Facilitate Discourse Comprehension in Individuals With Superior Immediate Memory for Body Configurations. Psychol Sci 2015; 26:1717-27. [DOI: 10.1177/0956797615597671] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2014] [Accepted: 07/03/2015] [Indexed: 11/16/2022] Open
Abstract
To understand a speaker’s gestures, people may draw on kinesthetic working memory (KWM)—a system for temporarily remembering body movements. The present study explored whether sensitivity to gesture meaning was related to differences in KWM capacity. KWM was evaluated through sequences of novel movements that participants viewed and reproduced with their own bodies. Gesture sensitivity was assessed through a priming paradigm. Participants judged whether multimodal utterances containing congruent, incongruent, or no gestures were related to subsequent picture probes depicting the referents of those utterances. Individuals with low KWM were primarily inhibited by incongruent speech-gesture primes, whereas those with high KWM showed facilitation—that is, they were able to identify picture probes more quickly when preceded by congruent speech and gestures than by speech alone. Group differences were most apparent for discourse with weakly congruent speech and gestures. Overall, speech-gesture congruency effects were positively correlated with KWM abilities, which may help listeners match spatial properties of gestures to concepts evoked by speech.
Collapse
Affiliation(s)
| | - Seana Coulson
- Department of Cognitive Science, University of California, San Diego
| |
Collapse
|
49
|
Obermeier C, Gunter TC. Multisensory integration: the case of a time window of gesture-speech integration. J Cogn Neurosci 2015; 27:292-307. [PMID: 25061929 DOI: 10.1162/jocn_a_00688] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
This experiment investigates the integration of gesture and speech from a multisensory perspective. In a disambiguation paradigm, participants were presented with short videos of an actress uttering sentences like "She was impressed by the BALL, because the GAME/DANCE...." The ambiguous noun (BALL) was accompanied by an iconic gesture fragment containing information to disambiguate the noun toward its dominant or subordinate meaning. We used four different temporal alignments between noun and gesture fragment: the identification point (IP) of the noun was either prior to (+120 msec), synchronous with (0 msec), or lagging behind the end of the gesture fragment (-200 and -600 msec). ERPs triggered to the IP of the noun showed significant differences for the integration of dominant and subordinate gesture fragments in the -200, 0, and +120 msec conditions. The outcome of this integration was revealed at the target words. These data suggest a time window for direct semantic gesture-speech integration ranging from at least -200 up to +120 msec. Although the -600 msec condition did not show any signs of direct integration at the homonym, significant disambiguation was found at the target word. An explorative analysis suggested that gesture information was directly integrated at the verb, indicating that there are multiple positions in a sentence where direct gesture-speech integration takes place. Ultimately, this would implicate that in natural communication, where a gesture lasts for some time, several aspects of that gesture will have their specific and possibly distinct impact on different positions in an utterance.
Collapse
Affiliation(s)
- Christian Obermeier
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | | |
Collapse
|
50
|
|