1
|
Rühlemann C, Barthel M. Word frequency and cognitive effort in turns-at-talk: turn structure affects processing load in natural conversation. Front Psychol 2024; 15:1208029. [PMID: 38899128 PMCID: PMC11186443 DOI: 10.3389/fpsyg.2024.1208029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 05/13/2024] [Indexed: 06/21/2024] Open
Abstract
Frequency distributions are known to widely affect psycholinguistic processes. The effects of word frequency in turns-at-talk, the nucleus of social action in conversation, have, by contrast, been largely neglected. This study probes into this gap by applying corpus-linguistic methods on the conversational component of the British National Corpus (BNC) and the Freiburg Multimodal Interaction Corpus (FreMIC). The latter includes continuous pupil size measures of participants of the recorded conversations, allowing for a systematic investigation of patterns in the contained speech and language on the one hand and their relation to concurrent processing costs they may incur in speakers and recipients on the other hand. We test a first hypothesis in this vein, analyzing whether word frequency distributions within turns-at-talk are correlated with interlocutors' processing effort during the production and reception of these turns. Turns are found to generally show a regular distribution pattern of word frequency, with highly frequent words in turn-initial positions, mid-range frequency words in turn-medial positions, and low-frequency words in turn-final positions. Speakers' pupil size is found to tend to increase during the course of a turn at talk, reaching a climax toward the turn end. Notably, the observed decrease in word frequency within turns is inversely correlated with the observed increase in pupil size in speakers, but not in recipients, with steeper decreases in word frequency going along with steeper increases in pupil size in speakers. We discuss the implications of these findings for theories of speech processing, turn structure, and information packaging. Crucially, we propose that the intensification of processing effort in speakers during a turn at talk is owed to an informational climax, which entails a progression from high-frequency, low-information words through intermediate levels to low-frequency, high-information words. At least in English conversation, interlocutors seem to make use of this pattern as one way to achieve efficiency in conversational interaction, creating a regularly recurring distribution of processing load across speaking turns, which aids smooth turn transitions, content prediction, and effective information transfer.
Collapse
Affiliation(s)
| | - Mathias Barthel
- Pragmatics Department, Leibniz Institute for the German Language (IDS), Mannheim, Germany
| |
Collapse
|
2
|
Böttcher M, Zellers M. Do you say uh or uhm? A cross-linguistic approach to filler particle use in heritage and majority speakers across three languages. Front Psychol 2024; 15:1305862. [PMID: 38566943 PMCID: PMC10986790 DOI: 10.3389/fpsyg.2024.1305862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 02/26/2024] [Indexed: 04/04/2024] Open
Abstract
Filler particles like uhm in English or ähm in German show subtle language-specific differences and their variation in form is related to socio-linguistic variables like gender. The use of fillers in a second language has been shown to differ from monolinguals' filler particle use in both frequency and form in different language contexts. This study investigates the language-specific use of filler particles by bilingual heritage speakers in both their languages, looking at the dominant majority language in the society and their minority heritage language spoken at home. This is done based on heritage Russian and German data and majority German and English data from the RUEG corpus. Language-specific fillers were extracted from the corpus and analyzed for their occurrence and segmental form. The frequency analysis suggests an influence of bilingualism, age group, and formality of the situation on the filler frequency across all languages. The number of filler particles is higher in formal, older, and bilingual speech. The form analysis reveals an effect of language and gender on the type of filler particle. The vocalic-nasal filler particles (e.g., uhm) are more frequently found in German and English and in female speech of these languages. Heritage speakers of Russian in contact with German and English show higher use of vocalic-nasal forms also in their Russian while producing similar gender related patterns to monolingual speakers in both their languages. The higher frequency of filler particles in formal situations, older speakers and in bilingual speech, is discussed related to cognitive load which is assumed to be higher in these contexts while speech style which differs between situations and social groups is also considered as explanation. The higher use of vocalic-nasal filler particles in German and English suggests language specific filler particle preferences also related to the socio-linguistic variable gender in these languages. The results from heritage speakers suggest and influence on filler particle form in their heritage language, while also revealing socio-linguistic usage patterns related to gender which are produced by heritage speakers similarly to monolinguals in their respective language.
Collapse
Affiliation(s)
- Marlene Böttcher
- Department of General Linguistics and Phonetics, Institute for Scandinavian Studies, Frisian Studies and General Linguistics, Kiel University, Kiel, Germany
| | | |
Collapse
|
3
|
Corps RE. What do we know about the mechanisms of response planning in dialog? PSYCHOLOGY OF LEARNING AND MOTIVATION 2023. [DOI: 10.1016/bs.plm.2023.02.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
4
|
Pouw W, Holler J. Timing in conversation is dynamically adjusted turn by turn in dyadic telephone conversations. Cognition 2022; 222:105015. [DOI: 10.1016/j.cognition.2022.105015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 01/04/2022] [Accepted: 01/05/2022] [Indexed: 11/03/2022]
|
5
|
Haiduk F, Fitch WT. Understanding Design Features of Music and Language: The Choric/Dialogic Distinction. Front Psychol 2022; 13:786899. [PMID: 35529579 PMCID: PMC9075586 DOI: 10.3389/fpsyg.2022.786899] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 02/22/2022] [Indexed: 12/03/2022] Open
Abstract
Music and spoken language share certain characteristics: both consist of sequences of acoustic elements that are combinatorically combined, and these elements partition the same continuous acoustic dimensions (frequency, formant space and duration). However, the resulting categories differ sharply: scale tones and note durations of small integer ratios appear in music, while speech uses phonemes, lexical tone, and non-isochronous durations. Why did music and language diverge into the two systems we have today, differing in these specific features? We propose a framework based on information theory and a reverse-engineering perspective, suggesting that design features of music and language are a response to their differential deployment along three different continuous dimensions. These include the familiar propositional-aesthetic ('goal') and repetitive-novel ('novelty') dimensions, and a dialogic-choric ('interactivity') dimension that is our focus here. Specifically, we hypothesize that music exhibits specializations enhancing coherent production by several individuals concurrently-the 'choric' context. In contrast, language is specialized for exchange in tightly coordinated turn-taking-'dialogic' contexts. We examine the evidence for our framework, both from humans and non-human animals, and conclude that many proposed design features of music and language follow naturally from their use in distinct dialogic and choric communicative contexts. Furthermore, the hybrid nature of intermediate systems like poetry, chant, or solo lament follows from their deployment in the less typical interactive context.
Collapse
Affiliation(s)
- Felix Haiduk
- Department of Behavioral and Cognitive Biology, University of Vienna, Vienna, Austria
| | - W. Tecumseh Fitch
- Department of Behavioral and Cognitive Biology, University of Vienna, Vienna, Austria
- Vienna Cognitive Science Hub, University of Vienna, Vienna, Austria
| |
Collapse
|
6
|
Castellucci GA, Kovach CK, Howard MA, Greenlee JDW, Long MA. A speech planning network for interactive language use. Nature 2022; 602:117-122. [PMID: 34987226 PMCID: PMC9990513 DOI: 10.1038/s41586-021-04270-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 11/19/2021] [Indexed: 11/09/2022]
Abstract
During conversation, people take turns speaking by rapidly responding to their partners while simultaneously avoiding interruption1,2. Such interactions display a remarkable degree of coordination, as gaps between turns are typically about 200 milliseconds3-approximately the duration of an eyeblink4. These latencies are considerably shorter than those observed in simple word-production tasks, which indicates that speakers often plan their responses while listening to their partners2. Although a distributed network of brain regions has been implicated in speech planning5-9, the neural dynamics underlying the specific preparatory processes that enable rapid turn-taking are poorly understood. Here we use intracranial electrocorticography to precisely measure neural activity as participants perform interactive tasks, and we observe a functionally and anatomically distinct class of planning-related cortical dynamics. We localize these responses to a frontotemporal circuit centred on the language-critical caudal inferior frontal cortex10 (Broca's region) and the caudal middle frontal gyrus-a region not normally implicated in speech planning11-13. Using a series of motor tasks, we then show that this planning network is more active when preparing speech as opposed to non-linguistic actions. Finally, we delineate planning-related circuitry during natural conversation that is nearly identical to the network mapped with our interactive tasks, and we find this circuit to be most active before participant speech during unconstrained turn-taking. Therefore, we have identified a speech planning network that is central to natural language generation during social interaction.
Collapse
Affiliation(s)
- Gregg A Castellucci
- NYU Neuroscience Institute and Department of Otolaryngology, New York University Langone Medical Center, New York, NY, USA
- Center for Neural Science, New York University, New York, NY, USA
| | | | - Matthew A Howard
- Department of Neurosurgery, University of Iowa, Iowa City, IA, USA
| | | | - Michael A Long
- NYU Neuroscience Institute and Department of Otolaryngology, New York University Langone Medical Center, New York, NY, USA.
- Center for Neural Science, New York University, New York, NY, USA.
| |
Collapse
|
7
|
Holler J, Alday PM, Decuyper C, Geiger M, Kendrick KH, Meyer AS. Competition Reduces Response Times in Multiparty Conversation. Front Psychol 2021; 12:693124. [PMID: 34603124 PMCID: PMC8481383 DOI: 10.3389/fpsyg.2021.693124] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 08/04/2021] [Indexed: 11/13/2022] Open
Abstract
Natural conversations are characterized by short transition times between turns. This holds in particular for multi-party conversations. The short turn transitions in everyday conversations contrast sharply with the much longer speech onset latencies observed in laboratory studies where speakers respond to spoken utterances. There are many factors that facilitate speech production in conversational compared to laboratory settings. Here we highlight one of them, the impact of competition for turns. In multi-party conversations, speakers often compete for turns. In quantitative corpus analyses of multi-party conversation, the fastest response determines the recorded turn transition time. In contrast, in dyadic conversations such competition for turns is much less likely to arise, and in laboratory experiments with individual participants it does not arise at all. Therefore, all responses tend to be recorded. Thus, competition for turns may reduce the recorded mean turn transition times in multi-party conversations for a simple statistical reason: slow responses are not included in the means. We report two studies illustrating this point. We first report the results of simulations showing how much the response times in a laboratory experiment would be reduced if, for each trial, instead of recording all responses, only the fastest responses of several participants responding independently on the trial were recorded. We then present results from a quantitative corpus analysis comparing turn transition times in dyadic and triadic conversations. There was no significant group size effect in question-response transition times, where the present speaker often selects the next one, thus reducing competition between speakers. But, as predicted, triads showed shorter turn transition times than dyads for the remaining turn transitions, where competition for the floor was more likely to arise. Together, these data show that turn transition times in conversation should be interpreted in the context of group size, turn transition type, and social setting.
Collapse
Affiliation(s)
- Judith Holler
- Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands.,Donders Institute for Brain, Cognition, and Behavior, Radboud University, Nijmegen, Netherlands
| | - Phillip M Alday
- Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands
| | - Caitlin Decuyper
- Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands
| | - Mareike Geiger
- Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands.,Donders Institute for Brain, Cognition, and Behavior, Radboud University, Nijmegen, Netherlands
| | | | - Antje S Meyer
- Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands.,Donders Institute for Brain, Cognition, and Behavior, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
8
|
Krause PA, Kawamoto AH. Predicting One's Turn With Both Body and Mind: Anticipatory Speech Postures During Dyadic Conversation. Front Psychol 2021; 12:684248. [PMID: 34326798 PMCID: PMC8315268 DOI: 10.3389/fpsyg.2021.684248] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 06/18/2021] [Indexed: 11/17/2022] Open
Abstract
In natural conversation, turns are handed off quickly, with the mean downtime commonly ranging from 7 to 423 ms. To achieve this, speakers plan their upcoming speech as their partner’s turn unfolds, holding the audible utterance in abeyance until socially appropriate. The role played by prediction is debated, with some researchers claiming that speakers predict upcoming speech opportunities, and others claiming that speakers wait for detection of turn-final cues. The dynamics of articulatory triggering may speak to this debate. It is often assumed that the prepared utterance is held in a response buffer and then initiated all at once. This assumption is consistent with standard phonetic models in which articulatory actions must follow tightly prescribed patterns of coordination. This assumption has recently been challenged by single-word production experiments in which participants partly positioned their articulators to anticipate upcoming utterances, long before starting the acoustic response. The present study considered whether similar anticipatory postures arise when speakers in conversation await their next opportunity to speak. We analyzed a pre-existing audiovisual database of dyads engaging in unstructured conversation. Video motion tracking was used to determine speakers’ lip areas over time. When utterance-initial syllables began with labial consonants or included rounded vowels, speakers produced distinctly smaller lip areas (compared to other utterances), prior to audible speech. This effect was moderated by the number of words in the upcoming utterance; postures arose up to 3,000 ms before acoustic onset for short utterances of 1–3 words. We discuss the implications for models of conversation and phonetic control.
Collapse
Affiliation(s)
- Peter A Krause
- Department of Psychology, California State University Channel Islands, Camarillo, CA, United States.,Department of Psychology, University of California, Santa Cruz, Santa Cruz, CA, United States
| | - Alan H Kawamoto
- Department of Psychology, University of California, Santa Cruz, Santa Cruz, CA, United States
| |
Collapse
|
9
|
Jongman SR. The attentional demands of combining comprehension and production in conversation. PSYCHOLOGY OF LEARNING AND MOTIVATION 2021. [DOI: 10.1016/bs.plm.2021.02.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|