1
|
Long M, Moore I, Mollica F, Rubio-Fernandez P. Contrast perception as a visual heuristic in the formulation of referential expressions. Cognition 2021; 217:104879. [PMID: 34418775 DOI: 10.1016/j.cognition.2021.104879] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 07/23/2021] [Accepted: 08/11/2021] [Indexed: 11/17/2022]
Abstract
We hypothesize that contrast perception works as a visual heuristic, such that when speakers perceive a significant degree of contrast in a visual context, they tend to produce the corresponding adjective to describe a referent. The contrast perception heuristic supports efficient audience design, allowing speakers to produce referential expressions with minimum expenditure of cognitive resources, while facilitating the listener's visual search for the referent. We tested the perceptual contrast hypothesis in three language-production experiments. Experiment 1 revealed that speakers overspecify color adjectives in polychrome displays, whereas in monochrome displays they overspecified other properties that were contrastive. Further support for the contrast perception hypothesis comes from a re-analysis of previous work, which confirmed that color contrast elicits color overspecification when detected in a given display, but not when detected across monochrome trials. Experiment 2 revealed that even atypical colors (which are often overspecified) are only mentioned if there is color contrast. In Experiment 3, participants named a target color faster in monochrome than in polychrome displays, suggesting that the effect of color contrast is not analogous to ease of production. We conclude that the tendency to overspecify color in polychrome displays is not a bottom-up effect driven by the visual salience of color as a property, but possibly a learned communicative strategy. We discuss the implications of our account for pragmatic theories of referential communication and models of audience design, challenging the view that overspecification is a form of egocentric behavior.
Collapse
Affiliation(s)
| | - Isabelle Moore
- Psychology Department, University of Virginia, United States of America
| | - Francis Mollica
- Informatics Department, University of Edinburgh, United Kingdom
| | - Paula Rubio-Fernandez
- Philosophy Department, University of Oslo, Norway; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, United States of America.
| |
Collapse
|
2
|
Kuperberg GR. Tea With Milk? A Hierarchical Generative Framework of Sequential Event Comprehension. Top Cogn Sci 2021; 13:256-298. [PMID: 33025701 PMCID: PMC7897219 DOI: 10.1111/tops.12518] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 07/11/2020] [Accepted: 07/11/2020] [Indexed: 10/23/2022]
Abstract
To make sense of the world around us, we must be able to segment a continual stream of sensory inputs into discrete events. In this review, I propose that in order to comprehend events, we engage hierarchical generative models that "reverse engineer" the intentions of other agents as they produce sequential action in real time. By generating probabilistic predictions for upcoming events, generative models ensure that we are able to keep up with the rapid pace at which perceptual inputs unfold. By tracking our certainty about other agents' goals and the magnitude of prediction errors at multiple temporal scales, generative models enable us to detect event boundaries by inferring when a goal has changed. Moreover, by adapting flexibly to the broader dynamics of the environment and our own comprehension goals, generative models allow us to optimally allocate limited resources. Finally, I argue that we use generative models not only to comprehend events but also to produce events (carry out goal-relevant sequential action) and to continually learn about new events from our surroundings. Taken together, this hierarchical generative framework provides new insights into how the human brain processes events so effortlessly while highlighting the fundamental links between event comprehension, production, and learning.
Collapse
Affiliation(s)
- Gina R. Kuperberg
- Department of Psychology and Center for Cognitive Science, Tufts University
- Department of Psychiatry and the Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School
| |
Collapse
|
3
|
Abstract
Audience design refers to the situation in which speakers fashion their utterances so as to cater to the needs of their addressees. In this article, a range of audience design effects are reviewed, organized by a novel cognitive framework for understanding audience design effects. Within this framework, feedforward (or one-shot) production is responsible for feedforward audience design effects, or effects based on already known properties of the addressee (e.g., child versus adult status) or the message (e.g., that it includes meanings that might be confusable). Then, a forward modeling approach is described, whereby speakers independently generate communicatively relevant features to predict potential communicative effects. This can explain recurrent processing audience design effects, or effects based on features of the produced utterance itself or on idiosyncratic features of the addressee or communicative situation. Predictions from the framework are delineated.
Collapse
Affiliation(s)
- Victor S Ferreira
- Department of Psychology and Center for Research in Language, University of California, San Diego, La Jolla, California 92093, USA;
| |
Collapse
|
4
|
Buz E, Tanenhaus MK, Jaeger TF. Dynamically adapted context-specific hyper-articulation: Feedback from interlocutors affects speakers' subsequent pronunciations. JOURNAL OF MEMORY AND LANGUAGE 2016; 89:68-86. [PMID: 27375344 PMCID: PMC4927008 DOI: 10.1016/j.jml.2015.12.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
We ask whether speakers can adapt their productions when feedback from their interlocutors suggests that previous productions were perceptually confusable. To address this question, we use a novel web-based task-oriented paradigm for speech recording, in which participants produce instructions towards a (simulated) partner with naturalistic response times. We manipulate (1) whether a target word with a voiceless plosive (e.g., pill) occurs in the presence of a voiced competitor (bill) or an unrelated word (food) and (2) whether or not the simulated partner occasionally misunderstands the target word. Speakers hyper-articulated the target word when a voiced competitor was present. Moreover, the size of the hyper-articulation effect was nearly doubled when partners occasionally misunderstood the instruction. A novel type of distributional analysis further suggests that hyper-articulation did not change the target of production, but rather reduced the probability of perceptually ambiguous or confusable productions. These results were obtained in the absence of explicit clarification requests, and persisted across words and over trials. Our findings suggest that speakers adapt their pronunciations based on the perceived communicative success of their previous productions in the current environment. We discuss why speakers make adaptive changes to their speech and what mechanisms might underlie speakers' ability to do so.
Collapse
Affiliation(s)
- Esteban Buz
- Department of Brain and Cognitive Sciences, University of Rochester, United States
| | - Michael K. Tanenhaus
- Department of Brain and Cognitive Sciences, University of Rochester, United States
- Department of Linguistics, University of Rochester, United States
| | - T. Florian Jaeger
- Department of Brain and Cognitive Sciences, University of Rochester, United States
- Department of Linguistics, University of Rochester, United States
- Department of Computer Science, University of Rochester, United States
| |
Collapse
|
5
|
Brown M, Kuperberg GR. A Hierarchical Generative Framework of Language Processing: Linking Language Perception, Interpretation, and Production Abnormalities in Schizophrenia. Front Hum Neurosci 2015; 9:643. [PMID: 26640435 PMCID: PMC4661240 DOI: 10.3389/fnhum.2015.00643] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2015] [Accepted: 11/12/2015] [Indexed: 12/27/2022] Open
Abstract
Language and thought dysfunction are central to the schizophrenia syndrome. They are evident in the major symptoms of psychosis itself, particularly as disorganized language output (positive thought disorder) and auditory verbal hallucinations (AVHs), and they also manifest as abnormalities in both high-level semantic and contextual processing and low-level perception. However, the literatures characterizing these abnormalities have largely been separate and have sometimes provided mutually exclusive accounts of aberrant language in schizophrenia. In this review, we propose that recent generative probabilistic frameworks of language processing can provide crucial insights that link these four lines of research. We first outline neural and cognitive evidence that real-time language comprehension and production normally involve internal generative circuits that propagate probabilistic predictions to perceptual cortices - predictions that are incrementally updated based on prediction error signals as new inputs are encountered. We then explain how disruptions to these circuits may compromise communicative abilities in schizophrenia by reducing the efficiency and robustness of both high-level language processing and low-level speech perception. We also argue that such disruptions may contribute to the phenomenology of thought-disordered speech and false perceptual inferences in the language system (i.e., AVHs). This perspective suggests a number of productive avenues for future research that may elucidate not only the mechanisms of language abnormalities in schizophrenia, but also promising directions for cognitive rehabilitation.
Collapse
Affiliation(s)
- Meredith Brown
- Department of Psychiatry–Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, CharlestownMA, USA
- Department of Psychology, Tufts University, MedfordMA, USA
| | - Gina R. Kuperberg
- Department of Psychiatry–Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, CharlestownMA, USA
- Department of Psychology, Tufts University, MedfordMA, USA
| |
Collapse
|
6
|
Kuperberg GR, Jaeger TF. What do we mean by prediction in language comprehension? LANGUAGE, COGNITION AND NEUROSCIENCE 2015; 31:32-59. [PMID: 27135040 PMCID: PMC4850025 DOI: 10.1080/23273798.2015.1102299] [Citation(s) in RCA: 374] [Impact Index Per Article: 41.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Accepted: 09/28/2015] [Indexed: 05/05/2023]
Abstract
We consider several key aspects of prediction in language comprehension: its computational nature, the representational level(s) at which we predict, whether we use higher level representations to predictively pre-activate lower level representations, and whether we 'commit' in any way to our predictions, beyond pre-activation. We argue that the bulk of behavioral and neural evidence suggests that we predict probabilistically and at multiple levels and grains of representation. We also argue that we can, in principle, use higher level inferences to predictively pre-activate information at multiple lower representational levels. We also suggest that the degree and level of predictive pre-activation might be a function of the expected utility of prediction, which, in turn, may depend on comprehenders' goals and their estimates of the relative reliability of their prior knowledge and the bottom-up input. Finally, we argue that all these properties of language understanding can be naturally explained and productively explored within a multi-representational hierarchical actively generative architecture whose goal is to infer the message intended by the producer, and in which predictions play a crucial role in explaining the bottom-up input.
Collapse
Affiliation(s)
- Gina R. Kuperberg
- Department of Psychology and Center for Cognitive Science, Tufts University
- Department of Psychiatry and the Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School
| | - T. Florian Jaeger
- Department of Brain and Cognitive Sciences, Department of Computer Science, Department of Linguistics, University of Rochester
| |
Collapse
|
7
|
Kleinschmidt DF, Jaeger TF. Robust speech perception: recognize the familiar, generalize to the similar, and adapt to the novel. Psychol Rev 2015; 122:148-203. [PMID: 25844873 PMCID: PMC4744792 DOI: 10.1037/a0038695] [Citation(s) in RCA: 239] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Successful speech perception requires that listeners map the acoustic signal to linguistic categories. These mappings are not only probabilistic, but change depending on the situation. For example, one talker's /p/ might be physically indistinguishable from another talker's /b/ (cf. lack of invariance). We characterize the computational problem posed by such a subjectively nonstationary world and propose that the speech perception system overcomes this challenge by (a) recognizing previously encountered situations, (b) generalizing to other situations based on previous similar experience, and (c) adapting to novel situations. We formalize this proposal in the ideal adapter framework: (a) to (c) can be understood as inference under uncertainty about the appropriate generative model for the current talker, thereby facilitating robust speech perception despite the lack of invariance. We focus on 2 critical aspects of the ideal adapter. First, in situations that clearly deviate from previous experience, listeners need to adapt. We develop a distributional (belief-updating) learning model of incremental adaptation. The model provides a good fit against known and novel phonetic adaptation data, including perceptual recalibration and selective adaptation. Second, robust speech recognition requires that listeners learn to represent the structured component of cross-situation variability in the speech signal. We discuss how these 2 aspects of the ideal adapter provide a unifying explanation for adaptation, talker-specificity, and generalization across talkers and groups of talkers (e.g., accents and dialects). The ideal adapter provides a guiding framework for future investigations into speech perception and adaptation, and more broadly language comprehension.
Collapse
Affiliation(s)
| | - T Florian Jaeger
- Departments of Brain and Cognitive Sciences, Computer Science, and Linguistics, University of Rochester
| |
Collapse
|
8
|
Seyfarth S. Word informativity influences acoustic duration: Effects of contextual predictability on lexical representation. Cognition 2014; 133:140-55. [DOI: 10.1016/j.cognition.2014.06.013] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2013] [Revised: 06/12/2014] [Accepted: 06/16/2014] [Indexed: 11/25/2022]
|
9
|
Fine AB, Jaeger TF, Farmer TA, Qian T. Rapid Expectation Adaptation during Syntactic Comprehension. PLoS One 2013; 8:e77661. [PMID: 24204909 PMCID: PMC3813674 DOI: 10.1371/journal.pone.0077661] [Citation(s) in RCA: 142] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2013] [Accepted: 09/08/2013] [Indexed: 11/19/2022] Open
Abstract
When we read or listen to language, we are faced with the challenge of inferring intended messages from noisy input. This challenge is exacerbated by considerable variability between and within speakers. Focusing on syntactic processing (parsing), we test the hypothesis that language comprehenders rapidly adapt to the syntactic statistics of novel linguistic environments (e.g., speakers or genres). Two self-paced reading experiments investigate changes in readers' syntactic expectations based on repeated exposure to sentences with temporary syntactic ambiguities (so-called "garden path sentences"). These sentences typically lead to a clear expectation violation signature when the temporary ambiguity is resolved to an a priori less expected structure (e.g., based on the statistics of the lexical context). We find that comprehenders rapidly adapt their syntactic expectations to converge towards the local statistics of novel environments. Specifically, repeated exposure to a priori unexpected structures can reduce, and even completely undo, their processing disadvantage (Experiment 1). The opposite is also observed: a priori expected structures become less expected (even eliciting garden paths) in environments where they are hardly ever observed (Experiment 2). Our findings suggest that, when changes in syntactic statistics are to be expected (e.g., when entering a novel environment), comprehenders can rapidly adapt their expectations, thereby overcoming the processing disadvantage that mistaken expectations would otherwise cause. Our findings take a step towards unifying insights from research in expectation-based models of language processing, syntactic priming, and statistical learning.
Collapse
Affiliation(s)
- Alex B. Fine
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, United States of America
| | - T. Florian Jaeger
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, United States of America
- Department of Computer Science, University of Rochester, Rochester, New York, United States of America
| | - Thomas A. Farmer
- Department of Psychology, University of Iowa, Iowa City, Iowa, United States of America
| | - Ting Qian
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York, United States of America
| |
Collapse
|
10
|
Wedel A, Jackson S, Kaplan A. Functional load and the lexicon: Evidence that syntactic category and frequency relationships in minimal lemma pairs predict the loss of phoneme contrasts in language change. LANGUAGE AND SPEECH 2013; 56:395-417. [PMID: 24416963 DOI: 10.1177/0023830913489096] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
All languages use individually meaningless, contrastive categories in combination to create distinct words. Despite their central role in communication, these "phoneme" contrasts can be lost over the course of language change. The century-old functional load hypothesis proposes that loss of a phoneme contrast will be inhibited in relation to the work that it does in distinguishing words. In a previous work we showed for the first time that a simple measure of functional load does significantly predict patterns of contrast loss within a diverse set of languages: the more minimal word pairs that a phoneme contrast distinguishes, the less likely those phonemes are to have merged over the course of language change. Here, we examine several lexical properties that are predicted to influence the uncertainty between word pairs in usage. We present evidence that (a) the lemma rather than surface-form count of minimal pairs is more predictive of merger; (b) the count of minimal lemma pairs that share a syntactic category is a stronger predictor of merger than the count of those with divergent syntactic categories, and (c) that the count of minimal lemma pairs with members of similar frequency is a stronger predictor of merger than that of those with more divergent frequencies. These findings support the broad hypothesis that properties of individual utterances influence long-term language change, and are consistent with findings suggesting that phonetic cues are modulated in response to lexical uncertainty within utterances.
Collapse
Affiliation(s)
- Andrew Wedel
- Department of Linguistics, University of Arizona, AZ 85721, USA.
| | - Scott Jackson
- Center for Advanced Study of Language, University of Maryland, USA
| | - Abby Kaplan
- Department of Linguistics, University of Utah, USA
| |
Collapse
|