1
|
Huber E, Sauppe S, Isasi-Isasmendi A, Bornkessel-Schlesewsky I, Merlo P, Bickel B. Surprisal From Language Models Can Predict ERPs in Processing Predicate-Argument Structures Only if Enriched by an Agent Preference Principle. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:167-200. [PMID: 38645615 PMCID: PMC11025647 DOI: 10.1162/nol_a_00121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 08/30/2023] [Indexed: 04/23/2024]
Abstract
Language models based on artificial neural networks increasingly capture key aspects of how humans process sentences. Most notably, model-based surprisals predict event-related potentials such as N400 amplitudes during parsing. Assuming that these models represent realistic estimates of human linguistic experience, their success in modeling language processing raises the possibility that the human processing system relies on no other principles than the general architecture of language models and on sufficient linguistic input. Here, we test this hypothesis on N400 effects observed during the processing of verb-final sentences in German, Basque, and Hindi. By stacking Bayesian generalised additive models, we show that, in each language, N400 amplitudes and topographies in the region of the verb are best predicted when model-based surprisals are complemented by an Agent Preference principle that transiently interprets initial role-ambiguous noun phrases as agents, leading to reanalysis when this interpretation fails. Our findings demonstrate the need for this principle independently of usage frequencies and structural differences between languages. The principle has an unequal force, however. Compared to surprisal, its effect is weakest in German, stronger in Hindi, and still stronger in Basque. This gradient is correlated with the extent to which grammars allow unmarked NPs to be patients, a structural feature that boosts reanalysis effects. We conclude that language models gain more neurobiological plausibility by incorporating an Agent Preference. Conversely, theories of human processing profit from incorporating surprisal estimates in addition to principles like the Agent Preference, which arguably have distinct evolutionary roots.
Collapse
Affiliation(s)
- Eva Huber
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| | - Sebastian Sauppe
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
- Department of Psychology, University of Zurich, Zurich, Switzerland
| | - Arrate Isasi-Isasmendi
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| | - Ina Bornkessel-Schlesewsky
- Cognitive Neuroscience Laboratory, Australian Research Centre for Interactive and Virtual Environments, University of South Australia, Adelaide, Australia
| | - Paola Merlo
- Department of Linguistics, University of Geneva, Geneva, Switzerland
- University Center for Computer Science, University of Geneva, Geneva, Switzerland
| | - Balthasar Bickel
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| |
Collapse
|
2
|
Moore JA, Wilms M, Gutierrez A, Ismail Z, Fakhar K, Hadaeghi F, Hilgetag CC, Forkert ND. Simulation of neuroplasticity in a CNN-based in-silico model of neurodegeneration of the visual system. Front Comput Neurosci 2023; 17:1274824. [PMID: 38105786 PMCID: PMC10722164 DOI: 10.3389/fncom.2023.1274824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 11/08/2023] [Indexed: 12/19/2023] Open
Abstract
The aim of this work was to enhance the biological feasibility of a deep convolutional neural network-based in-silico model of neurodegeneration of the visual system by equipping it with a mechanism to simulate neuroplasticity. Therefore, deep convolutional networks of multiple sizes were trained for object recognition tasks and progressively lesioned to simulate neurodegeneration of the visual cortex. More specifically, the injured parts of the network remained injured while we investigated how the added retraining steps were able to recover some of the model's object recognition baseline performance. The results showed with retraining, model object recognition abilities are subject to a smoother and more gradual decline with increasing injury levels than without retraining and, therefore, more similar to the longitudinal cognition impairments of patients diagnosed with Alzheimer's disease (AD). Moreover, with retraining, the injured model exhibits internal activation patterns similar to those of the healthy baseline model when compared to the injured model without retraining. Furthermore, we conducted this analysis on a network that had been extensively pruned, resulting in an optimized number of parameters or synapses. Our findings show that this network exhibited remarkably similar capability to recover task performance with decreasingly viable pathways through the network. In conclusion, adding a retraining step to the in-silico setup that simulates neuroplasticity improves the model's biological feasibility considerably and could prove valuable to test different rehabilitation approaches in-silico.
Collapse
Affiliation(s)
- Jasmine A. Moore
- Department of Radiology, University of Calgary, Calgary, AB, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Biomedical Engineering Program, University of Calgary, Calgary, AB, Canada
| | - Matthias Wilms
- Department of Radiology, University of Calgary, Calgary, AB, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB, Canada
| | - Alejandro Gutierrez
- Department of Radiology, University of Calgary, Calgary, AB, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Biomedical Engineering Program, University of Calgary, Calgary, AB, Canada
| | - Zahinoor Ismail
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Department of Clinical Neurosciences, University of Calgary, Calgary, AB, Canada
| | - Kayson Fakhar
- Institute of Computational Neuroscience, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Fatemeh Hadaeghi
- Institute of Computational Neuroscience, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Claus C. Hilgetag
- Institute of Computational Neuroscience, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
- Department of Health Sciences, Boston University, Boston, MA, United States
| | - Nils D. Forkert
- Department of Radiology, University of Calgary, Calgary, AB, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
3
|
Lin C, Bulls LS, Tepfer LJ, Vyas AD, Thornton MA. Advancing Naturalistic Affective Science with Deep Learning. AFFECTIVE SCIENCE 2023; 4:550-562. [PMID: 37744976 PMCID: PMC10514024 DOI: 10.1007/s42761-023-00215-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 08/03/2023] [Indexed: 09/26/2023]
Abstract
People express their own emotions and perceive others' emotions via a variety of channels, including facial movements, body gestures, vocal prosody, and language. Studying these channels of affective behavior offers insight into both the experience and perception of emotion. Prior research has predominantly focused on studying individual channels of affective behavior in isolation using tightly controlled, non-naturalistic experiments. This approach limits our understanding of emotion in more naturalistic contexts where different channels of information tend to interact. Traditional methods struggle to address this limitation: manually annotating behavior is time-consuming, making it infeasible to do at large scale; manually selecting and manipulating stimuli based on hypotheses may neglect unanticipated features, potentially generating biased conclusions; and common linear modeling approaches cannot fully capture the complex, nonlinear, and interactive nature of real-life affective processes. In this methodology review, we describe how deep learning can be applied to address these challenges to advance a more naturalistic affective science. First, we describe current practices in affective research and explain why existing methods face challenges in revealing a more naturalistic understanding of emotion. Second, we introduce deep learning approaches and explain how they can be applied to tackle three main challenges: quantifying naturalistic behaviors, selecting and manipulating naturalistic stimuli, and modeling naturalistic affective processes. Finally, we describe the limitations of these deep learning methods, and how these limitations might be avoided or mitigated. By detailing the promise and the peril of deep learning, this review aims to pave the way for a more naturalistic affective science.
Collapse
Affiliation(s)
- Chujun Lin
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Landry S. Bulls
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Lindsey J. Tepfer
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Amisha D. Vyas
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Mark A. Thornton
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| |
Collapse
|
4
|
Malhotra G, Dujmović M, Bowers JS. Feature blindness: A challenge for understanding and modelling visual object recognition. PLoS Comput Biol 2022; 18:e1009572. [PMID: 35560155 PMCID: PMC9132323 DOI: 10.1371/journal.pcbi.1009572] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 05/25/2022] [Accepted: 03/19/2022] [Indexed: 12/02/2022] Open
Abstract
Humans rely heavily on the shape of objects to recognise them. Recently, it has been argued that Convolutional Neural Networks (CNNs) can also show a shape-bias, provided their learning environment contains this bias. This has led to the proposal that CNNs provide good mechanistic models of shape-bias and, more generally, human visual processing. However, it is also possible that humans and CNNs show a shape-bias for very different reasons, namely, shape-bias in humans may be a consequence of architectural and cognitive constraints whereas CNNs show a shape-bias as a consequence of learning the statistics of the environment. We investigated this question by exploring shape-bias in humans and CNNs when they learn in a novel environment. We observed that, in this new environment, humans (i) focused on shape and overlooked many non-shape features, even when non-shape features were more diagnostic, (ii) learned based on only one out of multiple predictive features, and (iii) failed to learn when global features, such as shape, were absent. This behaviour contrasted with the predictions of a statistical inference model with no priors, showing the strong role that shape-bias plays in human feature selection. It also contrasted with CNNs that (i) preferred to categorise objects based on non-shape features, and (ii) increased reliance on these non-shape features as they became more predictive. This was the case even when the CNN was pre-trained to have a shape-bias and the convolutional backbone was frozen. These results suggest that shape-bias has a different source in humans and CNNs: while learning in CNNs is driven by the statistical properties of the environment, humans are highly constrained by their previous biases, which suggests that cognitive constraints play a key role in how humans learn to recognise novel objects. Any object consists of hundreds of visual features that can be used to recognise it. How do humans select which feature to use? Do we always choose features that are best at predicting the object? In a series of experiments using carefully designed stimuli, we find that humans frequently ignore many features that are clearly visible and highly predictive. This behaviour is statistically inefficient and we show that it contrasts with statistical inference models such as state-of-the-art neural networks. Unlike humans, these models learn to rely on the most predictive feature when trained on the same data. We argue that the reason underlying human behaviour may be a bias to look for features that are less hungry for cognitive resources and generalise better to novel instances. Models that incorporate cognitive constraints may not only allow us to better understand human vision but also help us develop machine learning models that are more robust to changes in incidental features of objects.
Collapse
Affiliation(s)
- Gaurav Malhotra
- School of Psychological Sciences, University of Bristol, Bristol, United Kingdom
- * E-mail:
| | - Marin Dujmović
- School of Psychological Sciences, University of Bristol, Bristol, United Kingdom
| | - Jeffrey S. Bowers
- School of Psychological Sciences, University of Bristol, Bristol, United Kingdom
| |
Collapse
|
5
|
Maier M, Blume F, Bideau P, Hellwich O, Abdel Rahman R. Knowledge-augmented face perception: Prospects for the Bayesian brain-framework to align AI and human vision. Conscious Cogn 2022; 101:103301. [DOI: 10.1016/j.concog.2022.103301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 11/27/2021] [Accepted: 01/04/2022] [Indexed: 11/03/2022]
|
6
|
Fintz M, Osadchy M, Hertz U. Using deep learning to predict human decisions and using cognitive models to explain deep learning models. Sci Rep 2022; 12:4736. [PMID: 35304572 PMCID: PMC8933393 DOI: 10.1038/s41598-022-08863-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 03/15/2022] [Indexed: 11/09/2022] Open
Abstract
Deep neural networks (DNNs) models have the potential to provide new insights in the study of cognitive processes, such as human decision making, due to their high capacity and data-driven design. While these models may be able to go beyond theory-driven models in predicting human behaviour, their opaque nature limits their ability to explain how an operation is carried out, undermining their usefulness as a scientific tool. Here we suggest the use of a DNN model as an exploratory tool to identify predictable and consistent human behaviour, and using explicit, theory-driven models, to characterise the high-capacity model. To demonstrate our approach, we trained an exploratory DNN model to predict human decisions in a four-armed bandit task. We found that this model was more accurate than two explicit models, a reward-oriented model geared towards choosing the most rewarding option, and a reward-oblivious model that was trained to predict human decisions without information about rewards. Using experimental simulations, we were able to characterise the exploratory model using the explicit models. We found that the exploratory model converged with the reward-oriented model’s predictions when one option was clearly better than the others, but that it predicted pattern-based explorations akin to the reward-oblivious model’s predictions. These results suggest that predictable decision patterns that are not solely reward-oriented may contribute to human decisions. Importantly, we demonstrate how theory-driven cognitive models can be used to characterise the operation of DNNs, making DNNs a useful explanatory tool in scientific investigation.
Collapse
Affiliation(s)
- Matan Fintz
- Department of Computer Science, University of Haifa, Haifa, Israel
| | | | - Uri Hertz
- Department of Cognitive Sciences, University of Haifa, 3498838, Haifa, Israel.
| |
Collapse
|
7
|
Ogiela U, Snášel V. Predictive intelligence in evaluation of visual perception thresholds for visual pattern recognition and understanding. Inf Process Manag 2022. [DOI: 10.1016/j.ipm.2022.102865] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
8
|
Abstract
AbstractDeep learning (DL) techniques have revolutionised artificial systems’ performance on myriad tasks, from playing Go to medical diagnosis. Recent developments have extended such successes to natural language processing, an area once deemed beyond such systems’ reach. Despite their different goals (technological development vs. theoretical insight), these successes have suggested that such systems may be pertinent to theoretical linguistics. The competence/performance distinction presents a fundamental barrier to such inferences. While DL systems are trained on linguistic performance, linguistic theories are aimed at competence. Such a barrier has traditionally been sidestepped by assuming a fairly close correspondence: performance as competence plus noise. I argue this assumption is unmotivated. Competence and performance can differ arbitrarily. Thus, we should not expect DL models to illuminate linguistic theory.
Collapse
|
9
|
|
10
|
Davis F, Altmann GTM. Finding event structure in time: What recurrent neural networks can tell us about event structure in mind. Cognition 2021; 213:104651. [PMID: 33714544 DOI: 10.1016/j.cognition.2021.104651] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 02/24/2021] [Accepted: 02/24/2021] [Indexed: 11/27/2022]
Abstract
Under a theory of event representations that defines events as dynamic changes in objects across both time and space, as in the proposal of Intersecting Object Histories (Altmann & Ekves, 2019), the encoding of changes in state is a fundamental first step in building richer representations of events. In other words, there is an inherent dynamic that is captured by our knowledge of events. In the present study, we evaluated the degree to which this dynamic was inferable from just the linguistic signal, without access to visual, sensory, and embodied experience, using recurrent neural networks (RNNs). Recent literature exploring RNNs has largely focused on syntactic and semantic knowledge. We extend this domain of investigation to representations of events within RNNs. In three studies, we find preliminary evidence that RNNs capture, in their internal representations, the extent to which objects change states; for example, that chopping an onion changes the onion by more than just peeling the onion. Moreover, the temporal relationship between state changes is encoded to some extent. We found RNNs are sensitive to how chopping an onion and then weighing it, or first weighing it, entails the onion that is being weighed being in a different state depending on the adverb. Our final study explored what factors influence the propagation of these rudimentary event representations forward into subsequent sentences. We conclude that while there is much still to be learned about the abilities of RNNs (especially in respect of the extent to which they encode objects as specific tokens), we still do not know what are the equivalent representational dynamics in humans. That is, we take the perspective that the exploration of computational models points us to important questions about the nature of the human mind.
Collapse
Affiliation(s)
- Forrest Davis
- Department of Linguistics, Cornell University, United States of America
| | - Gerry T M Altmann
- Department of Psychological Sciences, CT Institute for the Brain and Cognitive Sciences, University of Connecticut, United States of America.
| |
Collapse
|
11
|
A Novel Squeeze-and-Excitation W-Net for 2D and 3D Building Change Detection with Multi-Source and Multi-Feature Remote Sensing Data. REMOTE SENSING 2021. [DOI: 10.3390/rs13030440] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Building Change Detection (BCD) is one of the core issues in earth observation and has received extensive attention in recent years. With the rapid development of earth observation technology, the data source of remote sensing change detection is continuously enriched, which provides the possibility to describe the spatial details of the ground objects more finely and to characterize the ground objects with multiple perspectives and levels. However, due to the different physical mechanisms of multi-source remote sensing data, BCD based on heterogeneous data is a challenge. Previous studies mostly focused on the BCD of homogeneous remote sensing data, while the use of multi-source remote sensing data and considering multiple features to conduct 2D and 3D BCD research is sporadic. In this article, we propose a novel and general squeeze-and-excitation W-Net, which is developed from U-Net and SE-Net. Its unique advantage is that it can not only be used for BCD of homogeneous and heterogeneous remote sensing data respectively but also can input both homogeneous and heterogeneous remote sensing data for 2D or 3D BCD by relying on its bidirectional symmetric end-to-end network architecture. Moreover, from a unique perspective, we use image features that are stable in performance and less affected by radiation differences and temporal changes. We innovatively introduced the squeeze-and-excitation module to explicitly model the interdependence between feature channels so that the response between the feature channels is adaptively recalibrated to improve the information mining ability and detection accuracy of the model. As far as we know, this is the first proposed network architecture that can simultaneously use multi-source and multi-feature remote sensing data for 2D and 3D BCD. The experimental results in two 2D data sets and two challenging 3D data sets demonstrate that the promising performances of the squeeze-and-excitation W-Net outperform several traditional and state-of-the-art approaches. Moreover, both visual and quantitative analyses of the experimental results demonstrate competitive performance in the proposed network. This demonstrates that the proposed network and method are practical, physically justified, and have great potential application value in large-scale 2D and 3D BCD and qualitative and quantitative research.
Collapse
|