1
|
Millidge B, Tang M, Osanlouy M, Harper NS, Bogacz R. Predictive coding networks for temporal prediction. PLoS Comput Biol 2024; 20:e1011183. [PMID: 38557984 PMCID: PMC11008833 DOI: 10.1371/journal.pcbi.1011183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 04/11/2024] [Accepted: 03/12/2024] [Indexed: 04/04/2024] Open
Abstract
One of the key problems the brain faces is inferring the state of the world from a sequence of dynamically changing stimuli, and it is not yet clear how the sensory system achieves this task. A well-established computational framework for describing perceptual processes in the brain is provided by the theory of predictive coding. Although the original proposals of predictive coding have discussed temporal prediction, later work developing this theory mostly focused on static stimuli, and key questions on neural implementation and computational properties of temporal predictive coding networks remain open. Here, we address these questions and present a formulation of the temporal predictive coding model that can be naturally implemented in recurrent networks, in which activity dynamics rely only on local inputs to the neurons, and learning only utilises local Hebbian plasticity. Additionally, we show that temporal predictive coding networks can approximate the performance of the Kalman filter in predicting behaviour of linear systems, and behave as a variant of a Kalman filter which does not track its own subjective posterior variance. Importantly, temporal predictive coding networks can achieve similar accuracy as the Kalman filter without performing complex mathematical operations, but just employing simple computations that can be implemented by biological networks. Moreover, when trained with natural dynamic inputs, we found that temporal predictive coding can produce Gabor-like, motion-sensitive receptive fields resembling those observed in real neurons in visual areas. In addition, we demonstrate how the model can be effectively generalized to nonlinear systems. Overall, models presented in this paper show how biologically plausible circuits can predict future stimuli and may guide research on understanding specific neural circuits in brain areas involved in temporal prediction.
Collapse
Affiliation(s)
- Beren Millidge
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Mufeng Tang
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| | - Mahyar Osanlouy
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Nicol S. Harper
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Rafal Bogacz
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
2
|
Kondapaneni N, Perona P. A number sense as an emergent property of the manipulating brain. Sci Rep 2024; 14:6858. [PMID: 38514690 PMCID: PMC10958013 DOI: 10.1038/s41598-024-56828-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 03/12/2024] [Indexed: 03/23/2024] Open
Abstract
The ability to understand and manipulate numbers and quantities emerges during childhood, but the mechanism through which humans acquire and develop this ability is still poorly understood. We explore this question through a model, assuming that the learner is able to pick up and place small objects from, and to, locations of its choosing, and will spontaneously engage in such undirected manipulation. We further assume that the learner's visual system will monitor the changing arrangements of objects in the scene and will learn to predict the effects of each action by comparing perception with a supervisory signal from the motor system. We model perception using standard deep networks for feature extraction and classification. Our main finding is that, from learning the task of action prediction, an unexpected image representation emerges exhibiting regularities that foreshadow the perception and representation of numbers and quantity. These include distinct categories for zero and the first few natural numbers, a strict ordering of the numbers, and a one-dimensional signal that correlates with numerical quantity. As a result, our model acquires the ability to estimate numerosity, i.e. the number of objects in the scene, as well as subitization, i.e. the ability to recognize at a glance the exact number of objects in small scenes. Remarkably, subitization and numerosity estimation extrapolate to scenes containing many objects, far beyond the three objects used during training. We conclude that important aspects of a facility with numbers and quantities may be learned with supervision from a simple pre-training task. Our observations suggest that cross-modal learning is a powerful learning mechanism that may be harnessed in artificial intelligence.
Collapse
|
3
|
Jiang LP, Rao RPN. Dynamic predictive coding: A model of hierarchical sequence learning and prediction in the neocortex. PLoS Comput Biol 2024; 20:e1011801. [PMID: 38330098 PMCID: PMC10880975 DOI: 10.1371/journal.pcbi.1011801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 02/21/2024] [Accepted: 01/04/2024] [Indexed: 02/10/2024] Open
Abstract
We introduce dynamic predictive coding, a hierarchical model of spatiotemporal prediction and sequence learning in the neocortex. The model assumes that higher cortical levels modulate the temporal dynamics of lower levels, correcting their predictions of dynamics using prediction errors. As a result, lower levels form representations that encode sequences at shorter timescales (e.g., a single step) while higher levels form representations that encode sequences at longer timescales (e.g., an entire sequence). We tested this model using a two-level neural network, where the top-down modulation creates low-dimensional combinations of a set of learned temporal dynamics to explain input sequences. When trained on natural videos, the lower-level model neurons developed space-time receptive fields similar to those of simple cells in the primary visual cortex while the higher-level responses spanned longer timescales, mimicking temporal response hierarchies in the cortex. Additionally, the network's hierarchical sequence representation exhibited both predictive and postdictive effects resembling those observed in visual motion processing in humans (e.g., in the flash-lag illusion). When coupled with an associative memory emulating the role of the hippocampus, the model allowed episodic memories to be stored and retrieved, supporting cue-triggered recall of an input sequence similar to activity recall in the visual cortex. When extended to three hierarchical levels, the model learned progressively more abstract temporal representations along the hierarchy. Taken together, our results suggest that cortical processing and learning of sequences can be interpreted as dynamic predictive coding based on a hierarchical spatiotemporal generative model of the visual world.
Collapse
Affiliation(s)
- Linxing Preston Jiang
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, Washington, United States of America
- Center for Neurotechnology, University of Washington, Seattle, Washington, United States of America
- Computational Neuroscience Center, University of Washington, Seattle, Washington, United States of America
| | - Rajesh P. N. Rao
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, Washington, United States of America
- Center for Neurotechnology, University of Washington, Seattle, Washington, United States of America
- Computational Neuroscience Center, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
4
|
Song Y, Millidge B, Salvatori T, Lukasiewicz T, Xu Z, Bogacz R. Inferring neural activity before plasticity as a foundation for learning beyond backpropagation. Nat Neurosci 2024; 27:348-358. [PMID: 38172438 PMCID: PMC7615830 DOI: 10.1038/s41593-023-01514-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 11/02/2023] [Indexed: 01/05/2024]
Abstract
For both humans and machines, the essence of learning is to pinpoint which components in its information processing pipeline are responsible for an error in its output, a challenge that is known as 'credit assignment'. It has long been assumed that credit assignment is best solved by backpropagation, which is also the foundation of modern machine learning. Here, we set out a fundamentally different principle on credit assignment called 'prospective configuration'. In prospective configuration, the network first infers the pattern of neural activity that should result from learning, and then the synaptic weights are modified to consolidate the change in neural activity. We demonstrate that this distinct mechanism, in contrast to backpropagation, (1) underlies learning in a well-established family of models of cortical circuits, (2) enables learning that is more efficient and effective in many contexts faced by biological organisms and (3) reproduces surprising patterns of neural activity and behavior observed in diverse human and rat learning experiments.
Collapse
Affiliation(s)
- Yuhang Song
- Department of Computer Science, University of Oxford, Oxford, UK.
- Medical Research Council Brain Network Dynamics Unit, University of Oxford, Oxford, UK.
- Fractile, Ltd., London, UK.
| | - Beren Millidge
- Medical Research Council Brain Network Dynamics Unit, University of Oxford, Oxford, UK
| | - Tommaso Salvatori
- Department of Computer Science, University of Oxford, Oxford, UK
- Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria
- VERSES AI Research Lab, Los Angeles, CA, USA
| | - Thomas Lukasiewicz
- Department of Computer Science, University of Oxford, Oxford, UK.
- Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria.
| | - Zhenghua Xu
- Department of Computer Science, University of Oxford, Oxford, UK.
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China.
| | - Rafal Bogacz
- Medical Research Council Brain Network Dynamics Unit, University of Oxford, Oxford, UK.
| |
Collapse
|
5
|
Halvagal MS, Zenke F. The combination of Hebbian and predictive plasticity learns invariant object representations in deep sensory networks. Nat Neurosci 2023; 26:1906-1915. [PMID: 37828226 PMCID: PMC10620089 DOI: 10.1038/s41593-023-01460-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 09/08/2023] [Indexed: 10/14/2023]
Abstract
Recognition of objects from sensory stimuli is essential for survival. To that end, sensory networks in the brain must form object representations invariant to stimulus changes, such as size, orientation and context. Although Hebbian plasticity is known to shape sensory networks, it fails to create invariant object representations in computational models, raising the question of how the brain achieves such processing. In the present study, we show that combining Hebbian plasticity with a predictive form of plasticity leads to invariant representations in deep neural network models. We derive a local learning rule that generalizes to spiking neural networks and naturally accounts for several experimentally observed properties of synaptic plasticity, including metaplasticity and spike-timing-dependent plasticity. Finally, our model accurately captures neuronal selectivity changes observed in the primate inferotemporal cortex in response to altered visual experience. Thus, we provide a plausible normative theory emphasizing the importance of predictive plasticity mechanisms for successful representational learning.
Collapse
Affiliation(s)
- Manu Srinath Halvagal
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
- Faculty of Science, University of Basel, Basel, Switzerland
| | - Friedemann Zenke
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- Faculty of Science, University of Basel, Basel, Switzerland.
| |
Collapse
|
6
|
Singer Y, Taylor L, Willmore BDB, King AJ, Harper NS. Hierarchical temporal prediction captures motion processing along the visual pathway. eLife 2023; 12:e52599. [PMID: 37844199 PMCID: PMC10629830 DOI: 10.7554/elife.52599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 10/04/2023] [Indexed: 10/18/2023] Open
Abstract
Visual neurons respond selectively to features that become increasingly complex from the eyes to the cortex. Retinal neurons prefer flashing spots of light, primary visual cortical (V1) neurons prefer moving bars, and those in higher cortical areas favor complex features like moving textures. Previously, we showed that V1 simple cell tuning can be accounted for by a basic model implementing temporal prediction - representing features that predict future sensory input from past input (Singer et al., 2018). Here, we show that hierarchical application of temporal prediction can capture how tuning properties change across at least two levels of the visual system. This suggests that the brain does not efficiently represent all incoming information; instead, it selectively represents sensory inputs that help in predicting the future. When applied hierarchically, temporal prediction extracts time-varying features that depend on increasingly high-level statistics of the sensory input.
Collapse
Affiliation(s)
- Yosef Singer
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Luke Taylor
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Ben DB Willmore
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Andrew J King
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| | - Nicol S Harper
- Department of Physiology, Anatomy and Genetics, University of OxfordOxfordUnited Kingdom
| |
Collapse
|
7
|
Lindeberg T. Covariance properties under natural image transformations for the generalised Gaussian derivative model for visual receptive fields. Front Comput Neurosci 2023; 17:1189949. [PMID: 37398936 PMCID: PMC10311448 DOI: 10.3389/fncom.2023.1189949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 05/23/2023] [Indexed: 07/04/2023] Open
Abstract
The property of covariance, also referred to as equivariance, means that an image operator is well-behaved under image transformations, in the sense that the result of applying the image operator to a transformed input image gives essentially a similar result as applying the same image transformation to the output of applying the image operator to the original image. This paper presents a theory of geometric covariance properties in vision, developed for a generalised Gaussian derivative model of receptive fields in the primary visual cortex and the lateral geniculate nucleus, which, in turn, enable geometric invariance properties at higher levels in the visual hierarchy. It is shown how the studied generalised Gaussian derivative model for visual receptive fields obeys true covariance properties under spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations. These covariance properties imply that a vision system, based on image and video measurements in terms of the receptive fields according to the generalised Gaussian derivative model, can, to first order of approximation, handle the image and video deformations between multiple views of objects delimited by smooth surfaces, as well as between multiple views of spatio-temporal events, under varying relative motions between the objects and events in the world and the observer. We conclude by describing implications of the presented theory for biological vision, regarding connections between the variabilities of the shapes of biological visual receptive fields and the variabilities of spatial and spatio-temporal image structures under natural image transformations. Specifically, we formulate experimentally testable biological hypotheses as well as needs for measuring population statistics of receptive field characteristics, originating from predictions from the presented theory, concerning the extent to which the shapes of the biological receptive fields in the primary visual cortex span the variabilities of spatial and spatio-temporal image structures induced by natural image transformations, based on geometric covariance properties.
Collapse
Affiliation(s)
- Tony Lindeberg
- Computational Brain Science Lab, Division of Computational Science and Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
8
|
Sihn D, Kwon OS, Kim SP. Robust and efficient representations of dynamic stimuli in hierarchical neural networks via temporal smoothing. Front Comput Neurosci 2023; 17:1164595. [PMID: 37398935 PMCID: PMC10307978 DOI: 10.3389/fncom.2023.1164595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 05/24/2023] [Indexed: 07/04/2023] Open
Abstract
Introduction Efficient coding that minimizes informational redundancy of neural representations is a widely accepted neural coding principle. Despite the benefit, maximizing efficiency in neural coding can make neural representation vulnerable to random noise. One way to achieve robustness against random noise is smoothening neural responses. However, it is not clear whether the smoothness of neural responses can hold robust neural representations when dynamic stimuli are processed through a hierarchical brain structure, in which not only random noise but also systematic error due to temporal lag can be induced. Methods In the present study, we showed that smoothness via spatio-temporally efficient coding can achieve both efficiency and robustness by effectively dealing with noise and neural delay in the visual hierarchy when processing dynamic visual stimuli. Results The simulation results demonstrated that a hierarchical neural network whose bidirectional synaptic connections were learned through spatio-temporally efficient coding with natural scenes could elicit neural responses to visual moving bars similar to those to static bars with the identical position and orientation, indicating robust neural responses against erroneous neural information. It implies that spatio-temporally efficient coding preserves the structure of visual environments locally in the neural responses of hierarchical structures. Discussion The present results suggest the importance of a balance between efficiency and robustness in neural coding for visual processing of dynamic stimuli across hierarchical brain structures.
Collapse
|
9
|
Lamberti M, Tripathi S, van Putten MJAM, Marzen S, le Feber J. Prediction in cultured cortical neural networks. PNAS NEXUS 2023; 2:pgad188. [PMID: 37383023 PMCID: PMC10299080 DOI: 10.1093/pnasnexus/pgad188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 04/18/2023] [Accepted: 05/25/2023] [Indexed: 06/30/2023]
Abstract
Theory suggest that networks of neurons may predict their input. Prediction may underlie most aspects of information processing and is believed to be involved in motor and cognitive control and decision-making. Retinal cells have been shown to be capable of predicting visual stimuli, and there is some evidence for prediction of input in the visual cortex and hippocampus. However, there is no proof that the ability to predict is a generic feature of neural networks. We investigated whether random in vitro neuronal networks can predict stimulation, and how prediction is related to short- and long-term memory. To answer these questions, we applied two different stimulation modalities. Focal electrical stimulation has been shown to induce long-term memory traces, whereas global optogenetic stimulation did not. We used mutual information to quantify how much activity recorded from these networks reduces the uncertainty of upcoming stimuli (prediction) or recent past stimuli (short-term memory). Cortical neural networks did predict future stimuli, with the majority of all predictive information provided by the immediate network response to the stimulus. Interestingly, prediction strongly depended on short-term memory of recent sensory inputs during focal as well as global stimulation. However, prediction required less short-term memory during focal stimulation. Furthermore, the dependency on short-term memory decreased during 20 h of focal stimulation, when long-term connectivity changes were induced. These changes are fundamental for long-term memory formation, suggesting that besides short-term memory the formation of long-term memory traces may play a role in efficient prediction.
Collapse
Affiliation(s)
- Martina Lamberti
- Department of Clinical Neurophysiology, University of Twente, PO Box 217 7500AE, Enschede, The Netherlands
| | - Shiven Tripathi
- Department of Electrical Engineering, Indian Institute of Technology, Kanpur 208016, India
| | - Michel J A M van Putten
- Department of Clinical Neurophysiology, University of Twente, PO Box 217 7500AE, Enschede, The Netherlands
| | - Sarah Marzen
- W. M. Keck Science Department, Pitzer, Scripps, and Claremont McKenna College, Claremont, CA 91711, USA
| | | |
Collapse
|
10
|
Auksztulewicz R, Rajendran VG, Peng F, Schnupp JWH, Harper NS. Omission responses in local field potentials in rat auditory cortex. BMC Biol 2023; 21:130. [PMID: 37254137 DOI: 10.1186/s12915-023-01592-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 04/11/2023] [Indexed: 06/01/2023] Open
Abstract
BACKGROUND Non-invasive recordings of gross neural activity in humans often show responses to omitted stimuli in steady trains of identical stimuli. This has been taken as evidence for the neural coding of prediction or prediction error. However, evidence for such omission responses from invasive recordings of cellular-scale responses in animal models is scarce. Here, we sought to characterise omission responses using extracellular recordings in the auditory cortex of anaesthetised rats. We profiled omission responses across local field potentials (LFP), analogue multiunit activity (AMUA), and single/multi-unit spiking activity, using stimuli that were fixed-rate trains of acoustic noise bursts where 5% of bursts were randomly omitted. RESULTS Significant omission responses were observed in LFP and AMUA signals, but not in spiking activity. These omission responses had a lower amplitude and longer latency than burst-evoked sensory responses, and omission response amplitude increased as a function of the number of preceding bursts. CONCLUSIONS Together, our findings show that omission responses are most robustly observed in LFP and AMUA signals (relative to spiking activity). This has implications for models of cortical processing that require many neurons to encode prediction errors in their spike output.
Collapse
Affiliation(s)
- Ryszard Auksztulewicz
- Center for Cognitive Neuroscience Berlin, Free University Berlin, Berlin, Germany.
- Dept of Neuroscience, City University of Hong Kong, Hong Kong, Hong Kong S.A.R..
| | | | - Fei Peng
- Dept of Neuroscience, City University of Hong Kong, Hong Kong, Hong Kong S.A.R
| | | | | |
Collapse
|
11
|
Lindeberg T. A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time. BIOLOGICAL CYBERNETICS 2023; 117:21-59. [PMID: 36689001 PMCID: PMC10160219 DOI: 10.1007/s00422-022-00953-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 11/21/2022] [Indexed: 05/05/2023]
Abstract
This article presents an overview of a theory for performing temporal smoothing on temporal signals in such a way that: (i) temporally smoothed signals at coarser temporal scales are guaranteed to constitute simplifications of corresponding temporally smoothed signals at any finer temporal scale (including the original signal) and (ii) the temporal smoothing process is both time-causal and time-recursive, in the sense that it does not require access to future information and can be performed with no other temporal memory buffer of the past than the resulting smoothed temporal scale-space representations themselves. For specific subsets of parameter settings for the classes of linear and shift-invariant temporal smoothing operators that obey this property, it is shown how temporal scale covariance can be additionally obtained, guaranteeing that if the temporal input signal is rescaled by a uniform temporal scaling factor, then also the resulting temporal scale-space representations of the rescaled temporal signal will constitute mere rescalings of the temporal scale-space representations of the original input signal, complemented by a shift along the temporal scale dimension. The resulting time-causal limit kernel that obeys this property constitutes a canonical temporal kernel for processing temporal signals in real-time scenarios when the regular Gaussian kernel cannot be used, because of its non-causal access to information from the future, and we cannot additionally require the temporal smoothing process to comprise a complementary memory of the past beyond the information contained in the temporal smoothing process itself, which in this way also serves as a multi-scale temporal memory of the past. We describe how the time-causal limit kernel relates to previously used temporal models, such as Koenderink's scale-time kernels and the ex-Gaussian kernel. We do also give an overview of how the time-causal limit kernel can be used for modelling the temporal processing in models for spatio-temporal and spectro-temporal receptive fields, and how it more generally has a high potential for modelling neural temporal response functions in a purely time-causal and time-recursive way, that can also handle phenomena at multiple temporal scales in a theoretically well-founded manner. We detail how this theory can be efficiently implemented for discrete data, in terms of a set of recursive filters coupled in cascade. Hence, the theory is generally applicable for both: (i) modelling continuous temporal phenomena over multiple temporal scales and (ii) digital processing of measured temporal signals in real time. We conclude by stating implications of the theory for modelling temporal phenomena in biological, perceptual, neural and memory processes by mathematical models, as well as implications regarding the philosophy of time and perceptual agents. Specifically, we propose that for A-type theories of time, as well as for perceptual agents, the notion of a non-infinitesimal inner temporal scale of the temporal receptive fields has to be included in representations of the present, where the inherent nonzero temporal delay of such time-causal receptive fields implies a need for incorporating predictions from the actual time-delayed present in the layers of a perceptual hierarchy, to make it possible for a representation of the perceptual present to constitute a representation of the environment with timing properties closer to the actual present.
Collapse
Affiliation(s)
- Tony Lindeberg
- Computational Brain Science Lab, Division of Computational Science and Technology, KTH Royal Institute of Technology, 100 44, Stockholm, Sweden.
| |
Collapse
|
12
|
Willmore BDB, King AJ. Adaptation in auditory processing. Physiol Rev 2023; 103:1025-1058. [PMID: 36049112 PMCID: PMC9829473 DOI: 10.1152/physrev.00011.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Adaptation is an essential feature of auditory neurons, which reduces their responses to unchanging and recurring sounds and allows their response properties to be matched to the constantly changing statistics of sounds that reach the ears. As a consequence, processing in the auditory system highlights novel or unpredictable sounds and produces an efficient representation of the vast range of sounds that animals can perceive by continually adjusting the sensitivity and, to a lesser extent, the tuning properties of neurons to the most commonly encountered stimulus values. Together with attentional modulation, adaptation to sound statistics also helps to generate neural representations of sound that are tolerant to background noise and therefore plays a vital role in auditory scene analysis. In this review, we consider the diverse forms of adaptation that are found in the auditory system in terms of the processing levels at which they arise, the underlying neural mechanisms, and their impact on neural coding and perception. We also ask what the dynamics of adaptation, which can occur over multiple timescales, reveal about the statistical properties of the environment. Finally, we examine how adaptation to sound statistics is influenced by learning and experience and changes as a result of aging and hearing loss.
Collapse
Affiliation(s)
- Ben D. B. Willmore
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Andrew J. King
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
13
|
Price BH, Gavornik JP. Efficient Temporal Coding in the Early Visual System: Existing Evidence and Future Directions. Front Comput Neurosci 2022; 16:929348. [PMID: 35874317 PMCID: PMC9298461 DOI: 10.3389/fncom.2022.929348] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 06/13/2022] [Indexed: 01/16/2023] Open
Abstract
While it is universally accepted that the brain makes predictions, there is little agreement about how this is accomplished and under which conditions. Accurate prediction requires neural circuits to learn and store spatiotemporal patterns observed in the natural environment, but it is not obvious how such information should be stored, or encoded. Information theory provides a mathematical formalism that can be used to measure the efficiency and utility of different coding schemes for data transfer and storage. This theory shows that codes become efficient when they remove predictable, redundant spatial and temporal information. Efficient coding has been used to understand retinal computations and may also be relevant to understanding more complicated temporal processing in visual cortex. However, the literature on efficient coding in cortex is varied and can be confusing since the same terms are used to mean different things in different experimental and theoretical contexts. In this work, we attempt to provide a clear summary of the theoretical relationship between efficient coding and temporal prediction, and review evidence that efficient coding principles explain computations in the retina. We then apply the same framework to computations occurring in early visuocortical areas, arguing that data from rodents is largely consistent with the predictions of this model. Finally, we review and respond to criticisms of efficient coding and suggest ways that this theory might be used to design future experiments, with particular focus on understanding the extent to which neural circuits make predictions from efficient representations of environmental statistics.
Collapse
|
14
|
Ivanov AZ, King AJ, Willmore BDB, Walker KMM, Harper NS. Cortical adaptation to sound reverberation. eLife 2022; 11:75090. [PMID: 35617119 PMCID: PMC9213001 DOI: 10.7554/elife.75090] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 05/25/2022] [Indexed: 11/13/2022] Open
Abstract
In almost every natural environment, sounds are reflected by nearby objects, producing many delayed and distorted copies of the original sound, known as reverberation. Our brains usually cope well with reverberation, allowing us to recognize sound sources regardless of their environments. In contrast, reverberation can cause severe difficulties for speech recognition algorithms and hearing-impaired people. The present study examines how the auditory system copes with reverberation. We trained a linear model to recover a rich set of natural, anechoic sounds from their simulated reverberant counterparts. The model neurons achieved this by extending the inhibitory component of their receptive filters for more reverberant spaces, and did so in a frequency-dependent manner. These predicted effects were observed in the responses of auditory cortical neurons of ferrets in the same simulated reverberant environments. Together, these results suggest that auditory cortical neurons adapt to reverberation by adjusting their filtering properties in a manner consistent with dereverberation.
Collapse
Affiliation(s)
- Aleksandar Z Ivanov
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Andrew J King
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Ben D B Willmore
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Kerry M M Walker
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Nicol S Harper
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
15
|
Raj R, Dahlen D, Duyck K, Yu CR. Maximal Dependence Capturing as a Principle of Sensory Processing. Front Comput Neurosci 2022; 16:857653. [PMID: 35399919 PMCID: PMC8989953 DOI: 10.3389/fncom.2022.857653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 02/15/2022] [Indexed: 11/13/2022] Open
Abstract
Sensory inputs conveying information about the environment are often noisy and incomplete, yet the brain can achieve remarkable consistency in recognizing objects. Presumably, transforming the varying input patterns into invariant object representations is pivotal for this cognitive robustness. In the classic hierarchical representation framework, early stages of sensory processing utilize independent components of environmental stimuli to ensure efficient information transmission. Representations in subsequent stages are based on increasingly complex receptive fields along a hierarchical network. This framework accurately captures the input structures; however, it is challenging to achieve invariance in representing different appearances of objects. Here we assess theoretical and experimental inconsistencies of the current framework. In its place, we propose that individual neurons encode objects by following the principle of maximal dependence capturing (MDC), which compels each neuron to capture the structural components that contain maximal information about specific objects. We implement the proposition in a computational framework incorporating dimension expansion and sparse coding, which achieves consistent representations of object identities under occlusion, corruption, or high noise conditions. The framework neither requires learning the corrupted forms nor comprises deep network layers. Moreover, it explains various receptive field properties of neurons. Thus, MDC provides a unifying principle for sensory processing.
Collapse
Affiliation(s)
- Rishabh Raj
- Stowers Institute for Medical Research, Kansas City, MO, United States
| | - Dar Dahlen
- Stowers Institute for Medical Research, Kansas City, MO, United States
| | - Kyle Duyck
- Stowers Institute for Medical Research, Kansas City, MO, United States
| | - C. Ron Yu
- Stowers Institute for Medical Research, Kansas City, MO, United States
- Department of Anatomy and Cell Biology, University of Kansas Medical Center, Kansas City, KS, United States
| |
Collapse
|
16
|
Otsuka S, Nakagawa S, Furukawa S. Expectations of the timing and intensity of a stimulus propagate to the auditory periphery through the medial olivocochlear reflex. Cereb Cortex 2022; 32:5121-5131. [PMID: 35094068 PMCID: PMC9667176 DOI: 10.1093/cercor/bhac002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 12/28/2021] [Accepted: 12/29/2021] [Indexed: 12/27/2022] Open
Abstract
Expectations concerning the timing of a stimulus enhance attention at the time at which the event occurs, which confers significant sensory and behavioral benefits. Herein, we show that temporal expectations modulate even the sensory transduction in the auditory periphery via the descending pathway. We measured the medial olivocochlear reflex (MOCR), a sound-activated efferent feedback that controls outer hair cell motility and optimizes the dynamic range of the sensory system. MOCR was noninvasively assessed using otoacoustic emissions. We found that the MOCR was enhanced by a visual cue presented at a fixed interval before a sound but was unaffected if the interval was changing between trials. The MOCR was also observed to be stronger when the learned timing expectation matched with the timing of the sound but remained unvaried when these two factors did not match. This implies that the MOCR can be voluntarily controlled in a stimulus- and goal-directed manner. Moreover, we found that the MOCR was enhanced by the expectation of a strong but not a weak, sound intensity. This asymmetrical enhancement could facilitate antimasking and noise protective effects without disrupting the detection of faint signals. Therefore, the descending pathway conveys temporal and intensity expectations to modulate auditory processing.
Collapse
Affiliation(s)
- Sho Otsuka
- Address correspondence to Sho Otsuka, Center for Frontier Medical Engineering, Chiba University, 1-33 Yayoicho, Inageku, Chiba 263-8522, Japan.
| | - Seiji Nakagawa
- Center for Frontier Medical Engineering, Chiba University, Chiba, Japan
| | - Shigeto Furukawa
- NTT Communication Science Laboratoires, NTT Corporation, Kanagawa, Japan
| |
Collapse
|
17
|
Brodbeck C, Bhattasali S, Cruz Heredia AAL, Resnik P, Simon JZ, Lau E. Parallel processing in speech perception with local and global representations of linguistic context. eLife 2022; 11:72056. [PMID: 35060904 PMCID: PMC8830882 DOI: 10.7554/elife.72056] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 01/16/2022] [Indexed: 12/03/2022] Open
Abstract
Speech processing is highly incremental. It is widely accepted that human listeners continuously use the linguistic context to anticipate upcoming concepts, words, and phonemes. However, previous evidence supports two seemingly contradictory models of how a predictive context is integrated with the bottom-up sensory input: Classic psycholinguistic paradigms suggest a two-stage process, in which acoustic input initially leads to local, context-independent representations, which are then quickly integrated with contextual constraints. This contrasts with the view that the brain constructs a single coherent, unified interpretation of the input, which fully integrates available information across representational hierarchies, and thus uses contextual constraints to modulate even the earliest sensory representations. To distinguish these hypotheses, we tested magnetoencephalography responses to continuous narrative speech for signatures of local and unified predictive models. Results provide evidence that listeners employ both types of models in parallel. Two local context models uniquely predict some part of early neural responses, one based on sublexical phoneme sequences, and one based on the phonemes in the current word alone; at the same time, even early responses to phonemes also reflect a unified model that incorporates sentence-level constraints to predict upcoming phonemes. Neural source localization places the anatomical origins of the different predictive models in nonidentical parts of the superior temporal lobes bilaterally, with the right hemisphere showing a relative preference for more local models. These results suggest that speech processing recruits both local and unified predictive models in parallel, reconciling previous disparate findings. Parallel models might make the perceptual system more robust, facilitate processing of unexpected inputs, and serve a function in language acquisition.
Collapse
Affiliation(s)
| | | | | | | | | | - Ellen Lau
- Department of Linguistics, University of Maryland
| |
Collapse
|
18
|
Schrimpf M, Blank IA, Tuckute G, Kauf C, Hosseini EA, Kanwisher N, Tenenbaum JB, Fedorenko E. The neural architecture of language: Integrative modeling converges on predictive processing. Proc Natl Acad Sci U S A 2021; 118:e2105646118. [PMID: 34737231 PMCID: PMC8694052 DOI: 10.1073/pnas.2105646118] [Citation(s) in RCA: 116] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/03/2021] [Indexed: 01/30/2023] Open
Abstract
The neuroscience of perception has recently been revolutionized with an integrative modeling approach in which computation, brain function, and behavior are linked across many datasets and many computational models. By revealing trends across models, this approach yields novel insights into cognitive and neural mechanisms in the target domain. We here present a systematic study taking this approach to higher-level cognition: human language processing, our species' signature cognitive skill. We find that the most powerful "transformer" models predict nearly 100% of explainable variance in neural responses to sentences and generalize across different datasets and imaging modalities (functional MRI and electrocorticography). Models' neural fits ("brain score") and fits to behavioral responses are both strongly correlated with model accuracy on the next-word prediction task (but not other language tasks). Model architecture appears to substantially contribute to neural fit. These results provide computationally explicit evidence that predictive processing fundamentally shapes the language comprehension mechanisms in the human brain.
Collapse
Affiliation(s)
- Martin Schrimpf
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139;
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Idan Asher Blank
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- Department of Psychology, University of California, Los Angeles, CA 90095
| | - Greta Tuckute
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Carina Kauf
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Eghbal A Hosseini
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Nancy Kanwisher
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139;
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Joshua B Tenenbaum
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Evelina Fedorenko
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139;
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139
| |
Collapse
|
19
|
Mishra AP, Peng F, Li K, Harper NS, Schnupp JWH. Sensitivity of neural responses in the inferior colliculus to statistical features of sound textures. Hear Res 2021; 412:108357. [PMID: 34739889 DOI: 10.1016/j.heares.2021.108357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 09/04/2021] [Accepted: 09/21/2021] [Indexed: 11/16/2022]
Abstract
Previous psychophysical studies have identified a hierarchy of time-averaged statistics which determine the identity of natural sound textures. However, it is unclear whether the neurons in the inferior colliculus (IC) are sensitive to each of these statistical features in the natural sound textures. We used 13 representative sound textures spanning the space of 3 statistics extracted from over 200 natural textures. The synthetic textures were generated by incorporating the statistical features in a step-by-step manner, in which a particular statistical feature was changed while the other statistical features remain unchanged. The extracellular activity in response to the synthetic texture stimuli was recorded in the IC of anesthetized rats. Analysis of the transient and sustained multiunit activity after each transition of statistical feature showed that the IC units were sensitive to the changes of all types of statistics, although to a varying extent. For example, we found that more neurons were sensitive to the changes in variance than that in the modulation correlations. Our results suggest that the sensitivity of the statistical features in the subcortical levels contributes to the identification and discrimination of natural sound textures.
Collapse
Affiliation(s)
- Ambika P Mishra
- Department of Neuroscience, City University of Hong Kong, Hong Kong SAR.
| | - Fei Peng
- Department of Neuroscience, City University of Hong Kong, Hong Kong SAR.
| | - Kongyan Li
- Department of Neuroscience, City University of Hong Kong, Hong Kong SAR
| | - Nicol S Harper
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, UK.
| | - Jan W H Schnupp
- Department of Neuroscience, City University of Hong Kong, Hong Kong SAR.
| |
Collapse
|
20
|
Primary visual cortex straightens natural video trajectories. Nat Commun 2021; 12:5982. [PMID: 34645787 PMCID: PMC8514453 DOI: 10.1038/s41467-021-25939-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 09/08/2021] [Indexed: 11/08/2022] Open
Abstract
Many sensory-driven behaviors rely on predictions about future states of the environment. Visual input typically evolves along complex temporal trajectories that are difficult to extrapolate. We test the hypothesis that spatial processing mechanisms in the early visual system facilitate prediction by constructing neural representations that follow straighter temporal trajectories. We recorded V1 population activity in anesthetized macaques while presenting static frames taken from brief video clips, and developed a procedure to measure the curvature of the associated neural population trajectory. We found that V1 populations straighten naturally occurring image sequences, but entangle artificial sequences that contain unnatural temporal transformations. We show that these effects arise in part from computational mechanisms that underlie the stimulus selectivity of V1 cells. Together, our findings reveal that the early visual system uses a set of specialized computations to build representations that can support prediction in the natural environment. Many behaviours depend on predictions about the environment. Here the authors find neural populations in primary visual cortex to straighten the temporal trajectories of natural video clips, facilitating the extrapolation of past observations.
Collapse
|
21
|
Exploring the distribution of statistical feature parameters for natural sound textures. PLoS One 2021; 16:e0238960. [PMID: 34161323 PMCID: PMC8221478 DOI: 10.1371/journal.pone.0238960] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 06/03/2021] [Indexed: 11/19/2022] Open
Abstract
Sounds like “running water” and “buzzing bees” are classes of sounds which are a collective result of many similar acoustic events and are known as “sound textures”. A recent psychoacoustic study using sound textures has reported that natural sounding textures can be synthesized from white noise by imposing statistical features such as marginals and correlations computed from the outputs of cochlear models responding to the textures. The outputs being the envelopes of bandpass filter responses, the ‘cochlear envelope’. This suggests that the perceptual qualities of many natural sounds derive directly from such statistical features, and raises the question of how these statistical features are distributed in the acoustic environment. To address this question, we collected a corpus of 200 sound textures from public online sources and analyzed the distributions of the textures’ marginal statistics (mean, variance, skew, and kurtosis), cross-frequency correlations and modulation power statistics. A principal component analysis of these parameters revealed a great deal of redundancy in the texture parameters. For example, just two marginal principal components, which can be thought of as measuring the sparseness or burstiness of a texture, capture as much as 64% of the variance of the 128 dimensional marginal parameter space, while the first two principal components of cochlear correlations capture as much as 88% of the variance in the 496 correlation parameters. Knowledge of the statistical distributions documented here may help guide the choice of acoustic stimuli with high ecological validity in future research.
Collapse
|
22
|
Shine JM, Müller EJ, Munn B, Cabral J, Moran RJ, Breakspear M. Computational models link cellular mechanisms of neuromodulation to large-scale neural dynamics. Nat Neurosci 2021; 24:765-776. [PMID: 33958801 DOI: 10.1038/s41593-021-00824-6] [Citation(s) in RCA: 75] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 02/23/2021] [Indexed: 02/02/2023]
Abstract
Decades of neurobiological research have disclosed the diverse manners in which the response properties of neurons are dynamically modulated to support adaptive cognitive functions. This neuromodulation is achieved through alterations in the biophysical properties of the neuron. However, changes in cognitive function do not arise directly from the modulation of individual neurons, but are mediated by population dynamics in mesoscopic neural ensembles. Understanding this multiscale mapping is an important but nontrivial issue. Here, we bridge these different levels of description by showing how computational models parametrically map classic neuromodulatory processes onto systems-level models of neural activity. The ensuing critical balance of systems-level activity supports perception and action, although our knowledge of this mapping remains incomplete. In this way, quantitative models that link microscale neuronal neuromodulation to systems-level brain function highlight gaps in knowledge and suggest new directions for integrating theoretical and experimental work.
Collapse
Affiliation(s)
- James M Shine
- Brain and Mind Center, The University of Sydney, Camperdown, New South Wales, Australia.,Center for Complex Systems, The University of Sydney, Camperdown, New South Wales, Australia
| | - Eli J Müller
- Brain and Mind Center, The University of Sydney, Camperdown, New South Wales, Australia.,Center for Complex Systems, The University of Sydney, Camperdown, New South Wales, Australia
| | - Brandon Munn
- Brain and Mind Center, The University of Sydney, Camperdown, New South Wales, Australia.,Center for Complex Systems, The University of Sydney, Camperdown, New South Wales, Australia
| | - Joana Cabral
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal
| | | | - Michael Breakspear
- School of Psychology, College of Engineering, Science and the Environment, University of Newcastle, Callaghan, New South Wales, Australia. .,School of Medicine and Public Health, College of Health and Medicine, University of Newcastle, Callaghan, New South Wales, Australia.
| |
Collapse
|
23
|
Lotter W, Kreiman G, Cox D. A neural network trained for prediction mimics diverse features of biological neurons and perception. NAT MACH INTELL 2020; 2:210-219. [PMID: 34291193 PMCID: PMC8291226 DOI: 10.1038/s42256-020-0170-9] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 03/13/2020] [Indexed: 11/09/2022]
Abstract
Recent work has shown that convolutional neural networks (CNNs) trained on image recognition tasks can serve as valuable models for predicting neural responses in primate visual cortex. However, these models typically require biologically-infeasible levels of labeled training data, so this similarity must at least arise via different paths. In addition, most popular CNNs are solely feedforward, lacking a notion of time and recurrence, whereas neurons in visual cortex produce complex time-varying responses, even to static inputs. Towards addressing these inconsistencies with biology, here we study the emergent properties of a recurrent generative network that is trained to predict future video frames in a self-supervised manner. Remarkably, the resulting model is able to capture a wide variety of seemingly disparate phenomena observed in visual cortex, ranging from single-unit response dynamics to complex perceptual motion illusions, even when subjected to highly impoverished stimuli. These results suggest potentially deep connections between recurrent predictive neural network models and computations in the brain, providing new leads that can enrich both fields.
Collapse
Affiliation(s)
| | - Gabriel Kreiman
- Harvard University, Cambridge, MA, USA
- Boston Children’s Hospital, Harvard Medical School, Boston, MA, USA
- Center for Brains, Minds, and Machines (CBMM), Cambridge, MA, USA
| | - David Cox
- Harvard University, Cambridge, MA, USA
- MIT-IBM Watson AI Lab, Cambridge, MA, USA
- IBM Research, Cambridge, MA, USA
| |
Collapse
|
24
|
Rajendran VG, Harper NS, Schnupp JWH. Auditory cortical representation of music favours the perceived beat. ROYAL SOCIETY OPEN SCIENCE 2020; 7:191194. [PMID: 32269783 PMCID: PMC7137933 DOI: 10.1098/rsos.191194] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 02/03/2020] [Indexed: 06/02/2023]
Abstract
Previous research has shown that musical beat perception is a surprisingly complex phenomenon involving widespread neural coordination across higher-order sensory, motor and cognitive areas. However, the question of how low-level auditory processing must necessarily shape these dynamics, and therefore perception, is not well understood. Here, we present evidence that the auditory cortical representation of music, even in the absence of motor or top-down activations, already favours the beat that will be perceived. Extracellular firing rates in the rat auditory cortex were recorded in response to 20 musical excerpts diverse in tempo and genre, for which musical beat perception had been characterized by the tapping behaviour of 40 human listeners. We found that firing rates in the rat auditory cortex were on average higher on the beat than off the beat. This 'neural emphasis' distinguished the beat that was perceived from other possible interpretations of the beat, was predictive of the degree of tapping consensus across human listeners, and was accounted for by a spectrotemporal receptive field model. These findings strongly suggest that the 'bottom-up' processing of music performed by the auditory system predisposes the timing and clarity of the perceived musical beat.
Collapse
Affiliation(s)
- Vani G. Rajendran
- Auditory Neuroscience Group, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, UK
- Department of Biomedical Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong
| | - Nicol S. Harper
- Auditory Neuroscience Group, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, UK
| | - Jan W. H. Schnupp
- Auditory Neuroscience Group, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford, UK
- Department of Biomedical Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong
| |
Collapse
|
25
|
Shain C, Blank IA, van Schijndel M, Schuler W, Fedorenko E. fMRI reveals language-specific predictive coding during naturalistic sentence comprehension. Neuropsychologia 2020; 138:107307. [PMID: 31874149 PMCID: PMC7140726 DOI: 10.1016/j.neuropsychologia.2019.107307] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 12/02/2019] [Accepted: 12/13/2019] [Indexed: 11/19/2022]
Abstract
Much research in cognitive neuroscience supports prediction as a canonical computation of cognition across domains. Is such predictive coding implemented by feedback from higher-order domain-general circuits, or is it locally implemented in domain-specific circuits? What information sources are used to generate these predictions? This study addresses these two questions in the context of language processing. We present fMRI evidence from a naturalistic comprehension paradigm (1) that predictive coding in the brain's response to language is domain-specific, and (2) that these predictions are sensitive both to local word co-occurrence patterns and to hierarchical structure. Using a recently developed continuous-time deconvolutional regression technique that supports data-driven hemodynamic response function discovery from continuous BOLD signal fluctuations in response to naturalistic stimuli, we found effects of prediction measures in the language network but not in the domain-general multiple-demand network, which supports executive control processes and has been previously implicated in language comprehension. Moreover, within the language network, surface-level and structural prediction effects were separable. The predictability effects in the language network were substantial, with the model capturing over 37% of explainable variance on held-out data. These findings indicate that human sentence processing mechanisms generate predictions about upcoming words using cognitive processes that are sensitive to hierarchical structure and specialized for language processing, rather than via feedback from high-level executive control mechanisms.
Collapse
Affiliation(s)
| | - Idan Asher Blank
- University of California Los Angeles, 90024, USA; Massachusetts Institute of Technology, 02139, USA.
| | | | - William Schuler
- The Ohio State University, 43210, USA; Massachusetts General Hospital, Program in Speech and Hearing Bioscience and Technology, 02115, USA.
| | - Evelina Fedorenko
- Massachusetts General Hospital, Program in Speech and Hearing Bioscience and Technology, 02115, USA.
| |
Collapse
|
26
|
Do domain-general executive resources play a role in linguistic prediction? Re-evaluation of the evidence and a path forward. Neuropsychologia 2020; 136:107258. [DOI: 10.1016/j.neuropsychologia.2019.107258] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 11/07/2019] [Accepted: 11/07/2019] [Indexed: 12/13/2022]
|
27
|
Richards BA, Lillicrap TP, Beaudoin P, Bengio Y, Bogacz R, Christensen A, Clopath C, Costa RP, de Berker A, Ganguli S, Gillon CJ, Hafner D, Kepecs A, Kriegeskorte N, Latham P, Lindsay GW, Miller KD, Naud R, Pack CC, Poirazi P, Roelfsema P, Sacramento J, Saxe A, Scellier B, Schapiro AC, Senn W, Wayne G, Yamins D, Zenke F, Zylberberg J, Therien D, Kording KP. A deep learning framework for neuroscience. Nat Neurosci 2019; 22:1761-1770. [PMID: 31659335 PMCID: PMC7115933 DOI: 10.1038/s41593-019-0520-2] [Citation(s) in RCA: 376] [Impact Index Per Article: 75.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Accepted: 09/23/2019] [Indexed: 11/08/2022]
Abstract
Systems neuroscience seeks explanations for how the brain implements a wide variety of perceptual, cognitive and motor tasks. Conversely, artificial intelligence attempts to design computational systems based on the tasks they will have to solve. In artificial neural networks, the three components specified by design are the objective functions, the learning rules and the architectures. With the growing success of deep learning, which utilizes brain-inspired architectures, these three designed components have increasingly become central to how we model, engineer and optimize complex artificial learning systems. Here we argue that a greater focus on these components would also benefit systems neuroscience. We give examples of how this optimization-based framework can drive theoretical and experimental progress in neuroscience. We contend that this principled perspective on systems neuroscience will help to generate more rapid progress.
Collapse
Affiliation(s)
- Blake A Richards
- Mila, Montréal, Quebec, Canada.
- School of Computer Science, McGill University, Montréal, Quebec, Canada.
- Department of Neurology & Neurosurgery, McGill University, Montréal, Quebec, Canada.
- Canadian Institute for Advanced Research, Toronto, Ontario, Canada.
| | - Timothy P Lillicrap
- DeepMind, Inc., London, UK
- Centre for Computation, Mathematics and Physics in the Life Sciences and Experimental Biology, University College London, London, UK
| | | | - Yoshua Bengio
- Mila, Montréal, Quebec, Canada
- Canadian Institute for Advanced Research, Toronto, Ontario, Canada
- Université de Montréal, Montréal, Quebec, Canada
| | - Rafal Bogacz
- MRC Brain Network Dynamics Unit, University of Oxford, Oxford, UK
| | - Amelia Christensen
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | - Claudia Clopath
- Department of Bioengineering, Imperial College London, London, UK
| | - Rui Ponte Costa
- Computational Neuroscience Unit, School of Computer Science, Electrical and Electronic Engineering, and Engineering Maths, University of Bristol, Bristol, UK
- Department of Physiology, Universität Bern, Bern, Switzerland
| | | | - Surya Ganguli
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Google Brain, Mountain View, CA, USA
| | - Colleen J Gillon
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, Ontario, Canada
- Department of Cell & Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Danijar Hafner
- Google Brain, Mountain View, CA, USA
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
| | - Adam Kepecs
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Nikolaus Kriegeskorte
- Department of Psychology and Neuroscience, Columbia University, New York, NY, USA
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, USA
| | - Peter Latham
- Gatsby Computational Neuroscience Unit, University College London, London, UK
| | - Grace W Lindsay
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, USA
- Center for Theoretical Neuroscience, Columbia University, New York, NY, USA
| | - Kenneth D Miller
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, USA
- Center for Theoretical Neuroscience, Columbia University, New York, NY, USA
- Department of Neuroscience, College of Physicians and Surgeons, Columbia University, New York, NY, USA
| | - Richard Naud
- University of Ottawa Brain and Mind Institute, Ottawa, Ontario, Canada
- Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa, Ontario, Canada
| | - Christopher C Pack
- Department of Neurology & Neurosurgery, McGill University, Montréal, Quebec, Canada
| | - Panayiota Poirazi
- Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Crete, Greece
| | - Pieter Roelfsema
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Amsterdam, Netherlands
| | - João Sacramento
- Institute of Neuroinformatics, ETH Zürich and University of Zürich, Zürich, Switzerland
| | - Andrew Saxe
- Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Benjamin Scellier
- Mila, Montréal, Quebec, Canada
- Université de Montréal, Montréal, Quebec, Canada
| | - Anna C Schapiro
- Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
| | - Walter Senn
- Department of Physiology, Universität Bern, Bern, Switzerland
| | | | - Daniel Yamins
- Department of Psychology, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
| | - Friedemann Zenke
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
- Centre for Neural Circuits and Behaviour, University of Oxford, Oxford, UK
| | - Joel Zylberberg
- Canadian Institute for Advanced Research, Toronto, Ontario, Canada
- Department of Physics and Astronomy York University, Toronto, Ontario, Canada
- Center for Vision Research, York University, Toronto, Ontario, Canada
| | | | - Konrad P Kording
- Canadian Institute for Advanced Research, Toronto, Ontario, Canada
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
28
|
Abstract
It has been hypothesized that stimulus-aligned brain rhythms reflect predictions about upcoming input. New research shows that these rhythms bias subsequent speech perception, in line with a mechanism of prediction.
Collapse
|
29
|
Whittington JCR, Bogacz R. Theories of Error Back-Propagation in the Brain. Trends Cogn Sci 2019; 23:235-250. [PMID: 30704969 PMCID: PMC6382460 DOI: 10.1016/j.tics.2018.12.005] [Citation(s) in RCA: 137] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Revised: 12/13/2018] [Accepted: 12/28/2018] [Indexed: 12/14/2022]
Abstract
This review article summarises recently proposed theories on how neural circuits in the brain could approximate the error back-propagation algorithm used by artificial neural networks. Computational models implementing these theories achieve learning as efficient as artificial neural networks, but they use simple synaptic plasticity rules based on activity of presynaptic and postsynaptic neurons. The models have similarities, such as including both feedforward and feedback connections, allowing information about error to propagate throughout the network. Furthermore, they incorporate experimental evidence on neural connectivity, responses, and plasticity. These models provide insights on how brain networks might be organised such that modification of synaptic weights on multiple levels of cortical hierarchy leads to improved performance on tasks.
Collapse
Affiliation(s)
- James C R Whittington
- MRC Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX3 9DU, UK; Wellcome Centre for Integrative Neuroimaging, Centre for Functional Magnetic Resonance Imaging of the Brain, University of Oxford, Oxford OX3 9DU, UK
| | - Rafal Bogacz
- MRC Brain Network Dynamics Unit, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX3 9DU, UK.
| |
Collapse
|
30
|
Sheikh AS, Harper NS, Drefs J, Singer Y, Dai Z, Turner RE, Lücke J. STRFs in primary auditory cortex emerge from masking-based statistics of natural sounds. PLoS Comput Biol 2019; 15:e1006595. [PMID: 30653497 PMCID: PMC6382252 DOI: 10.1371/journal.pcbi.1006595] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 02/20/2019] [Accepted: 10/23/2018] [Indexed: 11/19/2022] Open
Abstract
We investigate how the neural processing in auditory cortex is shaped by the statistics of natural sounds. Hypothesising that auditory cortex (A1) represents the structural primitives out of which sounds are composed, we employ a statistical model to extract such components. The input to the model are cochleagrams which approximate the non-linear transformations a sound undergoes from the outer ear, through the cochlea to the auditory nerve. Cochleagram components do not superimpose linearly, but rather according to a rule which can be approximated using the max function. This is a consequence of the compression inherent in the cochleagram and the sparsity of natural sounds. Furthermore, cochleagrams do not have negative values. Cochleagrams are therefore not matched well by the assumptions of standard linear approaches such as sparse coding or ICA. We therefore consider a new encoding approach for natural sounds, which combines a model of early auditory processing with maximal causes analysis (MCA), a sparse coding model which captures both the non-linear combination rule and non-negativity of the data. An efficient truncated EM algorithm is used to fit the MCA model to cochleagram data. We characterize the generative fields (GFs) inferred by MCA with respect to in vivo neural responses in A1 by applying reverse correlation to estimate spectro-temporal receptive fields (STRFs) implied by the learned GFs. Despite the GFs being non-negative, the STRF estimates are found to contain both positive and negative subfields, where the negative subfields can be attributed to explaining away effects as captured by the applied inference method. A direct comparison with ferret A1 shows many similar forms, and the spectral and temporal modulation tuning of both ferret and model STRFs show similar ranges over the population. In summary, our model represents an alternative to linear approaches for biological auditory encoding while it captures salient data properties and links inhibitory subfields to explaining away effects. The information carried by natural sounds enters the cortex of mammals in a specific format: the cochleagram. Instead of representing the original pressure waveforms, the inner ear represents how the energy in a sound is distributed across frequency bands and how the energy distribution evolves over time. The generation of cochleagrams is highly non-linear resulting in the dominance of one sound source per time-frequency bin under natural conditions (masking). Auditory cortex is believed to decompose cochleagrams into structural primitives, i.e., reappearing regular spectro-temporal subpatterns that make up cochleagram patterns (similar to edges in images). However, such a decomposition has so far only been modeled without considering masking and non-negativity. Here we apply a novel non-linear sparse coding model that can capture masking non-linearities and non-negativities. When trained on cochleagrams of natural sounds, the model gives rise to an encoding primarily based-on spectro-temporally localized components. If stimulated by a sound, the encoding units compete to explain its contents. The competition is a direct consequence of the statistical sound model, and it results in neural responses being best described by spectro-temporal receptive fields (STRFs) with positive and negative subfields. The emerging STRFs show a higher similarity to experimentally measured STRFs than a model without masking, which provides evidence for cortical encoding being consistent with the masking based sound statistics of cochleagrams. Furthermore, and more generally, our study suggests for the first time that negative subfields of STRFs may be direct evidence for explaining away effects resulting from performing inference in an underlying statistical model.
Collapse
Affiliation(s)
- Abdul-Saboor Sheikh
- Research Center Neurosensory Science, Cluster of Excellence Hearing4all, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
- Zalando Research, Zalando SE, Berlin, Germany
| | - Nicol S. Harper
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, United Kingdom
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Jakob Drefs
- Research Center Neurosensory Science, Cluster of Excellence Hearing4all, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
| | - Yosef Singer
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Zhenwen Dai
- Department of Computer Science, University of Sheffield, Sheffield, United Kingdom
| | - Richard E. Turner
- Department of Engineering, University of Cambridge, Cambridge, United Kingdom
- Microsoft Research, Cambridge, United Kingdom
| | - Jörg Lücke
- Research Center Neurosensory Science, Cluster of Excellence Hearing4all, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
- * E-mail:
| |
Collapse
|
31
|
Abstract
Our ability to make sense of the auditory world results from neural processing that begins in the ear, goes through multiple subcortical areas, and continues in the cortex. The specific contribution of the auditory cortex to this chain of processing is far from understood. Although many of the properties of neurons in the auditory cortex resemble those of subcortical neurons, they show somewhat more complex selectivity for sound features, which is likely to be important for the analysis of natural sounds, such as speech, in real-life listening conditions. Furthermore, recent work has shown that auditory cortical processing is highly context-dependent, integrates auditory inputs with other sensory and motor signals, depends on experience, and is shaped by cognitive demands, such as attention. Thus, in addition to being the locus for more complex sound selectivity, the auditory cortex is increasingly understood to be an integral part of the network of brain regions responsible for prediction, auditory perceptual decision-making, and learning. In this review, we focus on three key areas that are contributing to this understanding: the sound features that are preferentially represented by cortical neurons, the spatial organization of those preferences, and the cognitive roles of the auditory cortex.
Collapse
Affiliation(s)
- Andrew J King
- Department of Physiology, Anatomy & Genetics, University of Oxford, Oxford, OX1 3PT, UK
| | - Sundeep Teki
- Department of Physiology, Anatomy & Genetics, University of Oxford, Oxford, OX1 3PT, UK
| | - Ben D B Willmore
- Department of Physiology, Anatomy & Genetics, University of Oxford, Oxford, OX1 3PT, UK
| |
Collapse
|