1
|
Bun LM, Horwitz GD. Color and luminance processing in V1 complex cells and artificial neural networks. COLOR RESEARCH AND APPLICATION 2023; 48:841-852. [PMID: 38145033 PMCID: PMC10746296 DOI: 10.1002/col.22903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 09/03/2023] [Indexed: 12/26/2023]
Abstract
Object recognition by natural and artificial visual systems benefits from the identification of object boundaries. A useful cue for the detection of object boundaries is the superposition of luminance and color edges. To gain insight into the suitability of this cue for object recognition, we examined convolutional neural network models that had been trained to recognize objects in natural images. We focused specifically on units in the second convolutional layer whose activations are invariant to the spatial phase of a sinusoidal grating. Some of these units were tuned for a nonlinear combination of color and luminance, which is broadly consistent with a role in object boundary detection. Others were tuned for luminance alone, but very few were tuned for color alone. A literature review reveals that V1 complex cells have a similar distribution of tuning. We speculate that this pattern of sensitivity provides an efficient basis for object recognition, perhaps by mitigating the effects of lighting on luminance contrast polarity. The absence of a contrast polarity-invariant representation of chromaticity alone suggests that it is redundant with other representations.
Collapse
Affiliation(s)
- Luke M. Bun
- Department of Bioengineering
- Washington National Primate Research Center
| | - Gregory D. Horwitz
- Department of Bioengineering
- Washington National Primate Research Center
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, 98195
| |
Collapse
|
2
|
Hosoya H, Hyvärinen A. Learning Visual Spatial Pooling by Strong PCA Dimension Reduction. Neural Comput 2016; 28:1249-64. [PMID: 27171856 DOI: 10.1162/neco_a_00843] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
In visual modeling, invariance properties of visual cells are often explained by a pooling mechanism, in which outputs of neurons with similar selectivities to some stimulus parameters are integrated so as to gain some extent of invariance to other parameters. For example, the classical energy model of phase-invariant V1 complex cells pools model simple cells preferring similar orientation but different phases. Prior studies, such as independent subspace analysis, have shown that phase-invariance properties of V1 complex cells can be learned from spatial statistics of natural inputs. However, those previous approaches assumed a squaring nonlinearity on the neural outputs to capture energy correlation; such nonlinearity is arguably unnatural from a neurobiological viewpoint but hard to change due to its tight integration into their formalisms. Moreover, they used somewhat complicated objective functions requiring expensive computations for optimization. In this study, we show that visual spatial pooling can be learned in a much simpler way using strong dimension reduction based on principal component analysis. This approach learns to ignore a large part of detailed spatial structure of the input and thereby estimates a linear pooling matrix. Using this framework, we demonstrate that pooling of model V1 simple cells learned in this way, even with nonlinearities other than squaring, can reproduce standard tuning properties of V1 complex cells. For further understanding, we analyze several variants of the pooling model and argue that a reasonable pooling can generally be obtained from any kind of linear transformation that retains several of the first principal components and suppresses the remaining ones. In particular, we show how the classic Wiener filtering theory leads to one such variant.
Collapse
Affiliation(s)
- Haruo Hosoya
- Computational Neuroscience Laboratories, ATR International, Kyoto 619-0288, Japan, and Presto, Japan Science and Technology Agency, Saitama 332-0012, Japan
| | - Aapo Hyvärinen
- Department of Computer Science and HIIT, University of Helsinki, Helsinki 00560, Finland
| |
Collapse
|
3
|
Lies JP, Häfner RM, Bethge M. Slowness and sparseness have diverging effects on complex cell learning. PLoS Comput Biol 2014; 10:e1003468. [PMID: 24603197 PMCID: PMC3945087 DOI: 10.1371/journal.pcbi.1003468] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2013] [Accepted: 12/19/2013] [Indexed: 11/18/2022] Open
Abstract
Following earlier studies which showed that a sparse coding principle may explain the receptive field properties of complex cells in primary visual cortex, it has been concluded that the same properties may be equally derived from a slowness principle. In contrast to this claim, we here show that slowness and sparsity drive the representations towards substantially different receptive field properties. To do so, we present complete sets of basis functions learned with slow subspace analysis (SSA) in case of natural movies as well as translations, rotations, and scalings of natural images. SSA directly parallels independent subspace analysis (ISA) with the only difference that SSA maximizes slowness instead of sparsity. We find a large discrepancy between the filter shapes learned with SSA and ISA. We argue that SSA can be understood as a generalization of the Fourier transform where the power spectrum corresponds to the maximally slow subspace energies in SSA. Finally, we investigate the trade-off between slowness and sparseness when combined in one objective function.
Collapse
Affiliation(s)
- Jörn-Philipp Lies
- Werner Reichardt Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
| | - Ralf M. Häfner
- Swartz Center for Theoretical Neurobiology, Brandeis University, Waltham, Massachusetts, United States of America
| | - Matthias Bethge
- Werner Reichardt Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany
- Bernstein Center for Computational Neuroscience, Tübingen, Germany
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| |
Collapse
|
4
|
Xu J, Yang Z, Tsien JZ. Emergence of visual saliency from natural scenes via context-mediated probability distributions coding. PLoS One 2010; 5:e15796. [PMID: 21209963 PMCID: PMC3012104 DOI: 10.1371/journal.pone.0015796] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2010] [Accepted: 11/23/2010] [Indexed: 11/19/2022] Open
Abstract
Visual saliency is the perceptual quality that makes some items in visual scenes stand out from their immediate contexts. Visual saliency plays important roles in natural vision in that saliency can direct eye movements, deploy attention, and facilitate tasks like object detection and scene understanding. A central unsolved issue is: What features should be encoded in the early visual cortex for detecting salient features in natural scenes? To explore this important issue, we propose a hypothesis that visual saliency is based on efficient encoding of the probability distributions (PDs) of visual variables in specific contexts in natural scenes, referred to as context-mediated PDs in natural scenes. In this concept, computational units in the model of the early visual system do not act as feature detectors but rather as estimators of the context-mediated PDs of a full range of visual variables in natural scenes, which directly give rise to a measure of visual saliency of any input stimulus. To test this hypothesis, we developed a model of the context-mediated PDs in natural scenes using a modified algorithm for independent component analysis (ICA) and derived a measure of visual saliency based on these PDs estimated from a set of natural scenes. We demonstrated that visual saliency based on the context-mediated PDs in natural scenes effectively predicts human gaze in free-viewing of both static and dynamic natural scenes. This study suggests that the computation based on the context-mediated PDs of visual variables in natural scenes may underlie the neural mechanism in the early visual cortex for detecting salient features in natural scenes.
Collapse
Affiliation(s)
- Jinhua Xu
- Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta, Georgia, United States of America
- Department of Computer Science and Technology, East China Normal University, Shanghai, China
| | - Zhiyong Yang
- Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta, Georgia, United States of America
- Department of Ophthalmology, Georgia Health Sciences University, Augusta, Georgia, United States of America
| | - Joe Z. Tsien
- Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta, Georgia, United States of America
- Department of Neurology, Georgia Health Sciences University, Augusta, Georgia, United States of America
| |
Collapse
|
5
|
Malo J, Laparra V. Psychophysically tuned divisive normalization approximately factorizes the PDF of natural images. Neural Comput 2010; 22:3179-206. [PMID: 20858127 DOI: 10.1162/neco_a_00046] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The conventional approach in computational neuroscience in favor of the efficient coding hypothesis goes from image statistics to perception. It has been argued that the behavior of the early stages of biological visual processing (e.g., spatial frequency analyzers and their nonlinearities) may be obtained from image samples and the efficient coding hypothesis using no psychophysical or physiological information. In this work we address the same issue in the opposite direction: from perception to image statistics. We show that psychophysically fitted image representation in V1 has appealing statistical properties, for example, approximate PDF factorization and substantial mutual information reduction, even though no statistical information is used to fit the V1 model. These results are complementary evidence in favor of the efficient coding hypothesis.
Collapse
Affiliation(s)
- Jesús Malo
- Image Processing Laboratory, Universitat de València, 46980 Paterna, València, Spain.
| | | |
Collapse
|
6
|
Relating BOLD fMRI and neural oscillations through convolution and optimal linear weighting. Neuroimage 2009; 49:1479-89. [PMID: 19778617 DOI: 10.1016/j.neuroimage.2009.09.020] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2009] [Revised: 08/21/2009] [Accepted: 09/11/2009] [Indexed: 11/21/2022] Open
Abstract
The exact relationship between neural activity and BOLD fMRI is unknown. However, several recent findings, recorded invasively in both humans and monkeys, show a positive correlation of BOLD to high-frequency (30-150 Hz) oscillatory power changes and a negative correlation to low-frequency (8-30 Hz) power changes arising from cortical areas. In this study, we computed the time series correlation between BOLD GE-EPI fMRI at 7 T and neural activity measures from noninvasive MEG, using a time-frequency beam former for source localisation. A sinusoidal drifting grating was presented visually for 4 s followed by a 20 s rest period in both recording modalities. The MEG time series were convolved with either a measured or canonical haemodynamic response function (HRF) for comparison with the measured BOLD data, and the BOLD data were deconvolved with either a measured or a canonical HRF for comparison with the measured MEG. In the visual cortex, the higher frequencies (mid-gamma=52-75 Hz and high-gamma=75-98 Hz) were positively correlated with BOLD whilst the lower frequencies (alpha=8-12 Hz and beta=12-25 Hz) were negatively correlated with BOLD. Furthermore, regression including all frequency bands predicted BOLD better than stimulus timing alone, although no individual frequency band predicted BOLD as well as stimulus timing. For this paradigm, there was, in general, no difference between using the SPM canonical HRF compared to the subject-specific measured HRF. In conclusion, MEG replicates findings from invasive recordings with regard to time series correlations with BOLD data. Conversely, deconvolution of BOLD data provides a neural estimate which correlates well with measured neural effects as a function of neural oscillation frequency.
Collapse
|
7
|
Hyvärinen A, Köster U. Complex cell pooling and the statistics of natural images. NETWORK (BRISTOL, ENGLAND) 2007; 18:81-100. [PMID: 17852755 DOI: 10.1080/09548980701418942] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
In previous work, we presented a statistical model of natural images that produced outputs similar to receptive fields of complex cells in primary visual cortex. However, a weakness of that model was that the structure of the pooling was assumed a priori and not learned from the statistical properties of natural images. Here, we present an extended model in which the pooling nonlinearity and the size of the subspaces are optimized rather than fixed, so we make much fewer assumptions about the pooling. Results on natural images indicate that the best probabilistic representation is formed when the size of the subspaces is relatively large, and that the likelihood is considerably higher than for a simple linear model with no pooling. Further, we show that the optimal nonlinearity for the pooling is squaring. We also highlight the importance of contrast gain control for the performance of the model. Our model is novel in that it is the first to analyze optimal subspace size and how this size is influenced by contrast normalization.
Collapse
Affiliation(s)
- Aapo Hyvärinen
- Basic Research Unit, Helsinki Institute for Information Technology, Department of Computer Science, University of Helsinki, Finland.
| | | |
Collapse
|
8
|
Rajkai C, Lakatos P, Chen CM, Pincze Z, Karmos G, Schroeder CE. Transient cortical excitation at the onset of visual fixation. Cereb Cortex 2007; 18:200-9. [PMID: 17494059 DOI: 10.1093/cercor/bhm046] [Citation(s) in RCA: 166] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Primates actively examine the visual world by rapidly shifting gaze (fixation) over the elements in a scene. Despite this fact, we typically study vision by presenting stimuli with gaze held constant. To better understand the dynamics of natural vision, we examined how the onset of visual fixation affects ongoing neuronal activity in the absence of visual stimulation. We used multiunit activity and current source density measurements to index neuronal firing patterns and underlying synaptic processes in macaque V1. Initial averaging of neural activity synchronized to the onset of fixation suggested that a brief period of cortical excitation follows each fixation. Subsequent single-trial analyses revealed that 1) neuronal oscillation phase transits from random to a highly organized state just after the fixation onset, 2) this phase concentration is accompanied by increased spectral power in several frequency bands, and 3) visual response amplitude is enhanced at the specific oscillatory phase associated with fixation. We hypothesize that nonvisual inputs are used by the brain to increase cortical excitability at fixation onset, thus "priming" the system for new visual inputs generated at fixation. Despite remaining mechanistic questions, it appears that analysis of fixation-related responses may be useful in studying natural vision.
Collapse
Affiliation(s)
- Csaba Rajkai
- Cognitive Neuroscience and Schizophrenia Program, Nathan S. Kline Institute for Psychiatric Research, Orangeburg, NY 10962, USA
| | | | | | | | | | | |
Collapse
|
9
|
Abstract
The brain extracts useful features from a maelstrom of sensory information, and a fundamental goal of theoretical neuroscience is to work out how it does so. One proposed feature extraction strategy is motivated by the observation that the meaning of sensory data, such as the identity of a moving visual object, is often more persistent than the activation of any single sensory receptor. This notion is embodied in the slow feature analysis (SFA) algorithm, which uses “slowness” as a heuristic by which to extract semantic information from multidimensional time series. Here, we develop a probabilistic interpretation of this algorithm, showing that inference and learning in the limiting case of a suitable probabilistic model yield exactly the results of SFA. Similar equivalences have proved useful in interpreting and extending comparable algorithms such as independent component analysis. For SFA, we use the equivalent probabilistic model as a conceptual springboard with which to motivate several novel extensions to the algorithm.
Collapse
|
10
|
Wyss R, König P, Verschure PFMJ. A model of the ventral visual system based on temporal stability and local memory. PLoS Biol 2006; 4:e120. [PMID: 16605306 PMCID: PMC1436026 DOI: 10.1371/journal.pbio.0040120] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2005] [Accepted: 02/14/2006] [Indexed: 12/20/2022] Open
Abstract
The cerebral cortex is a remarkably homogeneous structure suggesting a rather generic computational machinery. Indeed, under a variety of conditions, functions attributed to specialized areas can be supported by other regions. However, a host of studies have laid out an ever more detailed map of functional cortical areas. This leaves us with the puzzle of whether different cortical areas are intrinsically specialized, or whether they differ mostly by their position in the processing hierarchy and their inputs but apply the same computational principles. Here we show that the computational principle of optimal stability of sensory representations combined with local memory gives rise to a hierarchy of processing stages resembling the ventral visual pathway when it is exposed to continuous natural stimuli. Early processing stages show receptive fields similar to those observed in the primary visual cortex. Subsequent stages are selective for increasingly complex configurations of local features, as observed in higher visual areas. The last stage of the model displays place fields as observed in entorhinal cortex and hippocampus. The results suggest that functionally heterogeneous cortical areas can be generated by only a few computational principles and highlight the importance of the variability of the input signals in forming functional specialization.
Collapse
Affiliation(s)
- Reto Wyss
- 1Institute of Neuroinformatics, University/ETH Zürich, Zürich, Switzerland
- 3Computation and Neural Systems, California Institute of Technology, Division of Biology, Pasadena, California, United States of America
| | - Peter König
- 1Institute of Neuroinformatics, University/ETH Zürich, Zürich, Switzerland
- 2Institute of Cognitive Science, University Osnabrück, Neurobiopsychologie, Osnabrück, Germany
| | - Paul F. M. J Verschure
- 1Institute of Neuroinformatics, University/ETH Zürich, Zürich, Switzerland
- 4ICREA and Technology Department, University Pompeu Fabra, Barcelona, Spain
| |
Collapse
|
11
|
Malo J, Gutiérrez J. V1 non-linear properties emerge from local-to-global non-linear ICA. NETWORK (BRISTOL, ENGLAND) 2006; 17:85-102. [PMID: 16613796 DOI: 10.1080/09548980500439602] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
It has been argued that the aim of non-linearities in different visual and auditory mechanisms may be to remove the relations between the coefficients of the signal after global linear ICA-like stages. Specifically, in Schwartz and Simoncelli (2001), it was shown that masking effects are reproduced by fitting the parameters of a particular non-linearity in order to remove the dependencies between the energy of wavelet coefficients. In this work, we present a different result that supports the same efficient encoding hypothesis. However, this result is more general because, instead of assuming any specific functional form for the non-linearity, we show that by using an unconstrained approach, masking-like behavior emerges directly from natural images. This result is an additional indication that Barlow's efficient encoding hypothesis may explain not only the shape of receptive fields of V1 sensors but also their non-linear behavior.
Collapse
Affiliation(s)
- Jesús Malo
- Dept. d'Optica, Facultat de Física, Universitat de València, Spain.
| | | |
Collapse
|
12
|
Batarseh KI. Energy levels of Moiré patterns: relation to human perception. BIOLOGICAL CYBERNETICS 2005; 93:248-55. [PMID: 16189673 DOI: 10.1007/s00422-005-0001-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2004] [Accepted: 05/11/2005] [Indexed: 05/04/2023]
Abstract
Quantitative analyses were undertaken to obtain a priori information regarding the energy levels of the random-dot display or Moiré patterns as a function of the angle of rotation theta by employing classical Newtonian mechanics. The energy profiles for these patterns were found to be similar for 10 degrees <t heta < 350 degrees in which the energies exhibited a maxima. For 10 degrees > or = theta > or = 350 degrees , the profiles were found to be dramatically different, especially for the focus pattern where the profile exhibited a downward spike. Specifically, it was found that the minimum energy levels correspond to the angles of rotation where the profiles are perceived by humans. These results may provide insights into the underlying mechanism responsible for the perception of these patterns and information processing in the brain, specifically in the cerebral cortex.
Collapse
Affiliation(s)
- Kareem I Batarseh
- Alpha-Omega Biologicals, 8610 Larkview Lane, Fairfax Station, VA 22039, USA.
| |
Collapse
|
13
|
Felsen G, Touryan J, Han F, Dan Y. Cortical sensitivity to visual features in natural scenes. PLoS Biol 2005; 3:e342. [PMID: 16171408 PMCID: PMC1233414 DOI: 10.1371/journal.pbio.0030342] [Citation(s) in RCA: 113] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2005] [Accepted: 08/03/2005] [Indexed: 11/18/2022] Open
Abstract
A central hypothesis concerning sensory processing is that the neuronal circuits are specifically adapted to represent natural stimuli efficiently. Here we show a novel effect in cortical coding of natural images. Using spike-triggered average or spike-triggered covariance analyses, we first identified the visual features selectively represented by each cortical neuron from its responses to natural images. We then measured the neuronal sensitivity to these features when they were present in either natural images or random stimuli. We found that in the responses of complex cells, but not of simple cells, the sensitivity was markedly higher for natural images than for random stimuli. Such elevated sensitivity leads to increased detectability of the visual features and thus an improved cortical representation of natural scenes. Interestingly, this effect is due not to the spatial power spectra of natural images, but to their phase regularities. These results point to a distinct visual-coding strategy that is mediated by contextual modulation of cortical responses tuned to the spatial-phase structure of natural scenes.
Collapse
Affiliation(s)
- Gidon Felsen
- 1 Division of Neurobiology, Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
- 2 Helen Wills Neuroscience Institute, University of California, Berkeley, California, United States of America
| | - Jon Touryan
- 2 Helen Wills Neuroscience Institute, University of California, Berkeley, California, United States of America
- 3 Group in Vision Science, University of California, Berkeley, California, United States of America
| | - Feng Han
- 2 Helen Wills Neuroscience Institute, University of California, Berkeley, California, United States of America
- 3 Group in Vision Science, University of California, Berkeley, California, United States of America
| | - Yang Dan
- 1 Division of Neurobiology, Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
- 2 Helen Wills Neuroscience Institute, University of California, Berkeley, California, United States of America
- 3 Group in Vision Science, University of California, Berkeley, California, United States of America
| |
Collapse
|
14
|
Einhäuser W, Hipp J, Eggert J, Körner E, König P. Learning viewpoint invariant object representations using a temporal coherence principle. BIOLOGICAL CYBERNETICS 2005; 93:79-90. [PMID: 16021516 DOI: 10.1007/s00422-005-0585-8] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2004] [Accepted: 05/23/2005] [Indexed: 05/03/2023]
Abstract
Invariant object recognition is arguably one of the major challenges for contemporary machine vision systems. In contrast, the mammalian visual system performs this task virtually effortlessly. How can we exploit our knowledge on the biological system to improve artificial systems? Our understanding of the mammalian early visual system has been augmented by the discovery that general coding principles could explain many aspects of neuronal response properties. How can such schemes be transferred to system level performance? In the present study we train cells on a particular variant of the general principle of temporal coherence, the "stability" objective. These cells are trained on unlabeled real-world images without a teaching signal. We show that after training, the cells form a representation that is largely independent of the viewpoint from which the stimulus is looked at. This finding includes generalization to previously unseen viewpoints. The achieved representation is better suited for view-point invariant object classification than the cells' input patterns. This property to facilitate view-point invariant classification is maintained even if training and classification take place in the presence of an--also unlabeled--distractor object. In summary, here we show that unsupervised learning using a general coding principle facilitates the classification of real-world objects, that are not segmented from the background and undergo complex, non-isomorphic, transformations.
Collapse
Affiliation(s)
- Wolfgang Einhäuser
- Institute of Neuroinformatics, University & ETH Zürich, Zürich, Switzerland.
| | | | | | | | | |
Collapse
|
15
|
Hipp J, Einhäuser W, Conradt J, König P. Learning of somatosensory representations for texture discrimination using a temporal coherence principle. NETWORK (BRISTOL, ENGLAND) 2005; 16:223-38. [PMID: 16411497 DOI: 10.1080/09548980500361582] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
In order to perform appropriate actions, animals need to quickly and reliably classify their sensory input. How can representations suitable for classification be acquired from statistical properties of the animal's natural environment? Akin to behavioural studies in rats, we investigate this question using texture discrimination by the vibrissae system as a model. To account for the rat's active sensing behaviour, we record whisker movements in a hardware model. Based on these signals, we determine the response of primary neurons, modelled as spatio-temporal filters. Using their output, we train a second layer of neurons to optimise a temporal coherence objective function. The performance in classifying textures using a single cell strongly correlates with the cell's temporal coherence; hence output cells outperform primary cells. Using a simple, unsupervised classifier, the performance on the output cell population is same as if using a sophisticated supervised classifier on the primary cells. Our results demonstrate that the optimisation of temporal coherence yields a representation that facilitates subsequent classification by selectively conveying relevant information.
Collapse
Affiliation(s)
- Joerg Hipp
- Institute of Neuroinformatics, University of Zürich & Swiss Federal Institute of Technology (ETH), Zürich, Switzerland.
| | | | | | | |
Collapse
|