1
|
Paiton DM, Frye CG, Lundquist SY, Bowen JD, Zarcone R, Olshausen BA. Selectivity and robustness of sparse coding networks. J Vis 2020; 20:10. [PMID: 33237290 PMCID: PMC7691792 DOI: 10.1167/jov.20.12.10] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We investigate how the population nonlinearities resulting from lateral inhibition and thresholding in sparse coding networks influence neural response selectivity and robustness. We show that when compared to pointwise nonlinear models, such population nonlinearities improve the selectivity to a preferred stimulus and protect against adversarial perturbations of the input. These findings are predicted from the geometry of the single-neuron iso-response surface, which provides new insight into the relationship between selectivity and adversarial robustness. Inhibitory lateral connections curve the iso-response surface outward in the direction of selectivity. Since adversarial perturbations are orthogonal to the iso-response surface, adversarial attacks tend to be aligned with directions of selectivity. Consequently, the network is less easily fooled by perceptually irrelevant perturbations to the input. Together, these findings point to benefits of integrating computational principles found in biological vision systems into artificial neural networks.
Collapse
Affiliation(s)
- Dylan M Paiton
- Vision Science Graduate Group, University of California Berkeley, Berkeley, CA, USA.,Redwood Center for Theoretical Neuroscience, University of California Berkeley, Berkeley, CA, USA.,
| | - Charles G Frye
- Redwood Center for Theoretical Neuroscience, University of California Berkeley, Berkeley, CA, USA.,Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA.,
| | - Sheng Y Lundquist
- Department of Computer Science, Portland State University, Portland, OR, USA.,
| | - Joel D Bowen
- Vision Science Graduate Group, University of California Berkeley, Berkeley, CA, USA.,
| | - Ryan Zarcone
- Redwood Center for Theoretical Neuroscience, University of California Berkeley, Berkeley, CA, USA.,Biophysics, University of California Berkeley, Berkeley, CA, USA.,
| | - Bruno A Olshausen
- Vision Science Graduate Group, University of California Berkeley, Berkeley, CA, USA.,Redwood Center for Theoretical Neuroscience, University of California Berkeley, Berkeley, CA, USA.,Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA.,
| |
Collapse
|
2
|
Capparelli F, Pawelzik K, Ernst U. Constrained inference in sparse coding reproduces contextual effects and predicts laminar neural dynamics. PLoS Comput Biol 2019; 15:e1007370. [PMID: 31581240 PMCID: PMC6793885 DOI: 10.1371/journal.pcbi.1007370] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 10/15/2019] [Accepted: 09/02/2019] [Indexed: 01/16/2023] Open
Abstract
When probed with complex stimuli that extend beyond their classical receptive field, neurons in primary visual cortex display complex and non-linear response characteristics. Sparse coding models reproduce some of the observed contextual effects, but still fail to provide a satisfactory explanation in terms of realistic neural structures and cortical mechanisms, since the connection scheme they propose consists only of interactions among neurons with overlapping input fields. Here we propose an extended generative model for visual scenes that includes spatial dependencies among different features. We derive a neurophysiologically realistic inference scheme under the constraint that neurons have direct access only to local image information. The scheme can be interpreted as a network in primary visual cortex where two neural populations are organized in different layers within orientation hypercolumns that are connected by local, short-range and long-range recurrent interactions. When trained with natural images, the model predicts a connectivity structure linking neurons with similar orientation preferences matching the typical patterns found for long-ranging horizontal axons and feedback projections in visual cortex. Subjected to contextual stimuli typically used in empirical studies, our model replicates several hallmark effects of contextual processing and predicts characteristic differences for surround modulation between the two model populations. In summary, our model provides a novel framework for contextual processing in the visual system proposing a well-defined functional role for horizontal axons and feedback projections.
Collapse
Affiliation(s)
- Federica Capparelli
- Institute for Theoretical Physics, University of Bremen, Bremen, Germany
- * E-mail:
| | - Klaus Pawelzik
- Institute for Theoretical Physics, University of Bremen, Bremen, Germany
| | - Udo Ernst
- Institute for Theoretical Physics, University of Bremen, Bremen, Germany
| |
Collapse
|
3
|
Giraldo LGS, Schwartz O. Integrating Flexible Normalization into Midlevel Representations of Deep Convolutional Neural Networks. Neural Comput 2019; 31:2138-2176. [PMID: 31525314 DOI: 10.1162/neco_a_01226] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Deep convolutional neural networks (CNNs) are becoming increasingly popular models to predict neural responses in visual cortex. However, contextual effects, which are prevalent in neural processing and in perception, are not explicitly handled by current CNNs, including those used for neural prediction. In primary visual cortex, neural responses are modulated by stimuli spatially surrounding the classical receptive field in rich ways. These effects have been modeled with divisive normalization approaches, including flexible models, where spatial normalization is recruited only to the degree that responses from center and surround locations are deemed statistically dependent. We propose a flexible normalization model applied to midlevel representations of deep CNNs as a tractable way to study contextual normalization mechanisms in midlevel cortical areas. This approach captures nontrivial spatial dependencies among midlevel features in CNNs, such as those present in textures and other visual stimuli, that arise from tiling high-order features geometrically. We expect that the proposed approach can make predictions about when spatial normalization might be recruited in midlevel cortical areas. We also expect this approach to be useful as part of the CNN tool kit, therefore going beyond more restrictive fixed forms of normalization.
Collapse
Affiliation(s)
| | - Odelia Schwartz
- Computer Science Department, University of Miami, Coral Gables, FL 33146, U.S.A.
| |
Collapse
|
4
|
Hansen BC, Field DJ, Greene MR, Olson C, Miskovic V. Towards a state-space geometry of neural responses to natural scenes: A steady-state approach. Neuroimage 2019; 201:116027. [PMID: 31325643 DOI: 10.1016/j.neuroimage.2019.116027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 06/13/2019] [Accepted: 07/16/2019] [Indexed: 10/26/2022] Open
Abstract
Our understanding of information processing by the mammalian visual system has come through a variety of techniques ranging from psychophysics and fMRI to single unit recording and EEG. Each technique provides unique insights into the processing framework of the early visual system. Here, we focus on the nature of the information that is carried by steady state visual evoked potentials (SSVEPs). To study the information provided by SSVEPs, we presented human participants with a population of natural scenes and measured the relative SSVEP response. Rather than focus on particular features of this signal, we focused on the full state-space of possible responses and investigated how the evoked responses are mapped onto this space. Our results show that it is possible to map the relatively high-dimensional signal carried by SSVEPs onto a 2-dimensional space with little loss. We also show that a simple biologically plausible model can account for a high proportion of the explainable variance (~73%) in that space. Finally, we describe a technique for measuring the mutual information that is available about images from SSVEPs. The techniques introduced here represent a new approach to understanding the nature of the information carried by SSVEPs. Crucially, this approach is general and can provide a means of comparing results across different neural recording methods. Altogether, our study sheds light on the encoding principles of early vision and provides a much needed reference point for understanding subsequent transformations of the early visual response space to deeper knowledge structures that link different visual environments.
Collapse
Affiliation(s)
- Bruce C Hansen
- Colgate University, Department of Psychological & Brain Sciences, Neuroscience Program, Hamilton, NY, USA.
| | - David J Field
- Cornell University, Department of Psychology, Ithaca, NY, USA
| | | | - Cassady Olson
- Colgate University, Department of Psychological & Brain Sciences, Neuroscience Program, Hamilton, NY, USA; Current Address: University of Chicago, Committee on Computational Neuroscience, Chicago, IL, USA
| | - Vladimir Miskovic
- State University of New York at Binghamton, Department of Psychology, Binghamton, NY, USA
| |
Collapse
|
5
|
Hu Q, Victor JD. Two-Dimensional Hermite Filters Simplify the Description of High-Order Statistics of Natural Images. Symmetry (Basel) 2016; 8. [PMID: 27713838 PMCID: PMC5050006 DOI: 10.3390/sym8090098] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Natural image statistics play a crucial role in shaping biological visual systems, understanding their function and design principles, and designing effective computer-vision algorithms. High-order statistics are critical for conveying local features, but they are challenging to study – largely because their number and variety is large. Here, via the use of two-dimensional Hermite (TDH) functions, we identify a covert symmetry in high-order statistics of natural images that simplifies this task. This emerges from the structure of TDH functions, which are an orthogonal set of functions that are organized into a hierarchy of ranks. Specifically, we find that the shape (skewness and kurtosis) of the distribution of filter coefficients depends only on the projection of the function onto a 1-dimensional subspace specific to each rank. The characterization of natural image statistics provided by TDH filter coefficients reflects both their phase and amplitude structure, and we suggest an intuitive interpretation for the special subspace within each rank.
Collapse
Affiliation(s)
- Qin Hu
- Microsoft Research, One Microsoft Way, Redmond, WA 98052 USA;
| | - Jonathan D Victor
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, 1300 York Ave., NewYork, NY 10065 USA;
| |
Collapse
|
6
|
Golden JR, Vilankar KP, Wu MCK, Field DJ. Conjectures regarding the nonlinear geometry of visual neurons. Vision Res 2016; 120:74-92. [PMID: 26902730 DOI: 10.1016/j.visres.2015.10.015] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Revised: 09/16/2015] [Accepted: 10/10/2015] [Indexed: 12/01/2022]
Abstract
From the earliest stages of sensory processing, neurons show inherent non-linearities: the response to a complex stimulus is not a sum of the responses to a set of constituent basis stimuli. These non-linearities come in a number of forms and have been explained in terms of a number of functional goals. The family of spatial non-linearities have included interactions that occur both within and outside of the classical receptive field. They include, saturation, cross orientation inhibition, contrast normalization, end-stopping and a variety of non-classical effects. In addition, neurons show a number of facilitatory and invariance related effects such as those exhibited by complex cells (integration across position). Here, we describe an approach that attempts to explain many of the non-linearities under a single geometric framework. In line with Zetzsche and colleagues (e.g., Zetzsche et al., 1999) we propose that many of the principal non-linearities can be described by a geometry where the neural response space has a simple curvature. In this paper, we focus on the geometry that produces both increased selectivity (curving outward) and increased tolerance (curving inward). We demonstrate that overcomplete sparse coding with both low-dimensional synthetic data and high-dimensional natural scene data can result in curvature that is responsible for a variety of different known non-classical effects including end-stopping and gain control. We believe that this approach provides a more fundamental explanation of these non-linearities and does not require that one postulate a variety of explanations (e.g., that gain must be controlled or the ends of lines must be detected). In its standard form, sparse coding does not however, produce invariance/tolerance represented by inward curvature. We speculate on some of the requirements needed to produce such curvature.
Collapse
Affiliation(s)
- James R Golden
- Department of Psychology, Cornell University, Ithaca, NY, USA.
| | | | - Michael C K Wu
- Biophysics Graduate Group, University of California, Berkeley, CA, USA; Lithium Technologies Inc., San Francisco, CA, USA.
| | - David J Field
- Department of Psychology, Cornell University, Ithaca, NY, USA.
| |
Collapse
|
7
|
Elder JH, Victor J, Zucker SW. Understanding the statistics of the natural environment and their implications for vision. Vision Res 2016; 120:1-4. [PMID: 26851343 DOI: 10.1016/j.visres.2016.01.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- James H Elder
- Department of Electrical Engineering & Computer Science, Department of Psychology, Centre for Vision Research, York University, 4700 Keele Street Toronto, Ontario M3J 1P3, Canada.
| | - Jonathan Victor
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, 1300 York Avenue, New York, NY 10065, USA.
| | - Steven W Zucker
- Depts. of Computer Science and Biomedical Engineering, Yale University, 51 Prospect St., New Haven, CT 06520-8285, USA.
| |
Collapse
|
8
|
Seamons JWG, Barbosa MS, Bubna-Litic A, Maddess T. A lower bound on the number of mechanisms for discriminating fourth and higher order spatial correlations. Vision Res 2015; 108:41-8. [PMID: 25624152 DOI: 10.1016/j.visres.2014.12.023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Revised: 11/20/2014] [Accepted: 12/03/2014] [Indexed: 11/26/2022]
Abstract
Research on single striate cortical neurons has often concentrated on their responses to stimuli defined by two-point correlations. Texture discrimination studies using a relatively small palette of isotrigon textures have indicated that we are sensitive to third and higher-order spatial correlations. To further evaluate the underlying mechanisms of texture discrimination subjects discriminated random binary noise patterns from ten new isotrigon texture types. Factor analysis revealed that as few as three mechanisms may govern the detection of fourth and higher order image structure. This supports the findings of previous studies using different isotrigon textures. The computation of higher-order correlations by the brain is neurophysiologically plausible. The mechanisms identified in this study may represent some short range nonlinear combination of recursive and/or rectifying processes.
Collapse
Affiliation(s)
- John W G Seamons
- Eccles Institute for Neuroscience, The John Curtin School of Medical Research, The Australian National University, Canberra, ACT 0200, Australia
| | - Marconi S Barbosa
- Eccles Institute for Neuroscience, The John Curtin School of Medical Research, The Australian National University, Canberra, ACT 0200, Australia
| | - Anton Bubna-Litic
- Eccles Institute for Neuroscience, The John Curtin School of Medical Research, The Australian National University, Canberra, ACT 0200, Australia
| | - Ted Maddess
- Eccles Institute for Neuroscience, The John Curtin School of Medical Research, The Australian National University, Canberra, ACT 0200, Australia.
| |
Collapse
|
9
|
Natural image sequences constrain dynamic receptive fields and imply a sparse code. Brain Res 2013; 1536:53-67. [PMID: 23933349 DOI: 10.1016/j.brainres.2013.07.056] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2013] [Revised: 07/28/2013] [Accepted: 07/31/2013] [Indexed: 11/22/2022]
Abstract
In their natural environment, animals experience a complex and dynamic visual scenery. Under such natural stimulus conditions, neurons in the visual cortex employ a spatially and temporally sparse code. For the input scenario of natural still images, previous work demonstrated that unsupervised feature learning combined with the constraint of sparse coding can predict physiologically measured receptive fields of simple cells in the primary visual cortex. This convincingly indicated that the mammalian visual system is adapted to the natural spatial input statistics. Here, we extend this approach to the time domain in order to predict dynamic receptive fields that can account for both spatial and temporal sparse activation in biological neurons. We rely on temporal restricted Boltzmann machines and suggest a novel temporal autoencoding training procedure. When tested on a dynamic multi-variate benchmark dataset this method outperformed existing models of this class. Learning features on a large dataset of natural movies allowed us to model spatio-temporal receptive fields for single neurons. They resemble temporally smooth transformations of previously obtained static receptive fields and are thus consistent with existing theories. A neuronal spike response model demonstrates how the dynamic receptive field facilitates temporal and population sparseness. We discuss the potential mechanisms and benefits of a spatially and temporally sparse representation of natural visual input.
Collapse
|
10
|
Memisevic R. Learning to relate images. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2013; 35:1829-1846. [PMID: 23787339 DOI: 10.1109/tpami.2013.53] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
A fundamental operation in many vision tasks, including motion understanding, stereopsis, visual odometry, or invariant recognition, is establishing correspondences between images or between images and data from other modalities. Recently, there has been increasing interest in learning to infer correspondences from data using relational, spatiotemporal, and bilinear variants of deep learning methods. These methods use multiplicative interactions between pixels or between features to represent correlation patterns across multiple images. In this paper, we review the recent work on relational feature learning, and we provide an analysis of the role that multiplicative interactions play in learning to encode relations. We also discuss how square-pooling and complex cell models can be viewed as a way to represent multiplicative interactions and thereby as a way to encode relations.
Collapse
Affiliation(s)
- Roland Memisevic
- Department of Computer Science and Operations Research, University of Montreal, Montreal.
| |
Collapse
|
11
|
Rajan K, Marre O, Tkačik G. Learning quadratic receptive fields from neural responses to natural stimuli. Neural Comput 2013; 25:1661-92. [PMID: 23607557 DOI: 10.1162/neco_a_00463] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Models of neural responses to stimuli with complex spatiotemporal correlation structure often assume that neurons are selective for only a small number of linear projections of a potentially high-dimensional input. In this review, we explore recent modeling approaches where the neural response depends on the quadratic form of the input rather than on its linear projection, that is, the neuron is sensitive to the local covariance structure of the signal preceding the spike. To infer this quadratic dependence in the presence of arbitrary (e.g., naturalistic) stimulus distribution, we review several inference methods, focusing in particular on two information theory-based approaches (maximization of stimulus energy and of noise entropy) and two likelihood-based approaches (Bayesian spike-triggered covariance and extensions of generalized linear models). We analyze the formal relationship between the likelihood-based and information-based approaches to demonstrate how they lead to consistent inference. We demonstrate the practical feasibility of these procedures by using model neurons responding to a flickering variance stimulus.
Collapse
Affiliation(s)
- Kanaka Rajan
- Joseph Henry Laboratories of Physics and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA.
| | | | | |
Collapse
|
12
|
Sensorimotor representation and knowledge-based reasoning for spatial exploration and localisation. Cogn Process 2008; 9:283-97. [PMID: 18461375 DOI: 10.1007/s10339-008-0214-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2007] [Accepted: 04/16/2008] [Indexed: 11/27/2022]
|
13
|
Multiresolution wavelet framework models brightness induction effects. Vision Res 2008; 48:733-51. [PMID: 18241909 DOI: 10.1016/j.visres.2007.12.008] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2007] [Revised: 12/04/2007] [Accepted: 12/13/2007] [Indexed: 10/22/2022]
Abstract
A new multiresolution wavelet model is presented here, which accounts for brightness assimilation and contrast effects in a unified framework, and includes known psychophysical and physiological attributes of the primate visual system (such as spatial frequency channels, oriented receptive fields, contrast sensitivity function, contrast non-linearities, and a unified set of parameters). Like other low-level models, such as the ODOG model [Blakeslee, B., & McCourt, M. E. (1999). A multiscale spatial filtering account of the white effect, simultaneous brightness contrast and grating induction. Vision Research, 39, 4361-4377], this formulation reproduces visual effects such as simultaneous contrast, the White effect, grating induction, the Todorović effect, Mach bands, the Chevreul effect and the Adelson-Logvinenko tile effects, but it also reproduces other previously unexplained effects such as the dungeon illusion, all using a single set of parameters.
Collapse
|
14
|
Abstract
No sensory stimulus is an island unto itself; rather, it can only properly be interpreted in light of the stimuli that surround it in space and time. This can result in entertaining illusions and puzzling results in psychological and neurophysiological experiments. We concentrate on perhaps the best studied test case, namely orientation or tilt, which gives rise to the notorious tilt illusion and the adaptation tilt after-effect. We review the empirical literature and discuss the computational and statistical ideas that are battling to explain these conundrums, and thereby gain favour as more general accounts of cortical processing.
Collapse
Affiliation(s)
- Odelia Schwartz
- Albert Einstein College of Medicine, Jack and Pearl Resnick Campus, 1300 Morris Park Avenue, Bronx, New York 10461 (718) 430-2000, USA.
| | | | | |
Collapse
|
15
|
Nuding U, Zetzsche C. Learning the selectivity of V2 and V4 neurons using non-linear multi-layer wavelet networks. Biosystems 2007; 89:273-9. [PMID: 17324497 DOI: 10.1016/j.biosystems.2006.04.025] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2005] [Accepted: 04/21/2006] [Indexed: 11/19/2022]
Abstract
We investigate a non-linear network with two processing stages optimized to reduce the statistical dependencies in natural images. This network serves as a model for the neural information processing in the higher visual areas of primates (visual cortices V2-V4). The resulting population is analyzed with regard to non-linear selectivity and invariance properties. We find units that are very selective with respect to the space spanned by all possible input signals and units that are invariant with respect to certain stimulus classes. In comparison to the measured distribution of selectivity in V2 neurons, the selectivity histogram of the network units shows an even more pronounced tendency towards higher selectivities. A special property of the system is the emergence of non-linear interactions between coefficients from different scales and orientations, which are necessary for the exploitation of higher-order statistical redundancies of natural images. We extend the concept to multi-layer systems and present some simulation results.
Collapse
Affiliation(s)
- U Nuding
- Bernstein Center for Computational Neuroscience, Ludwig-Maximilians-Universität München, Germany
| | | |
Collapse
|
16
|
Maddess T, Nagai Y, Victor JD, Taylor RRL. Multilevel isotrigon textures. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2007; 24:278-93. [PMID: 17206245 DOI: 10.1364/josaa.24.000278] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
To date a small palette of isotrigon textures have been available to study how the brain uses higher-order spatial correlation information. We introduce several hundred new isotrigon textures. Special modulation properties are illustrated that can be used to extract neural responses to higher-order spatial correlations. We also ask how many textures make an adequate training set and how representative individual examples are of their texture class. Human discrimination of 90 of these patterns was quantified. Modeling those responses shows that humanlike performance can be obtained providing a fourth-order classifier is used, although more than one mechanism is required.
Collapse
Affiliation(s)
- Ted Maddess
- ARC Centre of Excellence in Vision Science, Research School of Biological Sciences, Australian National University, Canberra ACT 0200, Australia.
| | | | | | | |
Collapse
|
17
|
Victor JD, Conte MM. Encoding and stability of image statistics in working memory. Vision Res 2006; 46:4152-62. [PMID: 16996557 DOI: 10.1016/j.visres.2006.07.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2006] [Revised: 07/20/2006] [Accepted: 07/21/2006] [Indexed: 10/24/2022]
Abstract
Visual working memory contains a representation of certain image statistics (Victor & Conte, 2004), in addition to a pixel-by-pixel representation. Here, we show that the representation of statistics is more stable in time (up to 3000 ms) than the pixel-by-pixel representation, especially for changes in luminance and local high-order statistics, and is not affected by visual masking. Bilaterally symmetric arrays and arrays with local correlations are more readily encoded than random ones, but a change in the presence of bilateral symmetry, per se, contributes only modestly to the ability to detect that an array has changed.
Collapse
Affiliation(s)
- Jonathan D Victor
- Department of Neurology and Neuroscience, Weill Medical College of Cornell University, 1300 York Avenue, New York, NY 10021, USA.
| | | |
Collapse
|
18
|
Schwartz O, Sejnowski TJ, Dayan P. Soft mixer assignment in a hierarchical generative model of natural scene statistics. Neural Comput 2006; 18:2680-718. [PMID: 16999575 PMCID: PMC2915771 DOI: 10.1162/neco.2006.18.11.2680] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Gaussian scale mixture models offer a top-down description of signal generation that captures key bottom-up statistical characteristics of filter responses to images. However, the pattern of dependence among the filters for this class of models is prespecified. We propose a novel extension to the gaussian scale mixture model that learns the pattern of dependence from observed inputs and thereby induces a hierarchical representation of these inputs. Specifically, we propose that inputs are generated by gaussian variables (modeling local filter structure), multiplied by a mixer variable that is assigned probabilistically to each input from a set of possible mixers. We demonstrate inference of both components of the generative model, for synthesized data and for different classes of natural images, such as a generic ensemble and faces. For natural images, the mixer variable assignments show invariances resembling those of complex cells in visual cortex; the statistics of the gaussian components of the model are in accord with the outputs of divisive normalization models. We also show how our model helps interrelate a wide range of models of image statistics and cortical processing.
Collapse
Affiliation(s)
- Odelia Schwartz
- Howard Hughes Medical Institute, Computational Neurobiology Lab, Salk Institute for Biological Studies, La Jolla, CA 92037, U.S.A
| | - Terrence J. Sejnowski
- Howard Hughes Medical Institute, Computational Neurobiology Lab, Salk Institute for Biological Studies, La Jolla, CA 92037, and Department of Biology, University of California at San Diego, La Jolla, CA 92093, U.S.A
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, University College, London WC1N 3AR, U.K
| |
Collapse
|