1
|
Babenko VV, Yavna DV, Ermakov PN, Anokhina PV. Nonlocal contrast calculated by the second order visual mechanisms and its significance in identifying facial emotions. F1000Res 2023; 10:274. [PMID: 37767361 PMCID: PMC10521119 DOI: 10.12688/f1000research.28396.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/15/2023] [Indexed: 09/29/2023] Open
Abstract
Background: Previously obtained results indicate that faces are / preattentively/ detected in the visual scene very fast, and information on facial expression is rapidly extracted at the lower levels of the visual system. At the same time different facial attributes make different contributions in facial expression recognition. However, it is known, among the preattentive mechanisms there are none that would be selective for certain facial features, such as eyes or mouth. The aim of our study was to identify a candidate for the role of such a mechanism. Our assumption was that the most informative areas of the image are those characterized by spatial heterogeneity, particularly with nonlocal contrast changes. These areas may be identified / in the human visual system/ by the second-order visual / mechanisms/ filters selective to contrast modulations of brightness gradients. Methods: We developed a software program imitating the operation of these / mechanisms/ filters and finding areas of contrast heterogeneity in the image. Using this program, we extracted areas with maximum, minimum and medium contrast modulation amplitudes from the initial face images, then we used these to make three variants of one and the same face. The faces were demonstrated to the observers along with other objects synthesized the same way. The participants had to identify faces and define facial emotional expressions. Results: It was found that the greater is the contrast modulation amplitude of the areas shaping the face, the more precisely the emotion is identified. Conclusions: The results suggest that areas with a greater increase in nonlocal contrast are more informative in facial images, and the second-order visual / mechanisms/ filters can claim the role of /filters/ elements that detect areas of interest, attract visual attention and are windows through which subsequent levels of visual processing receive valuable information.
Collapse
Affiliation(s)
- Vitaly V. Babenko
- Department of Psychophysiology and Clinical Psychology, Academy of Psychology and Education Sciences, Southern Federal University, Rostov-on-Don, Russian Federation
| | - Denis V. Yavna
- Department of Psychophysiology and Clinical Psychology, Academy of Psychology and Education Sciences, Southern Federal University, Rostov-on-Don, Russian Federation
| | - Pavel N. Ermakov
- Department of Psychophysiology and Clinical Psychology, Academy of Psychology and Education Sciences, Southern Federal University, Rostov-on-Don, Russian Federation
| | - Polina V. Anokhina
- Department of Psychophysiology and Clinical Psychology, Academy of Psychology and Education Sciences, Southern Federal University, Rostov-on-Don, Russian Federation
| |
Collapse
|
2
|
Graham NV, Wolfson SS. Varying test-pattern duration to explore the dynamics of contrast-comparison and contrast-normalization processes. J Vis 2023; 23:15. [PMID: 36689217 PMCID: PMC9896861 DOI: 10.1167/jov.23.1.15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 11/18/2022] [Indexed: 01/24/2023] Open
Abstract
In this paper, we examine the dynamics of contrast-comparison and contrast-normalization processes. Observers adapted (for 1 second) to a grid of Gabor patches at one contrast; then a test pattern (which varied in duration from 12 ms to 3012 ms) was shown; and then the adapt pattern was shown again (1 second). All the Gabor patches in all the adapt patterns had 50% contrast. The test pattern was the same as the adapt pattern except that the Gabor patches in the test pattern had two different contrasts; the test contrasts varied from row to row (horizontal test pattern) or column to column (vertical test pattern). The task was to identify the orientation of the contrast variation in the test pattern (in other words, the observer performed a second-order orientation identification task). The two contrasts in each test pattern were varied while keeping the difference between the two contrasts constant. We have previously found that the observer's performance is poor for test patterns containing contrasts both above and below the adapt patterns' contrast (what we have called the "straddle effect") when the test duration is approximately 100 ms. Here, we find the straddle effect persists at all test durations we used. Other features of the results varied dramatically with test duration. We find that a simple model containing contrast-comparison and contrast-normalization processes provides a good explanation for the psychophysical results. The results provide some insight into the dynamics of these processes.
Collapse
Affiliation(s)
- Norma V Graham
- Department of Psychology, Columbia University, New York, NY, USA
| | - S Sabina Wolfson
- Department of Psychology, Columbia University, New York, NY, USA
| |
Collapse
|
3
|
Abstract
There is a large literature on lateral effects in pattern vision but no consensus about them or comprehensive model of them. This paper reviews the literature with a focus on the effects of parallel context in the central fovea. It describes seven experiments that measure detection and discrimination thresholds in annular and Gabor-pattern contexts at different separations. It presents a model of these effects, which is an elaboration of Foley's (1994) model. The model describes the results well, and it shows that lateral context affects the response to the target by both multiplicative excitation and additive inhibition. Both lateral effects extend for several wavelengths beyond the target. They vary in relative strength, producing near suppression and far enhancement of the response to the target. The model describes the detection and discrimination results well, and it also describes the results of experiments on lateral effects on perceived contrast. The model is consistent with the physiology of V1 cells.
Collapse
Affiliation(s)
- John M Foley
- Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, USA
| |
Collapse
|
4
|
Richard B, Hansen BC, Johnson AP, Shafto P. Spatial summation of broadband contrast. J Vis 2019; 19:16. [PMID: 31100132 DOI: 10.1167/19.5.16] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Spatial summation of luminance contrast signals has historically been psychophysically measured with stimuli isolated in spatial frequency (i.e., narrowband). Here, we revisit the study of spatial summation with noise patterns that contain the naturalistic 1/fα distribution of contrast across spatial frequency. We measured amplitude spectrum slope (α) discrimination thresholds and verified if sensitivity to α improved according to stimulus size. Discrimination thresholds did decrease with an increase in stimulus size. These data were modeled with a summation model originally designed for narrowband stimuli (i.e., single detecting channel; Baker & Meese, 2011; Meese & Baker, 2011) that we modified to include summation across multiple-differently tuned-spatial frequency channels. To fit our data, contrast gain control weights had to be inversely related to spatial frequency (1/f); thus low spatial frequencies received significantly more divisive inhibition than higher spatial frequencies, which is a similar finding to previous models of broadband contrast perception (Haun & Essock, 2010; Haun & Peli, 2013). We found summation across spatial frequency channels to occur prior to summation across space, channel summation was near linear and summation across space was nonlinear. Our analysis demonstrates that classical psychophysical models can be adapted to computationally define visual mechanisms under broadband visual input, with the adapted models offering novel insight on the integration of signals across channels and space.
Collapse
Affiliation(s)
- Bruno Richard
- Department of Mathematics and Computer Science, Rutgers University, Newark, NJ, USA
| | - Bruce C Hansen
- Department of Psychological and Brain Sciences, Neuroscience Program, Colgate University, Hamilton, NY, USA
| | - Aaron P Johnson
- Department of Psychology, Concordia University, Montreal, Quebec, Canada
| | - Patrick Shafto
- Department of Mathematics and Computer Science, Rutgers University, Newark, NJ, USA
| |
Collapse
|
5
|
Graham NV, Wolfson SS. Is the straddle effect in contrast perception limited to second-order spatial vision? J Vis 2018; 18:15. [PMID: 29904790 PMCID: PMC5976235 DOI: 10.1167/18.5.15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Previous work on the straddle effect in contrast perception (Foley, 2011; Graham & Wolfson, 2007; Wolfson & Graham, 2007, 2009) has used visual patterns and observer tasks of the type known as spatially second-order. After adaptation of about 1 s to a grid of Gabor patches all at one contrast, a second-order test pattern composed of two different test contrasts can be easy or difficult to perceive correctly. When the two test contrasts are both a bit less (or both a bit greater) than the adapt contrast, observers perform very well. However, when the two test contrasts straddle the adapt contrast (i.e., one of the test contrasts is greater than the adapt contrast and the other is less), performance drops dramatically. To explain this drop in performance-the straddle effect-we have suggested a contrast-comparison process. We began to wonder: Are second-order patterns necessary for the straddle effect? Here we show that the answer is "no". We demonstrate the straddle effect using spatially first-order visual patterns and several different observer tasks. We also see the effect of contrast normalization using first-order visual patterns here, analogous to our prior findings with second-order visual patterns. We did find one difference between first- and second-order tasks: Performance in the first-order tasks was slightly lower. This slightly lower performance may be due to slightly greater memory load. For many visual scenes, the important quantity in human contrast processing may not be monotonic with physical contrast but may be something more like the unsigned difference between current contrast and recent average contrast.
Collapse
Affiliation(s)
- Norma V Graham
- Department of Psychology, Columbia University, New York, NY, USA
| | - S Sabina Wolfson
- Department of Psychology, Columbia University, New York, NY, USA
| |
Collapse
|
6
|
Abstract
Human contrast sensitivity for narrowband Gabor targets is suppressed when superimposed on narrowband masks of the same spatial frequency and orientation (referred to as overlay suppression), with suppression being broadly tuned to orientation and spatial frequency. Numerous behavioral and neurophysiological experiments have suggested that overlay suppression originates from the initial lateral geniculate nucleus (LGN) inputs to V1, which is consistent with the broad tuning typically reported for overlay suppression. However, recent reports have shown narrowly tuned anisotropic overlay suppression when narrowband targets are masked by broadband noise. Consequently, researchers have argued for an additional form of overlay suppression that involves cortical contrast gain control processes. The current study sought to further explore this notion behaviorally using narrowband and broadband masks, along with a computational neural simulation of the hypothesized underlying gain control processes in cortex. Additionally, we employed transcranial direct current stimulation (tDCS) in order to test whether cortical processes are involved in driving narrowly tuned anisotropic suppression. The behavioral results yielded anisotropic overlay suppression for both broadband and narrowband masks and could be replicated with our computational neural simulation of anisotropic gain control. Further, the anisotropic form of overlay suppression could be directly modulated by tDCS, which would not be expected if the suppression was primarily subcortical in origin. Altogether, the results of the current study provide further evidence in support of an additional overlay suppression process that originates in cortex and show that this form of suppression is also observable with narrowband masks.
Collapse
|
7
|
Abstract
To understand how different spatial frequencies contribute to the overall perceived contrast of complex, broadband photographic images, we adapted the classification image paradigm. Using natural images as stimuli, we randomly varied relative contrast amplitude at different spatial frequencies and had human subjects determine which images had higher contrast. Then, we determined how the random variations corresponded with the human judgments. We found that the overall contrast of an image is disproportionately determined by how much contrast is between 1 and 6 c/°, around the peak of the contrast sensitivity function (CSF). We then employed the basic components of contrast psychophysics modeling to show that the CSF alone is not enough to account for our results and that an increase in gain control strength toward low spatial frequencies is necessary. One important consequence of this is that contrast constancy, the apparent independence of suprathreshold perceived contrast and spatial frequency, will not hold during viewing of natural images. We also found that images with darker low-luminance regions tended to be judged as having higher overall contrast, which we interpret as the consequence of darker local backgrounds resulting in higher band-limited contrast response in the visual system.
Collapse
Affiliation(s)
- Andrew M Haun
- Schepens Eye Research Institute, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
8
|
Westrick ZM, Henry CA, Landy MS. Inconsistent channel bandwidth estimates suggest winner-take-all nonlinearity in second-order vision. Vision Res 2013; 81:58-68. [PMID: 23416867 DOI: 10.1016/j.visres.2013.01.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Revised: 01/23/2013] [Accepted: 01/29/2013] [Indexed: 10/27/2022]
Abstract
The processing of texture patterns has been characterized by a model that postulates a first-stage linear filter to highlight a component texture, a pointwise rectification stage to convert contrast for the highlighted texture into mean response strength, followed by a second-stage linear filter to detect the texture-defined pattern. We estimated the spatial-frequency bandwidth of the second-stage filter mediating orientation discrimination of orientation-modulated second-order gratings by measuring threshold elevation in the presence of filtered noise added to the modulation signal. This experiment yielded no evidence for frequency tuning. A second experiment, in which subjects had to detect similar second-order gratings while judging their modulation frequency, produced bandwidth estimates of 1-1.5 octaves, similar to estimated bandwidths of first-order channels. We propose that an additional dominant-response-selection nonlinearity can account for these apparently contradictory results.
Collapse
|
9
|
The efficacy of local luminance amplitude in disambiguating the origin of luminance signals depends on carrier frequency: Further evidence for the active role of second-order vision in layer decomposition. Vision Res 2011; 51:496-507. [DOI: 10.1016/j.visres.2011.01.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2010] [Revised: 01/16/2011] [Accepted: 01/19/2011] [Indexed: 11/24/2022]
|
10
|
Graham NV. Beyond multiple pattern analyzers modeled as linear filters (as classical V1 simple cells): useful additions of the last 25 years. Vision Res 2011; 51:1397-430. [PMID: 21329718 DOI: 10.1016/j.visres.2011.02.007] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2010] [Revised: 02/07/2011] [Accepted: 02/09/2011] [Indexed: 11/28/2022]
Abstract
This review briefly discusses processes that have been suggested in the last 25 years as important to the intermediate stages of visual processing of patterns. Five categories of processes are presented: (1) Higher-order processes including FRF structures; (2) Divisive contrast nonlinearities including contrast normalization; (3) Subtractive contrast nonlinearities including contrast comparison; (4) Non-classical receptive fields (surround suppression, cross-orientation inhibition); (5) Contour integration.
Collapse
Affiliation(s)
- Norma V Graham
- Department of Psychology, Columbia University, NY, NY 10027, USA.
| |
Collapse
|
11
|
Markounikau V, Igel C, Grinvald A, Jancke D. A dynamic neural field model of mesoscopic cortical activity captured with voltage-sensitive dye imaging. PLoS Comput Biol 2010; 6:e1000919. [PMID: 20838578 PMCID: PMC2936513 DOI: 10.1371/journal.pcbi.1000919] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2009] [Accepted: 08/04/2010] [Indexed: 11/18/2022] Open
Abstract
A neural field model is presented that captures the essential non-linear characteristics of activity dynamics across several millimeters of visual cortex in response to local flashed and moving stimuli. We account for physiological data obtained by voltage-sensitive dye (VSD) imaging which reports mesoscopic population activity at high spatio-temporal resolution. Stimulation included a single flashed square, a single flashed bar, the line-motion paradigm--for which psychophysical studies showed that flashing a square briefly before a bar produces sensation of illusory motion within the bar--and moving squares controls. We consider a two-layer neural field (NF) model describing an excitatory and an inhibitory layer of neurons as a coupled system of non-linear integro-differential equations. Under the assumption that the aggregated activity of both layers is reflected by VSD imaging, our phenomenological model quantitatively accounts for the observed spatio-temporal activity patterns. Moreover, the model generalizes to novel similar stimuli as it matches activity evoked by moving squares of different speeds. Our results indicate that feedback from higher brain areas is not required to produce motion patterns in the case of the illusory line-motion paradigm. Physiological interpretation of the model suggests that a considerable fraction of the VSD signal may be due to inhibitory activity, supporting the notion that balanced intra-layer cortical interactions between inhibitory and excitatory populations play a major role in shaping dynamic stimulus representations in the early visual cortex.
Collapse
|
12
|
Grabska-Barwińska A, Distler C, Hoffmann KP, Jancke D. Contrast independence of cardinal preference: stable oblique effect in orientation maps of ferret visual cortex. Eur J Neurosci 2009; 29:1258-70. [PMID: 19302161 DOI: 10.1111/j.1460-9568.2009.06656.x] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The oblique effect was first described as enhanced detection and discrimination of cardinal orientations compared with oblique orientations. Such biases in visual processing are believed to originate from a functional adaptation to environmental statistics dominated by cardinal contours. At the neuronal level, the oblique orientation effect corresponds to the numerical overrepresentation and narrower tuning bandwidths of cortical neurons representing the cardinal axes. The anisotropic distribution of orientation preferences over large cortical regions was revealed with optical imaging, providing further evidence for the cortical oblique effect in several mammalian species. Our present study explores whether the dominant representation of cardinal contours persists at different stimulus contrasts. Performing intrinsic optical imaging in the ferret visual cortex and presenting drifting gratings at various orientations and contrasts (100%, 30% and 10%), we found that the overrepresentation of vertical and horizontal contours was invariant across stimulus contrasts. In addition, the responses to cardinal orientations were also more robust and evoked larger modulation depths than responses to oblique orientations. We conclude that orientation maps remain constant across the full range of contrast levels down to detection thresholds. Thus, a stable layout of the functional architecture dedicated to processing oriented edges seems to reflect a fundamental coding strategy of the early visual cortex.
Collapse
|
13
|
Abstract
The tilt illusion is a paradigmatic example of contextual influences on perception. We analyze it in terms of a neural population model for the perceptual organization of visual orientation. In turn, this is based on a well-found treatment of natural scene statistics, known as the Gaussian Scale Mixture model. This model is closely related to divisive gain control in neural processing and has been extensively applied in the image processing and statistical learning communities; however, its implications for contextual effects in biological vision have not been studied. In our model, oriented neural units associated with surround tilt stimuli participate in divisively normalizing the activities of the units representing a center stimulus, thereby changing their tuning curves. We show that through standard population decoding, these changes lead to the forms of repulsion and attraction observed in the tilt illusion. The issues in our model readily generalize to other visual attributes and contextual phenomena, and should lead to more rigorous treatments of contextual effects based on natural scene statistics.
Collapse
Affiliation(s)
- Odelia Schwartz
- Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY 10461, USA.
| | | | | |
Collapse
|
14
|
Nagai Y, Taylor RRL, Loh YW, Maddess T. Discrimination of complex form by simple oscillator networks. NETWORK (BRISTOL, ENGLAND) 2009; 20:233-252. [PMID: 19919282 DOI: 10.3109/09548980903373879] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Natural images are rich in higher order spatial correlations. Brain scanning, psychophysics and electrophysiology indicate that humans are sensitive to these image properties. A useful tool for exploring this sense is the set of isotrigon textures. Like natural images these textures have low dimensionality relative to random images, but like random images contain no average structure in their first to third order correlation functions. Thus, the structured appearance of these textures results from higher order correlations. One way to generate the higher order products inherent in higher order correlations is recursive nonlinear processing. We therefore decided to examine if very small oscillator networks could produce a profile of activity that matches human isotrigon discrimination performance across 53 isotrigon texture types. Human performance was measured in 23 subjects. The two best network types found contained as few as 4 oscillators. The input oscillators are of a novel cubic form and the final readout oscillator was a logistic oscillator. Mean readout oscillator activity matched human performance reasonably well even though the network parameters were fixed for all 53 texture types. Overall it appears that relatively simple, short range, and biologically plausible, recursive processing could provide the basis for discrimination of complex form.
Collapse
Affiliation(s)
- Yoshinori Nagai
- Center for Information Science, Kokushikan University, Tokyo, Japan
| | | | | | | |
Collapse
|
15
|
Abstract
AbstractRecent work has revealed multiple pathways for cross-orientation suppression in cat and human vision. In particular, ipsiocular and interocular pathways appear to assert their influence before binocular summation in human but have different (1) spatial tuning, (2) temporal dependencies, and (3) adaptation after-effects. Here we use mask components that fall outside the excitatory passband of the detecting mechanism to investigate the rules for pooling multiple mask components within these pathways. We measured psychophysical contrast masking functions for vertical 1 cycle/deg sine-wave gratings in the presence of left or right oblique (±45 deg) 3 cycles/deg mask gratings with contrast C%, or a plaid made from their sum, where each component (i) had contrast 0.5Ci%. Masks and targets were presented to two eyes (binocular), one eye (monoptic), or different eyes (dichoptic). Binocular-masking functions superimposed when plotted against C, but in the monoptic and dichoptic conditions, the grating produced slightly more suppression than the plaid when Ci ≥ 16%. We tested contrast gain control models involving two types of contrast combination on the denominator: (1) spatial pooling of the mask after a local nonlinearity (to calculate either root mean square contrast or energy) and (2) “linear suppression” (Holmes & Meese, 2004, Journal of Vision4, 1080–1089), involving the linear sum of the mask component contrasts. Monoptic and dichoptic masking were typically better fit by the spatial pooling models, but binocular masking was not: it demanded strict linear summation of the Michelson contrast across mask orientation. Another scheme, in which suppressive pooling followed compressive contrast responses to the mask components (e.g., oriented cortical cells), was ruled out by all of our data. We conclude that the different processes that underlie monoptic and dichoptic masking use the same type of contrast pooling within their respective suppressive fields, but the effects do not sum to predict the binocular case.
Collapse
|
16
|
Carrasco M, Loula F, Ho YX. How attention enhances spatial resolution: Evidence from selective adaptation to spatial frequency. ACTA ACUST UNITED AC 2006; 68:1004-12. [PMID: 17153194 DOI: 10.3758/bf03193361] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In this study, we investigated how spatial resolution and covert attention affect performance in a texture segmentation task in which performance peaks at midperiphery and drops at peripheral and central retinal locations. The central impairment is called the central performance drop (CPD; Kehrer, 1989). It has been established that attending to the target location improves performance in the periphery where resolution is too low for the task, but impairs it at central locations where resolution is too high. This is called the central attention impairment (CAI; Yeshurun & Carrasco, 1998, 2000). We employed a cuing procedure in conjunction with selective adaptation to explore (1) whether the CPD is due to the inhibition of low spatial frequency responses by high spatial frequency responses in central locations, and (2) whether the CAI is due to attention's shifting sensitivity to higher spatial frequencies. We found that adaptation to low spatial frequencies does not change performance in this texture segmentation task. However, adaptation to high spatial frequencies diminishes the CPD and eliminates the CAI. These results indicate that the CPD is primarily due to the dominance of high spatial frequency responses and that covert attention enhances spatial resolution by shifting sensitivity to higher spatial frequencies.
Collapse
|
17
|
Petrov AA, Dosher BA, Lu ZL. Perceptual learning without feedback in non-stationary contexts: data and model. Vision Res 2006; 46:3177-97. [PMID: 16697434 DOI: 10.1016/j.visres.2006.03.022] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2005] [Revised: 03/13/2006] [Accepted: 03/14/2006] [Indexed: 11/24/2022]
Abstract
The role of feedback in perceptual learning is probed in an orientation discrimination experiment under destabilizing non-stationary conditions, and explored in a neural-network model. Experimentally, perceptual learning was examined with periodic alteration of a strong external noise context. The speed of learning, the performance loss at each change in external noise context (switch cost), and the asymptotic accuracy d' without feedback were very similar or identical to those with feedback. However, lack of feedback led to higher decision bias (error responses matching the external noise context). In the model, the stimulus representations are constant, whereas the read-out connections to a decision unit learn by a Hebbian plasticity rule that may be augmented by additional feedback input and criterion control of decision bias.
Collapse
Affiliation(s)
- Alexander A Petrov
- Department of Cognitive Sciences, University of California, Irvine, CA 92697, USA.
| | | | | |
Collapse
|
18
|
Abstract
The mechanisms of perceptual learning are analyzed theoretically, probed in an orientation-discrimination experiment involving a novel nonstationary context manipulation, and instantiated in a detailed computational model. Two hypotheses are examined: modification of early cortical representations versus task-specific selective reweighting. Representation modification seems neither functionally necessary nor implied by the available psychophysical and physiological evidence. Computer simulations and mathematical analyses demonstrate the functional and empirical adequacy of selective reweighting as a perceptual learning mechanism. The stimulus images are processed by standard orientation- and frequency-tuned representational units, divisively normalized. Learning occurs only in the "read-out" connections to a decision unit; the stimulus representations never change. An incremental Hebbian rule tracks the task-dependent predictive value of each unit, thereby improving the signal-to-noise ratio of their weighted combination. Each abrupt change in the environmental statistics induces a switch cost in the learning curves as the system temporarily works with suboptimal weights.
Collapse
Affiliation(s)
- Alexander A Petrov
- Department of Cognitive Sciences, University of California, Irvine, CA, USA.
| | | | | |
Collapse
|
19
|
Chen J, Pappas TN, Mojsilović A, Rogowitz BE. Adaptive perceptual color-texture image segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2005; 14:1524-36. [PMID: 16238058 DOI: 10.1109/tip.2005.852204] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
We propose a new approach for image segmentation that is based on low-level features for color and texture. It is aimed at segmentation of natural scenes, in which the color and texture of each segment does not typically exhibit uniform statistical characteristics. The proposed approach combines knowledge of human perception with an understanding of signal characteristics in order to segment natural scenes into perceptually/semantically uniform regions. The proposed approach is based on two types of spatially adaptive low-level features. The first describes the local color composition in terms of spatially adaptive dominant colors, and the second describes the spatial characteristics of the grayscale component of the texture. Together, they provide a simple and effective characterization of texture that the proposed algorithm uses to obtain robust and, at the same time, accurate and precise segmentations. The resulting segmentations convey semantic information that can be used for content-based retrieval. The performance of the proposed algorithms is demonstrated in the domain of photographic images, including low-resolution, degraded, and compressed images.
Collapse
|
20
|
Graham N, Wolfson SS. Is there opponent-orientation coding in the second-order channels of pattern vision? Vision Res 2005; 44:3145-75. [PMID: 15482802 DOI: 10.1016/j.visres.2004.07.018] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2003] [Revised: 03/05/2004] [Indexed: 10/26/2022]
Abstract
Is there opponency between orientation-selective processes in pattern perception, analogous to opponency between color mechanisms? Here we concentrate on possible opponency in second-order channels. We compare several possible second-order structures: SIGN-opponent-only channels in which there is no opponency between orientations (also called complex channels or filter-rectify-filter mechanisms); three structures we group under the name ORIENTATION-opponent; and finally BOTH-opponent channels which combine features of both SIGN-opponent-only and ORIENTATION-opponent channels but lead to predictions that are distinct from either of theirs. We measured observers' ability to segregate textures composed of checkerboard and striped arrangements of vertical and horizontal Gabor grating patches. The observers' performance was compared to model predictions from the alternative opponent structures. The experimental results are consistent with SIGN-opponent-only channels. The results rule out the ORIENTATION-opponent and BOTH-opponent structures. Further, when the models were expanded to include a contrast gain-control (inhibition among channels in a normalization network) the SIGN-opponent-only model was also able to explain a contrast-dependent effect we found, thus providing another piece of evidence that such normalization is an important process in human texture perception.
Collapse
Affiliation(s)
- Norma Graham
- Department of Psychology, Columbia University, Mail Code 5501, New York, NY 10027, USA.
| | | |
Collapse
|
21
|
Motoyoshi I, Nishida S. Cross-orientation summation in texture segregation. Vision Res 2004; 44:2567-76. [PMID: 15358072 DOI: 10.1016/j.visres.2004.05.024] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2003] [Revised: 05/25/2004] [Indexed: 11/17/2022]
Abstract
Human texture vision has been modeled as a filter-rectify-filter (FRF) process, in which '2nd-order' filters detect changes in the rectified outputs of luminance-based '1st-order' filters. This study tested the validity of the two basic assumptions of the standard FRF model, namely (a) that the 2nd-order filters are sensitive to spatial modulations in both contrast and orientation, and (b) that the 2nd-order filters are tuned to different 1st-order orientations. In the first experiment, we tested subthreshold summation between two orthogonal carrier orientations in detection of a texture region, which was defined by contrast modulations across regions in the two carrier orientations, while systematically varying the relative change magnitudes between the two orientations. The results showed that the detection thresholds were determined by spatial difference in the contrast integrated over the two orientations. Orientation difference did act as a segregation cue, but only when there was no differences in carrier contrast. This suggests that two mechanisms are involved in texture segregation; one that detects changes in luminance contrast and another that detects changes in orientation. To further analyze the latter mechanism, a second experiment measured cross-orientation summation in the detection of purely orientation-defined textures, using stimuli that were density modulations of two orientations presented among randomly-orientated distractors. Again, the relative modulation magnitudes between the two orientations was systematically varied. The results are consistent with the notions that (a) the dominant orientation is extracted from the 1st-order outputs before the 2nd-order process, and that (b) the 2nd-order, spatial comparison process integrates those dominant signals over different orientations.
Collapse
Affiliation(s)
- Isamu Motoyoshi
- Human and Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, 3-1 Morinosato-Wakamiya, Atsugi, Kanagawa, 243-0198, Japan.
| | | |
Collapse
|
22
|
Ellemberg D, Allen HA, Hess RF. Investigating local network interactions underlying first- and second-order processing. Vision Res 2004; 44:1787-97. [PMID: 15135994 DOI: 10.1016/j.visres.2004.02.012] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2003] [Revised: 02/25/2004] [Indexed: 11/27/2022]
Abstract
We compared the spatial lateral interactions for first-order cues to those for second-order cues, and investigated spatial interactions between these two types of cues. We measured the apparent modulation depth of a target Gabor at fixation, in the presence and the absence of horizontally flanking Gabors. The Gabors' gratings were either added to (first-order) or multiplied with (second-order) binary 2-D noise. Apparent "contrast" or modulation depth (i.e., the perceived difference between the high and low luminance regions for the first-order stimulus, or between the high and low contrast regions for the second-order stimulus) was measured with a modulation depth-matching paradigm. For each observer, the first- and second-order Gabors were equated for apparent modulation depth without the flankers. Our results indicate that at the smallest inter-element spacing, the perceived reduction in modulation depth is significantly smaller for the second-order than for the first-order stimuli. Further, lateral interactions operate over shorter distances and the spatial frequency and orientation tuning of the suppression effect are broader for second- than first-order stimuli. Finally, first- and second-order information interact in an asymmetrical fashion; second-order flankers do not reduce the apparent modulation depth of the first-order target, whilst first-order flankers reduce the apparent modulation depth of the second-order target.
Collapse
Affiliation(s)
- Dave Ellemberg
- Department of Ophthalmology, McGill Vision Research Unit, McGill University, 687 Pine Ave. West H4-14, Montreal, Que., Canada H3A 1A1.
| | | | | |
Collapse
|
23
|
Johnson AP, Baker CL. First- and second-order information in natural images: a filter-based approach to image statistics. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2004; 21:913-925. [PMID: 15191171 DOI: 10.1364/josaa.21.000913] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Previous analyses of natural image statistics have dealt mainly with their Fourier power spectra. Here we explore image statistics by examining responses to biologically motivated filters that are spatially localized and respond to first-order (luminance-defined) and second-order (contrast- or texture-defined) characteristics. We compare the distribution of natural image responses across filter parameters for first- and second-order information. We find that second-order information in natural scenes shows the same self-similarity previously described for first-order information but has substantially less orientational anisotropy. The magnitudes of the two kinds of information, as well as their mutual unsigned correlation, are much stronger for particular combinations of filter parameters in natural images but not in unstructured fractal images having the same power spectra.
Collapse
Affiliation(s)
- Aaron P Johnson
- McGill Vision Research Unit, Department of Ophthalmology, 687 Pine Avenue West, Room H4-14, Montréal, Québec, Canada, H3A 1A1.
| | | |
Collapse
|
24
|
Abstract
Many current psychophysical models propose that visual processing in cortex is hierarchical, with nonlinearities sandwiched between linear stages of processing. In earlier publications, we proposed a model of this type to account for masking effects found with spatial frequency and orientation discriminations. Our model includes two nonlinear mechanisms that regulate contrast sensitivity in early cortical mechanisms. The first is a local within-pathway nonlinearity that accelerates at low contrasts but is compressive at high. The second is a pooled nonlinear gain control process that operates over a broad range of neurons with different tuning characteristics. Here, we test predictions of the model for spatial frequency discriminations. The model predicts that at low contrasts, adding a grating mask oriented parallel to test gratings will improve discrimination performance via operation of the within-pathway nonlinearity, analogous to the "dipper effect" found with contrast discriminations. Adding an orthogonally oriented mask is predicted to have no effect at low contrasts, where pooled gain control processes contribute little to performance. At high contrasts, the model predicts that performance will asymptote and become independent of contrast with either parallel or orthogonal masks. The results confirm model predictions.
Collapse
Affiliation(s)
- Lynn A Olzak
- Department of Psychology, Miami University of Ohio, Oxford, OH 45056, USA.
| | | |
Collapse
|
25
|
Abstract
The segregation of texture patterns may be carried out by a set of linear spatial filters (to enhance one of the constituent textures), a nonlinearity (to convert the higher contrast of response to that constituent to a higher mean response), and finally subsequent ("second-order") linear spatial filters (to provide a strong response to the texture-defined edge itself). In this paper, the properties of such second-order filters are characterized. Observers were required to detect or discriminate textures that were modulated between predominantly horizontally oriented and predominantly vertically oriented noise patterns. Spatial summation for these patterns reached asymptote for a stimulus size of 15 x 15 deg. Modulation contrast sensitivity was nearly flat over a five-octave range of spatial frequency, but was bandpass when stated as efficiency (relative to an idealized observer confronted with the same task). Increment threshold showed the improved performance with a sub-threshold pedestal seen in the "dipper effect", but the typical Weber's law behavior at higher pedestal contrasts was not observed at the highest pedestal modulation contrasts achievable with our stimuli. Sub-threshold summation experiments indicate that second-order filters have a moderate bandwidth.
Collapse
Affiliation(s)
- Michael S Landy
- Department of Psychology and Center for Neural Science, New York University, 6 Washington Place, 8th floor, New York, NY 10003, USA.
| | | |
Collapse
|
26
|
Abstract
Higher order spatial correlations can capture edge and object relationships. Isotrigon textures are useful for studying our sensitivity to these correlations. We determined human discrimination performance for 18 isotrigon texture types and compared it with outputs from statistical discriminant models. Some of the models employed versions of the Allan Variance in receptive field outputs. Physiologically plausible mechanisms for such calculations are presented. Two discriminant models emulated human performance well, one based upon a global variance measure, and the other based upon a localised variance with an orientation bias. The 18 texture types were also shown to contain characteristic mini-textures.
Collapse
Affiliation(s)
- T Maddess
- Centre for Visual Sciences, Research School of Biological Sciences, Australian National University, Canberra ACT 0200, Australia.
| | | |
Collapse
|
27
|
Graham N, Wolfson SS. A note about preferred orientations at the first and second stages of complex (second-order) texture channels. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2001; 18:2273-2281. [PMID: 11551062 DOI: 10.1364/josaa.18.002273] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Complex (second-order) channels have been useful in explaining many of the phenomena of perceived texture segregation. These channels contain two stages of linear filtering with an intermediate pointwise nonlinearity. One unanswered question about these hypothetical channels is that of the relationship between the preferred orientations of the two stages of filtering. Is a particular orientation at the second stage equally likely to occur with all orientations at the first stage, or is there a bias in the "mapping" between the two stages' preferred orientations? In this study we consider two possible mappings: that where the orientations at the two stages are identical (called "consistent" here) and that where the orientations at the two stages are perpendicular ("inconsistent"). We explore these mappings using a texture-segregation task with textures composed of arrangements of grating-patch elements. The results imply that, to explain perceived texture segregation, complex channels with a consistent orientation mapping must be either somewhat more prevalent or more effective than those with an inconsistent mapping.
Collapse
Affiliation(s)
- N Graham
- Department of Psychology, Columbia University, New York, New York 10027, USA.
| | | |
Collapse
|
28
|
Adini Y, Sagi D. Recurrent networks in human visual cortex: psychophysical evidence. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2001; 18:2228-2236. [PMID: 11551058 DOI: 10.1364/josaa.18.002228] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
To study the neuronal circuitry underlying visual spatial-integration processes, we measured the effect of short and long chains of proximal Gabor-signal (GS) flankers (sigma = lambda = 0.15 degrees) on the contrast-discrimination function of a foveal GS target. We found that the same pattern of lateral masks enhanced target detection with low-contrast pedestals and strongly suppressed the discrimination of a range of intermediate pedestal contrasts (pedestal contrast <30%). Increasing the number of the flankers reversed the suppressive effect. The data suggest that the main influence of the proximal flankers is maintained by activity-dependent interactions and not by linear spatial summation. With an increased number of flankers, we found a nonmonotonic relationship between the discrimination thresholds and the number of flankers, supporting the notion that the discrimination thresholds are mediated by excitatory-inhibitory recurrent networks that manifest the dynamics of large neuronal populations in the neocortex [Proc. Natl. Acad. Sci. USA 94, 10426 (1997)].
Collapse
Affiliation(s)
- Y Adini
- Department of Neurobiology, Brain Research, The Weizmann Institute of Science, Rehovot, Israel
| | | |
Collapse
|
29
|
Abstract
We describe a form of nonlinear decomposition that is well-suited for efficient encoding of natural signals. Signals are initially decomposed using a bank of linear filters. Each filter response is then rectified and divided by a weighted sum of rectified responses of neighboring filters. We show that this decomposition, with parameters optimized for the statistics of a generic ensemble of natural images or sounds, provides a good characterization of the nonlinear response properties of typical neurons in primary visual cortex or auditory nerve, respectively. These results suggest that nonlinear response properties of sensory neurons are not an accident of biological implementation, but have an important functional role.
Collapse
Affiliation(s)
- O Schwartz
- Center for Neural Science, New York University, 4 Washington Place, Room 809, New York, New York 10003, USA
| | | |
Collapse
|