1
|
Canham T, Vazquez-Corral J, Mathieu E, Bertalmío M. Matching visual induction effects on screens of different size. J Vis 2021; 21:10. [PMID: 34144607 PMCID: PMC8237091 DOI: 10.1167/jov.21.6.10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
In the film industry, the same movie is expected to be watched on displays of vastly different sizes, from cinema screens to mobile phones. But visual induction, the perceptual phenomenon by which the appearance of a scene region is affected by its surroundings, will be different for the same image shown on two displays of different dimensions. This phenomenon presents a practical challenge for the preservation of the artistic intentions of filmmakers, because it can lead to shifts in image appearance between viewing destinations. In this work, we show that a neural field model based on the efficient representation principle is able to predict induction effects and how, by regularizing its associated energy functional, the model is still able to represent induction but is now invertible. From this finding, we propose a method to preprocess an image in a screen-size dependent way so that its perception, in terms of visual induction, may remain constant across displays of different size. The potential of the method is demonstrated through psychophysical experiments on synthetic images and qualitative examples on natural images.
Collapse
Affiliation(s)
- Trevor Canham
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain.,
| | - Javier Vazquez-Corral
- Computer Vision Center and the Computer Sciences Department at Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain., http://www.jvazquez-corral.net
| | | | - Marcelo Bertalmío
- Instituto de óptica, Spanish National Research Council (CSIC), Spain.,
| |
Collapse
|
2
|
A Cortical-Inspired Sub-Riemannian Model for Poggendorff-Type Visual Illusions. J Imaging 2021; 7:jimaging7030041. [PMID: 34460697 PMCID: PMC8321287 DOI: 10.3390/jimaging7030041] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 01/27/2021] [Accepted: 02/11/2021] [Indexed: 11/20/2022] Open
Abstract
We consider Wilson-Cowan-type models for the mathematical description of orientation-dependent Poggendorff-like illusions. Our modelling improves two previously proposed cortical-inspired approaches, embedding the sub-Riemannian heat kernel into the neuronal interaction term, in agreement with the intrinsically anisotropic functional architecture of V1 based on both local and lateral connections. For the numerical realisation of both models, we consider standard gradient descent algorithms combined with Fourier-based approaches for the efficient computation of the sub-Laplacian evolution. Our numerical results show that the use of the sub-Riemannian kernel allows us to reproduce numerically visual misperceptions and inpainting-type biases in a stronger way in comparison with the previous approaches.
Collapse
|
3
|
Bertalmío M, Calatroni L, Franceschi V, Franceschiello B, Gomez Villa A, Prandi D. Visual illusions via neural dynamics: Wilson-Cowan-type models and the efficient representation principle. J Neurophysiol 2020; 123:1606-1618. [PMID: 32159409 DOI: 10.1152/jn.00488.2019] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
We reproduce suprathreshold perception phenomena, specifically visual illusions, by Wilson-Cowan (WC)-type models of neuronal dynamics. Our findings show that the ability to replicate the illusions considered is related to how well the neural activity equations comply with the efficient representation principle. Our first contribution consists in showing that the WC equations can reproduce a number of brightness and orientation-dependent illusions. Then we formally prove that there cannot be an energy functional that the WC dynamics are minimizing. This leads us to consider an alternative, variational modeling, which has been previously employed for local histogram equalization (LHE) tasks. To adapt our model to the architecture of V1, we perform an extension that has an explicit dependence on local image orientation. Finally, we report several numerical experiments showing that LHE provides a better reproduction of visual illusions than the original WC formulation, and that its cortical extension is capable also to reproduce complex orientation-dependent illusions.NEW & NOTEWORTHY We show that the Wilson-Cowan equations can reproduce a number of brightness and orientation-dependent illusions. Then we formally prove that there cannot be an energy functional that the Wilson-Cowan equations are minimizing, making them suboptimal with respect to the efficient representation principle. We thus propose a slight modification that is consistent with such principle and show that this provides a better reproduction of visual illusions than the original Wilson-Cowan formulation. We also consider the cortical extension of both models to deal with more complex orientation-dependent illusions.
Collapse
Affiliation(s)
- Marcelo Bertalmío
- Departament de Tecnologies de la Informació i les Comunicacions, Universitat Pompeu Fabra, Barcelona, Spain
| | - Luca Calatroni
- UCA, CNRS, INRIA, Laboratoire d'Informatique, Signaux et Systèmes de Sophia Antipolis, Sophia Antipolis, France
| | - Valentina Franceschi
- Sorbonne Université, CNRS, Université de Paris, Inria, Laboratoire Jacques-Louis Lions (LJLL), Paris, France
| | - Benedetta Franceschiello
- Department of Ophthalmology, Fondation Asile des Aveugles, The Laboratory for Investigative Neurophysiology, Department of Radiology, University Hospital Center and University of Lausanne (CHUV), Lausanne, Switzerland
| | - Alexander Gomez Villa
- Departament de Tecnologies de la Informació i les Comunicacions, Universitat Pompeu Fabra, Barcelona, Spain
| | - Dario Prandi
- Université Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des Signaux et Systèmes, Gif-sur-Yvette, France
| |
Collapse
|
4
|
Song A, Faugeras O, Veltz R. A neural field model for color perception unifying assimilation and contrast. PLoS Comput Biol 2019; 15:e1007050. [PMID: 31173581 PMCID: PMC6583951 DOI: 10.1371/journal.pcbi.1007050] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 06/19/2019] [Accepted: 04/17/2019] [Indexed: 11/30/2022] Open
Abstract
We address the question of color-space interactions in the brain, by proposing a neural field model of color perception with spatial context for the visual area V1 of the cortex. Our framework reconciles two opposing perceptual phenomena, known as simultaneous contrast and chromatic assimilation. They have been previously shown to act synergistically, so that at some point in an image, the color seems perceptually more similar to that of adjacent neighbors, while being more dissimilar from that of remote ones. Thus, their combined effects are enhanced in the presence of a spatial pattern, and can be measured as larger shifts in color matching experiments. Our model supposes a hypercolumnar structure coding for colors in V1, and relies on the notion of color opponency introduced by Hering. The connectivity kernel of the neural field exploits the balance between attraction and repulsion in color and physical spaces, so as to reproduce the sign reversal in the influence of neighboring points. The color sensation at a point, defined from a steady state of the neural activities, is then extracted as a nonlinear percept conveyed by an assembly of neurons. It connects the cortical and perceptual levels, because we describe the search for a color match in asymmetric matching experiments as a mathematical projection on color sensations. We validate our color neural field alongside this color matching framework, by performing a multi-parameter regression to data produced by psychophysicists and ourselves. All the results show that we are able to explain the nonlinear behavior of shifts observed along one or two dimensions in color space, which cannot be done using a simple linear model. The color perception produced by an image heavily depends on the spatial distribution of its colors. From this “color in context” phenomenon, extensively studied in psychophysics for decades, has arisen the question in neuroscience of how color and space interact in the brain. Visual signals are indeed processed in such a way that neighboring pixels make the perception at some point different from its real color, inducing a color shift. In this work, we propose to emulate perception in context by modeling the activity of color sensitive neurons with a neural field. Our framework unifies two antagonistic effects, assimilation and contrast, which have been suggested to occur simultaneously but at different scales. We use the notion of color opponency inspired by the work of Hering, so as to express these effects as a combination of attraction and repulsion in physical and color spaces. We introduce the concept of “color sensation”, and show how to rigorously link the neural field model to perceptual shifts, by considering color matching as a mathematical projection on color sensations. The results show that our model is able to reproduce some nontrivial behaviors of the color shifts observed in experiments.
Collapse
Affiliation(s)
- Anna Song
- Student at Département de Mathématiques et Applications, École Normale Supérieure, 45 rue d’Ulm, 75005, Paris, France
- * E-mail: ,
| | - Olivier Faugeras
- MathNeuro Team, Inria Sophia Antipolis Méditerranée, 2004 Route des Lucioles-BP 93, 06902, Sophia Antipolis, France
- TOSCA Team, Inria Sophia Antipolis Méditerranée, 2004 Route des Lucioles-BP 93, 06902, Sophia Antipolis, France
| | - Romain Veltz
- MathNeuro Team, Inria Sophia Antipolis Méditerranée, 2004 Route des Lucioles-BP 93, 06902, Sophia Antipolis, France
| |
Collapse
|
5
|
Martinez-Garcia M, Bertalmío M, Malo J. In Praise of Artifice Reloaded: Caution With Natural Image Databases in Modeling Vision. Front Neurosci 2019; 13:8. [PMID: 30894796 PMCID: PMC6414813 DOI: 10.3389/fnins.2019.00008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Accepted: 01/07/2019] [Indexed: 11/13/2022] Open
Abstract
Subjective image quality databases are a major source of raw data on how the visual system works in naturalistic environments. These databases describe the sensitivity of many observers to a wide range of distortions of different nature and intensity seen on top of a variety of natural images. Data of this kind seems to open a number of possibilities for the vision scientist to check the models in realistic scenarios. However, while these natural databases are great benchmarks for models developed in some other way (e.g., by using the well-controlled artificial stimuli of traditional psychophysics), they should be carefully used when trying to fit vision models. Given the high dimensionality of the image space, it is very likely that some basic phenomena are under-represented in the database. Therefore, a model fitted on these large-scale natural databases will not reproduce these under-represented basic phenomena that could otherwise be easily illustrated with well selected artificial stimuli. In this work we study a specific example of the above statement. A standard cortical model using wavelets and divisive normalization tuned to reproduce subjective opinion on a large image quality dataset fails to reproduce basic cross-masking. Here we outline a solution for this problem by using artificial stimuli and by proposing a modification that makes the model easier to tune. Then, we show that the modified model is still competitive in the large-scale database. Our simulations with these artificial stimuli show that when using steerable wavelets, the conventional unit norm Gaussian kernels in divisive normalization should be multiplied by high-pass filters to reproduce basic trends in masking. Basic visual phenomena may be misrepresented in large natural image datasets but this can be solved with model-interpretable stimuli. This is an additional argument in praise of artifice in line with Rust and Movshon (2005).
Collapse
Affiliation(s)
- Marina Martinez-Garcia
- Image Processing Lab, Universitat de València Valencia, Spain.,CSIC, Instituto de Neurociencias Alicante, Spain
| | - Marcelo Bertalmío
- Departamento de Tecnologías de la Información y las Comunicaciones, Universidad Pompeu Fabra Barcelona, Spain
| | - Jesús Malo
- Image Processing Lab, Universitat de València Valencia, Spain
| |
Collapse
|
6
|
Derivatives and inverse of cascaded linear+nonlinear neural models. PLoS One 2018; 13:e0201326. [PMID: 30321175 PMCID: PMC6188639 DOI: 10.1371/journal.pone.0201326] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Accepted: 07/11/2018] [Indexed: 11/20/2022] Open
Abstract
In vision science, cascades of Linear+Nonlinear transforms are very successful in modeling a number of perceptual experiences. However, the conventional literature is usually too focused on only describing the forward input-output transform. Instead, in this work we present the mathematics of such cascades beyond the forward transform, namely the Jacobian matrices and the inverse. The fundamental reason for this analytical treatment is that it offers useful analytical insight into the psychophysics, the physiology, and the function of the visual system. For instance, we show how the trends of the sensitivity (volume of the discrimination regions) and the adaptation of the receptive fields can be identified in the expression of the Jacobian w.r.t. the stimulus. This matrix also tells us which regions of the stimulus space are encoded more efficiently in multi-information terms. The Jacobian w.r.t. the parameters shows which aspects of the model have bigger impact in the response, and hence their relative relevance. The analytic inverse implies conditions for the response and model parameters to ensure appropriate decoding. From the experimental and applied perspective, (a) the Jacobian w.r.t. the stimulus is necessary in new experimental methods based on the synthesis of visual stimuli with interesting geometrical properties, (b) the Jacobian matrices w.r.t. the parameters are convenient to learn the model from classical experiments or alternative goal optimization, and (c) the inverse is a promising model-based alternative to blind machine-learning methods for neural decoding that do not include meaningful biological information. The theory is checked by building and testing a vision model that actually follows a modular Linear+Nonlinear program. Our illustrative derivable and invertible model consists of a cascade of modules that account for brightness, contrast, energy masking, and wavelet masking. To stress the generality of this modular setting we show examples where some of the canonical Divisive Normalization modules are substituted by equivalent modules such as the Wilson-Cowan interaction model (at the V1 cortex) or a tone-mapping model (at the retina).
Collapse
|
7
|
Retinal Lateral Inhibition Provides the Biological Basis of Long-Range Spatial Induction. PLoS One 2016; 11:e0168963. [PMID: 28030651 PMCID: PMC5193432 DOI: 10.1371/journal.pone.0168963] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 12/05/2016] [Indexed: 11/19/2022] Open
Abstract
Retinal lateral inhibition is one of the conventional efficient coding mechanisms in the visual system that is produced by interneurons that pool signals over a neighborhood of presynaptic feedforward cells and send inhibitory signals back to them. Thus, the receptive-field (RF) of a retinal ganglion cell has a center-surround receptive-field (RF) profile that is classically represented as a difference-of-Gaussian (DOG) adequate for efficient spatial contrast coding. The DOG RF profile has been attributed to produce the psychophysical phenomena of brightness induction, in which the perceived brightness of an object is affected by that of its vicinity, either shifting away from it (brightness contrast) or becoming more similar to it (brightness assimilation) depending on the size of the surfaces surrounding the object. While brightness contrast can be modeled using a DOG with a narrow surround, brightness assimilation requires a wide suppressive surround. Early retinal studies determined that the suppressive surround of a retinal ganglion cell is narrow (< 100–300 μm; ‘classic RF’), which led researchers to postulate that brightness assimilation must originate at some post-retinal, possibly cortical, stage where long-range interactions are feasible. However, more recent studies have reported that the retinal interneurons also exhibit a spatially wide component (> 500–1000 μm). In the current study, we examine the effect of this wide interneuron RF component in two biophysical retinal models and show that for both of the retinal models it explains the long-range effect evidenced in simultaneous brightness induction phenomena and that the spatial extent of this long-range effect of the retinal model responses matches that of perceptual data. These results suggest that the retinal lateral inhibition mechanism alone can regulate local as well as long-range spatial induction through the narrow and wide RF components of retinal interneurons, arguing against the existing view that spatial induction is operated by two separate local vs. long-range mechanisms.
Collapse
|
8
|
Rodríguez-Sánchez AJ, Fallah M, Leonardis A. Editorial: Hierarchical Object Representations in the Visual Cortex and Computer Vision. Front Comput Neurosci 2015; 9:142. [PMID: 26635595 PMCID: PMC4653288 DOI: 10.3389/fncom.2015.00142] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 11/06/2015] [Indexed: 11/29/2022] Open
Affiliation(s)
- Antonio J Rodríguez-Sánchez
- Intelligent and Interactive Systems, Department of Computer Science, University of Innsbruck Innsbruck, Austria
| | - Mazyar Fallah
- Visual Perception and Attention Laboratory, Centre for Vision Research, School of Kinesiology and Health Science, York University Toronto, ON, Canada
| | - Aleš Leonardis
- School of Computer Science, University of Birmingham Birmingham, UK
| |
Collapse
|