1
|
Vafaii H, Yates JL, Butts DA. Hierarchical VAEs provide a normative account of motion processing in the primate brain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.27.559646. [PMID: 37808629 PMCID: PMC10557690 DOI: 10.1101/2023.09.27.559646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
The relationship between perception and inference, as postulated by Helmholtz in the 19th century, is paralleled in modern machine learning by generative models like Variational Autoencoders (VAEs) and their hierarchical variants. Here, we evaluate the role of hierarchical inference and its alignment with brain function in the domain of motion perception. We first introduce a novel synthetic data framework, Retinal Optic Flow Learning (ROFL), which enables control over motion statistics and their causes. We then present a new hierarchical VAE and test it against alternative models on two downstream tasks: (i) predicting ground truth causes of retinal optic flow (e.g., self-motion); and (ii) predicting the responses of neurons in the motion processing pathway of primates. We manipulate the model architectures (hierarchical versus non-hierarchical), loss functions, and the causal structure of the motion stimuli. We find that hierarchical latent structure in the model leads to several improvements. First, it improves the linear decodability of ground truth factors and does so in a sparse and disentangled manner. Second, our hierarchical VAE outperforms previous state-of-the-art models in predicting neuronal responses and exhibits sparse latent-to-neuron relationships. These results depend on the causal structure of the world, indicating that alignment between brains and artificial neural networks depends not only on architecture but also on matching ecologically relevant stimulus statistics. Taken together, our results suggest that hierarchical Bayesian inference underlines the brain's understanding of the world, and hierarchical VAEs can effectively model this understanding.
Collapse
|
2
|
Ong J, Tavakkoli A, Strangman G, Zaman N, Kamran SA, Zhang Q, Ivkovic V, Lee AG. Neuro-ophthalmic Imaging and Visual Assessment Technology for Spaceflight Associated Neuro-ocular Syndrome (SANS). Surv Ophthalmol 2022; 67:1443-1466. [DOI: 10.1016/j.survophthal.2022.04.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 04/15/2022] [Accepted: 04/18/2022] [Indexed: 12/11/2022]
|
3
|
Ayhan I, Doyle E, Zanker J. Measuring image distortions arising from age-related macular degeneration: An Iterative Amsler Grid (IAG). MedComm (Beijing) 2022; 3:e107. [PMID: 35281788 PMCID: PMC8906453 DOI: 10.1002/mco2.107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 11/12/2021] [Accepted: 11/18/2021] [Indexed: 11/08/2022] Open
Abstract
Metamorphopsia, perceived as distortion of a shape, is experienced in age-related macular degeneration (AMD): straight lines appear to be curved and wavy to AMD patients and some other retinal pathologies. Conventional clinical assessment largely relies on asking patients to identify irregularities in Amsler Grids - a standardized set of equally spaced vertical and horizontal lines. Perceived distortions or gaps in the grid are a sign of macular pathology. Here, we developed an iterative Amsler Grid (IAG) procedure to obtain a quantifiable map of visual deformations. Horizontal and vertical line segments representing metamorphopsia are displayed on a computer screen. Line segments appearing distorted are adjusted by participants using the computer mouse to change their orientation in several iteratively such that they appear straight. Control participants are able to reliably correct deformations that simulate metamorphopsia while maintaining fixation in the center. In a pilot experiment, we attempted to obtain deformation maps from a small number of AMD patients. Whereas some patients with extensive scotomas found this procedure challenging, others were comfortable using the IAG and generating deformation maps corresponding to their subjective reports. This procedure may potentially be used to quantify local distortions and map them reliably in patients with early AMD.
Collapse
Affiliation(s)
- Inci Ayhan
- Department of PsychologyBoğaziçi UniversityIstanbulTurkey
| | - Edward Doyle
- Department of OphthalmologyTorbay HospitalTorquayUK
| | - Johannes Zanker
- Department of PsychologyRoyal Holloway University of LondonEghamUK
| |
Collapse
|
4
|
Zaidel A, Laurens J, DeAngelis GC, Angelaki DE. Supervised Multisensory Calibration Signals Are Evident in VIP But Not MSTd. J Neurosci 2021; 41:10108-10119. [PMID: 34716232 PMCID: PMC8660052 DOI: 10.1523/jneurosci.0135-21.2021] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 08/23/2021] [Accepted: 09/17/2021] [Indexed: 11/21/2022] Open
Abstract
Multisensory plasticity enables our senses to dynamically adapt to each other and the external environment, a fundamental operation that our brain performs continuously. We searched for neural correlates of adult multisensory plasticity in the dorsal medial superior temporal area (MSTd) and the ventral intraparietal area (VIP) in 2 male rhesus macaques using a paradigm of supervised calibration. We report little plasticity in neural responses in the relatively low-level multisensory cortical area MSTd. In contrast, neural correlates of plasticity are found in higher-level multisensory VIP, an area with strong decision-related activity. Accordingly, we observed systematic shifts of VIP tuning curves, which were reflected in the choice-related component of the population response. This is the first demonstration of neuronal calibration, together with behavioral calibration, in single sessions. These results lay the foundation for understanding multisensory neural plasticity, applicable broadly to maintaining accuracy for sensorimotor tasks.SIGNIFICANCE STATEMENT Multisensory plasticity is a fundamental and continual function of the brain that enables our senses to adapt dynamically to each other and to the external environment. Yet, very little is known about the neuronal mechanisms of multisensory plasticity. In this study, we searched for neural correlates of adult multisensory plasticity in the dorsal medial superior temporal area (MSTd) and the ventral intraparietal area (VIP) using a paradigm of supervised calibration. We found little plasticity in neural responses in the relatively low-level multisensory cortical area MSTd. By contrast, neural correlates of plasticity were found in VIP, a higher-level multisensory area with strong decision-related activity. This is the first demonstration of neuronal calibration, together with behavioral calibration, in single sessions.
Collapse
Affiliation(s)
- Adam Zaidel
- Gonda Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat Gan, 5290002, Israel
| | - Jean Laurens
- Ernst Strüngmann Institute for Neuroscience in Cooperation with Max Planck Society, Frankfurt 60528, Germany
| | - Gregory C DeAngelis
- Department of Brain and Cognitive Sciences, Center for Visual Science, University of Rochester, Rochester 14627, New York
| | - Dora E Angelaki
- Center for Neural Science and Tandon School of Engineering, New York University, New York 10003, New York
| |
Collapse
|
5
|
Dear M, Harrison WJ. The influence of visual distortion on face recognition. Cortex 2021; 146:238-249. [PMID: 34915394 DOI: 10.1016/j.cortex.2021.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Revised: 08/26/2021] [Accepted: 10/29/2021] [Indexed: 11/03/2022]
Abstract
A person's ability to recognise familiar faces is critical to their participation in many aspects of society. Following an acquired brain injury or retinal disease, however, faces can appear distorted, a phenomenon known as prosopometamorphopsia. Although case reports have described a variety of changes in the appearance of faces during prosopometamorphopsia, the influence of the disorder on face recognition has not been rigorously investigated. In the present report, we quantify how well healthy observers can recognise familiar faces that have been distorted using a parametric model of prosopometamorphopsia. Our results reveal that face recognition varies systematically with the parameters of visual distortion, which, importantly, interact with the size of the face in a nonlinear but highly predictable manner. Our findings demonstrate that prosopometamorphopsia can lead to a surprising range of changes in the appearance of faces. The impact of visual distortion on face recognition thus depends critically on the distance at which the face is viewed, which is likely to change across social and clinical contexts.
Collapse
Affiliation(s)
- Micaela Dear
- Queensland Brain Institute, The University of Queensland, St Lucia, QLD, Australia; School of Psychology, The University of Queensland, St Lucia, QLD, Australia; Melbourne School of Psychological Science, University of Melbourne, Parkville, VIC, Australia
| | - William J Harrison
- Queensland Brain Institute, The University of Queensland, St Lucia, QLD, Australia; School of Psychology, The University of Queensland, St Lucia, QLD, Australia.
| |
Collapse
|
6
|
Visual memorability in the absence of semantic content. Cognition 2021; 212:104714. [DOI: 10.1016/j.cognition.2021.104714] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2018] [Revised: 02/15/2021] [Accepted: 03/28/2021] [Indexed: 11/17/2022]
|
7
|
Melnik N, Coates DR, Sayim B. Geometrically restricted image descriptors: A method to capture the appearance of shape. J Vis 2021; 21:14. [PMID: 33688921 PMCID: PMC7961119 DOI: 10.1167/jov.21.3.14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Shape perception varies depending on many factors. For example, presenting a stimulus in the periphery often yields a different appearance compared with its foveal presentation. However, how exactly shape appearance is altered under different conditions remains elusive. One reason for this is that studies typically measure identification performance, leaving details about target appearance unknown. The lack of appearance-based methods and general challenges to quantify appearance complicate the investigation of shape appearance. Here, we introduce Geometrically Restricted Image Descriptors (GRIDs), a method to investigate the appearance of shapes. Stimuli in the GRID paradigm are shapes consisting of distinct line elements placed on a grid by connecting grid nodes. Each line is treated as a discrete target. Observers are asked to capture target appearance by placing lines on a freely viewed response grid. We used GRIDs to investigate the appearance of letters and letter-like shapes. Targets were presented at 10° eccentricity in the right visual field. Gaze-contingent stimulus presentation was used to prevent eye movements to the target. The data were analyzed by quantifying the differences between targets and response in regard to overall accuracy, element discriminability, and several distinct error types. Our results show how shape appearance can be captured by GRIDs, and how a fine-grained analysis of stimulus parts provides quantifications of appearance typically not available in standard measures of performance. We propose that GRIDs are an effective tool to investigate the appearance of shapes.
Collapse
Affiliation(s)
- Natalia Melnik
- Institute of Psychology, University of Bern, Bern, Switzerland.,
| | - Daniel R Coates
- Institute of Psychology, University of Bern, Bern, Switzerland and College of Optometry, University of Houston, Houston, Texas, USA.,
| | - Bilge Sayim
- Institute of Psychology, University of Bern, Bern, Switzerland and Univ. Lille, CNRS, UMR 9193 - SCALab - Sciences Cognitives et Sciences Affectives, Lille, France., http://www.appearancelab.org/
| |
Collapse
|
8
|
Kosovicheva A, Bex PJ. Gravitational effects of scene information in object localization. Sci Rep 2021; 11:11520. [PMID: 34075169 PMCID: PMC8169838 DOI: 10.1038/s41598-021-91006-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Accepted: 05/20/2021] [Indexed: 02/04/2023] Open
Abstract
We effortlessly interact with objects in our environment, but how do we know where something is? An object's apparent position does not simply correspond to its retinotopic location but is influenced by its surrounding context. In the natural environment, this context is highly complex, and little is known about how visual information in a scene influences the apparent location of the objects within it. We measured the influence of local image statistics (luminance, edges, object boundaries, and saliency) on the reported location of a brief target superimposed on images of natural scenes. For each image statistic, we calculated the difference between the image value at the physical center of the target and the value at its reported center, using observers' cursor responses, and averaged the resulting values across all trials. To isolate image-specific effects, difference scores were compared to a randomly-permuted null distribution that accounted for any response biases. The observed difference scores indicated that responses were significantly biased toward darker regions, luminance edges, object boundaries, and areas of high saliency, with relatively low shared variance among these measures. In addition, we show that the same image statistics were associated with observers' saccade errors, despite large differences in response time, and that some effects persisted when high-level scene processing was disrupted by 180° rotations and color negatives of the originals. Together, these results provide evidence for landmark effects within natural images, in which feature location reports are pulled toward low- and high-level informative content in the scene.
Collapse
Affiliation(s)
- Anna Kosovicheva
- grid.17063.330000 0001 2157 2938Department of Psychology, University of Toronto Mississauga, 3359 Mississauga Road, Mississauga, ON L5L 1C6 Canada ,grid.261112.70000 0001 2173 3359Department of Psychology, Northeastern University, 125 Nightingale Hall, 360 Huntington Ave., Boston, MA 02115 USA
| | - Peter J. Bex
- grid.261112.70000 0001 2173 3359Department of Psychology, Northeastern University, 125 Nightingale Hall, 360 Huntington Ave., Boston, MA 02115 USA
| |
Collapse
|
9
|
Abstract
Detection of target objects in the surrounding environment is a common visual task. There is a vast psychophysical and modeling literature concerning the detection of targets in artificial and natural backgrounds. Most studies involve detection of additive targets or of some form of image distortion. Although much has been learned from these studies, the targets that most often occur under natural conditions are neither additive nor distorting; rather, they are opaque targets that occlude the backgrounds behind them. Here, we describe our efforts to measure and model detection of occluding targets in natural backgrounds. To systematically vary the properties of the backgrounds, we used the constrained sampling approach of Sebastian, Abrams, and Geisler (2017). Specifically, millions of calibrated gray-scale natural-image patches were sorted into a 3D histogram along the dimensions of luminance, contrast, and phase-invariant similarity to the target. Eccentricity psychometric functions (accuracy as a function of retinal eccentricity) were measured for four different occluding targets and 15 different combinations of background luminance, contrast, and similarity, with a different randomly sampled background on each trial. The complex pattern of results was consistent across the three subjects, and was largely explained by a principled model observer (with only a single efficiency parameter) that combines three image cues (pattern, silhouette, and edge) and four well-known properties of the human visual system (optical blur, blurring and downsampling by the ganglion cells, divisive normalization, intrinsic position uncertainty). The model also explains the thresholds for additive foveal targets in natural backgrounds reported in Sebastian et al. (2017).
Collapse
|
10
|
Clayden AC, Fisher RB, Nuthmann A. On the relative (un)importance of foveal vision during letter search in naturalistic scenes. Vision Res 2020; 177:41-55. [PMID: 32957035 DOI: 10.1016/j.visres.2020.07.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 07/10/2020] [Accepted: 07/13/2020] [Indexed: 11/26/2022]
Abstract
The importance of high-acuity foveal vision to visual search can be assessed by denying foveal vision using the gaze-contingent Moving Mask technique. Foveal vision was necessary to attain normal performance when searching for a target letter in alphanumeric displays, Perception & Psychophysics, 62 (2000) 576-585. In contrast, foveal vision was not necessary to correctly locate and identify medium-sized target objects in natural scenes, Journal of Experimental Psychology: Human Perception and Performance, 40 (2014) 342-360. To explore these task differences, we used grayscale pictures of real-world scenes which included a target letter (Experiment 1: T, Experiment 2: T or L). To reduce between-scene variability with regard to target salience, we developed the Target Embedding Algorithm (T.E.A.) to place the letter in a location for which there was a median change in local contrast when inserting the letter into the scene. The presence or absence of foveal vision was crossed with four target sizes. In both experiments, search performance decreased for smaller targets, and was impaired when searching the scene without foveal vision. For correct trials, the process of target localization remained completely unimpaired by the foveal scotoma, but it took longer to accept the target. We reasoned that the size of the target may affect the importance of foveal vision to the task, but the present data remain ambiguous. In summary, the data highlight the importance of extrafoveal vision for target localization, and the importance of foveal vision for target verification during letter-in-scene search.
Collapse
Affiliation(s)
- Adam C Clayden
- Psychology Department, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, UK; School of Engineering, Arts, Science and Technology, University of Suffolk, UK
| | | | - Antje Nuthmann
- Psychology Department, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, UK; Institute of Psychology, University of Kiel, Germany.
| |
Collapse
|
11
|
Krarup TG, Nisted I, Christensen U, Kiilgaard JF, la Cour M. The tolerance of anisometropia. Acta Ophthalmol 2020; 98:418-426. [PMID: 31773911 DOI: 10.1111/aos.14310] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 10/31/2019] [Indexed: 11/28/2022]
Abstract
PURPOSE This study examines aniseikonia, Aniseikonia tolerance range (ATR), anisometropia and patient-reported outcomes (PRO) in an anisometropic population compared with a non-anisometropic population. The relationship between anisometropia and aniseikonia is determined, and the correlations between aniseikonia, anisometropia and ATR versus PRO are described. METHODS One hundred and twenty-three patients with IOL-induced anisometropia ≥1 dioptre (D) (the anisometropic group) and 17 patients who had IOL-induced anisometropia <1 D (the control group) were included. Best corrected visual acuity, aniseikonia, ATR and stereoacuity were examined, and two questionnaires were completed: convergence insufficiency symptom survey (CISS) and Visual Function Questionnaire (VFQ-39). RESULTS One hundred and thirteen patients had anisometropia >1 and <3 D, and 10 patients had anisometropia >3 D. There was no difference in PRO between the control group and the anisometropic group (Mann-Whitney, p-values VFQ: 0.96, CISS: 0.06). There was no correlation between anisometropia and PRO (Spearman's rank correlation test p-values: VFQ: 0.54, CISS: 0.57). Patients with low ATR were more sensitive towards anisometropia and had lower PRO than patients with high ATR (Mann-Whitney, p-values: VFQ: 0.0008, CISS: 0.11). A large tolerance of aniseikonia was observed. CONCLUSION No correlation between PRO and anisometropia or aniseikonia was found. Patients with low ATR are at risk of visual complaints if they are exposed to IOL-induced anisometropia. ATR might be a future screening tool in cataract patients.
Collapse
Affiliation(s)
| | - Ivan Nisted
- Faculty of Health Institute for Clinical Medicine Aarhus N Denmark
| | - Ulrik Christensen
- Department of Ophthalmology Rigshospitalet‐Glostrup Glostrup Denmark
| | | | - Morten la Cour
- Department of Ophthalmology Rigshospitalet‐Glostrup Glostrup Denmark
| |
Collapse
|
12
|
Jennings BJ, Schmidtmann G, Wehbé F, Kingdom FAA, Farivar R. Detection of distortions in images of natural scenes in mild traumatic brain injury patients. Vision Res 2019; 161:12-17. [PMID: 31129288 DOI: 10.1016/j.visres.2019.05.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 05/09/2019] [Accepted: 05/12/2019] [Indexed: 10/26/2022]
Abstract
Mild traumatic brain injuries (mTBI) frequently lead to the impairment of visual functions including blurred and/or distorted vision, due to the disruption of visual cortical mechanisms. Previous mTBI studies have focused on specific aspects of visual processing, e.g., stereopsis, using artificial, low-level, stimuli (e.g., Gaussian patches and gratings). In the current study we investigated high-level visual processing by employing images of real world natural scenes as our stimuli. Both an mTBI group and control group composed of healthy observers were tasked with detecting sinusoidal distortions added to the natural scene stimuli as a function of the distorting sinusoid's spatial frequency. It was found that the mTBI group were equally as sensitive to high frequency distortions as the control group. However, sensitivity decreased more rapidly with decreasing distortion frequency in the mTBI group relative to the controls. These data reflect a deficit in the mTBI group to spatially integrate over larger regions of the scene.
Collapse
Affiliation(s)
- B J Jennings
- Centre for Cognitive Neuroscience, Department of Life Sciences, College of Health and Life Sciences, Brunel University, London, UK; McGill Vision Research Unit, Department of Ophthalmology, McGill University, Montreal, Canada.
| | - G Schmidtmann
- McGill Vision Research Unit, Department of Ophthalmology, McGill University, Montreal, Canada; Eye & Vision Research Group, Department of Optometry, University of Plymouth, Plymouth, UK
| | - F Wehbé
- McGill Vision Research Unit, Department of Ophthalmology, McGill University, Montreal, Canada
| | - F A A Kingdom
- McGill Vision Research Unit, Department of Ophthalmology, McGill University, Montreal, Canada
| | - R Farivar
- McGill Vision Research Unit, Department of Ophthalmology, McGill University, Montreal, Canada; Traumatic Brain Injury Program, Research Institute of the McGill University Health Centre, Montreal, Canada
| |
Collapse
|
13
|
Wallis TS, Funke CM, Ecker AS, Gatys LA, Wichmann FA, Bethge M. Image content is more important than Bouma's Law for scene metamers. eLife 2019; 8:42512. [PMID: 31038458 PMCID: PMC6491040 DOI: 10.7554/elife.42512] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 03/09/2019] [Indexed: 11/16/2022] Open
Abstract
We subjectively perceive our visual field with high fidelity, yet peripheral distortions can go unnoticed and peripheral objects can be difficult to identify (crowding). Prior work showed that humans could not discriminate images synthesised to match the responses of a mid-level ventral visual stream model when information was averaged in receptive fields with a scaling of about half their retinal eccentricity. This result implicated ventral visual area V2, approximated ‘Bouma’s Law’ of crowding, and has subsequently been interpreted as a link between crowding zones, receptive field scaling, and our perceptual experience. However, this experiment never assessed natural images. We find that humans can easily discriminate real and model-generated images at V2 scaling, requiring scales at least as small as V1 receptive fields to generate metamers. We speculate that explaining why scenes look as they do may require incorporating segmentation and global organisational constraints in addition to local pooling. As you read this digest, your eyes move to follow the lines of text. But now try to hold your eyes in one position, while reading the text on either side and below: it soon becomes clear that peripheral vision is not as good as we tend to assume. It is not possible to read text far away from the center of your line of vision, but you can see ‘something’ out of the corner of your eye. You can see that there is text there, even if you cannot read it, and you can see where your screen or page ends. So how does the brain generate peripheral vision, and why does it differ from what you see when you look straight ahead? One idea is that the visual system averages information over areas of the peripheral visual field. This gives rise to texture-like patterns, as opposed to images made up of fine details. Imagine looking at an expanse of foliage, gravel or fur, for example. Your eyes cannot make out the individual leaves, pebbles or hairs. Instead, you perceive an overall pattern in the form of a texture. Our peripheral vision may also consist of such textures, created when the brain averages information over areas of space. Wallis, Funke et al. have now tested this idea using an existing computer model that averages visual input in this way. By giving the model a series of photographs to process, Wallis, Funke et al. obtained images that should in theory simulate peripheral vision. If the model mimics the mechanisms that generate peripheral vision, then healthy volunteers should be unable to distinguish the processed images from the original photographs. But in fact, the participants could easily discriminate the two sets of images. This suggests that the visual system does not solely use textures to represent information in the peripheral visual field. Wallis, Funke et al. propose that other factors, such as how the visual system separates and groups objects, may instead determine what we see in our peripheral vision. This knowledge could ultimately benefit patients with eye diseases such as macular degeneration, a condition that causes loss of vision in the center of the visual field and forces patients to rely on their peripheral vision.
Collapse
Affiliation(s)
- Thomas Sa Wallis
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Christina M Funke
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Alexander S Ecker
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany.,Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, United States.,Institute for Theoretical Physics, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Leon A Gatys
- Werner Reichardt Center for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Felix A Wichmann
- Neural Information Processing Group, Faculty of Science, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Matthias Bethge
- Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine, Houston, United States.,Institute for Theoretical Physics, Eberhard Karls Universität Tübingen, Tübingen, Germany.,Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| |
Collapse
|
14
|
Valsecchi M, Koenderink J, van Doorn A, Gegenfurtner KR. Prediction shapes peripheral appearance. J Vis 2019; 18:21. [PMID: 30593064 DOI: 10.1167/18.13.21] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Peripheral perception is limited in terms of visual acuity, contrast sensitivity, and positional uncertainty. In the present study we used an image-manipulation algorithm (the Eidolon Factory) based on a formal description of the visual field as a tool to investigate how peripheral stimuli appear in the presence of such limitations. Observers were asked to match central and peripheral stimuli, both configurations of superimposed geometric shapes and patches of natural images, in terms of the parameters controlling the amplitude of the perturbation (reach) and the cross-scale similarity of the perturbation (coherence). We found that observers systematically tended to report the peripheral stimuli as having shorter reach and higher coherence. This means that their matches both were less distorted and had sharper edges relative to the actual stimulus. Overall, the results indicate that the way we see objects in our peripheral visual field is complemented by our assumptions about the way the same objects would appear if they were viewed foveally.
Collapse
Affiliation(s)
- Matteo Valsecchi
- Abteilung Allgemeine Psychologie, Justus-Liebig-Universität Giessen, Giessen, Germany
| | - Jan Koenderink
- Abteilung Allgemeine Psychologie, Justus-Liebig-Universität Giessen, Giessen, Germany.,Experimental Psychology, KU Leuven, Leuven, Belgium.,Experimental Psychology, Utrecht University, Utrecht, the Netherlands
| | - Andrea van Doorn
- Abteilung Allgemeine Psychologie, Justus-Liebig-Universität Giessen, Giessen, Germany.,Experimental Psychology, KU Leuven, Leuven, Belgium.,Experimental Psychology, Utrecht University, Utrecht, the Netherlands
| | - Karl R Gegenfurtner
- Abteilung Allgemeine Psychologie, Justus-Liebig-Universität Giessen, Giessen, Germany
| |
Collapse
|
15
|
Fruend I, Stalker E. Human sensitivity to perturbations constrained by a model of the natural image manifold. J Vis 2018; 18:20. [PMID: 30383190 DOI: 10.1167/18.11.20] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Humans are remarkably well tuned to the statistical properties of natural images. However, quantitative characterization of processing within the domain of natural images has been difficult because most parametric manipulations of a natural image make that image appear less natural. We used generative adversarial networks (GANs) to constrain parametric manipulations to remain within an approximation of the manifold of natural images. In the first experiment, seven observers decided which one of two synthetic perturbed images matched a synthetic unperturbed comparison image. Observers were significantly more sensitive to perturbations that were constrained to an approximate manifold of natural images than they were to perturbations applied directly in pixel space. Trial-by-trial errors were consistent with the idea that these perturbations disrupt configural aspects of visual structure used in image segmentation. In a second experiment, five observers discriminated paths along the image manifold as recovered by the GAN. Observers were remarkably good at this task, confirming that observers are tuned to fairly detailed properties of an approximate manifold of natural images. We conclude that human tuning to natural images is more general than detecting deviations from natural appearance, and that humans have, to some extent, access to detailed interrelations between natural images.
Collapse
Affiliation(s)
- Ingo Fruend
- Centre for Vision Research and Department of Psychology, York University, Toronto, Ontario, Canada
| | - Elee Stalker
- Department of Psychology York University, Toronto, Ontario, Canada
| |
Collapse
|
16
|
Huang M, Shen Q, Ma Z, Bovik AC, Gupta P, Zhou R, Cao X. Modeling the Perceptual Quality of Immersive Images Rendered on Head Mounted Displays: Resolution and Compression. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:6039-6050. [PMID: 30106732 DOI: 10.1109/tip.2018.2865089] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
We develop a model that expresses the joint impact of spatial resolution s and JPEG compression quality factor qf on immersive image quality. The model is expressed as the product of optimized exponential functions of these factors. The model is tested on a subjective database of immersive image contents rendered on a head mounted display (HMD). High Pearson correlation and Spearman correlation (> 0.95) and small relative root mean squared error (< 5.6%) are achieved between the model predictions and the subjective quality judgements. The immersive ground-truth images along with the rest of the database are made available for future research and comparisons.
Collapse
|
17
|
Nightingale SJ, Wade KA, Watson DG. Can people identify original and manipulated photos of real-world scenes? COGNITIVE RESEARCH-PRINCIPLES AND IMPLICATIONS 2017; 2:30. [PMID: 28776002 PMCID: PMC5514174 DOI: 10.1186/s41235-017-0067-2] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Accepted: 06/12/2017] [Indexed: 05/29/2023]
Abstract
Advances in digital technology mean that the creation of visually compelling photographic fakes is growing at an incredible speed. The prevalence of manipulated photos in our everyday lives invites an important, yet largely unanswered, question: Can people detect photo forgeries? Previous research using simple computer-generated stimuli suggests people are poor at detecting geometrical inconsistencies within a scene. We do not know, however, whether such limitations also apply to real-world scenes that contain common properties that the human visual system is attuned to processing. In two experiments we asked people to detect and locate manipulations within images of real-world scenes. Subjects demonstrated a limited ability to detect original and manipulated images. Furthermore, across both experiments, even when subjects correctly detected manipulated images, they were often unable to locate the manipulation. People’s ability to detect manipulated images was positively correlated with the extent of disruption to the underlying structure of the pixels in the photo. We also explored whether manipulation type and individual differences were associated with people’s ability to identify manipulations. Taken together, our findings show, for the first time, that people have poor ability to identify whether a real-world image is original or has been manipulated. The results have implications for professionals working with digital images in legal, media, and other domains.
Collapse
Affiliation(s)
| | - Kimberley A Wade
- Department of Psychology, University of Warwick, Coventry, CV4 7AL UK
| | - Derrick G Watson
- Department of Psychology, University of Warwick, Coventry, CV4 7AL UK
| |
Collapse
|
18
|
Detecting distortions of peripherally presented letter stimuli under crowded conditions. Atten Percept Psychophys 2017; 79:850-862. [DOI: 10.3758/s13414-016-1245-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
19
|
Jennings BJ, Wang K, Menzies S, Kingdom FAA. Detection of chromatic and luminance distortions in natural scenes. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2015; 32:1613-1622. [PMID: 26367428 DOI: 10.1364/josaa.32.001613] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
A number of studies have measured visual thresholds for detecting spatial distortions applied to images of natural scenes. In one study, Bex [J. Vis.10(2), 1 (2010)10.1167/10.2.231534-7362] measured sensitivity to sinusoidal spatial modulations of image scale. Here, we measure sensitivity to sinusoidal scale distortions applied to the chromatic, luminance, or both layers of natural scene images. We first established that sensitivity does not depend on whether the undistorted comparison image was of the same or of a different scene. Next, we found that, when the luminance but not chromatic layer was distorted, performance was the same regardless of whether the chromatic layer was present, absent, or phase-scrambled; in other words, the chromatic layer, in whatever form, did not affect sensitivity to the luminance layer distortion. However, when the chromatic layer was distorted, sensitivity was higher when the luminance layer was intact compared to when absent or phase-scrambled. These detection threshold results complement the appearance of periodic distortions of the image scale: when the luminance layer is distorted visibly, the scene appears distorted, but when the chromatic layer is distorted visibly, there is little apparent scene distortion. We conclude that (a) observers have a built-in sense of how a normal image of a natural scene should appear, and (b) the detection of distortion in, as well as the apparent distortion of, natural scene images is mediated predominantly by the luminance layer and not chromatic layer.
Collapse
|
20
|
Abstract
Human vision has a remarkable ability to perceive two layers at the same retinal locations, a transparent layer in front of a background surface. Critical image cues to perceptual transparency, studied extensively in the past, are changes in luminance or color that could be caused by light absorptions and reflections by the front layer, but such image changes may not be clearly visible when the front layer consists of a pure transparent material such as water. Our daily experiences with transparent materials of this kind suggest that an alternative potential cue of visual transparency is image deformations of a background pattern caused by light refraction. Although previous studies have indicated that these image deformations, at least static ones, play little role in perceptual transparency, here we show that dynamic image deformations of the background pattern, which could be produced by light refraction on a moving liquid's surface, can produce a vivid impression of a transparent liquid layer without the aid of any other visual cues as to the presence of a transparent layer. Furthermore, a transparent liquid layer perceptually emerges even from a randomly generated dynamic image deformation as long as it is similar to real liquid deformations in its spatiotemporal frequency profile. Our findings indicate that the brain can perceptually infer the presence of "invisible" transparent liquids by analyzing the spatiotemporal structure of dynamic image deformation, for which it uses a relatively simple computation that does not require high-level knowledge about the detailed physics of liquid deformation.
Collapse
|
21
|
Wallis TSA, Dorr M, Bex PJ. Sensitivity to gaze-contingent contrast increments in naturalistic movies: An exploratory report and model comparison. J Vis 2015; 15:3. [PMID: 26057546 DOI: 10.1167/15.8.3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Sensitivity to luminance contrast is a prerequisite for all but the simplest visual systems. To examine contrast increment detection performance in a way that approximates the natural environmental input of the human visual system, we presented contrast increments gaze-contingently within naturalistic video freely viewed by observers. A band-limited contrast increment was applied to a local region of the video relative to the observer's current gaze point, and the observer made a forced-choice response to the location of the target (≈25,000 trials across five observers). We present exploratory analyses showing that performance improved as a function of the magnitude of the increment and depended on the direction of eye movements relative to the target location, the timing of eye movements relative to target presentation, and the spatiotemporal image structure at the target location. Contrast discrimination performance can be modeled by assuming that the underlying contrast response is an accelerating nonlinearity (arising from a nonlinear transducer or gain control). We implemented one such model and examined the posterior over model parameters, estimated using Markov-chain Monte Carlo methods. The parameters were poorly constrained by our data; parameters constrained using strong priors taken from previous research showed poor cross-validated prediction performance. Atheoretical logistic regression models were better constrained and provided similar prediction performance to the nonlinear transducer model. Finally, we explored the properties of an extended logistic regression that incorporates both eye movement and image content features. Models of contrast transduction may be better constrained by incorporating data from both artificial and natural contrast perception settings.
Collapse
|
22
|
Abstract
Acuity is the most commonly used measure of visual function, and reductions in acuity are associated with most eye diseases. Metamorphopsia--a perceived distortion of visual space--is another common symptom of visual impairment and is currently assessed qualitatively using Amsler (1953) charts. In order to quantify the impact of metamorphopsia on acuity, we measured the effect of physical spatial distortion on letter recognition. Following earlier work showing that letter recognition is tuned to specific spatial frequency (SF) channels, we hypothesized that the effect of distortion might depend on the spatial scale of visual distortion just as it depends on the spatial scale of masking noise. Six normally sighted observers completed a 26 alternate forced choice (AFC) Sloan letter identification task at five different viewing distances, and the letters underwent different levels of spatial distortion. Distortion was controlled using spatially band-pass filtered noise that spatially remapped pixel locations. Noise was varied over five spatial frequencies and five magnitudes. Performance was modeled with logistic regression and worsened linearly with increasing distortion magnitude and decreasing letter size. We found that retinal SF affects distortion at midrange frequencies and can be explained with the tuning of a basic contrast sensitivity function, while object-centered distortion SF follows a similar pattern of letter object recognition sensitivity and is tuned to approximately three cycles per letter (CPL). The interaction between letter size and distortion makes acuity an unreliable outcome for metamorphopsia assessment.
Collapse
Affiliation(s)
- Emily Wiecek
- Massachusetts Eye and Ear Infirmary, Boston, MA, USA Department of Ophthalmology, Harvard Medical School, Boston, MA, USA Institute of Ophthalmology, University College London, London, UK
| | - Steven C Dakin
- Institute of Ophthalmology, University College London, London, UK Biomedical Research Centre, Moorfields Eye Hospital, National Institute for Health Research, London, UK Department of Optometry and Vision Science, University of Auckland, New Zealand
| | - Peter Bex
- Massachusetts Eye and Ear Infirmary, Boston, MA, USA Department of Ophthalmology, Harvard Medical School, Boston, MA, USA Department of Psychology, Northeastern University, Boston, MA, USA
| |
Collapse
|
23
|
Wiecek E, Lashkari K, Dakin SC, Bex P. Novel quantitative assessment of metamorphopsia in maculopathy. Invest Ophthalmol Vis Sci 2014; 56:494-504. [PMID: 25406293 PMCID: PMC4299468 DOI: 10.1167/iovs.14-15394] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Accepted: 10/27/2014] [Indexed: 11/24/2022] Open
Abstract
PURPOSE Patients with macular disease often report experiencing metamorphopsia (visual distortion). Although typically measured with Amsler charts, more quantitative assessments of perceived distortion are desirable to effectively monitor the presence, progression, and remediation of visual impairment. METHODS Participants with binocular (n = 33) and monocular (n = 50) maculopathy across seven disease groups, and control participants (n = 10) with no identifiable retinal disease completed a modified Amsler grid assessment (presented on a computer screen with eye tracking to ensure fixation compliance) and two novel assessments to measure metamorphopsia in the central 5° of visual field. A total of 81% (67/83) of participants completed a hyperacuity task where they aligned eight dots in the shape of a square, and 64% (32/50) of participants with monocular distortion completed a spatial alignment task using dichoptic stimuli. Ten controls completed all tasks. RESULTS Horizontal and vertical distortion magnitudes were calculated for each of the three assessments. Distortion magnitudes were significantly higher in patients than controls in all assessments. There was no significant difference in magnitude of distortion across different macular diseases. There were no significant correlations between overall magnitude of distortion among any of the three measures and no significant correlations in localized measures of distortion. CONCLUSIONS Three alternative quantifications of monocular spatial distortion in the central visual field generated uncorrelated estimates of visual distortion. It is therefore unlikely that metamorphopsia is caused solely by retinal displacement, but instead involves additional top-down information, knowledge about the scene, and perhaps, cortical reorganization.
Collapse
Affiliation(s)
- Emily Wiecek
- Schepens Eye Research Institute/Mass. Eye and Ear, Boston, Massachusetts, United States
| | - Kameran Lashkari
- Schepens Eye Research Institute/Mass. Eye and Ear, Boston, Massachusetts, United States
- Harvard Medical School, Department of Ophthalmology, Boston, Massachusetts, United States
| | - Steven C. Dakin
- Institute of Ophthalmology, University College London, London, United Kingdom
| | - Peter Bex
- Schepens Eye Research Institute/Mass. Eye and Ear, Boston, Massachusetts, United States
- Harvard Medical School, Department of Ophthalmology, Boston, Massachusetts, United States
- Department of Psychology, Northeastern University, Boston, Massachusetts, United States
| |
Collapse
|
24
|
Hwang AD, Peli E. Instability of the perceived world while watching 3D stereoscopic imagery: A likely source of motion sickness symptoms. Iperception 2014; 5:515-35. [PMID: 26034562 PMCID: PMC4441027 DOI: 10.1068/i0647] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2014] [Revised: 07/30/2014] [Indexed: 10/31/2022] Open
Abstract
Watching 3D content using a stereoscopic display may cause various discomforting symptoms, including eye strain, blurred vision, double vision, and motion sickness. Numerous studies have reported motion-sickness-like symptoms during stereoscopic viewing, but no causal linkage between specific aspects of the presentation and the induced discomfort has been explicitly proposed. Here, we describe several causes, in which stereoscopic capture, display, and viewing differ from natural viewing resulting in static and, importantly, dynamic distortions that conflict with the expected stability and rigidity of the real world. This analysis provides a basis for suggested changes to display systems that may alleviate the symptoms, and suggestions for future studies to determine the relative contribution of the various effects to the unpleasant symptoms.
Collapse
Affiliation(s)
- Alex D Hwang
- Department of Ophthalmology, Schepens Eye Research Institute, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA; e-mail:
| | - Eli Peli
- Department of Ophthalmology, Schepens Eye Research Institute, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA; e-mail:
| |
Collapse
|
25
|
Ugarte M, Shunmugam M, Laidlaw DAH, Williamson TH. Morphision: a method for subjective evaluation of metamorphopsia in patients with unilateral macular pathology (i.e., full thickness macular hole and epiretinal membrane). Indian J Ophthalmol 2014; 61:653-8. [PMID: 24008785 PMCID: PMC3959082 DOI: 10.4103/0301-4738.117804] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
BACKGROUND Lack of clinical tests to quantify spatial components of distortion in patients with full thickness macular holes (FTMH) and epiretinal membranes (ERM). AIM To develop a test for subjective evaluation of visual distortion in the central visual field around fixation in patients with unilateral FTMH or ERM. SETTINGS AND DESIGN Prospective case-control study carried out at tertiary referral center. MATERIALS AND METHODS Twenty-five patients with unilateral macular disease (13 macular epiretinal membranes, 12 full-thickness macular holes), and nine controls (without ocular pathology) underwent ophthalmological examination with logMAR ETDRS visual acuity, near vision and contrast sensitivity assessed. Macular optical coherence tomography and metamorphopsia assessment using Morphision test was also carried out. This test consists of a set of modified Amsler charts for detection, identification, and subjective quantification of visual distortion in the central visual field around fixation. Morphision test content and construct validity, and reliability (test-retest method) were evaluated. Sixteen patients completed an unstructured survey on test performance and preference. RESULTS Every patient with unilateral FTMH or ERM identified a particular chart using Morphision test (content validity). None of the normal subjects without symptoms of metamorphopsia identified any distortion (construct validity). Test-retest showed a 100% consistency for frequency and 67% for amplitude. The mean amplitude difference between measurements was 0.02 degrees (SD = 0.038). The coefficient of repeatability was 0.075. There was a correlation between Morphision amplitude score and visual acuity and contrast sensitivity, individually. CONCLUSIONS Morphision test allowed detection and subjective quantification of metamorphopsia in the clinical setting in our patients with unilateral macular epiretinal membranes and full thickness macular holes.
Collapse
Affiliation(s)
| | | | | | - Tom H Williamson
- St. Thomas' Hospital, Lambeth Palace Road, London, SE1 7EH, United Kingdom
| |
Collapse
|
26
|
Zavitz E, Baker CL. Texture sparseness, but not local phase structure, impairs second-order segmentation. Vision Res 2013; 91:45-55. [PMID: 23942289 DOI: 10.1016/j.visres.2013.07.018] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Revised: 07/08/2013] [Accepted: 07/31/2013] [Indexed: 11/29/2022]
Abstract
Texture boundary segmentation is typically thought to reflect a comparison of differences in Fourier energy (i.e. low-order texture statistics) on either side of a boundary. However in a previous study (Arsenault, Yoonessi, & Baker, 2011) we showed that the distribution of energy within a natural texture (i.e. its higher-order statistical structure) also influences segmentation of contrast boundaries. Here we examine the influence of specific higher-order texture statistics on segmentation of contrast- and orientation-defined boundaries. Using naturalistic synthetic textures to manipulate the sparseness, global phase structure, and local phase alignments of carrier textures, we measure segmentation thresholds based on forced-choice judgments of boundary orientation. We find a similar pattern of results for both contrast and orientation boundaries: (1) randomizing all structure by globally phase scrambling the texture reduces segmentation thresholds substantially, (2) decreasing sparseness also reduces thresholds, and (3) removing local phase alignments has little or no effect on segmentation thresholds. We show that a two-stage filter model with an intermediate compressive nonlinearity and expansive output nonlinearity can account for these data using synthetic textures. Furthermore, the model parameter fits obtained using synthetic textures also predict the segmentation thresholds presented in Arsenault, Yoonessi, and Baker (2011) for natural and phase-scrambled natural texture carriers.
Collapse
Affiliation(s)
- Elizabeth Zavitz
- McGill Vision Research, Department of Ophthalmology, McGill University, Montreal, Quebec, Canada; Department of Physiology, Monash University, Clayton, Victoria, Australia.
| | | |
Collapse
|
27
|
Zhang F, Lin W, Chen Z, Ngan KN. Additive log-logistic model for networked video quality assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2013; 22:1536-1547. [PMID: 23247855 DOI: 10.1109/tip.2012.2233486] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Modeling subjective opinions on visual quality is a challenging problem, which closely relates to many factors of the human perception. In this paper, the additive log-logistic model (ALM) is proposed to formulate such a multidimensional nonlinear problem. The log-logistic model has flexible monotonic or nonmonotonic partial derivatives and thus is suitable to model various uni-type impairments. The proposed ALM metric adds the distortions due to each type of impairment in a log-logistic transformed space of subjective opinions. The features can be evaluated and selected by classic statistical inference, and the model parameters can be easily estimated. Cross validations on five Telecommunication Standardization Sector of International Telecommunication Union (ITU-T) subjectively-rated databases confirm that: 1) based on the same features, the ALM outperforms the support vector regression and the logistic model in quality prediction and, 2) the resultant no-reference quality met-ric based on impairment-relevant video parameters achieves high correlation with a total of 27 216 subjective opinions on 1134 video clips, even compared with existing full-reference quality metrics based on pixel differences. The ALM metric wins the model competition of the ITU-T Study Group 12 (where the validation databases are independent with the training databases) and thus is being put forth into ITU-T Recommendation P.1202.2 for the consent of ITU-T.
Collapse
Affiliation(s)
- Fan Zhang
- Department of Research and Innovation, Technicolor (China) Technology Co. Ltd, Beijing 100192, China.
| | | | | | | |
Collapse
|
28
|
McIlreavy L, Fiser J, Bex PJ. Impact of simulated central scotomas on visual search in natural scenes. Optom Vis Sci 2013; 89:1385-94. [PMID: 22885785 DOI: 10.1097/opx.0b013e318267a914] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
PURPOSE In performing search tasks, the visual system encodes information across the visual field at a resolution inversely related to eccentricity and deploys saccades to place visually interesting targets upon the fovea, where resolution is highest. The serial process of fixation, punctuated by saccadic eye movements, continues until the desired target has been located. Loss of central vision restricts the ability to resolve the high spatial information of a target, interfering with this visual search process. We investigate oculomotor adaptations to central visual field loss with gaze-contingent artificial scotomas. METHODS Spatial distortions were placed at random locations in 25° square natural scenes. Gaze-contingent artificial central scotomas were updated at the screen rate (75 Hz) based on a 250 Hz eye tracker. Eight subjects searched the natural scene for the spatial distortion and indicated its location using a mouse-controlled cursor. RESULTS As the central scotoma size increased, the mean search time increased [F(3,28) = 5.27, p = 0.05], and the spatial distribution of gaze points during fixation increased significantly along the x [F(3,28) = 6.33, p = 0.002] and y [F(3,28) = 3.32, p = 0.034] axes. Oculomotor patterns of fixation duration, saccade size, and saccade duration did not change significantly, regardless of scotoma size. CONCLUSIONS There is limited automatic adaptation of the oculomotor system after simulated central vision loss.
Collapse
Affiliation(s)
- Lee McIlreavy
- Department of Ophthalmology, Harvard Medical School, Schepens Eye Research Institute, Boston, Massachusetts 02114, USA.
| | | | | |
Collapse
|
29
|
Abstract
Natural textures have characteristic image statistics that make them discriminable from unnatural textures. For example, both contrast negation and texture synthesis alter the appearance of natural textures even though each manipulation preserves some features while disrupting others. Here, we examined the extent to which contrast negation and texture synthesis each introduce or remove critical perceptual features for discriminating unnatural textures from natural textures. We find that both manipulations remove information that observers use for distinguishing natural textures from transformed versions of the same patterns, but do so in different ways. Texture synthesis removes information that is relevant for discrimination in both abstract patterns and ecologically valid textures, and we also observe a category-dependent asymmetry for identifying an “oddball” real texture among synthetic distractors. Contrast negation exhibits no such asymmetry, and also does not impact discrimination performance in abstract patterns. We discuss our results in the context of the visual system’s tuning to ecologically relevant patterns and other results describing sensitivity to higher-order statistics in texture patterns.
Collapse
Affiliation(s)
- Benjamin Balas
- Department of Psychology, Center for Visual and Cognitive Neuroscience, North Dakota State University Fargo, ND, USA
| |
Collapse
|
30
|
Wallis TSA, Bex PJ. Image correlates of crowding in natural scenes. J Vis 2012; 12:6. [PMID: 22798053 PMCID: PMC4503217 DOI: 10.1167/12.7.6] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2012] [Accepted: 05/29/2012] [Indexed: 11/24/2022] Open
Abstract
Visual crowding is the inability to identify visible features when they are surrounded by other structure in the peripheral field. Since natural environments are replete with structure and most of our visual field is peripheral, crowding represents the primary limit on vision in the real world. However, little is known about the characteristics of crowding under natural conditions. Here we examine where crowding occurs in natural images. Observers were required to identify which of four locations contained a patch of "dead leaves'' (synthetic, naturalistic contour structure) embedded into natural images. Threshold size for the dead leaves patch scaled with eccentricity in a manner consistent with crowding. Reverse correlation at multiple scales was used to determine local image statistics that correlated with task performance. Stepwise model selection revealed that local RMS contrast and edge density at the site of the dead leaves patch were of primary importance in predicting the occurrence of crowding once patch size and eccentricity had been considered. The absolute magnitudes of the regression weights for RMS contrast at different spatial scales varied in a manner consistent with receptive field sizes measured in striate cortex of primate brains. Our results are consistent with crowding models that are based on spatial averaging of features in the early stages of the visual system, and allow the prediction of where crowding is likely to occur in natural images.
Collapse
Affiliation(s)
- Thomas S. A. Wallis
- Schepens Eye Research Institute, Massachusetts Eye and Ear Infirmary, Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
- School of Psychology, The University of Western Australia, Perth, Australia
| | - Peter J. Bex
- Schepens Eye Research Institute, Massachusetts Eye and Ear Infirmary, Department of Ophthalmology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|