1
|
Stewart EEM, Fleming RW, Schütz AC. A simple optical flow model explains why certain object viewpoints are special. Proc Biol Sci 2024; 291:20240577. [PMID: 38981528 DOI: 10.1098/rspb.2024.0577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 06/13/2024] [Indexed: 07/11/2024] Open
Abstract
A core challenge in perception is recognizing objects across the highly variable retinal input that occurs when objects are viewed from different directions (e.g. front versus side views). It has long been known that certain views are of particular importance, but it remains unclear why. We reasoned that characterizing the computations underlying visual comparisons between objects could explain the privileged status of certain qualitatively special views. We measured pose discrimination for a wide range of objects, finding large variations in performance depending on the object and the viewing angle, with front and back views yielding particularly good discrimination. Strikingly, a simple and biologically plausible computational model based on measuring the projected three-dimensional optical flow between views of objects accurately predicted both successes and failures of discrimination performance. This provides a computational account of why certain views have a privileged status.
Collapse
Affiliation(s)
- Emma E M Stewart
- School of Biological and Behavioural Sciences, Queen Mary University London , London E14NS, UK
- Department of Experimental and Biological Psychology, Queen Mary University London , London E14NS, UK
- Centre for Brain and Behaviour, Queen Mary University London , London E14NS, UK
| | - Roland W Fleming
- Department of Experimental Psychology, Justus Liebig University Giessen , Giessen 35394, Germany
- Centre for Mind, Brain, and Behaviour (CMBB), University of Marburg and Justus Liebig University Giessen , Giessen 35032, Germany
| | - Alexander C Schütz
- Centre for Mind, Brain, and Behaviour (CMBB), University of Marburg and Justus Liebig University Giessen , Giessen 35032, Germany
- General and Experimental Psychology, University of Marburg , Marburg 35032, Germany
| |
Collapse
|
2
|
Rafal RD. Seeing without a Scene: Neurological Observations on the Origin and Function of the Dorsal Visual Stream. J Intell 2024; 12:50. [PMID: 38786652 PMCID: PMC11121949 DOI: 10.3390/jintelligence12050050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 03/15/2024] [Accepted: 04/30/2024] [Indexed: 05/25/2024] Open
Abstract
In all vertebrates, visual signals from each visual field project to the opposite midbrain tectum (called the superior colliculus in mammals). The tectum/colliculus computes visual salience to select targets for context-contingent visually guided behavior: a frog will orient toward a small, moving stimulus (insect prey) but away from a large, looming stimulus (a predator). In mammals, visual signals competing for behavioral salience are also transmitted to the visual cortex, where they are integrated with collicular signals and then projected via the dorsal visual stream to the parietal and frontal cortices. To control visually guided behavior, visual signals must be encoded in body-centered (egocentric) coordinates, and so visual signals must be integrated with information encoding eye position in the orbit-where the individual is looking. Eye position information is derived from copies of eye movement signals transmitted from the colliculus to the frontal and parietal cortices. In the intraparietal cortex of the dorsal stream, eye movement signals from the colliculus are used to predict the sensory consequences of action. These eye position signals are integrated with retinotopic visual signals to generate scaffolding for a visual scene that contains goal-relevant objects that are seen to have spatial relationships with each other and with the observer. Patients with degeneration of the superior colliculus, although they can see, behave as though they are blind. Bilateral damage to the intraparietal cortex of the dorsal stream causes the visual scene to disappear, leaving awareness of only one object that is lost in space. This tutorial considers what we have learned from patients with damage to the colliculus, or to the intraparietal cortex, about how the phylogenetically older midbrain and the newer mammalian dorsal cortical visual stream jointly coordinate the experience of a spatially and temporally coherent visual scene.
Collapse
Affiliation(s)
- Robert D Rafal
- Department of Psychological and Brain Sciences, University of Delaware, Newark, DE 19716, USA
| |
Collapse
|
3
|
Morgenstern Y, Storrs KR, Schmidt F, Hartmann F, Tiedemann H, Wagemans J, Fleming RW. High-level aftereffects reveal the role of statistical features in visual shape encoding. Curr Biol 2024; 34:1098-1106.e5. [PMID: 38218184 PMCID: PMC10931819 DOI: 10.1016/j.cub.2023.12.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 11/13/2023] [Accepted: 12/13/2023] [Indexed: 01/15/2024]
Abstract
Visual shape perception is central to many everyday tasks, from object recognition to grasping and handling tools.1,2,3,4,5,6,7,8,9,10 Yet how shape is encoded in the visual system remains poorly understood. Here, we probed shape representations using visual aftereffects-perceptual distortions that occur following extended exposure to a stimulus.11,12,13,14,15,16,17 Such effects are thought to be caused by adaptation in neural populations that encode both simple, low-level stimulus characteristics17,18,19,20 and more abstract, high-level object features.21,22,23 To tease these two contributions apart, we used machine-learning methods to synthesize novel shapes in a multidimensional shape space, derived from a large database of natural shapes.24 Stimuli were carefully selected such that low-level and high-level adaptation models made distinct predictions about the shapes that observers would perceive following adaptation. We found that adaptation along vector trajectories in the high-level shape space predicted shape aftereffects better than simple low-level processes. Our findings reveal the central role of high-level statistical features in the visual representation of shape. The findings also hint that human vision is attuned to the distribution of shapes experienced in the natural environment.
Collapse
Affiliation(s)
- Yaniv Morgenstern
- Erasmus University Rotterdam, Department of Psychology, Burgemeester Oudlaan 50, 3062PA Rotterdam, the Netherlands; University of Leuven (KU Leuven), Brain and Cognition, Tiensestraat 102, 3000 Leuven, Belgium.
| | - Katherine R Storrs
- Justus Liebig University Giessen, Department of Psychology, Otto-Behaghel-Str. 10, 3000 Giessen, Germany; University of Auckland, School of Psychology, 23 Symonds Street, Auckland 1010, New Zealand
| | - Filipp Schmidt
- Justus Liebig University Giessen, Department of Psychology, Otto-Behaghel-Str. 10, 3000 Giessen, Germany; University of Marburg and Justus Liebig University Giessen, Center for Mind, Brain and Behavior (CMBB), Hans-Meerwein-Str. 6, 35032 Marburg, Germany
| | - Frieder Hartmann
- Justus Liebig University Giessen, Department of Psychology, Otto-Behaghel-Str. 10, 3000 Giessen, Germany
| | - Henning Tiedemann
- Justus Liebig University Giessen, Department of Psychology, Otto-Behaghel-Str. 10, 3000 Giessen, Germany
| | - Johan Wagemans
- University of Leuven (KU Leuven), Brain and Cognition, Tiensestraat 102, 3000 Leuven, Belgium
| | - Roland W Fleming
- Justus Liebig University Giessen, Department of Psychology, Otto-Behaghel-Str. 10, 3000 Giessen, Germany; University of Marburg and Justus Liebig University Giessen, Center for Mind, Brain and Behavior (CMBB), Hans-Meerwein-Str. 6, 35032 Marburg, Germany
| |
Collapse
|
4
|
Hafri A, Green EJ, Firestone C. Compositionality in visual perception. Behav Brain Sci 2023; 46:e277. [PMID: 37766604 DOI: 10.1017/s0140525x23001838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2023]
Abstract
Quilty-Dunn et al.'s wide-ranging defense of the Language of Thought Hypothesis (LoTH) argues that vision traffics in abstract, structured representational formats. We agree: Vision, like language, is compositional - just as words compose into phrases, many visual representations contain discrete constituents that combine in systematic ways. Here, we amass evidence extending this proposal, and explore its implications for how vision interfaces with the rest of the mind.
Collapse
Affiliation(s)
- Alon Hafri
- Department of Linguistics and Cognitive Science, University of Delaware, Newark, DE, USA. ; https://pal.lingcogsci.udel.edu/
| | - E J Green
- Department of Linguistics and Philosophy, Massachusetts Institute of Technology, Cambridge, MA, USA. ; https://sites.google.com/site/greenedwinj/
| | - Chaz Firestone
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA. ; https://perception.jhu.edu/
| |
Collapse
|
5
|
Sun Z, Firestone C. Beautiful on the inside: Aesthetic preferences and the skeletal complexity of shapes. Perception 2022; 51:904-918. [DOI: 10.1177/03010066221124872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
A plain, blank canvas does not look very beautiful; to make it aesthetically appealing requires adding structure and complexity. But how much structure is best? In other words, what is the relationship between beauty and complexity? It has long been hypothesized that complexity and beauty meet at a “sweet spot,” such that the most beautiful images are neither too simple nor too complex. Here, we take a novel experimental approach to this question, using an information-theoretic approach to object representation based on an internal “skeletal” structure. We algorithmically generated a library of two-dimensional polygons and manipulated their complexity by gradually smoothing out their features—essentially decreasing the amount of information in the objects. We then stylized these shapes as “paintings” by rendering them with artistic strokes, and “mounted” them on framed canvases hung in a virtual room. Participants were shown pairs of these mounted shapes (which possessed similar structures but varied in skeletal complexity) and chose which shape looked best by previewing each painting on the canvas. Experiment 1 revealed a “Goldilocks” effect: participants preferred paintings that were neither too simple nor too complex, such that moderately complex shapes were chosen as the most attractive paintings. Experiment 2 isolated the role of complexity per se: when the same shapes were scrambled (such that their structural complexity was undermined, while other visual features were preserved), the Goldilocks effect was dramatically diminished. These findings suggest a quadratic relationship between aesthetics and complexity in ways that go beyond previous measures of each and demonstrate the utility of information-theoretic approaches for exploring high-level aspects of visual experience.
Collapse
|
6
|
Functional recursion of orientation cues in figure-ground separation. Vision Res 2022; 197:108047. [PMID: 35691090 PMCID: PMC9262819 DOI: 10.1016/j.visres.2022.108047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 03/16/2022] [Accepted: 03/23/2022] [Indexed: 11/23/2022]
Abstract
Visual texture is an important cue to figure-ground organization. While processing of texture differences is a prerequisite for the use of this cue to extract figure-ground organization, these stages are distinct processes. One potential indicator of this distinction is the possibility that texture statistics play a different role in the figure vs. in the ground. To determine whether this is the case, we probed figure-ground processing with a family of local image statistics that specified textures that varied in the strength and spatial scale of structure, and the extent to which features are oriented. For image statistics that generated approximately isotropic textures, the threshold for identification of figure-ground structure was determined by the difference in correlation strength in figure vs. ground, independent of whether the correlations were present in figure, ground, or both. However, for image statistics with strong orientation content, thresholds were up to two times higher for correlations in the ground, vs. the figure. This held equally for texture-defined objects with convex or concave boundaries, indicating that these threshold differences are driven by border ownership, not boundary shape. Similar threshold differences were found for presentation times ranging from 125 to 500 ms. These findings identify a qualitative difference in how texture is used for figure-ground analysis, vs. texture discrimination. Additionally, it reveals a functional recursion: texture differences are needed to identify tentative boundaries and consequent scene organization into figure and ground, but then scene organization modifies sensitivity to texture differences according to the figure-ground assignment.
Collapse
|
7
|
Ayzenberg V, Lourenco S. Perception of an object's global shape is best described by a model of skeletal structure in human infants. eLife 2022; 11:e74943. [PMID: 35612898 PMCID: PMC9132572 DOI: 10.7554/elife.74943] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 05/09/2022] [Indexed: 11/13/2022] Open
Abstract
Categorization of everyday objects requires that humans form representations of shape that are tolerant to variations among exemplars. Yet, how such invariant shape representations develop remains poorly understood. By comparing human infants (6-12 months; N=82) to computational models of vision using comparable procedures, we shed light on the origins and mechanisms underlying object perception. Following habituation to a never-before-seen object, infants classified other novel objects across variations in their component parts. Comparisons to several computational models of vision, including models of high-level and low-level vision, revealed that infants' performance was best described by a model of shape based on the skeletal structure. Interestingly, infants outperformed a range of artificial neural network models, selected for their massive object experience and biological plausibility, under the same conditions. Altogether, these findings suggest that robust representations of shape can be formed with little language or object experience by relying on the perceptually invariant skeletal structure.
Collapse
Affiliation(s)
| | - Stella Lourenco
- Department of Psychology, Emory UniversityAtlantaUnited States
| |
Collapse
|
8
|
Superordinate Categorization Based on the Perceptual Organization of Parts. Brain Sci 2022; 12:brainsci12050667. [PMID: 35625053 PMCID: PMC9139997 DOI: 10.3390/brainsci12050667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 05/13/2022] [Accepted: 05/17/2022] [Indexed: 12/10/2022] Open
Abstract
Plants and animals are among the most behaviorally significant superordinate categories for humans. Visually assigning objects to such high-level classes is challenging because highly distinct items must be grouped together (e.g., chimpanzees and geckos) while more similar items must sometimes be separated (e.g., stick insects and twigs). As both animals and plants typically possess complex multi-limbed shapes, the perceptual organization of shape into parts likely plays a crucial rule in identifying them. Here, we identify a number of distinctive growth characteristics that affect the spatial arrangement and properties of limbs, yielding useful cues for differentiating plants from animals. We developed a novel algorithm based on shape skeletons to create many novel object pairs that differ in their part structure but are otherwise very similar. We found that particular part organizations cause stimuli to look systematically more like plants or animals. We then generated other 110 sequences of shapes morphing from animal- to plant-like appearance by modifying three aspects of part structure: sprouting parts, curvedness of parts, and symmetry of part pairs. We found that all three parameters correlated strongly with human animal/plant judgments. Together our findings suggest that subtle changes in the properties and organization of parts can provide powerful cues in superordinate categorization.
Collapse
|
9
|
Tiedemann H, Morgenstern Y, Schmidt F, Fleming RW. One-shot generalization in humans revealed through a drawing task. eLife 2022; 11:75485. [PMID: 35536739 PMCID: PMC9090327 DOI: 10.7554/elife.75485] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 05/01/2022] [Indexed: 11/13/2022] Open
Abstract
Humans have the amazing ability to learn new visual concepts from just a single exemplar. How we achieve this remains mysterious. State-of-the-art theories suggest observers rely on internal 'generative models', which not only describe observed objects, but can also synthesize novel variations. However, compelling evidence for generative models in human one-shot learning remains sparse. In most studies, participants merely compare candidate objects created by the experimenters, rather than generating their own ideas. Here, we overcame this key limitation by presenting participants with 2D 'Exemplar' shapes and asking them to draw their own 'Variations' belonging to the same class. The drawings reveal that participants inferred-and synthesized-genuine novel categories that were far more varied than mere copies. Yet, there was striking agreement between participants about which shape features were most distinctive, and these tended to be preserved in the drawn Variations. Indeed, swapping distinctive parts caused objects to swap apparent category. Our findings suggest that internal generative models are key to how humans generalize from single exemplars. When observers see a novel object for the first time, they identify its most distinctive features and infer a generative model of its shape, allowing them to mentally synthesize plausible variants.
Collapse
Affiliation(s)
- Henning Tiedemann
- Department of Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany
| | - Yaniv Morgenstern
- Department of Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany.,Laboratory of Experimental Psychology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Filipp Schmidt
- Department of Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany.,Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, Giessen, Germany
| | - Roland W Fleming
- Department of Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany.,Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
10
|
Schmidt F. The Perception of “Intelligent” Design in Visual Structure. Iperception 2022; 13:20416695221080184. [PMID: 35237402 PMCID: PMC8883379 DOI: 10.1177/20416695221080184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 01/27/2022] [Indexed: 12/01/2022] Open
Abstract
Many objects in our visual environment will appear to us either as a consequence of “intelligent” design—the purposeful action of an animal mind—or as a consequence of self-organization in response to nature's forces—for example, wind or gravity. Here, the origin of this distinction is studied by collecting human judgements about skeletal representations of objects, that reduce objects to their basic visual structure. The results suggest that humans attribute an animate origin to visual objects with basic structures exhibiting straight lines and right angles.
Collapse
Affiliation(s)
- Filipp Schmidt
- Justus Liebig University Giessen, Germany
- Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, Germany
| |
Collapse
|
11
|
Wilder J, Rezanejad M, Dickinson S, Siddiqi K, Jepson A, Walther DB. Neural correlates of local parallelism during naturalistic vision. PLoS One 2022; 17:e0260266. [PMID: 35061699 PMCID: PMC8782314 DOI: 10.1371/journal.pone.0260266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 11/07/2021] [Indexed: 11/18/2022] Open
Abstract
Human observers can rapidly perceive complex real-world scenes. Grouping visual elements into meaningful units is an integral part of this process. Yet, so far, the neural underpinnings of perceptual grouping have only been studied with simple lab stimuli. We here uncover the neural mechanisms of one important perceptual grouping cue, local parallelism. Using a new, image-computable algorithm for detecting local symmetry in line drawings and photographs, we manipulated the local parallelism content of real-world scenes. We decoded scene categories from patterns of brain activity obtained via functional magnetic resonance imaging (fMRI) in 38 human observers while they viewed the manipulated scenes. Decoding was significantly more accurate for scenes containing strong local parallelism compared to weak local parallelism in the parahippocampal place area (PPA), indicating a central role of parallelism in scene perception. To investigate the origin of the parallelism signal we performed a model-based fMRI analysis of the public BOLD5000 dataset, looking for voxels whose activation time course matches that of the locally parallel content of the 4916 photographs viewed by the participants in the experiment. We found a strong relationship with average local symmetry in visual areas V1-4, PPA, and retrosplenial cortex (RSC). Notably, the parallelism-related signal peaked first in V4, suggesting V4 as the site for extracting paralleism from the visual input. We conclude that local parallelism is a perceptual grouping cue that influences neuronal activity throughout the visual hierarchy, presumably starting at V4. Parallelism plays a key role in the representation of scene categories in PPA.
Collapse
Affiliation(s)
| | - Morteza Rezanejad
- University of Toronto, Toronto, Canada
- McGill University, Montreal, Canada
| | - Sven Dickinson
- University of Toronto, Toronto, Canada
- Samsung Toronto AI Research Center, Toronto, Canada
- Vector Institute, Toronto, Canada
| | | | - Allan Jepson
- University of Toronto, Toronto, Canada
- Samsung Toronto AI Research Center, Toronto, Canada
| | | |
Collapse
|
12
|
Ayzenberg V, Kamps FS, Dilks DD, Lourenco SF. Skeletal representations of shape in the human visual cortex. Neuropsychologia 2022; 164:108092. [PMID: 34801519 PMCID: PMC9840386 DOI: 10.1016/j.neuropsychologia.2021.108092] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 11/07/2021] [Accepted: 11/17/2021] [Indexed: 01/17/2023]
Abstract
Shape perception is crucial for object recognition. However, it remains unknown exactly how shape information is represented and used by the visual system. Here, we tested the hypothesis that the visual system represents object shape via a skeletal structure. Using functional magnetic resonance imaging (fMRI) and representational similarity analysis (RSA), we found that a model of skeletal similarity explained significant unique variance in the response profiles of V3 and LO. Moreover, the skeletal model remained predictive in these regions even when controlling for other models of visual similarity that approximate low-to high-level visual features (i.e., Gabor-jet, GIST, HMAX, and AlexNet), and across different surface forms, a manipulation that altered object contours while preserving the underlying skeleton. Together, these findings shed light on shape processing in human vision, as well as the computational properties of V3 and LO. We discuss how these regions may support two putative roles of shape skeletons: namely, perceptual organization and object recognition.
Collapse
Affiliation(s)
- Vladislav Ayzenberg
- Department of Psychology, Carnegie Mellon University, USA,Corresponding author: (V. Ayzenberg)
| | - Frederik S. Kamps
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, USA
| | | | - Stella F. Lourenco
- Department of Psychology, Emory University, USA,Corresponding author: (S.F. Lourenco)
| |
Collapse
|
13
|
Aizenman AM, Ehinger KA, Wick FA, Micheletto R, Park J, Jurgensen L, Wolfe JM. Hiding the Rabbit: Using a genetic algorithm to investigate shape guidance in visual search. J Vis 2022; 22:7. [PMID: 35024760 PMCID: PMC8762685 DOI: 10.1167/jov.22.1.7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
During visual search, attention is guided by specific features, including shape. Our understanding of shape guidance is limited to specific attributes (closures and line terminations) that do not fully explain the richness of preattentive shape processing. We used a novel genetic algorithm method to explore shape space and to stimulate hypotheses about shape guidance. Initially, observers searched for targets among 12 random distractors defined, in radial frequency space, by the amplitude and phase of 10 radial frequencies. Reaction time (RT) was the measure of “fitness.” To evolve toward an easier search task, distractors with faster RTs survived to the next generation, “mated,” and produced offspring (new distractors for the next generation of search). To evolve a harder search, surviving distractors were those yielding longer RTs. Within eight generations of evolution, the method succeeds in producing visual searches either harder or easier than the starting search. In radial frequency space, easy distractors evolve amplitude × frequency spectra that are dissimilar to the target, whereas hard distractors evolve spectra that are more similar to the target. This method also works with naturally shaped targets (e.g., rabbit silhouettes). Interestingly, the most inefficient distractors featured a combination of a body and ear distractors that did not resemble the rabbit (visually or in spectrum). Adding extra ears to these distractors did not impact the search spectrally and instead made it easier to confirm a rabbit, once it was found. In general, these experiments show that shapes that are clearly distinct when attended are similar to each other preattentively.
Collapse
Affiliation(s)
| | | | - Farahnaz A Wick
- Harvard Medical School, Cambridge, MA, USA.,Brigham and Women's Hospital, Cambridge, MA, USA.,
| | | | | | | | - Jeremy M Wolfe
- Harvard Medical School, Cambridge, MA, USA.,Brigham and Women's Hospital, Cambridge, MA, USA.,
| |
Collapse
|
14
|
Baker N, Kellman PJ. Constant curvature modeling of abstract shape representation. PLoS One 2021; 16:e0254719. [PMID: 34339436 PMCID: PMC8328290 DOI: 10.1371/journal.pone.0254719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 07/01/2021] [Indexed: 11/19/2022] Open
Abstract
How abstract shape is perceived and represented poses crucial unsolved problems in human perception and cognition. Recent findings suggest that the visual system may encode contours as sets of connected constant curvature segments. Here we describe a model for how the visual system might recode a set of boundary points into a constant curvature representation. The model includes two free parameters that relate to the degree to which the visual system encodes shapes with high fidelity vs. the importance of simplicity in shape representations. We conducted two experiments to estimate these parameters empirically. Experiment 1 tested the limits of observers’ ability to discriminate a contour made up of two constant curvature segments from one made up of a single constant curvature segment. Experiment 2 tested observers’ ability to discriminate contours generated from cubic splines (which, mathematically, have no constant curvature segments) from constant curvature approximations of the contours, generated at various levels of precision. Results indicated a clear transition point at which discrimination becomes possible. The results were used to fix the two parameters in our model. In Experiment 3, we tested whether outputs from our parameterized model were predictive of perceptual performance in a shape recognition task. We generated shape pairs that had matched physical similarity but differed in representational similarity (i.e., the number of segments needed to describe the shapes) as assessed by our model. We found that pairs of shapes that were more representationally dissimilar were also easier to discriminate in a forced choice, same/different task. The results of these studies provide evidence for constant curvature shape representation in human visual perception and provide a testable model for how abstract shape descriptions might be encoded.
Collapse
Affiliation(s)
- Nicholas Baker
- Department of Psychology, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail:
| | - Philip J. Kellman
- Department of Psychology, University of California Los Angeles, Los Angeles, California, United States of America
| |
Collapse
|
15
|
Morgenstern Y, Hartmann F, Schmidt F, Tiedemann H, Prokott E, Maiello G, Fleming RW. An image-computable model of human visual shape similarity. PLoS Comput Biol 2021; 17:e1008981. [PMID: 34061825 PMCID: PMC8195351 DOI: 10.1371/journal.pcbi.1008981] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Revised: 06/11/2021] [Accepted: 04/19/2021] [Indexed: 11/19/2022] Open
Abstract
Shape is a defining feature of objects, and human observers can effortlessly compare shapes to determine how similar they are. Yet, to date, no image-computable model can predict how visually similar or different shapes appear. Such a model would be an invaluable tool for neuroscientists and could provide insights into computations underlying human shape perception. To address this need, we developed a model (‘ShapeComp’), based on over 100 shape features (e.g., area, compactness, Fourier descriptors). When trained to capture the variance in a database of >25,000 animal silhouettes, ShapeComp accurately predicts human shape similarity judgments between pairs of shapes without fitting any parameters to human data. To test the model, we created carefully selected arrays of complex novel shapes using a Generative Adversarial Network trained on the animal silhouettes, which we presented to observers in a wide range of tasks. Our findings show that incorporating multiple ShapeComp dimensions facilitates the prediction of human shape similarity across a small number of shapes, and also captures much of the variance in the multiple arrangements of many shapes. ShapeComp outperforms both conventional pixel-based metrics and state-of-the-art convolutional neural networks, and can also be used to generate perceptually uniform stimulus sets, making it a powerful tool for investigating shape and object representations in the human brain. The ability to describe and compare shapes is crucial in many scientific domains from visual object recognition to computational morphology and computer graphics. Across disciplines, considerable effort has been devoted to the study of shape and its influence on object recognition, yet an important stumbling block is the quantitative characterization of shape similarity. Here we develop a psychophysically validated model that takes as input an object’s shape boundary and provides a high-dimensional output that can be used for predicting visual shape similarity. With this precise control of shape similarity, the model’s description of shape is a powerful tool that can be used across the neurosciences and artificial intelligence to test role of shape in perception and the brain.
Collapse
Affiliation(s)
- Yaniv Morgenstern
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
- * E-mail:
| | - Frieder Hartmann
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
| | - Filipp Schmidt
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
| | - Henning Tiedemann
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
| | - Eugen Prokott
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
| | - Guido Maiello
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
| | - Roland W. Fleming
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
- Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
16
|
Sun Z, Firestone C. Curious Objects: How Visual Complexity Guides Attention and Engagement. Cogn Sci 2021; 45:e12933. [PMID: 33873259 DOI: 10.1111/cogs.12933] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 12/10/2020] [Accepted: 12/15/2020] [Indexed: 11/26/2022]
Abstract
Some things look more complex than others. For example, a crenulate and richly organized leaf may seem more complex than a plain stone. What is the nature of this experience-and why do we have it in the first place? Here, we explore how object complexity serves as an efficiently extracted visual signal that the object merits further exploration. We algorithmically generated a library of geometric shapes and determined their complexity by computing the cumulative surprisal of their internal skeletons-essentially quantifying the "amount of information" within each shape-and then used this approach to ask new questions about the perception of complexity. Experiments 1-3 asked what kind of mental process extracts visual complexity: a slow, deliberate, reflective process (as when we decide that an object is expensive or popular) or a fast, effortless, and automatic process (as when we see that an object is big or blue)? We placed simple and complex objects in visual search arrays and discovered that complex objects were easier to find among simple distractors than simple objects are among complex distractors-a classic search asymmetry indicating that complexity is prioritized in visual processing. Next, we explored the function of complexity: Why do we represent object complexity in the first place? Experiments 4-5 asked subjects to study serially presented objects in a self-paced manner (for a later memory test); subjects dwelled longer on complex objects than simple objects-even when object shape was completely task-irrelevant-suggesting a connection between visual complexity and exploratory engagement. Finally, Experiment 6 connected these implicit measures of complexity to explicit judgments. Collectively, these findings suggest that visual complexity is extracted efficiently and automatically, and even arouses a kind of "perceptual curiosity" about objects that encourages subsequent attentional engagement.
Collapse
Affiliation(s)
- Zekun Sun
- Department of Psychological & Brain Sciences, Johns Hopkins University
| | - Chaz Firestone
- Department of Psychological & Brain Sciences, Johns Hopkins University
| |
Collapse
|
17
|
Abstract
How the brain reconstructs three-dimensional object shape from two-dimensional retinal light patterns remains a mystery. Most research has investigated how cues—such as shading, texture, or perspective—help us estimate visible surface points on the outside of objects. However, our findings show the brain achieves much more than this. Observers not only infer the visible outer surface but also the hidden internal structure of objects—seeing “beneath the skin.” Our findings suggest the brain parses shapes’ features according to their physical causes, potentially allowing us to separate a single continuous surface into multiple superimposed depth layers. This ability likely aids our interactions with objects, by indicating which surface locations are firmly supported from the inside and thus suitable for grasping. Three-dimensional (3D) shape perception is one of the most important functions of vision. It is crucial for many tasks, from object recognition to tool use, and yet how the brain represents shape remains poorly understood. Most theories focus on purely geometrical computations (e.g., estimating depths, curvatures, symmetries). Here, however, we find that shape perception also involves sophisticated inferences that parse shapes into features with distinct causal origins. Inspired by marble sculptures such as Strazza’s The Veiled Virgin (1850), which vividly depict figures swathed in cloth, we created composite shapes by wrapping unfamiliar forms in textile, so that the observable surface relief was the result of complex interactions between the underlying object and overlying fabric. Making sense of such structures requires segmenting the shape based on their causes, to distinguish whether lumps and ridges are due to the shrouded object or to the ripples and folds of the overlying cloth. Three-dimensional scans of the objects with and without the textile provided ground-truth measures of the true physical surface reliefs, against which observers’ judgments could be compared. In a virtual painting task, participants indicated which surface ridges appeared to be caused by the hidden object and which were due to the drapery. In another experiment, participants indicated the perceived depth profile of both surface layers. Their responses reveal that they can robustly distinguish features belonging to the textile from those due to the underlying object. Together, these findings reveal the operation of visual shape-segmentation processes that parse shapes based on their causal origin.
Collapse
|
18
|
Morgenstern Y, Schmidt F, Fleming RW. One-shot categorization of novel object classes in humans. Vision Res 2019; 165:98-108. [PMID: 31707254 DOI: 10.1016/j.visres.2019.09.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 09/17/2019] [Accepted: 09/18/2019] [Indexed: 10/25/2022]
Abstract
One aspect of human vision unmatched by machines is the capacity to generalize from few samples. Observers tend to know when novel objects are in the same class despite large differences in shape, material or viewpoint. A major challenge in studying such generalization is that participants can see each novel sample only once. To overcome this, we used crowdsourcing to obtain responses from 500 human observers on 20 novel object classes, with each stimulus compared to 1 or 16 related objects. The results reveal that humans generalize from sparse data in highly systematic ways with the number and variance of the samples. We compared human responses to 'ShapeComp', an image-computable model based on >100 shape descriptors, and 'AlexNet', a convolution neural network that roughly matches humans at recognizing 1000 categories of real-world objects. With 16 samples, the models were consistent with human responses without free parameters. Thus, when there are a sufficient number of samples, observers rely on shallow but efficient processes based on a fixed set of features. With 1 sample, however, the models required different feature weights for each object. This suggests that one-shot categorization involves more sophisticated processes that actively identify the unique characteristics underlying each object class.
Collapse
Affiliation(s)
- Yaniv Morgenstern
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen 35394, Germany.
| | - Filipp Schmidt
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen 35394, Germany
| | - Roland W Fleming
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen 35394, Germany
| |
Collapse
|
19
|
Ayzenberg V, Lourenco SF. Skeletal descriptions of shape provide unique perceptual information for object recognition. Sci Rep 2019; 9:9359. [PMID: 31249321 PMCID: PMC6597715 DOI: 10.1038/s41598-019-45268-y] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Accepted: 05/29/2019] [Indexed: 11/17/2022] Open
Abstract
With seemingly little effort, humans can both identify an object across large changes in orientation and extend category membership to novel exemplars. Although researchers argue that object shape is crucial in these cases, there are open questions as to how shape is represented for object recognition. Here we tested whether the human visual system incorporates a three-dimensional skeletal descriptor of shape to determine an object's identity. Skeletal models not only provide a compact description of an object's global shape structure, but also provide a quantitative metric by which to compare the visual similarity between shapes. Our results showed that a model of skeletal similarity explained the greatest amount of variance in participants' object dissimilarity judgments when compared with other computational models of visual similarity (Experiment 1). Moreover, parametric changes to an object's skeleton led to proportional changes in perceived similarity, even when controlling for another model of structure (Experiment 2). Importantly, participants preferentially categorized objects by their skeletons across changes to local shape contours and non-accidental properties (Experiment 3). Our findings highlight the importance of skeletal structure in vision, not only as a shape descriptor, but also as a diagnostic cue of object identity.
Collapse
|
20
|
Fleming RW, Schmidt F. Getting "fumpered": Classifying objects by what has been done to them. J Vis 2019; 19:15. [PMID: 30952166 DOI: 10.1167/19.4.15] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Every object acquires its shape from some kind of generative process, such as manufacture, biological growth, or self-organization, in response to external forces. Inferring such generative processes from an observed shape is computationally challenging because a given process can lead to radically different shapes, and similar shapes can result from different generative processes. Here, we suggest that in some cases, generative processes endow objects with distinctive statistical features that observers can use to classify objects according to what has been done to them. We found that from the very first trials in an eight-alternative forced-choice classification task, observers were extremely good at classifying unfamiliar objects by the transformations that had shaped them. Further experiments show that the shape features underlying this ability are distinct from Euclidean shape similarity and that observers can separate and voluntarily respond to both aspects of objects. Our findings suggest that perceptual organization processes allow us to identify salient statistical shape features that are diagnostic of generative processes. By so doing, we can classify objects we have never seen before according to the processes that shaped them.
Collapse
Affiliation(s)
- Roland W Fleming
- Justus-Liebig-University Giessen, General Psychology, Gießen, Germany
| | - Filipp Schmidt
- Justus-Liebig-University Giessen, General Psychology, Gießen, Germany
| |
Collapse
|
21
|
Destler N, Singh M, Feldman J. Shape discrimination along morph-spaces. Vision Res 2019; 158:189-199. [PMID: 30878276 DOI: 10.1016/j.visres.2019.03.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 02/27/2019] [Accepted: 03/08/2019] [Indexed: 11/16/2022]
Abstract
We investigated the dimensions defining mental shape space, by measuring shape discrimination thresholds along "morph-spaces" defined by pairs of shapes. Given any two shapes, one can construct a morph-space by taking weighted averages of their boundary vertices (after normalization), creating a continuum of shapes ranging from the first shape to the second. Previous studies of morphs between highly familiar shape categories (e.g. truck and turkey) have shown elevated discrimination at category boundaries, reflecting a kind of "categorical perception" in shape space. Here, we use this technique to explore the underlying representation of unfamiliar shapes. Subjects were shown two shapes at nearby points along a morph-space, and asked to judge whether they were the same or different, with an adaptive procedure used to estimate discrimination thresholds at each point along the morph-space. We targeted several potentially important categorical distinctions, such one- vs. two-part shapes, two- vs. three-part shapes, changes in symmetry structure, and other potentially important distinctions. Observed discrimination thresholds showed substantial and systematic deviations from uniformity at different points along each shape continuum, meaning that subjects were consistently better at discriminating at certain points along each morph-space than at others. We introduce a shape similarity measure, based on Bayesian skeletal shape representations, which gives a good account of the observed variations in shape sensitivity.
Collapse
Affiliation(s)
- Nathan Destler
- Department of Psychology, Center for Cognitive Science, Rutgers University, New Brunswick, NJ, United States.
| | - Manish Singh
- Department of Psychology, Center for Cognitive Science, Rutgers University, New Brunswick, NJ, United States
| | - Jacob Feldman
- Department of Psychology, Center for Cognitive Science, Rutgers University, New Brunswick, NJ, United States
| |
Collapse
|
22
|
Abstract
An intrinsic part of seeing objects is seeing how similar or different they are relative to one another. This experience requires that objects be mentally represented in a common format over which such comparisons can be carried out. What is that representational format? Objects could be compared in terms of their superficial features (e.g., degree of pixel-by-pixel overlap), but a more intriguing possibility is that they are compared on the basis of a deeper structure. One especially promising candidate that has enjoyed success in the computer vision literature is the shape skeleton-a geometric transformation that represents objects according to their inferred underlying organization. Despite several hints that shape skeletons are computed in human vision, it remains unclear how much they actually matter for subsequent performance. Here, we explore the possibility that shape skeletons help mediate the ability to extract visual similarity. Observers completed a same/different task in which two shapes could vary either in their skeletal structure (without changing superficial features such as size, orientation, and internal angular separation) or in large surface-level ways (without changing overall skeletal organization). Discrimination was better for skeletally dissimilar shapes: observers had difficulty appreciating even surprisingly large differences when those differences did not reorganize the underlying skeletons. This pattern also generalized beyond line drawings to 3-D volumes whose skeletons were less readily inferable from the shapes' visible contours. These results show how shape skeletons may influence the perception of similarity-and more generally, how they have important consequences for downstream visual processing.
Collapse
|
23
|
Wilder J, Rezanejad M, Dickinson S, Siddiqi K, Jepson A, Walther DB. Local contour symmetry facilitates scene categorization. Cognition 2018; 182:307-317. [PMID: 30415132 DOI: 10.1016/j.cognition.2018.09.014] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 09/20/2018] [Accepted: 09/22/2018] [Indexed: 10/27/2022]
Abstract
People are able to rapidly categorize briefly flashed images of real-world environments, even when they are reduced to line drawings. This setting allows for the study of time-limited perceptual grouping processes in the human visual system that are applicable to line drawings. Previous work (Wilder, Dickinson, Jepson, & Walther, 2018) showed that standard local features of individual contours, or junctions between contours, do not account for this rapid classification ability but, rather, the relative placement of these contours appeared to be important. Here we provide strong support for this observation by demonstrating that local ribbon symmetry between neighboring pairs of contours facilitates the categorization of complex real-world environments. To this end, we introduce a novel computational approach, based on the medial axis transform, for measuring the degree of local ribbon symmetry in a line drawing. We use this measure to separate the contour pixels for a given scene into the most ribbon symmetric half and the least ribbon symmetric half. We then show human observers the resulting half-images in a rapid-categorization experiment. Our results demonstrate that local ribbon symmetry facilitates the categorization of complex real-world environments. This is the first study of the role of local symmetry in inter-contour grouping for human scene classification. We conclude that local ribbon symmetry appears to play an important role in jump-starting the grouping of image content into meaningful units, even in flashed presentations.
Collapse
|
24
|
Berdugo-Lattke ML, Gónzalez F, Rangel-Ch JO, Gómez F. P-type based dimensionality reduction for open contours of Colombian Páramo plant species. ECOL INFORM 2016. [DOI: 10.1016/j.ecoinf.2016.09.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
25
|
Denisova K, Feldman J, Su X, Singh M. Investigating shape representation using sensitivity to part- and axis-based transformations. Vision Res 2016; 126:347-361. [PMID: 26325393 DOI: 10.1016/j.visres.2015.07.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2015] [Revised: 07/06/2015] [Accepted: 07/10/2015] [Indexed: 10/22/2022]
Abstract
Part- and axis-based approaches organize shape representations in terms of simple parts and their spatial relationships. Shape transformations that alter qualitative part structure have been shown to be more detectable than those that preserve it. We compared sensitivity to various transformations that change quantitative properties of parts and their spatial relationships, while preserving qualitative part structure. Shape transformations involving changes in length, width, curvature, orientation and location were applied to a small part attached to a larger base of a two-part shape. Increment thresholds were estimated for each transformation using a 2IFC procedure. Thresholds were converted into common units of shape difference to enable comparisons across transformations. Higher sensitivity was consistently found for transformations involving a parameter of a single part (length, width, curvature) than those involving spatial relations between two parts (relative orientation and location), suggesting a single-part superiority effect. Moreover, sensitivity to shifts in part location - a biomechanically implausible shape transformation - was consistently poorest. The influence of region-based geometry was investigated via stereoscopic manipulation of figure and ground. Sensitivity was compared across positive parts (protrusions) and negative parts (indentations) for transformations involving a change in orientation or location. For changes in part orientation (biomechanically plausible), sensitivity was better for positive than negative parts; whereas for changes in part location (biomechanically implausible), no systematic difference was observed.
Collapse
Affiliation(s)
- Kristina Denisova
- Department of Psychology and Rutgers Center for Cognitive Science, Rutgers University, New Brunswick, NJ, United States.
| | - Jacob Feldman
- Department of Psychology and Rutgers Center for Cognitive Science, Rutgers University, New Brunswick, NJ, United States.
| | - Xiaotao Su
- Department of Psychology and Rutgers Center for Cognitive Science, Rutgers University, New Brunswick, NJ, United States.
| | - Manish Singh
- Department of Psychology and Rutgers Center for Cognitive Science, Rutgers University, New Brunswick, NJ, United States.
| |
Collapse
|
26
|
Wilder J, Feldman J, Singh M. The role of shape complexity in the detection of closed contours. Vision Res 2015; 126:220-231. [PMID: 26505685 DOI: 10.1016/j.visres.2015.10.011] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2015] [Revised: 10/15/2015] [Accepted: 10/17/2015] [Indexed: 11/19/2022]
Abstract
The detection of contours in noise has been extensively studied, but the detection of closed contours, such as the boundaries of whole objects, has received relatively little attention. Closed contours pose substantial challenges not present in the simple (open) case, because they form the outlines of whole shapes and thus take on a range of potentially important configural properties. In this paper we consider the detection of closed contours in noise as a probabilistic decision problem. Previous work on open contours suggests that contour complexity, quantified as the negative log probability (Description Length, DL) of the contour under a suitably chosen statistical model, impairs contour detectability; more complex (statistically surprising) contours are harder to detect. In this study we extended this result to closed contours, developing a suitable probabilistic model of whole shapes that gives rise to several distinct though interrelated measures of shape complexity. We asked subjects to detect either natural shapes (Exp. 1) or experimentally manipulated shapes (Exp. 2) embedded in noise fields. We found systematic effects of global shape complexity on detection performance, demonstrating how aspects of global shape and form influence the basic process of object detection.
Collapse
Affiliation(s)
- John Wilder
- Department of Computer Science, University of Toronto, Toronto, Canada.
| | - Jacob Feldman
- Department of Psychology, Center for Cognitive Science, Rutgers University - New Brunswick, USA
| | - Manish Singh
- Department of Psychology, Center for Cognitive Science, Rutgers University - New Brunswick, USA
| |
Collapse
|
27
|
Abstract
Itis well-known that "smooth" chains of oriented elements-contours-are more easily detected amid background noise than more undulating (i.e., "less smooth") chains. Here, we develop a Bayesian framework for contour detection and show that it predicts that contour detection performance should decrease with the contour's complexity, quantified as the description length (DL; i.e., the negative logarithm of probability integrated along the contour). We tested this prediction in two experiments in which subjects were asked to detect simple open contours amid pixel noise. In Experiment 1, we demonstrate a consistent decline in performance with increasingly complex contours, as predicted by the Bayesian model. In Experiment 2, we confirmed that this effect is due to integrated complexity along the contour, and does not seem to depend on local stretches of linear structure. The results corroborate the probabilistic model of contours, and show how contour detection can be understood as a special case of a more general process-the identification of organized patterns in the environment.
Collapse
|
28
|
Abstract
The primate brain successfully recognizes objects, even when they are partially occluded. To begin to elucidate the neural substrates of this perceptual capacity, we measured the responses of shape-selective neurons in visual area V4 while monkeys discriminated pairs of shapes under varying degrees of occlusion. We found that neuronal shape selectivity always decreased with increasing occlusion level, with some neurons being notably more robust to occlusion than others. The responses of neurons that maintained their selectivity across a wider range of occlusion levels were often sufficiently sensitive to support behavioral performance. Many of these same neurons were distinctively selective for the curvature of local boundary features and their shape tuning was well fit by a model of boundary curvature (curvature-tuned neurons). A significant subset of V4 neurons also signaled the animal's upcoming behavioral choices; these decision signals had short onset latencies that emerged progressively later for higher occlusion levels. The time course of the decision signals in V4 paralleled that of shape selectivity in curvature-tuned neurons: shape selectivity in curvature-tuned neurons, but not others, emerged earlier than the decision signals. These findings provide evidence for the involvement of contour-based mechanisms in the segmentation and recognition of partially occluded objects, consistent with psychophysical theory. Furthermore, they suggest that area V4 participates in the representation of the relevant sensory signals and the generation of decision signals underlying discrimination.
Collapse
|
29
|
Abstract
A major challenge for visual recognition is to describe shapes flexibly enough to allow generalization over different views. Computer vision models have championed a potential solution in medial-axis shape skeletons—hierarchically arranged geometric structures that are robust to deformations like bending and stretching. In the experiments reported here, we exploited an old, unheralded, and exceptionally simple paradigm to reveal the presence and nature of shape skeletons in human vision. When participants independently viewed a shape on a touch-sensitive tablet computer and simply tapped the shape anywhere they wished, the aggregated touches formed the shape’s medial-axis skeleton. This pattern held across several shape variations, demonstrating profound and predictable influences of even subtle border perturbations and amodally filled-in regions. This phenomenon reveals novel properties of shape representation and demonstrates (in an unusually direct way) how deep and otherwise-hidden visual processes can directly control simple behaviors, even while observers are completely unaware of their existence.
Collapse
|
30
|
Wagemans J, Feldman J, Gepshtein S, Kimchi R, Pomerantz JR, van der Helm PA, van Leeuwen C. A century of Gestalt psychology in visual perception: II. Conceptual and theoretical foundations. Psychol Bull 2012; 138:1218-52. [PMID: 22845750 PMCID: PMC3728284 DOI: 10.1037/a0029334] [Citation(s) in RCA: 178] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Our first review article (Wagemans et al., 2012) on the occasion of the centennial anniversary of Gestalt psychology focused on perceptual grouping and figure-ground organization. It concluded that further progress requires a reconsideration of the conceptual and theoretical foundations of the Gestalt approach, which is provided here. In particular, we review contemporary formulations of holism within an information-processing framework, allowing for operational definitions (e.g., integral dimensions, emergent features, configural superiority, global precedence, primacy of holistic/configural properties) and a refined understanding of its psychological implications (e.g., at the level of attention, perception, and decision). We also review 4 lines of theoretical progress regarding the law of Prägnanz-the brain's tendency of being attracted towards states corresponding to the simplest possible organization, given the available stimulation. The first considers the brain as a complex adaptive system and explains how self-organization solves the conundrum of trading between robustness and flexibility of perceptual states. The second specifies the economy principle in terms of optimization of neural resources, showing that elementary sensors working independently to minimize uncertainty can respond optimally at the system level. The third considers how Gestalt percepts (e.g., groups, objects) are optimal given the available stimulation, with optimality specified in Bayesian terms. Fourth, structural information theory explains how a Gestaltist visual system that focuses on internal coding efficiency yields external veridicality as a side effect. To answer the fundamental question of why things look as they do, a further synthesis of these complementary perspectives is required.
Collapse
Affiliation(s)
- Johan Wagemans
- University of Leuven (KU Leuven), Laboratory of Experimental Psychology, Tiensestraat 102, box 3711, BE-3000 Leuven, Belgium.
| | | | | | | | | | | | | |
Collapse
|
31
|
Hartendorp MO, Van der Stigchel S, Wagemans J, Klugkist I, Postma A. The activation of alternative response candidates: when do doubts kick in? Acta Psychol (Amst) 2012; 139:38-45. [PMID: 22100134 DOI: 10.1016/j.actpsy.2011.10.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2011] [Revised: 10/24/2011] [Accepted: 10/25/2011] [Indexed: 10/15/2022] Open
Abstract
In the current study, we investigated at which moment during visual object categorization alternative interpretations are most strongly activated. According to an early activation account, we are uncertain about how to interpret the visual information early in the categorization process. This uncertainty will vanish over time and therefore, the number of possible response candidates decreases over time. According to a late activation account, the visual information is categorized quickly, but after extensive viewing alternative interpretations become more strongly activated. Therefore, the number of possible response candidates increases over time. To increase perceptual uncertainty we used morphed figures composed of a dominant and nondominant object. The similarity rating between morphed figures and their nondominant object was taken as indicator for the activation of the nondominant response candidate: high similarity indicates that the nondominant object is relatively strongly activated as an alternative response candidate. Presentation times were varied in order to distinguish between the early and late activation account. Using a Bayesian model selection approach, we found support for the late activation account, but not for the early activation account. It thus seems that in a late stage of the categorization process the influence of the nondominant response candidate is strongest.
Collapse
|