1
|
Cherian T, Arun SP. What do we see behind an occluder? Amodal completion of statistical properties in complex objects. Atten Percept Psychophys 2024:10.3758/s13414-024-02948-w. [PMID: 39461932 DOI: 10.3758/s13414-024-02948-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/30/2024] [Indexed: 10/28/2024]
Abstract
When a spiky object is occluded, we expect its spiky features to continue behind the occluder. Although many real-world objects contain complex features, it is unclear how more complex features are amodally completed and whether this process is automatic. To investigate this issue, we created pairs of displays with identical contour edges up to the point of occlusion, but with occluded portions exchanged. We then asked participants to search for oddball targets among distractors and asked whether relations between searches involving occluded displays would match better with relations between searches involving completions that are either globally consistent or inconsistent with the visible portions of these displays. Across two experiments involving simple and complex shapes, search times involving occluded displays matched better with those involving globally consistent compared with inconsistent displays. Analogous analyses on deep networks pretrained for object categorization revealed a similar pattern of results for simple but not complex shapes. Thus, deep networks seem to extrapolate simple occluded contours but not more complex contours. Taken together, our results show that amodal completion in humans is sophisticated and can be based on extrapolating global statistical properties.
Collapse
Affiliation(s)
- Thomas Cherian
- Centre for Neuroscience, Indian Institute of Science, Bengaluru, 560012, India
| | - S P Arun
- Centre for Neuroscience, Indian Institute of Science, Bengaluru, 560012, India.
| |
Collapse
|
2
|
Dillon MR. Divisive language. Behav Brain Sci 2024; 47:e124. [PMID: 38934439 DOI: 10.1017/s0140525x23003047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2024]
Abstract
What language devises, it might divide. By exploring the relations among the core geometries of the physical world, the abstract geometry of Euclid, and language, I give new insight into both the persistence of core knowledge into adulthood and our access to it through language. My extension of Spelke's language argument has implications for pedagogy, philosophy, and artificial intelligence.
Collapse
Affiliation(s)
- Moira R Dillon
- Department of Psychology, New York University, New York, NY, USA
| |
Collapse
|
3
|
Hafri A, Green EJ, Firestone C. Compositionality in visual perception. Behav Brain Sci 2023; 46:e277. [PMID: 37766604 DOI: 10.1017/s0140525x23001838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2023]
Abstract
Quilty-Dunn et al.'s wide-ranging defense of the Language of Thought Hypothesis (LoTH) argues that vision traffics in abstract, structured representational formats. We agree: Vision, like language, is compositional - just as words compose into phrases, many visual representations contain discrete constituents that combine in systematic ways. Here, we amass evidence extending this proposal, and explore its implications for how vision interfaces with the rest of the mind.
Collapse
Affiliation(s)
- Alon Hafri
- Department of Linguistics and Cognitive Science, University of Delaware, Newark, DE, USA. ; https://pal.lingcogsci.udel.edu/
| | - E J Green
- Department of Linguistics and Philosophy, Massachusetts Institute of Technology, Cambridge, MA, USA. ; https://sites.google.com/site/greenedwinj/
| | - Chaz Firestone
- Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA. ; https://perception.jhu.edu/
| |
Collapse
|
4
|
Wei L, Li X, Huang L, Liu Y, Hu L, Shen W, Ding Q, Liang P. An fMRI study of visual geometric shapes processing. Front Neurosci 2023; 17:1087488. [PMID: 37008223 PMCID: PMC10062448 DOI: 10.3389/fnins.2023.1087488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 02/28/2023] [Indexed: 03/18/2023] Open
Abstract
Cross-modal correspondence has been consistently evidenced between shapes and other sensory attributes. Especially, the curvature of shapes may arouse the affective account, which may contribute to understanding the mechanism of cross-modal integration. Hence, the current study used the functional magnetic resonance imaging (fMRI) technique to examine brain activity’s specificity when people view circular and angular shapes. The circular shapes consisted of a circle and an ellipse, while the angular shapes consisted of a triangle and a star. Results show that the brain areas activated by circular shapes mainly involved the sub-occipital lobe, fusiform gyrus, sub and middle occipital gyrus, and cerebellar VI. The brain areas activated by angular shapes mainly involve the cuneus, middle occipital gyrus, lingual gyrus, and calcarine gyrus. The brain activation patterns of circular shapes did not differ significantly from those of angular shapes. Such a null finding was unexpected when previous cross-modal correspondence of shape curvature was considered. The different brain regions detected by circular and angular shapes and the potential explanations were discussed in the paper.
Collapse
Affiliation(s)
- Liuqing Wei
- Department of Psychology, Faculty of Education, Hubei University, Wuhan, China
- Brain and Cognition Research Center, Faculty of Education, Hubei University, Wuhan, China
| | - Xueying Li
- Department of Psychology, Faculty of Education, Hubei University, Wuhan, China
| | - Lina Huang
- Imaging Department, Changshu No. 2 People’s Hospital, The Clinical Medical College Affiliated to Xuzhou Medical University, Changshu, China
| | - Yuansheng Liu
- Department of Psychology, Faculty of Education, Hubei University, Wuhan, China
| | - Luming Hu
- Department of Psychology, School of Arts and Sciences, Beijing Normal University, Zhuhai, China
| | - Wenbin Shen
- Imaging Department, Changshu No. 2 People’s Hospital, The Clinical Medical College Affiliated to Xuzhou Medical University, Changshu, China
| | - Qingguo Ding
- Imaging Department, Changshu No. 2 People’s Hospital, The Clinical Medical College Affiliated to Xuzhou Medical University, Changshu, China
- *Correspondence: Qingguo Ding,
| | - Pei Liang
- Department of Psychology, Faculty of Education, Hubei University, Wuhan, China
- Brain and Cognition Research Center, Faculty of Education, Hubei University, Wuhan, China
- Imaging Department, Changshu No. 2 People’s Hospital, The Clinical Medical College Affiliated to Xuzhou Medical University, Changshu, China
- Pei Liang,
| |
Collapse
|
5
|
Vannuscorps G, Galaburda A, Caramazza A. From intermediate shape-centered representations to the perception of oriented shapes: response to commentaries. Cogn Neuropsychol 2023; 40:71-94. [PMID: 37642330 DOI: 10.1080/02643294.2023.2250511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 07/14/2023] [Accepted: 07/31/2023] [Indexed: 08/31/2023]
Abstract
In this response paper, we start by addressing the main points made by the commentators on the target article's main theoretical conclusions: the existence and characteristics of the intermediate shape-centered representations (ISCRs) in the visual system, their emergence from edge detection mechanisms operating on different types of visual properties, and how they are eventually reunited in higher order frames of reference underlying conscious visual perception. We also address the much-commented issue of the possible neural mechanisms of the ISCRs. In the final section, we address more specific and general comments, questions, and suggestions which, albeit very interesting, were less directly focused on the main conclusions of the target paper.
Collapse
Affiliation(s)
- Gilles Vannuscorps
- Department of Psychology, Harvard University, Cambridge, MA, USA
- Institute of Psychological Sciences, Université catholique de Louvain, Louvain-la-Neuve, Belgium
- Institute of Neuroscience, Université catholique de Louvain, Louvain-la-Neuve, Belgium
- Louvain Bionics, Université catholique de Louvain, Louvain-la-Neuve, Belgium
| | - Albert Galaburda
- Department of Neurology, Harvard Medical School and Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Alfonso Caramazza
- Department of Psychology, Harvard University, Cambridge, MA, USA
- Center for Mind/Brain Sciences (CIMeC), Università degli Studi di Trento, Rovereto, Italy
| |
Collapse
|
6
|
Ayzenberg V, Simmons C, Behrmann M. Temporal asymmetries and interactions between dorsal and ventral visual pathways during object recognition. Cereb Cortex Commun 2023; 4:tgad003. [PMID: 36726794 PMCID: PMC9883614 DOI: 10.1093/texcom/tgad003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 12/30/2022] [Accepted: 01/02/2023] [Indexed: 01/15/2023] Open
Abstract
Despite their anatomical and functional distinctions, there is growing evidence that the dorsal and ventral visual pathways interact to support object recognition. However, the exact nature of these interactions remains poorly understood. Is the presence of identity-relevant object information in the dorsal pathway simply a byproduct of ventral input? Or, might the dorsal pathway be a source of input to the ventral pathway for object recognition? In the current study, we used high-density EEG-a technique with high temporal precision and spatial resolution sufficient to distinguish parietal and temporal lobes-to characterise the dynamics of dorsal and ventral pathways during object viewing. Using multivariate analyses, we found that category decoding in the dorsal pathway preceded that in the ventral pathway. Importantly, the dorsal pathway predicted the multivariate responses of the ventral pathway in a time-dependent manner, rather than the other way around. Together, these findings suggest that the dorsal pathway is a critical source of input to the ventral pathway for object recognition.
Collapse
Affiliation(s)
- Vladislav Ayzenberg
- Neuroscience Institute and Psychology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Claire Simmons
- School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Marlene Behrmann
- Neuroscience Institute and Psychology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Department of Ophthalmology, University of Pittsburgh, Pittsburgh, PA 15213, USA
| |
Collapse
|
7
|
Sablé-Meyer M, Ellis K, Tenenbaum J, Dehaene S. A language of thought for the mental representation of geometric shapes. Cogn Psychol 2022; 139:101527. [PMID: 36403385 DOI: 10.1016/j.cogpsych.2022.101527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 10/26/2022] [Accepted: 10/31/2022] [Indexed: 11/18/2022]
Abstract
In various cultures and at all spatial scales, humans produce a rich complexity of geometric shapes such as lines, circles or spirals. Here, we propose that humans possess a language of thought for geometric shapes that can produce line drawings as recursive combinations of a minimal set of geometric primitives. We present a programming language, similar to Logo, that combines discrete numbers and continuous integration to form higher-level structures based on repetition, concatenation and embedding, and we show that the simplest programs in this language generate the fundamental geometric shapes observed in human cultures. On the perceptual side, we propose that shape perception in humans involves searching for the shortest program that correctly draws the image (program induction). A consequence of this framework is that the mental difficulty of remembering a shape should depend on its minimum description length (MDL) in the proposed language. In two experiments, we show that encoding and processing of geometric shapes is well predicted by MDL. Furthermore, our hypotheses predict additive laws for the psychological complexity of repeated, concatenated or embedded shapes, which we confirm experimentally.
Collapse
Affiliation(s)
- Mathias Sablé-Meyer
- Unicog, CEA, INSERM, Université Paris-Saclay, NeuroSpin Center, 91191 Gif/Yvette, France; Collège de France, Université Paris-Sciences-Lettres (PSL), 75005 Paris, France.
| | - Kevin Ellis
- Cornell University, Ithaca, NY, United States
| | - Josh Tenenbaum
- Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Stanislas Dehaene
- Unicog, CEA, INSERM, Université Paris-Saclay, NeuroSpin Center, 91191 Gif/Yvette, France; Collège de France, Université Paris-Sciences-Lettres (PSL), 75005 Paris, France
| |
Collapse
|
8
|
Sun Z, Firestone C. Beautiful on the inside: Aesthetic preferences and the skeletal complexity of shapes. Perception 2022; 51:904-918. [DOI: 10.1177/03010066221124872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
A plain, blank canvas does not look very beautiful; to make it aesthetically appealing requires adding structure and complexity. But how much structure is best? In other words, what is the relationship between beauty and complexity? It has long been hypothesized that complexity and beauty meet at a “sweet spot,” such that the most beautiful images are neither too simple nor too complex. Here, we take a novel experimental approach to this question, using an information-theoretic approach to object representation based on an internal “skeletal” structure. We algorithmically generated a library of two-dimensional polygons and manipulated their complexity by gradually smoothing out their features—essentially decreasing the amount of information in the objects. We then stylized these shapes as “paintings” by rendering them with artistic strokes, and “mounted” them on framed canvases hung in a virtual room. Participants were shown pairs of these mounted shapes (which possessed similar structures but varied in skeletal complexity) and chose which shape looked best by previewing each painting on the canvas. Experiment 1 revealed a “Goldilocks” effect: participants preferred paintings that were neither too simple nor too complex, such that moderately complex shapes were chosen as the most attractive paintings. Experiment 2 isolated the role of complexity per se: when the same shapes were scrambled (such that their structural complexity was undermined, while other visual features were preserved), the Goldilocks effect was dramatically diminished. These findings suggest a quadratic relationship between aesthetics and complexity in ways that go beyond previous measures of each and demonstrate the utility of information-theoretic approaches for exploring high-level aspects of visual experience.
Collapse
|
9
|
Ayzenberg V, Behrmann M. Does the brain's ventral visual pathway compute object shape? Trends Cogn Sci 2022; 26:1119-1132. [PMID: 36272937 DOI: 10.1016/j.tics.2022.09.019] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 09/22/2022] [Accepted: 09/26/2022] [Indexed: 11/11/2022]
Abstract
A rich behavioral literature has shown that human object recognition is supported by a representation of shape that is tolerant to variations in an object's appearance. Such 'global' shape representations are achieved by describing objects via the spatial arrangement of their local features, or structure, rather than by the appearance of the features themselves. However, accumulating evidence suggests that the ventral visual pathway - the primary substrate underlying object recognition - may not represent global shape. Instead, ventral representations may be better described as a basis set of local image features. We suggest that this evidence forces a reevaluation of the role of the ventral pathway in object perception and posits a broader network for shape perception that encompasses contributions from the dorsal pathway.
Collapse
Affiliation(s)
- Vladislav Ayzenberg
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Psychology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
| | - Marlene Behrmann
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Psychology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA; The Department of Ophthalmology, University of Pittsburgh, Pittsburgh, PA 15260, USA.
| |
Collapse
|
10
|
Deane G. Machines That Feel and Think: The Role of Affective Feelings and Mental Action in (Artificial) General Intelligence. ARTIFICIAL LIFE 2022; 28:289-309. [PMID: 35881678 DOI: 10.1162/artl_a_00368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
What role do affective feelings (feelings/emotions/moods) play in adaptive behaviour? What are the implications of this for understanding and developing artificial general intelligence? Leading theoretical models of brain function are beginning to shed light on these questions. While artificial agents have excelled within narrowly circumscribed and specialised domains, domain-general intelligence has remained an elusive goal in artificial intelligence research. By contrast, humans and nonhuman animals are characterised by a capacity for flexible behaviour and general intelligence. In this article I argue that computational models of mental phenomena in predictive processing theories of the brain are starting to reveal the mechanisms underpinning domain-general intelligence in biological agents, and can inform the understanding and development of artificial general intelligence. I focus particularly on approaches to computational phenomenology in the active inference framework. Specifically, I argue that computational mechanisms of affective feelings in active inference-affective self-modelling-are revealing of how biological agents are able to achieve flexible behavioural repertoires and general intelligence. I argue that (i) affective self-modelling functions to "tune" organisms to the most tractable goals in the environmental context; and (ii) affective and agentic self-modelling is central to the capacity to perform mental actions in goal-directed imagination and creative cognition. I use this account as a basis to argue that general intelligence of the level and kind found in biological agents will likely require machines to be implemented with analogues of affective self-modelling.
Collapse
Affiliation(s)
- George Deane
- University of Edinburgh, School of Philosophy, Psychology, and Language Sciences.
| |
Collapse
|
11
|
Izard V, Pica P, Spelke ES. Visual foundations of Euclidean geometry. Cogn Psychol 2022; 136:101494. [PMID: 35751917 DOI: 10.1016/j.cogpsych.2022.101494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 05/10/2022] [Accepted: 06/06/2022] [Indexed: 01/29/2023]
Abstract
Geometry defines entities that can be physically realized in space, and our knowledge of abstract geometry may therefore stem from our representations of the physical world. Here, we focus on Euclidean geometry, the geometry historically regarded as "natural". We examine whether humans possess representations describing visual forms in the same way as Euclidean geometry - i.e., in terms of their shape and size. One hundred and twelve participants from the U.S. (age 3-34 years), and 25 participants from the Amazon (age 5-67 years) were asked to locate geometric deviants in panels of 6 forms of variable orientation. Participants of all ages and from both cultures detected deviant forms defined in terms of shape or size, while only U.S. adults drew distinctions between mirror images (i.e. forms differing in "sense"). Moreover, irrelevant variations of sense did not disrupt the detection of a shape or size deviant, while irrelevant variations of shape or size did. At all ages and in both cultures, participants thus retained the same properties as Euclidean geometry in their analysis of visual forms, even in the absence of formal instruction in geometry. These findings show that representations of planar visual forms provide core intuitions on which humans' knowledge in Euclidean geometry could possibly be grounded.
Collapse
Affiliation(s)
- Véronique Izard
- Université Paris Cité, CNRS, Integrative Neuroscience and Cognition Center, F-75006 Paris, France
- Department of Psychology, Harvard University, 33 Kirkland St, Cambridge, MA 02138, USA.
| | - Pierre Pica
- Instituto do Cérebro, Universidade Federal do Rio grande do Norte, R. do Horto, Lagoa Nova, Natal, RN 59076-550, Brazil
- UMR 7023, Structures Formelles du Langage, Université Paris 8, 2 rue de la Liberté, 93200 Saint-Denis, France
| | - Elizabeth S Spelke
- Department of Psychology, Harvard University, 33 Kirkland St, Cambridge, MA 02138, USA; NSF-STC Center for Brains, Minds and Machines, 43 Vassar St, Cambridge, MA 02139, USA
| |
Collapse
|
12
|
Yousif SR. Redundancy and Reducibility in the Formats of Spatial Representations. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2022; 17:1778-1793. [PMID: 35867333 DOI: 10.1177/17456916221077115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Mental representations are the essence of cognition. Yet to understand how the mind works, one must understand not just the content of mental representations (i.e., what information is stored) but also the format of those representations (i.e., how that information is stored). But what does it mean for representations to be formatted? How many formats are there? Is it possible that the mind represents some pieces of information in multiple formats at once? To address these questions, I discuss a "case study" of representational format: the representation of spatial location. I review work (a) across species and across development, (b) across spatial scales, and (c) across levels of analysis (e.g., high-level cognitive format vs. low-level neural format). Along the way, I discuss the possibility that the same information may be organized in multiple formats simultaneously (e.g., that locations may be represented in both Cartesian and polar coordinates). Ultimately, I argue that seemingly "redundant" formats may support the flexible spatial behavior observed in humans and that researchers should approach the study of all mental representations with this possibility in mind.
Collapse
|
13
|
Shape oriented object recognition on grasp using features from enclosure based exploratory procedure. INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS 2022. [DOI: 10.1007/s41315-022-00244-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
14
|
Ayzenberg V, Behrmann M. The Dorsal Visual Pathway Represents Object-Centered Spatial Relations for Object Recognition. J Neurosci 2022; 42:4693-4710. [PMID: 35508386 PMCID: PMC9186804 DOI: 10.1523/jneurosci.2257-21.2022] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 04/19/2022] [Accepted: 04/21/2022] [Indexed: 11/21/2022] Open
Abstract
Although there is mounting evidence that input from the dorsal visual pathway is crucial for object processes in the ventral pathway, the specific functional contributions of dorsal cortex to these processes remain poorly understood. Here, we hypothesized that dorsal cortex computes the spatial relations among an object's parts, a process crucial for forming global shape percepts, and transmits this information to the ventral pathway to support object categorization. Using fMRI with human participants (females and males), we discovered regions in the intraparietal sulcus (IPS) that were selectively involved in computing object-centered part relations. These regions exhibited task-dependent functional and effective connectivity with ventral cortex, and were distinct from other dorsal regions, such as those representing allocentric relations, 3D shape, and tools. In a subsequent experiment, we found that the multivariate response of posterior (p)IPS, defined on the basis of part-relations, could be used to decode object category at levels comparable to ventral object regions. Moreover, mediation and multivariate effective connectivity analyses further suggested that IPS may account for representations of part relations in the ventral pathway. Together, our results highlight specific contributions of the dorsal visual pathway to object recognition. We suggest that dorsal cortex is a crucial source of input to the ventral pathway and may support the ability to categorize objects on the basis of global shape.SIGNIFICANCE STATEMENT Humans categorize novel objects rapidly and effortlessly. Such categorization is achieved by representing an object's global shape structure, that is, the relations among object parts. Yet, despite their importance, it is unclear how part relations are represented neurally. Here, we hypothesized that object-centered part relations may be computed by the dorsal visual pathway, which is typically implicated in visuospatial processing. Using fMRI, we identified regions selective for the part relations in dorsal cortex. We found that these regions can support object categorization, and even mediate representations of part relations in the ventral pathway, the region typically thought to support object categorization. Together, these findings shed light on the broader network of brain regions that support object categorization.
Collapse
Affiliation(s)
- Vladislav Ayzenberg
- Neuroscience Institute and Psychology Department, Carnegie Mellon University, Pittsburgh, PA 15213
| | - Marlene Behrmann
- Neuroscience Institute and Psychology Department, Carnegie Mellon University, Pittsburgh, PA 15213
| |
Collapse
|
15
|
Ayzenberg V, Lourenco S. Perception of an object's global shape is best described by a model of skeletal structure in human infants. eLife 2022; 11:e74943. [PMID: 35612898 PMCID: PMC9132572 DOI: 10.7554/elife.74943] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 05/09/2022] [Indexed: 11/13/2022] Open
Abstract
Categorization of everyday objects requires that humans form representations of shape that are tolerant to variations among exemplars. Yet, how such invariant shape representations develop remains poorly understood. By comparing human infants (6-12 months; N=82) to computational models of vision using comparable procedures, we shed light on the origins and mechanisms underlying object perception. Following habituation to a never-before-seen object, infants classified other novel objects across variations in their component parts. Comparisons to several computational models of vision, including models of high-level and low-level vision, revealed that infants' performance was best described by a model of shape based on the skeletal structure. Interestingly, infants outperformed a range of artificial neural network models, selected for their massive object experience and biological plausibility, under the same conditions. Altogether, these findings suggest that robust representations of shape can be formed with little language or object experience by relying on the perceptually invariant skeletal structure.
Collapse
Affiliation(s)
| | - Stella Lourenco
- Department of Psychology, Emory UniversityAtlantaUnited States
| |
Collapse
|
16
|
Superordinate Categorization Based on the Perceptual Organization of Parts. Brain Sci 2022; 12:brainsci12050667. [PMID: 35625053 PMCID: PMC9139997 DOI: 10.3390/brainsci12050667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 05/13/2022] [Accepted: 05/17/2022] [Indexed: 12/10/2022] Open
Abstract
Plants and animals are among the most behaviorally significant superordinate categories for humans. Visually assigning objects to such high-level classes is challenging because highly distinct items must be grouped together (e.g., chimpanzees and geckos) while more similar items must sometimes be separated (e.g., stick insects and twigs). As both animals and plants typically possess complex multi-limbed shapes, the perceptual organization of shape into parts likely plays a crucial rule in identifying them. Here, we identify a number of distinctive growth characteristics that affect the spatial arrangement and properties of limbs, yielding useful cues for differentiating plants from animals. We developed a novel algorithm based on shape skeletons to create many novel object pairs that differ in their part structure but are otherwise very similar. We found that particular part organizations cause stimuli to look systematically more like plants or animals. We then generated other 110 sequences of shapes morphing from animal- to plant-like appearance by modifying three aspects of part structure: sprouting parts, curvedness of parts, and symmetry of part pairs. We found that all three parameters correlated strongly with human animal/plant judgments. Together our findings suggest that subtle changes in the properties and organization of parts can provide powerful cues in superordinate categorization.
Collapse
|
17
|
Hart Y, Mahadevan L, Dillon MR. Euclid's Random Walk: Developmental Changes in the Use of Simulation for Geometric Reasoning. Cogn Sci 2022; 46:e13070. [PMID: 35085405 DOI: 10.1111/cogs.13070] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 09/04/2021] [Accepted: 11/10/2021] [Indexed: 01/29/2023]
Abstract
Euclidean geometry has formed the foundation of architecture, science, and technology for millennia, yet the development of human's intuitive reasoning about Euclidean geometry is not well understood. The present study explores the cognitive processes and representations that support the development of humans' intuitive reasoning about Euclidean geometry. One-hundred-twenty-five 7- to 12-year-old children and 30 adults completed a localization task in which they visually extrapolated missing parts of fragmented planar triangles and a reasoning task in which they answered verbal questions about the general properties of planar triangles. While basic Euclidean principles guided even young children's visual extrapolations, only older children and adults reasoned about triangles in ways that were consistent with Euclidean geometry. Moreover, a relation beteen visual extrapolation and reasoning appeared only in older children and adults. Reasoning consistent with Euclidean geometry may thus emerge when children abandon incorrect, axiomatic-based reasoning strategies and come to reason using mental simulations of visual extrapolations.
Collapse
Affiliation(s)
- Yuval Hart
- Department of Psychology, The Hebrew University of Jerusalem.,Paulson School of Engineering and Applied Sciences, Harvard University
| | - L Mahadevan
- Paulson School of Engineering and Applied Sciences, Harvard University.,Department of Physics, Harvard University.,Center for Brain Science, Harvard University.,Department of Organismic and Evolutionary Biology, Harvard University
| | | |
Collapse
|
18
|
Wilder J, Rezanejad M, Dickinson S, Siddiqi K, Jepson A, Walther DB. Neural correlates of local parallelism during naturalistic vision. PLoS One 2022; 17:e0260266. [PMID: 35061699 PMCID: PMC8782314 DOI: 10.1371/journal.pone.0260266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 11/07/2021] [Indexed: 11/18/2022] Open
Abstract
Human observers can rapidly perceive complex real-world scenes. Grouping visual elements into meaningful units is an integral part of this process. Yet, so far, the neural underpinnings of perceptual grouping have only been studied with simple lab stimuli. We here uncover the neural mechanisms of one important perceptual grouping cue, local parallelism. Using a new, image-computable algorithm for detecting local symmetry in line drawings and photographs, we manipulated the local parallelism content of real-world scenes. We decoded scene categories from patterns of brain activity obtained via functional magnetic resonance imaging (fMRI) in 38 human observers while they viewed the manipulated scenes. Decoding was significantly more accurate for scenes containing strong local parallelism compared to weak local parallelism in the parahippocampal place area (PPA), indicating a central role of parallelism in scene perception. To investigate the origin of the parallelism signal we performed a model-based fMRI analysis of the public BOLD5000 dataset, looking for voxels whose activation time course matches that of the locally parallel content of the 4916 photographs viewed by the participants in the experiment. We found a strong relationship with average local symmetry in visual areas V1-4, PPA, and retrosplenial cortex (RSC). Notably, the parallelism-related signal peaked first in V4, suggesting V4 as the site for extracting paralleism from the visual input. We conclude that local parallelism is a perceptual grouping cue that influences neuronal activity throughout the visual hierarchy, presumably starting at V4. Parallelism plays a key role in the representation of scene categories in PPA.
Collapse
Affiliation(s)
| | - Morteza Rezanejad
- University of Toronto, Toronto, Canada
- McGill University, Montreal, Canada
| | - Sven Dickinson
- University of Toronto, Toronto, Canada
- Samsung Toronto AI Research Center, Toronto, Canada
- Vector Institute, Toronto, Canada
| | | | - Allan Jepson
- University of Toronto, Toronto, Canada
- Samsung Toronto AI Research Center, Toronto, Canada
| | | |
Collapse
|
19
|
|
20
|
Ayzenberg V, Kamps FS, Dilks DD, Lourenco SF. Skeletal representations of shape in the human visual cortex. Neuropsychologia 2022; 164:108092. [PMID: 34801519 PMCID: PMC9840386 DOI: 10.1016/j.neuropsychologia.2021.108092] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 11/07/2021] [Accepted: 11/17/2021] [Indexed: 01/17/2023]
Abstract
Shape perception is crucial for object recognition. However, it remains unknown exactly how shape information is represented and used by the visual system. Here, we tested the hypothesis that the visual system represents object shape via a skeletal structure. Using functional magnetic resonance imaging (fMRI) and representational similarity analysis (RSA), we found that a model of skeletal similarity explained significant unique variance in the response profiles of V3 and LO. Moreover, the skeletal model remained predictive in these regions even when controlling for other models of visual similarity that approximate low-to high-level visual features (i.e., Gabor-jet, GIST, HMAX, and AlexNet), and across different surface forms, a manipulation that altered object contours while preserving the underlying skeleton. Together, these findings shed light on shape processing in human vision, as well as the computational properties of V3 and LO. We discuss how these regions may support two putative roles of shape skeletons: namely, perceptual organization and object recognition.
Collapse
Affiliation(s)
- Vladislav Ayzenberg
- Department of Psychology, Carnegie Mellon University, USA,Corresponding author: (V. Ayzenberg)
| | - Frederik S. Kamps
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, USA
| | | | - Stella F. Lourenco
- Department of Psychology, Emory University, USA,Corresponding author: (S.F. Lourenco)
| |
Collapse
|
21
|
Aizenman AM, Ehinger KA, Wick FA, Micheletto R, Park J, Jurgensen L, Wolfe JM. Hiding the Rabbit: Using a genetic algorithm to investigate shape guidance in visual search. J Vis 2022; 22:7. [PMID: 35024760 PMCID: PMC8762685 DOI: 10.1167/jov.22.1.7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
During visual search, attention is guided by specific features, including shape. Our understanding of shape guidance is limited to specific attributes (closures and line terminations) that do not fully explain the richness of preattentive shape processing. We used a novel genetic algorithm method to explore shape space and to stimulate hypotheses about shape guidance. Initially, observers searched for targets among 12 random distractors defined, in radial frequency space, by the amplitude and phase of 10 radial frequencies. Reaction time (RT) was the measure of “fitness.” To evolve toward an easier search task, distractors with faster RTs survived to the next generation, “mated,” and produced offspring (new distractors for the next generation of search). To evolve a harder search, surviving distractors were those yielding longer RTs. Within eight generations of evolution, the method succeeds in producing visual searches either harder or easier than the starting search. In radial frequency space, easy distractors evolve amplitude × frequency spectra that are dissimilar to the target, whereas hard distractors evolve spectra that are more similar to the target. This method also works with naturally shaped targets (e.g., rabbit silhouettes). Interestingly, the most inefficient distractors featured a combination of a body and ear distractors that did not resemble the rabbit (visually or in spectrum). Adding extra ears to these distractors did not impact the search spectrally and instead made it easier to confirm a rabbit, once it was found. In general, these experiments show that shapes that are clearly distinct when attended are similar to each other preattentively.
Collapse
Affiliation(s)
| | | | - Farahnaz A Wick
- Harvard Medical School, Cambridge, MA, USA.,Brigham and Women's Hospital, Cambridge, MA, USA.,
| | | | | | | | - Jeremy M Wolfe
- Harvard Medical School, Cambridge, MA, USA.,Brigham and Women's Hospital, Cambridge, MA, USA.,
| |
Collapse
|
22
|
Ericson JD, Albert WS, Duane JN. Political affiliation moderates subjective interpretations of COVID-19 graphs. BIG DATA & SOCIETY 2022; 9:20539517221080678. [PMID: 35281347 PMCID: PMC8899844 DOI: 10.1177/20539517221080678] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
We examined the relationship between political affiliation, perceptual (percentage, slope) estimates, and subjective judgements of disease prevalence and mortality across three chart types. An online survey (N = 787) exposed separate groups of participants to charts displaying (a) COVID-19 data or (b) COVID-19 data labeled 'Influenza (Flu)'. Block 1 examined responses to cross-sectional mortality data (bar graphs, treemaps); results revealed that perceptual estimates comparing mortality in two countries were similar across political affiliations and chart types (all ps > .05), while subjective judgements revealed a disease x political party interaction (p < .05). Although Democrats and Republicans provided similar proportion estimates, Democrats interpreted mortality to be higher than Republicans; Democrats also interpreted mortality to be higher for COVID-19 than Influenza. Block 2 examined responses to time series (line graphs); Democrats and Republicans estimated greater slopes for COVID-19 trend lines than Influenza lines (p < .001); subjective judgements revealed a disease x political party interaction (p < .05). Democrats and Republicans indicated similar subjective rates of change for COVID-19 trends, and Democrats indicated lower subjective rates of change for Influenza than in any other condition. Thus, while Democrats and Republicans saw the graphs similarly in terms of percentages and line slopes, their subjective interpretations diverged. While we may see graphs of infectious disease data similarly from a purely mathematical or geometric perspective, our political affiliations may moderate how we subjectively interpret the data.
Collapse
|
23
|
Vannuscorps G, Galaburda A, Caramazza A. The form of reference frames in vision: The case of intermediate shape-centered representations. Neuropsychologia 2021; 162:108053. [PMID: 34624257 DOI: 10.1016/j.neuropsychologia.2021.108053] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 08/10/2021] [Accepted: 10/01/2021] [Indexed: 12/01/2022]
Abstract
Although a great deal is known about the early sensory and the later, perceptual, stages of visual processing, far less is known about the nature of intermediate representational units and reference frames. Progress in understanding intermediate levels of representations in vision is hindered by the complexity of interactions among multiple levels of representation in the visual system, making it difficult to isolate and study the nature of each particular level. Nature occasionally provides the opportunity to peer inside complex systems by isolating components of a system through accidental damage or genetic modification of neural components. We have recently reported the case of a young woman who perceives 2D bounded regions of space as if they were plane-rotated by 90, 180 or 270° around their center, mirrored across their own axes, or both. This suggested that an intermediate stage of processing consists in representing mutually exclusive 2D bounded regions extracted from the retinal image in their own "shape-centered" perceptual frame. We proposed to refer to this level of representation as "intermediate shape-centered representation" (ISCR). Here, we used Davida's pattern of errors across 9 experiments as a tool for specifying in greater detail the geometrical properties of the reference frame in which elongated and/or symmetrical shapes are represented at the level of the ISCR. The nature of Davida's errors in these experiments suggests that ISCRs are represented in reference frames composed of orthogonal axes aligned with and centered on the most elongated segment of elongated shapes and, for symmetrical shapes deprived of a straight segment, aligned with their axis of symmetry, and centered on their centroid.
Collapse
Affiliation(s)
- G Vannuscorps
- Department of Psychology, Harvard University, Cambridge, MA, 02138, USA; Institute of Psychological Sciences, Université Catholique de Louvain, 1348, Belgium; Institute of Neuroscience, Université Catholique de Louvain, 1348, Belgium; Louvain Bionics, Université Catholique de Louvain, 1348, Belgium.
| | - A Galaburda
- Department of Neurology, Harvard Medical School and Beth Israel Deaconess Medical Center, Boston, MA, 02215, USA
| | - A Caramazza
- Department of Psychology, Harvard University, Cambridge, MA, 02138, USA; Center for Mind/Brain Sciences (CIMeC), Università Degli Studi di Trento, Rovereto, 38068, Italy
| |
Collapse
|
24
|
Pan Y, Yang H, Li M, Zhang J, Cui L. Grouping strategies in numerosity perception between intrinsic and extrinsic grouping cues. Sci Rep 2021; 11:17605. [PMID: 34475472 PMCID: PMC8413425 DOI: 10.1038/s41598-021-96944-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 08/17/2021] [Indexed: 02/07/2023] Open
Abstract
The number of items in an array can be quickly and accurately estimated by dividing the array into subgroups, in a strategy termed "groupitizing." For example, when memorizing a telephone number, it is better to do so by divide the number into several segments. Different forms of visual grouping can affect the precision of the enumeration of a large set of items. Previous studies have found that when groupitizing, enumeration precision is improved by grouping arrays using visual proximity and color similarity. Based on Gestalt theory, Palmer (Cognit Psychol 24:436, 1992) divided perceptual grouping into intrinsic (e.g., proximity, similarity) and extrinsic (e.g., connectedness, common region) principles. Studies have investigated groupitizing effects on intrinsic grouping. However, to the best of our knowledge, no study has explored groupitizing effects for extrinsic grouping cues. Therefore, this study explored whether extrinsic grouping cues differed from intrinsic grouping cues for groupitizing effects in numerosity perception. The results showed that both extrinsic and intrinsic grouping cues improved enumeration precision. However, extrinsic grouping was more accurate in terms of the sensory precision of the numerosity perception.
Collapse
Affiliation(s)
- Yun Pan
- Key Laboratory of Basic Psychological and Cognitive Neuroscience, School of Psychology, Guizhou Normal University, Guiyang, China.
| | - Huanyu Yang
- Key Laboratory of Basic Psychological and Cognitive Neuroscience, School of Psychology, Guizhou Normal University, Guiyang, China.
- Education School, Yunnan University of Business Management, Kunming, China.
| | - Mengmeng Li
- Key Laboratory of Basic Psychological and Cognitive Neuroscience, School of Psychology, Guizhou Normal University, Guiyang, China
| | - Jian Zhang
- Key Laboratory of Basic Psychological and Cognitive Neuroscience, School of Psychology, Guizhou Normal University, Guiyang, China
| | - Lihua Cui
- Education School, Yunnan University of Business Management, Kunming, China
| |
Collapse
|
25
|
Baker N, Kellman PJ. Constant curvature modeling of abstract shape representation. PLoS One 2021; 16:e0254719. [PMID: 34339436 PMCID: PMC8328290 DOI: 10.1371/journal.pone.0254719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 07/01/2021] [Indexed: 11/19/2022] Open
Abstract
How abstract shape is perceived and represented poses crucial unsolved problems in human perception and cognition. Recent findings suggest that the visual system may encode contours as sets of connected constant curvature segments. Here we describe a model for how the visual system might recode a set of boundary points into a constant curvature representation. The model includes two free parameters that relate to the degree to which the visual system encodes shapes with high fidelity vs. the importance of simplicity in shape representations. We conducted two experiments to estimate these parameters empirically. Experiment 1 tested the limits of observers’ ability to discriminate a contour made up of two constant curvature segments from one made up of a single constant curvature segment. Experiment 2 tested observers’ ability to discriminate contours generated from cubic splines (which, mathematically, have no constant curvature segments) from constant curvature approximations of the contours, generated at various levels of precision. Results indicated a clear transition point at which discrimination becomes possible. The results were used to fix the two parameters in our model. In Experiment 3, we tested whether outputs from our parameterized model were predictive of perceptual performance in a shape recognition task. We generated shape pairs that had matched physical similarity but differed in representational similarity (i.e., the number of segments needed to describe the shapes) as assessed by our model. We found that pairs of shapes that were more representationally dissimilar were also easier to discriminate in a forced choice, same/different task. The results of these studies provide evidence for constant curvature shape representation in human visual perception and provide a testable model for how abstract shape descriptions might be encoded.
Collapse
Affiliation(s)
- Nicholas Baker
- Department of Psychology, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail:
| | - Philip J. Kellman
- Department of Psychology, University of California Los Angeles, Los Angeles, California, United States of America
| |
Collapse
|
26
|
Baker N, Garrigan P, Kellman PJ. Constant curvature segments as building blocks of 2D shape representation. J Exp Psychol Gen 2021; 150:1556-1580. [PMID: 33332142 PMCID: PMC8324180 DOI: 10.1037/xge0001007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
How the visual system represents shape, and how shape representations might be computed by neural mechanisms, are fundamental and unanswered questions. Here, we investigated the hypothesis that 2-dimensional (2D) contour shapes are encoded structurally, as sets of connected constant curvature segments. We report 3 experiments investigating constant curvature segments as fundamental units of contour shape representations in human perception. Our results showed better performance in a path detection paradigm for constant curvature targets, as compared with locally matched targets that lacked this global regularity (Experiment 1), and that participants can learn to segment contours into constant curvature parts with different curvature values, but not into similarly different parts with linearly increasing curvatures (Experiment 2). We propose a neurally plausible model of contour shape representation based on constant curvature, built from oriented units known to exist in early cortical areas, and we confirmed the model's prediction that changes to the angular extent of a segment will be easier to detect than changes to relative curvature (Experiment 3). Together, these findings suggest the human visual system is specially adapted to detect and encode regions of constant curvature and support the notion that constant curvature segments are the building blocks from which abstract contour shape representations are composed. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
Affiliation(s)
- Nicholas Baker
- Department of Psychology, University of California, Los Angeles
| | | | | |
Collapse
|
27
|
Ciccione L, Dehaene S. Can humans perform mental regression on a graph? Accuracy and bias in the perception of scatterplots. Cogn Psychol 2021; 128:101406. [PMID: 34214734 DOI: 10.1016/j.cogpsych.2021.101406] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 06/17/2021] [Accepted: 06/18/2021] [Indexed: 11/28/2022]
Abstract
Despite the widespread use of graphs, little is known about how fast and how accurately we can extract information from them. Through a series of four behavioral experiments, we characterized human performance in "mental regression", i.e. the perception of statistical trends from scatterplots. When presented with a noisy scatterplot, even as briefly as 100 ms, human adults could accurately judge if it was increasing or decreasing, fit a regression line, and extrapolate outside the original data range, for both linear and non-linear functions. Performance was highly consistent across those three tasks of trend judgment, line fitting and extrapolation. Participants' linear trend judgments took into account the slope, the noise, and the number of data points, and were tightly correlated with the t-test classically used to evaluate the significance of a linear regression. However, they overestimated the absolute value of the regression slope. This bias was inconsistent with ordinary least squares (OLS) regression, which minimizes the sum of square deviations, but consistent with the use of Deming regression, which treats the x and y axes symmetrically and minimizes the Euclidean distance to the fitting line. We speculate that this fast but biased perception of scatterplots may be based on a "neuronal recycling" of the human visual capacity to identify the medial axis of a shape.
Collapse
Affiliation(s)
- Lorenzo Ciccione
- University Paris Sciences Lettres (PSL), 60 rue Mazarine, 75006 Paris, France; Cognitive Neuroimaging Unit, CEA, INSERM, Université Paris-Saclay, NeuroSpin Center, 91191 Gif/Yvette, France; Collège de France, Université Paris Sciences Lettres (PSL), 11 Place Marcelin Berthelot, 75005 Paris, France.
| | - Stanislas Dehaene
- Cognitive Neuroimaging Unit, CEA, INSERM, Université Paris-Saclay, NeuroSpin Center, 91191 Gif/Yvette, France; Collège de France, Université Paris Sciences Lettres (PSL), 11 Place Marcelin Berthelot, 75005 Paris, France
| |
Collapse
|
28
|
Morgenstern Y, Hartmann F, Schmidt F, Tiedemann H, Prokott E, Maiello G, Fleming RW. An image-computable model of human visual shape similarity. PLoS Comput Biol 2021; 17:e1008981. [PMID: 34061825 PMCID: PMC8195351 DOI: 10.1371/journal.pcbi.1008981] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Revised: 06/11/2021] [Accepted: 04/19/2021] [Indexed: 11/19/2022] Open
Abstract
Shape is a defining feature of objects, and human observers can effortlessly compare shapes to determine how similar they are. Yet, to date, no image-computable model can predict how visually similar or different shapes appear. Such a model would be an invaluable tool for neuroscientists and could provide insights into computations underlying human shape perception. To address this need, we developed a model (‘ShapeComp’), based on over 100 shape features (e.g., area, compactness, Fourier descriptors). When trained to capture the variance in a database of >25,000 animal silhouettes, ShapeComp accurately predicts human shape similarity judgments between pairs of shapes without fitting any parameters to human data. To test the model, we created carefully selected arrays of complex novel shapes using a Generative Adversarial Network trained on the animal silhouettes, which we presented to observers in a wide range of tasks. Our findings show that incorporating multiple ShapeComp dimensions facilitates the prediction of human shape similarity across a small number of shapes, and also captures much of the variance in the multiple arrangements of many shapes. ShapeComp outperforms both conventional pixel-based metrics and state-of-the-art convolutional neural networks, and can also be used to generate perceptually uniform stimulus sets, making it a powerful tool for investigating shape and object representations in the human brain. The ability to describe and compare shapes is crucial in many scientific domains from visual object recognition to computational morphology and computer graphics. Across disciplines, considerable effort has been devoted to the study of shape and its influence on object recognition, yet an important stumbling block is the quantitative characterization of shape similarity. Here we develop a psychophysically validated model that takes as input an object’s shape boundary and provides a high-dimensional output that can be used for predicting visual shape similarity. With this precise control of shape similarity, the model’s description of shape is a powerful tool that can be used across the neurosciences and artificial intelligence to test role of shape in perception and the brain.
Collapse
Affiliation(s)
- Yaniv Morgenstern
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
- * E-mail:
| | - Frieder Hartmann
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
| | - Filipp Schmidt
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
| | - Henning Tiedemann
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
| | - Eugen Prokott
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
| | - Guido Maiello
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
| | - Roland W. Fleming
- Department of Experimental Psychology, Justus-Liebig University Giessen, Giessen, Germany
- Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
29
|
Sun Z, Firestone C. Curious Objects: How Visual Complexity Guides Attention and Engagement. Cogn Sci 2021; 45:e12933. [PMID: 33873259 DOI: 10.1111/cogs.12933] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 12/10/2020] [Accepted: 12/15/2020] [Indexed: 11/26/2022]
Abstract
Some things look more complex than others. For example, a crenulate and richly organized leaf may seem more complex than a plain stone. What is the nature of this experience-and why do we have it in the first place? Here, we explore how object complexity serves as an efficiently extracted visual signal that the object merits further exploration. We algorithmically generated a library of geometric shapes and determined their complexity by computing the cumulative surprisal of their internal skeletons-essentially quantifying the "amount of information" within each shape-and then used this approach to ask new questions about the perception of complexity. Experiments 1-3 asked what kind of mental process extracts visual complexity: a slow, deliberate, reflective process (as when we decide that an object is expensive or popular) or a fast, effortless, and automatic process (as when we see that an object is big or blue)? We placed simple and complex objects in visual search arrays and discovered that complex objects were easier to find among simple distractors than simple objects are among complex distractors-a classic search asymmetry indicating that complexity is prioritized in visual processing. Next, we explored the function of complexity: Why do we represent object complexity in the first place? Experiments 4-5 asked subjects to study serially presented objects in a self-paced manner (for a later memory test); subjects dwelled longer on complex objects than simple objects-even when object shape was completely task-irrelevant-suggesting a connection between visual complexity and exploratory engagement. Finally, Experiment 6 connected these implicit measures of complexity to explicit judgments. Collectively, these findings suggest that visual complexity is extracted efficiently and automatically, and even arouses a kind of "perceptual curiosity" about objects that encourages subsequent attentional engagement.
Collapse
Affiliation(s)
- Zekun Sun
- Department of Psychological & Brain Sciences, Johns Hopkins University
| | - Chaz Firestone
- Department of Psychological & Brain Sciences, Johns Hopkins University
| |
Collapse
|
30
|
Papale P, Leo A, Handjaras G, Cecchetti L, Pietrini P, Ricciardi E. Shape coding in occipito-temporal cortex relies on object silhouette, curvature, and medial axis. J Neurophysiol 2020; 124:1560-1570. [PMID: 33052726 DOI: 10.1152/jn.00212.2020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Object recognition relies on different transformations of the retinal input, carried out by the visual system, that range from local contrast to object shape and category. While some of those transformations are thought to occur at specific stages of the visual hierarchy, the features they represent are correlated (e.g., object shape and identity) and selectivity for the same feature overlaps in many brain regions. This may be explained either by collinearity across representations or may instead reflect the coding of multiple dimensions by the same cortical population. Moreover, orthogonal and shared components may differently impact distinctive stages of the visual hierarchy. We recorded functional MRI activity while participants passively attended to object images and employed a statistical approach that partitioned orthogonal and shared object representations to reveal their relative impact on brain processing. Orthogonal shape representations (silhouette, curvature, and medial axis) independently explained distinct and overlapping clusters of selectivity in the occitotemporal and parietal cortex. Moreover, we show that the relevance of shared representations linearly increases moving from posterior to anterior regions. These results indicate that the visual cortex encodes shared relations between different features in a topographic fashion and that object shape is encoded along different dimensions, each representing orthogonal features.NEW & NOTEWORTHY There are several possible ways of characterizing the shape of an object. Which shape description better describes our brain responses while we passively perceive objects? Here, we employed three competing shape models to explain brain representations when viewing real objects. We found that object shape is encoded in a multidimensional fashion and thus defined by the interaction of multiple features.
Collapse
Affiliation(s)
- Paolo Papale
- Molecular Mind Laboratory, IMT School for Advanced Studies Lucca, Italy.,Department of Vision and Cognition, Netherlands Institute for Neuroscience, Amsterdam, The Netherlands
| | - Andrea Leo
- Molecular Mind Laboratory, IMT School for Advanced Studies Lucca, Italy.,Department of Translational Research and Advanced Technologies in Medicine and Surgery, University of Pisa, Pisa, Italy
| | - Giacomo Handjaras
- Molecular Mind Laboratory, IMT School for Advanced Studies Lucca, Italy
| | - Luca Cecchetti
- Molecular Mind Laboratory, IMT School for Advanced Studies Lucca, Italy
| | - Pietro Pietrini
- Molecular Mind Laboratory, IMT School for Advanced Studies Lucca, Italy
| | | |
Collapse
|
31
|
On the robustness of skeleton detection against adversarial attacks. Neural Netw 2020; 132:416-427. [DOI: 10.1016/j.neunet.2020.09.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Revised: 09/12/2020] [Accepted: 09/21/2020] [Indexed: 11/22/2022]
|
32
|
Ayzenberg V, Chen Y, Yousif SR, Lourenco SF. Skeletal representations of shape in human vision: Evidence for a pruned medial axis model. J Vis 2019; 19:6. [PMID: 31173631 PMCID: PMC6894409 DOI: 10.1167/19.6.6] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
A representation of shape that is low dimensional and stable across minor disruptions is critical for object recognition. Computer vision research suggests that such a representation can be supported by the medial axis-a computational model for extracting a shape's internal skeleton. However, few studies have shown evidence of medial axis processing in humans, and even fewer have examined how the medial axis is extracted in the presence of disruptive contours. Here, we tested whether human skeletal representations of shape reflect the medial axis transform (MAT), a computation sensitive to all available contours, or a pruned medial axis, which ignores contours that may be considered "noise." Across three experiments, participants (N = 2062) were shown complete, perturbed, or illusory two-dimensional shapes on a tablet computer and were asked to tap the shapes anywhere once. When directly compared with another viable model of shape perception (based on principal axes), participants' collective responses were better fit by the medial axis, and a direct test of boundary avoidance suggested that this result was not likely because of a task-specific cognitive strategy (Experiment 1). Moreover, participants' responses reflected a pruned computation in shapes with small or large internal or external perturbations (Experiment 2) and under conditions of illusory contours (Experiment 3). These findings extend previous work by suggesting that humans extract a relatively stable medial axis of shapes. A relatively stable skeletal representation, reflected by a pruned model, may be well equipped to support real-world shape perception and object recognition.
Collapse
Affiliation(s)
| | - Yunxiao Chen
- The London School of Economics and Political Science, London, UK
| | | | | |
Collapse
|