1
|
Schmidt F, Tiedemann H, Fleming RW, Morgenstern Y. Inferring shape transformations in a drawing task. Mem Cognit 2023:10.3758/s13421-023-01452-0. [PMID: 37668880 DOI: 10.3758/s13421-023-01452-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/09/2023] [Indexed: 09/06/2023]
Abstract
Many objects and materials in our environment are subject to transformations that alter their shape. For example, branches bend in the wind, ice melts, and paper crumples. Still, we recognize objects and materials across these changes, suggesting we can distinguish an object's original features from those caused by the transformations ("shape scission"). Yet, if we truly understand transformations, we should not only be able to identify their signatures but also actively apply the transformations to new objects (i.e., through imagination or mental simulation). Here, we investigated this ability using a drawing task. On a tablet computer, participants viewed a sample contour and its transformed version, and were asked to apply the same transformation to a test contour by drawing what the transformed test shape should look like. Thus, they had to (i) infer the transformation from the shape differences, (ii) envisage its application to the test shape, and (iii) draw the result. Our findings show that drawings were more similar to the ground truth transformed test shape than to the original test shape-demonstrating the inference and reproduction of transformations from observation. However, this was only observed for relatively simple shapes. The ability was also modulated by transformation type and magnitude but not by the similarity between sample and test shapes. Together, our findings suggest that we can distinguish between representations of original object shapes and their transformations, and can use visual imagery to mentally apply nonrigid transformations to observed objects, showing how we not only perceive but also 'understand' shape.
Collapse
Affiliation(s)
- Filipp Schmidt
- Department of Experimental Psychology, Justus Liebig University Giessen, Otto-Behaghel-Str. 10F, 35394, Giessen, Germany.
- Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University, Giessen, Germany.
| | - Henning Tiedemann
- Department of Experimental Psychology, Justus Liebig University Giessen, Otto-Behaghel-Str. 10F, 35394, Giessen, Germany
| | - Roland W Fleming
- Department of Experimental Psychology, Justus Liebig University Giessen, Otto-Behaghel-Str. 10F, 35394, Giessen, Germany
- Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University, Giessen, Germany
| | - Yaniv Morgenstern
- Department of Experimental Psychology, Justus Liebig University Giessen, Otto-Behaghel-Str. 10F, 35394, Giessen, Germany
- University of Leuven (KU Leuven), Leuven, Belgium
| |
Collapse
|
2
|
Shao R, Bi XJ, Chen Z. A novel hybrid transformer-CNN architecture for environmental microorganism classification. PLoS One 2022; 17:e0277557. [PMID: 36367879 PMCID: PMC9651547 DOI: 10.1371/journal.pone.0277557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 10/30/2022] [Indexed: 11/13/2022] Open
Abstract
The success of vision transformers (ViTs) has given rise to their application in classification tasks of small environmental microorganism (EM) datasets. However, due to the lack of multi-scale feature maps and local feature extraction capabilities, the pure transformer architecture cannot achieve good results on small EM datasets. In this work, a novel hybrid model is proposed by combining the transformer with a convolution neural network (CNN). Compared to traditional ViTs and CNNs, the proposed model achieves state-of-the-art performance when trained on small EM datasets. This is accomplished in two ways. 1) Instead of the original fixed-size feature maps of the transformer-based designs, a hierarchical structure is adopted to obtain multi-scale feature maps. 2) Two new blocks are introduced to the transformer’s two core sections, namely the convolutional parameter sharing multi-head attention block and the local feed-forward network block. The ways allow the model to extract more local features compared to traditional transformers. In particular, for classification on the sixth version of the EM dataset (EMDS-6), the proposed model outperforms the baseline Xception by 6.7 percentage points, while being 60 times smaller in parameter size. In addition, the proposed model also generalizes well on the WHOI dataset (accuracy of 99%) and constitutes a fresh approach to the use of transformers for visual classification tasks based on small EM datasets.
Collapse
Affiliation(s)
- Ran Shao
- College of Information and Communication Engineering, Harbin Engineering University, Harbin, China
- College of Information and Communication Engineering, Harbin Vocational & Technical College, Harbin, China
| | - Xiao-Jun Bi
- Department of Information Engineering, Minzu University of China, Beijing, China
| | - Zheng Chen
- College of Information and Communication Engineering, Harbin Engineering University, Harbin, China
| |
Collapse
|
3
|
Abstract
Shape is an interesting property of objects because it is used in ordinary discourse in ways that seem to have little connection to how it is typically defined in mathematics. The present article describes how the concept of shape can be grounded within Euclidean and non-Euclidean geometry and also to human perception. It considers the formal methods that have been proposed for measuring the differences among shapes and how the performance of those methods compares with shape difference thresholds of human observers. It discusses how different types of shape change can be perceptually categorized. It also evaluates the specific data structures that have been used to represent shape in models of both human and machine vision, and it reviews the psychophysical evidence about the extent to which those models are consistent with human perception. Based on this review of the literature, we argue that shape is not one thing but rather a collection of many object attributes, some of which are more perceptually salient than others. Because the relative importance of these attributes can be context dependent, there is no obvious single definition of shape that is universally applicable in all situations.
Collapse
Affiliation(s)
- James T Todd
- Department of Psychology, The Ohio State University, Columbus, OH, USA
| | | |
Collapse
|
4
|
Chen Y, Wang Y, Guo S, Zhang X, Yan B. The causal future: The influence of shape features caused by external transformation on visual attention. J Vis 2021; 21:17. [PMID: 34694327 PMCID: PMC8556566 DOI: 10.1167/jov.21.11.17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Previous studies have validated that participants can distinguish different origins of objects’ shape features, teasing apart features caused by transformation (causal history) from those of the original shape. Considering bite as a transformation example, two experiments were designed to investigate the effect of causal history on the allocation of visual attention. Participants were presented with regular and familiar complete or bitten shapes in Experiment 1 and unfamiliar and irregular complete or bitten shapes in Experiment 2 over a range of stimulus onset asynchronies (SOAs). The task was to identify different probes (i.e., punctuation marks) that equally appeared at four positions around these shapes. The results showed that complete regular shapes had no impact on participants’ reaction times to identify probes that appeared at the four different positions (Experiment 1), whereas complete irregular shapes would facilitate participants’ responses to the probes that appeared at the positions around the “head” of the irregular shape (Experiment 2) regardless of SOAs. When presented with bitten shapes, in the earlier phase of visual processing, participants’ response patterns resembled those found when complete shapes were presented. However, with longer SOAs, participants were faster in identifying probes that appeared at those positions that were around the nontransformed region of the bitten shapes. The results revealed that information about shape features caused by causal history could be incorporated, albeit relatively later, into the allocation of visual attention. The role of causal history in the speculation about one object's future development is discussed.
Collapse
Affiliation(s)
- Yunyun Chen
- Beijing Key Laboratory of Applied Experimental Psychology, National Demonstration Center for Experimental Psychology Education, Faculty of Psychology, Beijing Normal University, Beijing, China.,
| | - Yuying Wang
- Shaanxi Key Laboratory of Behavior and Cognitive Neuroscience, School of Psychology, Shaanxi Normal University, Xi'an, Shaanxi, China.,
| | - Sen Guo
- Shaanxi Key Laboratory of Behavior and Cognitive Neuroscience, School of Psychology, Shaanxi Normal University, Xi'an, Shaanxi, China.,
| | - Xuemin Zhang
- Beijing Key Laboratory of Applied Experimental Psychology, National Demonstration Center for Experimental Psychology Education, Faculty of Psychology, Beijing Normal University, Beijing, China.,State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China.,
| | - Bihua Yan
- Shaanxi Key Laboratory of Behavior and Cognitive Neuroscience, School of Psychology, Shaanxi Normal University, Xi'an, Shaanxi, China.,
| |
Collapse
|
5
|
Brenner E, Hurtado SS, Arias EA, Smeets JBJ, Fleming RW. Searching for Strangely Shaped Cookies - Is Taking a Bite Out of a Cookie Similar to Occluding Part of It? Perception 2020; 50:140-153. [PMID: 33377849 PMCID: PMC7879225 DOI: 10.1177/0301006620983729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Does recognizing the transformations that gave rise to an object’s retinal image contribute to early object recognition? It might, because finding a partially occluded object among similar objects that are not occluded is more difficult than finding an object that has the same retinal image shape without evident occlusion. If this is because the occlusion is recognized as such, we might see something similar for other transformations. We confirmed that it is difficult to find a cookie with a section missing when this was the result of occlusion. It is not more difficult to find a cookie from which a piece has been bitten off than to find one that was baked in a similar shape. On the contrary, the bite marks help detect the bitten cookie. Thus, biting off a part of a cookie has very different effects on visual search than occluding part of it. These findings do not support the idea that observers rapidly and automatically compensate for the ways in which objects’ shapes are transformed to give rise to the objects’ retinal images. They are easy to explain in terms of detecting characteristic features in the retinal image that such transformations may hide or create.
Collapse
|
6
|
The role of semantics in the perceptual organization of shape. Sci Rep 2020; 10:22141. [PMID: 33335146 PMCID: PMC7746709 DOI: 10.1038/s41598-020-79072-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 12/03/2020] [Indexed: 11/09/2022] Open
Abstract
Establishing correspondence between objects is fundamental for object constancy, similarity perception and identifying transformations. Previous studies measured point-to-point correspondence between objects before and after rigid and non-rigid shape transformations. However, we can also identify 'similar parts' on extremely different objects, such as butterflies and owls or lizards and whales. We measured point-to-point correspondence between such object pairs. In each trial, a dot was placed on the contour of one object, and participants had to place a dot on 'the corresponding location' of the other object. Responses show correspondence is established based on similarities between semantic parts (such as head, wings, or legs). We then measured correspondence between ambiguous objects with different labels (e.g., between 'duck' and 'rabbit' interpretations of the classic ambiguous figure). Despite identical geometries, correspondences were different across the interpretations, based on semantics (e.g., matching 'Head' to 'Head', 'Tail' to 'Tail'). We present a zero-parameter model based on labeled semantic part data (obtained from a different group of participants) that well explains our data and outperforms an alternative model based on contour curvature. This demonstrates how we establish correspondence between very different objects by evaluating similarity between semantic parts, combining perceptual organization and cognitive processes.
Collapse
|
7
|
Schmidt F, Fleming RW, Valsecchi M. Softness and weight from shape: Material properties inferred from local shape features. J Vis 2020; 20:2. [PMID: 32492099 PMCID: PMC7416911 DOI: 10.1167/jov.20.6.2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Object shape is an important cue to material identity and for the estimation of material properties. Shape features can affect material perception at different levels: at a microscale (surface roughness), mesoscale (textures and local object shape), or megascale (global object shape) level. Examples for local shape features include ripples in drapery, clots in viscous liquids, or spiraling creases in twisted objects. Here, we set out to test the role of such shape features on judgments of material properties softness and weight. For this, we created a large number of novel stimuli with varying surface shape features. We show that those features have distinct effects on softness and weight ratings depending on their type, as well as amplitude and frequency, for example, increasing numbers and pointedness of spikes makes objects appear harder and heavier. By also asking participants to name familiar objects, materials, and transformations they associate with our stimuli, we can show that softness and weight judgments do not merely follow from semantic associations between particular stimuli and real-world object shapes. Rather, softness and weight are estimated from surface shape, presumably based on learned heuristics about the relationship between a particular expression of surface features and material properties. In line with this, we show that correlations between perceived softness or weight and surface curvature vary depending on the type of surface feature. We conclude that local shape features have to be considered when testing the effects of shape on the perception of material properties such as softness and weight.
Collapse
|
8
|
Balas B, Auen A, Thrash J, Lammers S. Children's use of local and global visual features for material perception. J Vis 2020; 20:10. [PMID: 32097486 PMCID: PMC7343528 DOI: 10.1167/jov.20.2.10] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Adults can rapidly recognize material properties in natural images, and children's performance in material categorization tasks suggests that this ability develops slowly during childhood. In the current study, we further examined the information children use to recognize materials during development by asking how the use of local versus global visual features for material perception changes in middle childhood. We recruited adults and 5- to 10-year-old children for three experiments that required participants to distinguish between shape-matched images of real and artificial food. Accurate performance in this task requires participants to distinguish between a wide range of material properties characteristic of each category, thus testing material perception abilities broadly. In two tasks, we applied distinct methods of image scrambling (block scrambling and diffeomorphic scrambling) to parametrically disrupt global appearance while preserving features in small spatial neighborhoods. In the third task, we used image blurring to parametrically disrupt local feature visibility. Our key question was whether or not participant age affected performance differently when local versus global appearance was disrupted. We found that although image blur led to disproportionately poorer performance in young children, this effect was reduced or absent when diffeomorphic scrambling was used. We interpret this outcome as evidence that the ability to recruit large-scale visual features for material perception may develop slowly during middle childhood.
Collapse
|
9
|
Abstract
How the brain reconstructs three-dimensional object shape from two-dimensional retinal light patterns remains a mystery. Most research has investigated how cues—such as shading, texture, or perspective—help us estimate visible surface points on the outside of objects. However, our findings show the brain achieves much more than this. Observers not only infer the visible outer surface but also the hidden internal structure of objects—seeing “beneath the skin.” Our findings suggest the brain parses shapes’ features according to their physical causes, potentially allowing us to separate a single continuous surface into multiple superimposed depth layers. This ability likely aids our interactions with objects, by indicating which surface locations are firmly supported from the inside and thus suitable for grasping. Three-dimensional (3D) shape perception is one of the most important functions of vision. It is crucial for many tasks, from object recognition to tool use, and yet how the brain represents shape remains poorly understood. Most theories focus on purely geometrical computations (e.g., estimating depths, curvatures, symmetries). Here, however, we find that shape perception also involves sophisticated inferences that parse shapes into features with distinct causal origins. Inspired by marble sculptures such as Strazza’s The Veiled Virgin (1850), which vividly depict figures swathed in cloth, we created composite shapes by wrapping unfamiliar forms in textile, so that the observable surface relief was the result of complex interactions between the underlying object and overlying fabric. Making sense of such structures requires segmenting the shape based on their causes, to distinguish whether lumps and ridges are due to the shrouded object or to the ripples and folds of the overlying cloth. Three-dimensional scans of the objects with and without the textile provided ground-truth measures of the true physical surface reliefs, against which observers’ judgments could be compared. In a virtual painting task, participants indicated which surface ridges appeared to be caused by the hidden object and which were due to the drapery. In another experiment, participants indicated the perceived depth profile of both surface layers. Their responses reveal that they can robustly distinguish features belonging to the textile from those due to the underlying object. Together, these findings reveal the operation of visual shape-segmentation processes that parse shapes based on their causal origin.
Collapse
|
10
|
Toscani M, Milojevic Z, Fleming RW, Gegenfurtner KR. Color consistency in the appearance of bleached fabrics. J Vis 2020; 20:11. [PMID: 32315403 PMCID: PMC7405726 DOI: 10.1167/jov.20.4.11] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Accepted: 01/18/2020] [Indexed: 11/24/2022] Open
Abstract
Human observers are remarkably good at perceiving constant object color across illumination changes. However, there are numerous other factors that can modulate surface appearance, such as aging, bleaching, staining, or soaking. Despite this, we are often able to identify material properties across such transformations. Little is known about how and to what extent we can compensate for the accompanying color transformations. Here we investigated whether humans could reproduce the original color of bleached fabrics. We treated 12 different fabric samples with a commercial bleaching product. Bleaching increased luminance and decreased saturation. We presented photographs of the original and bleached samples on a computer screen and asked observers to match the fabric colors to an adjustable matching disk. Different groups of observers produced matches for original and bleached samples. One group of observers were instructed to match the color of the bleached samples as they were before bleaching (i.e., compensate for the effects of bleaching); another, to accurately match color appearance. Observers did compensate significantly for the effects of bleaching when instructed to do so, but not in the appearance match condition. Results of a second experiment suggest that observers achieve color consistency, at least in part, through a strategy based on local spatial differences within the bleached samples. According to the results of a third experiment, these local spatial differences are likely to be the perceptual image cues that allow participants to determine whether a sample is bleached. When the effect of bleaching was limited or uniformly distributed across a sample's surface, observers were uncertain about the bleaching magnitude and seemed to apply cognitive strategies to achieve color consistency.
Collapse
Affiliation(s)
- Matteo Toscani
- Department of Psychology, Giessen University, Giessen, Germany
| | - Zarko Milojevic
- Department of Psychology, Giessen University, Giessen, Germany
| | | | | |
Collapse
|
11
|
Fleming RW, Schmidt F. Getting "fumpered": Classifying objects by what has been done to them. J Vis 2019; 19:15. [PMID: 30952166 DOI: 10.1167/19.4.15] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Every object acquires its shape from some kind of generative process, such as manufacture, biological growth, or self-organization, in response to external forces. Inferring such generative processes from an observed shape is computationally challenging because a given process can lead to radically different shapes, and similar shapes can result from different generative processes. Here, we suggest that in some cases, generative processes endow objects with distinctive statistical features that observers can use to classify objects according to what has been done to them. We found that from the very first trials in an eight-alternative forced-choice classification task, observers were extremely good at classifying unfamiliar objects by the transformations that had shaped them. Further experiments show that the shape features underlying this ability are distinct from Euclidean shape similarity and that observers can separate and voluntarily respond to both aspects of objects. Our findings suggest that perceptual organization processes allow us to identify salient statistical shape features that are diagnostic of generative processes. By so doing, we can classify objects we have never seen before according to the processes that shaped them.
Collapse
Affiliation(s)
- Roland W Fleming
- Justus-Liebig-University Giessen, General Psychology, Gießen, Germany
| | - Filipp Schmidt
- Justus-Liebig-University Giessen, General Psychology, Gießen, Germany
| |
Collapse
|
12
|
Schmidt F, Phillips F, Fleming RW. Visual perception of shape-transforming processes: 'Shape Scission'. Cognition 2019; 189:167-180. [PMID: 30986590 DOI: 10.1016/j.cognition.2019.04.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 04/04/2019] [Accepted: 04/05/2019] [Indexed: 10/27/2022]
Abstract
Shape-deforming processes (e.g., squashing, bending, twisting) can radically alter objects' shapes. After such a transformation, some features are due to the object's original form, while others are due to the transformation, yet it is challenging to separate the two. We tested whether observers can distinguish the causal origin of different features, teasing apart the characteristics of the original shape from those imposed by transformations, a process we call 'shape scission'. Using computer graphics, we created 8 unfamiliar objects and subjected each to 8 transformations (e.g., "twisted", "inflated", "melted"). One group of participants named transformations consistently. A second group arranged cards depicting the objects into classes according to either (i) the original shape or (ii) the type of transformation. They could do this almost perfectly, suggesting that they readily distinguish the causal origin of shape features. Another group used a digital painting interface to indicate which locations on the objects appeared transformed, with responses suggesting they can localise features caused by transformations. Finally, we parametrically varied the magnitude of the transformations, and asked another group to rate the degree of transformation. Ratings correlated strongly with transformation magnitude with a tendency to overestimate small magnitudes. Responses were predicted by both the magnitude and area affected by the transformation. Together, the findings suggest that observers can scission object shapes into original shape and transformation features and access the resulting representational layers at will.
Collapse
Affiliation(s)
| | | | - Roland W Fleming
- Justus Liebig University, Giessen, Germany; Center for Mind, Brain and Behavior, Giessen, Germany.
| |
Collapse
|