1
|
Abstract
Holistic processing is often used as a construct to characterize face recognition. An important recent study by Gold, Mundy, and Tjan (2012) quantified holistic processing by computing a facial-feature integration index derived from an ideal observer model. This index was mathematically defined as the ratio of the psychophysical contrast sensitivities squared for recognizing a whole face versus the sum of contrast sensitivities squared for individual face parts (left eye, right eye, nose, and mouth). They observed that this index was not significantly different from 1, leading to the provocative conclusion that the perception of a face is no more than the sum of its parts. What may not be obvious to all readers of this work is that these conclusions were based on a collection of faces that shared essentially the same configuration of face parts. We tested whether the facial-feature integration index would also equal 1 when faces have a range of configurations mirroring the range of variability in real-world faces, using the same experimental procedure and calculating the same integration index as Gold et al. When tested on faces with the same configuration, we also observed an integration index similar to what Gold et al. reported. But when tested on faces with variable configurations, we observed an integration index significantly greater than 1. Combing our results with those of Gold et al. further clarifies the theoretical construct of holistic processing in face recognition and what it means for the whole to be greater than the sum of its parts.
Collapse
|
2
|
Abstract
Why do faces become easier to recognize with repeated exposure? Previous research has suggested that familiarity may induce a qualitative shift in visual processing from an independent analysis of individual facial features to analysis that includes information about the relationships among features (Farah, Wilson, Drain, & Tanaka Psychological Review, 105, 482-498, 1998; Maurer, Grand, & Mondloch Trends in Cognitive Science, 6, 255-260, 2002). We tested this idea by using a "summation-at-threshold" technique (Gold, Mundy, & Tjan Psychological Science, 23, 427-434, 2012; Nandy & Tjan Journal of Vision, 8, 3.1-20, 2008), in which an observer's ability to recognize each individual facial feature shown independently is used to predict their ability to recognize all of the features shown in combination. We find that, although people are better overall at recognizing familiar as opposed to unfamiliar faces, their ability to integrate information across features is similar for unfamiliar and highly familiar faces and is well predicted by their ability to recognize each of the facial features shown in isolation. These results are consistent with the idea that familiarity has a quantitative effect on the efficiency with which information is extracted from individual features, rather than a qualitative effect on the process by which features are combined.
Collapse
|
3
|
Gold JM. A perceptually completed whole is less than the sum of its parts. Psychol Sci 2014; 25:1206-17. [PMID: 24796662 DOI: 10.1177/0956797614530725] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2013] [Accepted: 03/14/2014] [Indexed: 11/15/2022] Open
Abstract
How efficiently do people integrate the disconnected image fragments that fall on their eyes when they view partly occluded objects? In the present study, I used a psychophysical summation-at-threshold technique to address this question by measuring discrimination performance with both isolated and combined features of physically fragmented but perceptually complete objects. If visual completion promotes superior integration efficiency, performance with a visually completed object should exceed what would be expected from performance with the individual object parts shown in isolation. Contrary to this prediction, results showed that discrimination performance with both static and moving versions of physically fragmented but perceptually complete objects was significantly worse than would be expected from performance with their constituent parts. These results present a challenge for future theories of visual completion.
Collapse
|
4
|
Gold JM. Information processing correlates of a size-contrast illusion. Front Psychol 2014; 5:142. [PMID: 24600430 PMCID: PMC3928540 DOI: 10.3389/fpsyg.2014.00142] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2013] [Accepted: 02/04/2014] [Indexed: 12/02/2022] Open
Abstract
Perception is often influenced by context. A well-known class of perceptual context effects is perceptual contrast illusions, in which proximate stimulus regions interact to alter the perception of various stimulus attributes, such as perceived brightness, color and size. Although the phenomenal reality of contrast effects is well documented, in many cases the connection between these illusions and how information is processed by perceptual systems is not well understood. Here, we use noise as a tool to explore the information processing correlates of one such contrast effect: the Ebbinghaus–Titchener size-contrast illusion. In this illusion, the perceived size of a central dot is significantly altered by the sizes of a set of surrounding dots, such that the presence of larger surrounding dots tends to reduce the perceived size of the central dot (and vise versa). In our experiments, we first replicated previous results that have demonstrated the subjective reality of the Ebbinghaus–Titchener illusion. We then used visual noise in a detection task to probe the manner in which observers processed information when experiencing the illusion. By correlating the noise with observers' classification decisions, we found that the sizes of the surrounding contextual elements had a direct influence on the relative weight observers assigned to regions within and surrounding the central element. Specifically, observers assigned relatively more weight to the surrounding region and less weight to the central region in the presence of smaller surrounding contextual elements. These results offer new insights into the connection between the subjective experience of size-contrast illusions and their associated information processing correlates.
Collapse
Affiliation(s)
- Jason M Gold
- Department of Psychological and Brain Sciences, Indiana University, Bloomington IN, USA
| |
Collapse
|
5
|
Dövencioğlu D, Ban H, Schofield AJ, Welchman AE. Perceptual integration for qualitatively different 3-D cues in the human brain. J Cogn Neurosci 2013; 25:1527-41. [PMID: 23647559 PMCID: PMC3785137 DOI: 10.1162/jocn_a_00417] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
The visual system's flexibility in estimating depth is remarkable: We readily perceive 3-D structure under diverse conditions from the seemingly random dots of a "magic eye" stereogram to the aesthetically beautiful, but obviously flat, canvasses of the Old Masters. Yet, 3-D perception is often enhanced when different cues specify the same depth. This perceptual process is understood as Bayesian inference that improves sensory estimates. Despite considerable behavioral support for this theory, insights into the cortical circuits involved are limited. Moreover, extant work tested quantitatively similar cues, reducing some of the challenges associated with integrating computationally and qualitatively different signals. Here we address this challenge by measuring fMRI responses to depth structures defined by shading, binocular disparity, and their combination. We quantified information about depth configurations (convex "bumps" vs. concave "dimples") in different visual cortical areas using pattern classification analysis. We found that fMRI responses in dorsal visual area V3B/KO were more discriminable when disparity and shading concurrently signaled depth, in line with the predictions of cue integration. Importantly, by relating fMRI and psychophysical tests of integration, we observed a close association between depth judgments and activity in this area. Finally, using a cross-cue transfer test, we found that fMRI responses evoked by one cue afford classification of responses evoked by the other. This reveals a generalized depth representation in dorsal visual cortex that combines qualitatively different information in line with 3-D perception.
Collapse
Affiliation(s)
- Dicle Dövencioğlu
- School of Psychology, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK.
| | | | | | | |
Collapse
|
6
|
Seeing and hearing a word: combining eye and ear is more efficient than combining the parts of a word. PLoS One 2013; 8:e64803. [PMID: 23734220 PMCID: PMC3667182 DOI: 10.1371/journal.pone.0064803] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2012] [Accepted: 04/18/2013] [Indexed: 11/21/2022] Open
Abstract
To understand why human sensitivity for complex objects is so low, we study how word identification combines eye and ear or parts of a word (features, letters, syllables). Our observers identify printed and spoken words presented concurrently or separately. When researchers measure threshold (energy of the faintest visible or audible signal) they may report either sensitivity (one over the human threshold) or efficiency (ratio of the best possible threshold to the human threshold). When the best possible algorithm identifies an object (like a word) in noise, its threshold is independent of how many parts the object has. But, with human observers, efficiency depends on the task. In some tasks, human observers combine parts efficiently, needing hardly more energy to identify an object with more parts. In other tasks, they combine inefficiently, needing energy nearly proportional to the number of parts, over a 60∶1 range. Whether presented to eye or ear, efficiency for detecting a short sinusoid (tone or grating) with few features is a substantial 20%, while efficiency for identifying a word with many features is merely 1%. Why? We show that the low human sensitivity for words is a cost of combining their many parts. We report a dichotomy between inefficient combining of adjacent features and efficient combining across senses. Joining our results with a survey of the cue-combination literature reveals that cues combine efficiently only if they are perceived as aspects of the same object. Observers give different names to adjacent letters in a word, and combine them inefficiently. Observers give the same name to a word’s image and sound, and combine them efficiently. The brain’s machinery optimally combines only cues that are perceived as originating from the same object. Presumably such cues each find their own way through the brain to arrive at the same object representation.
Collapse
|
7
|
Schmidtmann G, Gordon GE, Bennett DM, Loffler G. Detecting shapes in noise: tuning characteristics of global shape mechanisms. Front Comput Neurosci 2013; 7:37. [PMID: 23720625 PMCID: PMC3655279 DOI: 10.3389/fncom.2013.00037] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Accepted: 04/02/2013] [Indexed: 11/13/2022] Open
Abstract
The proportion of signal elements embedded in noise needed to detect a signal is a standard tool for investigating motion perception. This paradigm was applied to the shape domain to determine how local information is pooled into a global percept. Stimulus arrays consisted of oriented Gabor elements that sampled the circumference of concentric radial frequency (RF) patterns. Individual Gabors were oriented tangentially to the shape (signal) or randomly (noise). In different conditions, signal elements were located randomly within the entire array or constrained to fall along one of the concentric contours. Coherence thresholds were measured for RF patterns with various frequencies (number of corners) and amplitudes (“sharpness” of corners). Coherence thresholds (about 10% = 15 elements) were lowest for circular shapes. Manipulating shape frequency or amplitude showed a range where thresholds remain unaffected (frequency ≤ RF4; amplitude ≤ 0.05). Increasing either parameter caused thresholds to rise. Compared to circles, thresholds increased by approximately four times for RF13 and five times for amplitudes of 0.3. Confining the signals to individual contours significantly reduced the number of elements needed to reach threshold (between 4 and 6), independent of the total number of elements on the contour or contour shape. Finally, adding external noise to the orientation of the elements had a greater effect on detection thresholds than adding noise to their position. These results provide evidence for a series of highly sensitive, shape-specific analysers which sum information globally but only from within specific annuli. These global mechanisms are tuned to position and orientation of local elements from which they pool information. The overall performance for arrays of elements can be explained by the sensitivity of multiple, independent concentric shape detectors rather than a single detector integrating information widely across space (e.g. Glass pattern detector).
Collapse
|
8
|
Melmer T, Amirshahi SA, Koch M, Denzler J, Redies C. From regular text to artistic writing and artworks: Fourier statistics of images with low and high aesthetic appeal. Front Hum Neurosci 2013; 7:106. [PMID: 23554592 PMCID: PMC3612693 DOI: 10.3389/fnhum.2013.00106] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2013] [Accepted: 03/13/2013] [Indexed: 11/18/2022] Open
Abstract
The spatial characteristics of letters and their influence on readability and letter identification have been intensely studied during the last decades. There have been few studies, however, on statistical image properties that reflect more global aspects of text, for example properties that may relate to its aesthetic appeal. It has been shown that natural scenes and a large variety of visual artworks possess a scale-invariant Fourier power spectrum that falls off linearly with increasing frequency in log-log plots. We asked whether images of text share this property. As expected, the Fourier spectrum of images of regular typed or handwritten text is highly anisotropic, i.e., the spectral image properties in vertical, horizontal, and oblique orientations differ. Moreover, the spatial frequency spectra of text images are not scale-invariant in any direction. The decline is shallower in the low-frequency part of the spectrum for text than for aesthetic artworks, whereas, in the high-frequency part, it is steeper. These results indicate that, in general, images of regular text contain less global structure (low spatial frequencies) relative to fine detail (high spatial frequencies) than images of aesthetics artworks. Moreover, we studied images of text with artistic claim (ornate print and calligraphy) and ornamental art. For some measures, these images assume average values intermediate between regular text and aesthetic artworks. Finally, to answer the question of whether the statistical properties measured by us are universal amongst humans or are subject to intercultural differences, we compared images from three different cultural backgrounds (Western, East Asian, and Arabic). Results for different categories (regular text, aesthetic writing, ornamental art, and fine art) were similar across cultures.
Collapse
Affiliation(s)
- Tamara Melmer
- Experimental Aesthetics Group, Institute of Anatomy I, University of Jena School of Medicine, Jena University HospitalJena, Germany
| | - Seyed A. Amirshahi
- Experimental Aesthetics Group, Institute of Anatomy I, University of Jena School of Medicine, Jena University HospitalJena, Germany
- Computer Vision Group, Department of Computer Science, Friedrich Schiller UniversityJena, Germany
| | - Michael Koch
- Experimental Aesthetics Group, Institute of Anatomy I, University of Jena School of Medicine, Jena University HospitalJena, Germany
- Computer Vision Group, Department of Computer Science, Friedrich Schiller UniversityJena, Germany
| | - Joachim Denzler
- Computer Vision Group, Department of Computer Science, Friedrich Schiller UniversityJena, Germany
| | - Christoph Redies
- Experimental Aesthetics Group, Institute of Anatomy I, University of Jena School of Medicine, Jena University HospitalJena, Germany
| |
Collapse
|
9
|
Hegdé J, Thompson SK, Brady M, Kersten D. Object recognition in clutter: cortical responses depend on the type of learning. Front Hum Neurosci 2012; 6:170. [PMID: 22723774 PMCID: PMC3378082 DOI: 10.3389/fnhum.2012.00170] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2012] [Accepted: 05/24/2012] [Indexed: 11/28/2022] Open
Abstract
Theoretical studies suggest that the visual system uses prior knowledge of visual objects to recognize them in visual clutter, and posit that the strategies for recognizing objects in clutter may differ depending on whether or not the object was learned in clutter to begin with. We tested this hypothesis using functional magnetic resonance imaging (fMRI) of human subjects. We trained subjects to recognize naturalistic, yet novel objects in strong or weak clutter. We then tested subjects' recognition performance for both sets of objects in strong clutter. We found many brain regions that were differentially responsive to objects during object recognition depending on whether they were learned in strong or weak clutter. In particular, the responses of the left fusiform gyrus (FG) reliably reflected, on a trial-to-trial basis, subjects' object recognition performance for objects learned in the presence of strong clutter. These results indicate that the visual system does not use a single, general-purpose mechanism to cope with clutter. Instead, there are two distinct spatial patterns of activation whose responses are attributable not to the visual context in which the objects were seen, but to the context in which the objects were learned.
Collapse
Affiliation(s)
- Jay Hegdé
- Department of Ophthalmology, Vision Discovery Institute, Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta GA, USA
| | | | | | | |
Collapse
|
10
|
Gold JM, Mundy PJ, Tjan BS. The perception of a face is no more than the sum of its parts. Psychol Sci 2012; 23:427-34. [PMID: 22395131 DOI: 10.1177/0956797611427407] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
When you see a person's face, how do you go about combining his or her facial features to make a decision about who that person is? Most current theories of face perception assert that the ability to recognize a human face is not simply the result of an independent analysis of individual features, but instead involves a holistic coding of the relationships among features. This coding is thought to enhance people's ability to recognize a face beyond what would be expected if each feature were shown in isolation. In the study reported here, we explicitly tested this idea by comparing human performance on facial-feature integration with that of an optimal Bayesian integrator. Contrary to the predictions of most current notions of face perception, our findings showed that human observers integrate facial features in a manner that is no better than would be predicted by their ability to use each individual feature when shown in isolation. That is, a face is perceived no better than the sum of its individual parts.
Collapse
Affiliation(s)
- Jason M Gold
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA.
| | | | | |
Collapse
|
11
|
Ban H, Preston TJ, Meeson A, Welchman AE. The integration of motion and disparity cues to depth in dorsal visual cortex. Nat Neurosci 2012; 15:636-43. [PMID: 22327475 PMCID: PMC3378632 DOI: 10.1038/nn.3046] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2011] [Accepted: 01/10/2012] [Indexed: 11/16/2022]
Abstract
Humans exploit a range of visual depth cues to estimate three-dimensional structure. For example, the slant of a nearby tabletop can be judged by combining information from binocular disparity, texture and perspective. Behavioral tests show humans combine cues near-optimally, a feat that could depend on discriminating the outputs from cue-specific mechanisms or on fusing signals into a common representation. Although fusion is computationally attractive, it poses a substantial challenge, requiring the integration of quantitatively different signals. We used functional magnetic resonance imaging (fMRI) to provide evidence that dorsal visual area V3B/KO meets this challenge. Specifically, we found that fMRI responses are more discriminable when two cues (binocular disparity and relative motion) concurrently signal depth, and that information provided by one cue is diagnostic of depth indicated by the other. This suggests a cortical node important when perceiving depth, and highlights computations based on fusion in the dorsal stream.
Collapse
Affiliation(s)
- Hiroshi Ban
- School of Psychology, University of Birmingham, Edgbaston, Birmingham, UK
| | | | | | | |
Collapse
|
12
|
Sun GJ, Chung STL, Tjan BS. Ideal observer analysis of crowding and the reduction of crowding through learning. J Vis 2010; 10:16. [PMID: 20616136 PMCID: PMC3096759 DOI: 10.1167/10.5.16] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Crowding is a prominent phenomenon in peripheral vision where nearby objects impede one's ability to identify a target of interest. The precise mechanism of crowding is not known. We used ideal observer analysis and a noise-masking paradigm to identify the functional mechanism of crowding. We tested letter identification in the periphery with and without flanking letters and found that crowding increases equivalent input noise and decreases sampling efficiency. Crowding effectively causes the signal from the target to be noisier and at the same time reduces the visual system's ability to make use of a noisy signal. After practicing identification of flanked letters without noise in the periphery for 6 days, subjects' performance for identifying flanked letters improved (reduction of crowding). Across subjects, the improvement was attributable to either a decrease in crowding-induced equivalent input noise or an increase in sampling efficiency, but seldom both. This pattern of results is consistent with a simple model whereby learning reduces crowding by adjusting the spatial extent of a perceptual window used to gather relevant input features. Following learning, subjects with inappropriately large windows reduced their window sizes; while subjects with inappropriately small windows increased their window sizes. The improvement in equivalent input noise and sampling efficiency persists for at least 6 months.
Collapse
Affiliation(s)
- Gerald J Sun
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| | | | | |
Collapse
|
13
|
Boucart M, Naili F, Despretz P, Defoort-Dhellemmes S, Fabre-Thorpe M. Implicit and explicit object recognition at very large visual eccentricities: No improvement after loss of central vision. VISUAL COGNITION 2009. [DOI: 10.1080/13506280903287845] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Muriel Boucart
- a Lab. Neurosciences Fonctionnelles & Pathologies , Université Lille-Nord de France , Lille, France
| | - Fatima Naili
- a Lab. Neurosciences Fonctionnelles & Pathologies , Université Lille-Nord de France , Lille, France
| | - Pascal Despretz
- a Lab. Neurosciences Fonctionnelles & Pathologies , Université Lille-Nord de France , Lille, France
| | | | | |
Collapse
|
14
|
Oruç I, Landy MS. Scale dependence and channel switching in letter identification. J Vis 2009; 9:4.1-19. [PMID: 19761337 DOI: 10.1167/9.9.4] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2008] [Accepted: 06/29/2009] [Indexed: 11/24/2022] Open
Abstract
Letters are broadband visual stimuli with information useful for discrimination over a wide range of spatial frequencies. Yet, recent evidence suggests that observers use only a single, fixed spatial-frequency channel to identify letters and that the scale of that channel, in units of letter size, is determined by the size of the letter (scale dependence). We report two letter-identification experiments using critical-band masking. With sufficiently high-amplitude, low- or high-pass masking noise, observers switched to a different range of spatial frequencies for the task. Thus, letter channels are not fixed for a given letter size. When an additional white-noise masker was added to the stimulus to flatten the contrast-sensitivity function, the letter channel used by the observer still depended on letter size, further supporting the hypothesis that letter identification is scale dependent.
Collapse
Affiliation(s)
- Ipek Oruç
- Department of Ophthalmology and Visual Sciences, University of British Columbia, Vancouver, BC, Canada.
| | | |
Collapse
|
15
|
Gold JM, Tjan BS, Shotts M. Integration of Facial Information is Sub-Optimal. COGSCI ... ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY. COGNITIVE SCIENCE SOCIETY (U.S.). CONFERENCE 2009; 2009:2897-2901. [PMID: 25309964 PMCID: PMC4189801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
How efficiently do we combine information across facial features when recognizing a face? Previous studies have suggested that the perception of a face is not simply the result of an independent analysis of individual facial features, but instead involves a coding of the relationships amongst features. This additional coding of the relationships amongst features is thought to enhance our ability to recognize a face. In our experiments, we tested whether an observer's ability to recognize a face is in fact better than what one would expect from their ability to recognize the individual facial features in isolation. We tested this by using a psychophysical summation-at-threshold technique that has been used extensively to measure how efficiently observers integrate information across spatial locations and spatial frequencies. Surprisingly, we found that observers integrated information across facial features less efficiently than would be predicted by their ability to recognize the individual parts.
Collapse
Affiliation(s)
- Jason M. Gold
- Departments of Psychological and Brain Sciences and Cognitive Science, Indiana University, 1101 East 10th Street Bloomington, IN 47405 USA
| | - Bosco S. Tjan
- Department of Psychology, University of Southern California, SGM 501 Los Angeles, CA 90089 USA
| | - Megan Shotts
- Department of Psychological and Brain Sciences, Indiana University, 1101 East 10th Street Bloomington, IN 47405 USA
| |
Collapse
|