1
|
Abstract
The proposed research is based on a real plastic injection factory for cutting board production. Most existing approaches for smart manufacturing tried to build the total solution of IoT by moving forward to the standard of industry 4.0. Under the cost considerations, this will not be acceptable to most factories, so we proposed the vision based technology to solve their immediate problem. Real-time machine condition monitoring is important for making great products and measuring line productivity or factory productivity. The study focused on a vision-based data reader (VDR) in edge computing for smart factories. A simple camera embedded in Field Programmable Gate Array (FPGA) was attached to monitor the screen on the control panel of the machines. Each end device was preprogrammed to capture images and process data on its own. The preprocessing step was then performed to have the normalized illumination of the captured image. A saliency map was generated to detect the required region for recognition. Finally, digit recognition was performed and the recognized digits were sent to the IoT system. The most significant contribution of the proposed VDR system used the compact deep learning model for training and testing purposes to fit the requirement of cost consideration and real-time monitoring in edge computing. To build the compact model, different convolution filters were tested to fit the performance requirement. Experimentations on a real plastic cutting board factory showed the improvement in manufacturing products by the proposed system and achieved a high digit recognition accuracy of 97.56%. In addition, the prototype system had low power and low latency advantages.
Collapse
|
2
|
Palmer CJ, Otsuka Y, Clifford CWG. A sparkle in the eye: Illumination cues and lightness constancy in the perception of eye contact. Cognition 2020; 205:104419. [PMID: 32826054 DOI: 10.1016/j.cognition.2020.104419] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 07/27/2020] [Accepted: 07/27/2020] [Indexed: 11/24/2022]
Abstract
In social interactions, our sense of when we have eye contact with another person relies on the distribution of luminance across their eye region, reflecting the position of the darker iris within the lighter sclera of the human eye. This distribution of luminance can be distorted by the lighting conditions, consistent with the fundamental challenge that the visual system faces in distinguishing the nature of a surface from the pattern of light falling upon it. Here we perform a set of psychophysics experiments in human observers to investigate how illumination impacts on the perception of eye contact. First, we find that simple changes in the direction of illumination can produce systematic biases in our sense of when we have eye contact with another person. Second, we find that the visual system uses information about the lighting conditions to partially discount or 'explain away' the effects of illumination in this context, leading to a significantly more robust sense of when we have eye contact with another person. Third, we find that perceived eye contact is affected by specular reflections from the eye surface in addition to shading patterns, implicating eye glint as a potential cue to gaze direction. Overall, this illustrates how our interpretation of social signals relies on visual mechanisms that both compensate for the effects of illumination on retinal input and potentially exploit novel cues that illumination can produce.
Collapse
Affiliation(s)
- Colin J Palmer
- School of Psychology, UNSW Sydney, New South Wales 2052, Australia.
| | - Yumiko Otsuka
- Department of Humanities and Social Sciences, Ehime University, Matsuyama, Ehime, Japan
| | | |
Collapse
|
3
|
Gillespie C, Vishwanath D. A shape-level flanker facilitation effect in contour integration and the role of shape complexity. Vision Res 2019; 158:221-236. [PMID: 30797765 DOI: 10.1016/j.visres.2019.02.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Revised: 02/14/2019] [Accepted: 02/16/2019] [Indexed: 11/16/2022]
Abstract
The detection of an object in the visual field requires the visual system to integrate a variety of local features into a single object. How these local processes and their global integration is influenced by the presence of other shapes in the visual field is poorly understood. The detectability (contour integration) of a central target object in the form of a two dimensional Gaborized contour was compared in the presence or absence of nearby surrounding objects. A 2-AFC staircase procedure added orientation jitter to the constituent Gabor patches to determine the detectability of the target contour. The set of contours was generated using shape profiles of everyday objects and geometric forms. Experiment 1 examined the effect of three types of congruencies between the target and two flanking contours (contour shape, symmetry and familiarity). Experiment 2 investigated the effect of varying the number and spatial positions of the flankers. In addition, a measure of shape complexity (reciprocal of shape compactness) was used to assess the effects of contour complexity on detection. Across both experiments the detectability of the target contour increased when the target and flanker had the same shape and this was related to both the number of flankers and the complexity of the target shapes. Another factor that modulated this shape-level flanker facilitation effect was the presence of symmetry. The overall results are consistent with a contour integration process in which the visual system incorporates contextual information to extract the most likely smooth contour within a noise field.
Collapse
|
4
|
Smyth MM, Hay DC, Hitch GJ, Horton NJ. Serial Position Memory in the Visual—Spatial Domain: Reconstructing Sequences of Unfamiliar Faces. ACTA ACUST UNITED AC 2018; 58:909-30. [PMID: 16194941 DOI: 10.1080/02724980443000412] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
In two studies we presented pictures of unfamiliar faces one at a time, then presented the complete set at test and asked for serial reconstruction of the order of presentation. Serial position functions were similar to those found with verbal materials, with considerable primacy and one item recency, position errors that were mainly to the adjacent serial position, a visual similarity effect, and effects of articulatory suppression that did not interact with the serial position effect or with the similarity effect. Serial position effects were found when faces had been seen for as little as 300 ms and after a 6-s retention interval filled with articulatory suppression. Serial position effects found with unfamiliar faces are not based on verbal encoding strategies, and important elements of serial memory may be general across modalities.
Collapse
Affiliation(s)
- Mary M Smyth
- Department of Psychology, Lancaster University, Lancaster, UK.
| | | | | | | |
Collapse
|
5
|
Favelle S, Hill H, Claes P. About Face: Matching Unfamiliar Faces Across Rotations of View and Lighting. Iperception 2017; 8:2041669517744221. [PMID: 29225768 PMCID: PMC5714100 DOI: 10.1177/2041669517744221] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Matching the identities of unfamiliar faces is heavily influenced by variations in their images. Changes to viewpoint and lighting direction during face perception are commonplace across yaw and pitch axes and can result in dramatic image differences. We report two experiments that, for the first time, factorially investigate the combined effects of lighting and view angle on matching performance for unfamiliar faces. The use of three-dimensional head models allowed control of both lighting and viewpoint. We found viewpoint effects in the yaw axis with little to no effect of lighting. However, for rotations about the pitch axis, there were both viewpoint and lighting effects and these interacted where lighting effects were found only for front views and views from below. The pattern of effects was similar regardless of whether view variation occurred as a result of head (Experiment 1) or camera (Experiment 2) suggesting that face matching is not purely image based. Along with face inversion effects in Experiment 1, the results of this study suggest that face perception is based on shape and surface information and draws on implicit knowledge of upright faces and ecological (top) lighting conditions.
Collapse
Affiliation(s)
- Simone Favelle
- School of Psychology, University of Wollongong, Wollongong, New South Wales, Australia
| | - Harold Hill
- School of Psychology, University of Wollongong, Wollongong, New South Wales, Australia
| | - Peter Claes
- ESAT/PSI, Department of Electrical Engineering, KU Leuven, Belgium; Medical Imaging Research Center, UZ Leuven, Belgium
| |
Collapse
|
6
|
Stephan BCM, Caine D. What is in a View? The Role of Featural Information in the Recognition of Unfamiliar Faces across Viewpoint Transformation. Perception 2016; 36:189-98. [PMID: 17402663 DOI: 10.1068/p5627] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
In recognising a face the visual system shows a remarkable ability in overcoming changes in viewpoint. However, the mechanisms involved in solving this complex computational problem, particularly in terms of information processing, have not been clearly defined. Considerable evidence indicates that face recognition involves both featural and configural processing. In this study we examined the contribution of featural information across viewpoint change. Participants were familiarised with unknown faces and were later tested for recognition in complete or part-face format, across changes in view. A striking effect of viewpoint resulting in a reduction in profile recognition compared with the three-quarter and frontal views was found. However, a complete-face over part-face advantage independent of transformation was demonstrated across all views. A hierarchy of feature salience was also demonstrated. Findings are discussed in terms of the problem of object constancy as it applies to faces.
Collapse
Affiliation(s)
- Blossom C M Stephan
- Department of Public Health and Primary Care, University of Cambridge, University Forvie Site, Robinson Way, Cambridge CB2 OSR, UK.
| | | |
Collapse
|
7
|
Keyes H, Zalicks C. Socially Important Faces Are Processed Preferentially to Other Familiar and Unfamiliar Faces in a Priming Task across a Range of Viewpoints. PLoS One 2016; 11:e0156350. [PMID: 27219101 PMCID: PMC4878734 DOI: 10.1371/journal.pone.0156350] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2016] [Accepted: 05/10/2016] [Indexed: 11/18/2022] Open
Abstract
Using a priming paradigm, we investigate whether socially important faces are processed preferentially compared to other familiar and unfamiliar faces, and whether any such effects are affected by changes in viewpoint. Participants were primed with frontal images of personally familiar, famous or unfamiliar faces, and responded to target images of congruent or incongruent identity, presented in frontal, three quarter or profile views. We report that participants responded significantly faster to socially important faces (a friend's face) compared to other highly familiar (famous) faces or unfamiliar faces. Crucially, responses to famous and unfamiliar faces did not differ. This suggests that, when presented in the context of a socially important stimulus, socially unimportant familiar faces (famous faces) are treated in a similar manner to unfamiliar faces. This effect was not tied to viewpoint, and priming did not affect socially important face processing differently to other faces.
Collapse
Affiliation(s)
- Helen Keyes
- Department of Psychology, Anglia Ruskin University, Cambridge, Cambridgeshire, United Kingdom
| | - Catherine Zalicks
- Department of Psychology, Anglia Ruskin University, Cambridge, Cambridgeshire, United Kingdom
| |
Collapse
|
8
|
Royer J, Blais C, Barnabé-Lortie V, Carré M, Leclerc J, Fiset D. Efficient visual information for unfamiliar face matching despite viewpoint variations: It's not in the eyes! Vision Res 2016; 123:33-40. [PMID: 27179558 DOI: 10.1016/j.visres.2016.04.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Revised: 04/07/2016] [Accepted: 04/15/2016] [Indexed: 10/21/2022]
Abstract
Faces are encountered in highly diverse angles in real-world settings. Despite this considerable diversity, most individuals are able to easily recognize familiar faces. The vast majority of studies in the field of face recognition have nonetheless focused almost exclusively on frontal views of faces. Indeed, a number of authors have investigated the diagnostic facial features for the recognition of frontal views of faces previously encoded in this same view. However, the nature of the information useful for identity matching when the encoded face and test face differ in viewing angle remains mostly unexplored. The present study addresses this issue using individual differences and bubbles, a method that pinpoints the facial features effectively used in a visual categorization task. Our results indicate that the use of features located in the center of the face, the lower left portion of the nose area and the center of the mouth, are significantly associated with individual efficiency to generalize a face's identity across different viewpoints. However, as faces become more familiar, the reliance on this area decreases, while the diagnosticity of the eye region increases. This suggests that a certain distinction can be made between the visual mechanisms subtending viewpoint invariance and face recognition in the case of unfamiliar face identification. Our results further support the idea that the eye area may only come into play when the face stimulus is particularly familiar to the observer.
Collapse
Affiliation(s)
- Jessica Royer
- Département de Psychologie et Psychoéducation, Université du Québec en Outaouais, Gatineau, Canada; Centre de Recherche en Neuropsychologie et Cognition, Montréal, Canada
| | - Caroline Blais
- Département de Psychologie et Psychoéducation, Université du Québec en Outaouais, Gatineau, Canada; Centre de Recherche en Neuropsychologie et Cognition, Montréal, Canada
| | - Vincent Barnabé-Lortie
- School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada
| | - Mélissa Carré
- Département de Psychologie et Psychoéducation, Université du Québec en Outaouais, Gatineau, Canada
| | - Josiane Leclerc
- Département de Psychologie et Psychoéducation, Université du Québec en Outaouais, Gatineau, Canada
| | - Daniel Fiset
- Département de Psychologie et Psychoéducation, Université du Québec en Outaouais, Gatineau, Canada; Centre de Recherche en Neuropsychologie et Cognition, Montréal, Canada.
| |
Collapse
|
9
|
Barton JJS, Corrow SL. Recognizing and identifying people: A neuropsychological review. Cortex 2015; 75:132-150. [PMID: 26773237 DOI: 10.1016/j.cortex.2015.11.023] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Revised: 10/13/2015] [Accepted: 11/30/2015] [Indexed: 11/30/2022]
Abstract
Recognizing people is a classic example of a cognitive function that involves multiple processing stages and parallel routes of information. Neuropsychological data have provided important evidence for models of this process, particularly from case reports; however, the quality and extent of the data varies widely between studies. In this review we first discuss the requirements and logical basis of the types of neuropsychological evidence to support conclusions about the modules in this process. We then survey the adequacy of the current body of reports to address two key issues. First is the question of which cognitive operation generates a sense of familiarity: the current debate revolves around whether familiarity arises in modality-specific recognition units or later amodal processes. Key evidence on this point comes from the search for dissociations between familiarity for faces, voices and names. The second question is whether lesions can differentially affect the abilities to link diverse sources of person information (e.g., face, voice, name, biographic data). Dissociations of these linkages may favor a 'distributed-only' model of the organization of semantic knowledge, whereas a 'person-hub' model would predict uniform impairments of all linkages. While we conclude that there is reasonable evidence for dissociations in name, voice and face familiarity in regards to the first question, the evidence for or against dissociated linkages between information stores in regards to the second question is tenuous at best. We identify deficiencies in the current literature that should motivate and inform the design of future studies.
Collapse
Affiliation(s)
- Jason J S Barton
- Human Vision and Eye Movement Laboratory, Department of Medicine (Neurology), Ophthalmology and Visual Sciences, University of British Columbia, Vancouver, Canada; Human Vision and Eye Movement Laboratory, Department of Psychology, University of British Columbia, Vancouver, Canada.
| | - Sherryse L Corrow
- Human Vision and Eye Movement Laboratory, Department of Medicine (Neurology), Ophthalmology and Visual Sciences, University of British Columbia, Vancouver, Canada; Human Vision and Eye Movement Laboratory, Department of Psychology, University of British Columbia, Vancouver, Canada.
| |
Collapse
|
10
|
Liu CH, Chen W, Ward J. Effects of exposure to facial expression variation in face learning and recognition. PSYCHOLOGICAL RESEARCH 2015; 79:1042-53. [PMID: 25398479 PMCID: PMC4624836 DOI: 10.1007/s00426-014-0627-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 11/06/2014] [Indexed: 12/03/2022]
Abstract
Facial expression is a major source of image variation in face images. Linking numerous expressions to the same face can be a huge challenge for face learning and recognition. It remains largely unknown what level of exposure to this image variation is critical for expression-invariant face recognition. We examined this issue in a recognition memory task, where the number of facial expressions of each face being exposed during a training session was manipulated. Faces were either trained with multiple expressions or a single expression, and they were later tested in either the same or different expressions. We found that recognition performance after learning three emotional expressions had no improvement over learning a single emotional expression (Experiments 1 and 2). However, learning three emotional expressions improved recognition compared to learning a single neutral expression (Experiment 3). These findings reveal both the limitation and the benefit of multiple exposures to variations of emotional expression in achieving expression-invariant face recognition. The transfer of expression training to a new type of expression is likely to depend on a relatively extensive level of training and a certain degree of variation across the types of expressions.
Collapse
Affiliation(s)
- Chang Hong Liu
- Department of Psychology, Bournemouth University, Talbot Campus, Fern Barrow, Poole, BH12 5BB, UK.
| | - Wenfeng Chen
- Institute of Psychology, Chinese Academy of Sciences, Beijing, China
| | - James Ward
- Department of Computer Science, University of Hull, Kingston upon Hull, UK
| |
Collapse
|
11
|
Frowd CD, Jones S, Fodarella C, Skelton F, Fields S, Williams A, Marsh JE, Thorley R, Nelson L, Greenwood L, Date L, Kearley K, McIntyre AH, Hancock PJ. Configural and featural information in facial-composite images. Sci Justice 2014; 54:215-27. [DOI: 10.1016/j.scijus.2013.11.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/10/2013] [Revised: 11/24/2013] [Accepted: 11/25/2013] [Indexed: 11/30/2022]
|
12
|
de Heering A, Maurer D. The effect of spatial frequency on perceptual learning of inverted faces. Vision Res 2013; 86:107-14. [PMID: 23643906 DOI: 10.1016/j.visres.2013.04.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Revised: 04/22/2013] [Accepted: 04/24/2013] [Indexed: 11/19/2022]
Abstract
We investigated the efficacy of training adults to recognize full spectrum inverted faces presented with different viewpoints. To examine the role of different spatial frequencies in any learning, we also used high-pass filtered faces that preserved featural information and low-pass filtered faces that severely reduced that featural information. Although all groups got faster over the 2 days of training, there was more improvement in accuracy for the group exposed to full spectrum faces than in the two groups exposed to filtered faces, both of which improved more modestly and only when the same faces were shown on the 2 days of training. For the group exposed to the full spectrum range and, to a lesser extent, for those exposed to high frequency faces, training generalized to a new set of full spectrum faces of a different size in a different task, but did not lead to evidence of holistic processing or improved sensitivity to feature shape or spacing in inverted faces. Overall these results demonstrate that only 2h of practice in recognizing full-spectrum inverted faces presented from multiple points of view is sufficient to improve recognition of the trained faces and to generalize to novel instances. Perceptual learning also occurred for low and high frequency faces but to a smaller extent.
Collapse
|
13
|
Edelman S, Shahbazi R. Renewing the respect for similarity. Front Comput Neurosci 2012; 6:45. [PMID: 22811664 PMCID: PMC3396327 DOI: 10.3389/fncom.2012.00045] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2011] [Accepted: 06/24/2012] [Indexed: 11/13/2022] Open
Abstract
In psychology, the concept of similarity has traditionally evoked a mixture of respect, stemming from its ubiquity and intuitive appeal, and concern, due to its dependence on the framing of the problem at hand and on its context. We argue for a renewed focus on similarity as an explanatory concept, by surveying established results and new developments in the theory and methods of similarity-preserving associative lookup and dimensionality reduction-critical components of many cognitive functions, as well as of intelligent data management in computer vision. We focus in particular on the growing family of algorithms that support associative memory by performing hashing that respects local similarity, and on the uses of similarity in representing structured objects and scenes. Insofar as these similarity-based ideas and methods are useful in cognitive modeling and in AI applications, they should be included in the core conceptual toolkit of computational neuroscience. In support of this stance, the present paper (1) offers a discussion of conceptual, mathematical, computational, and empirical aspects of similarity, as applied to the problems of visual object and scene representation, recognition, and interpretation, (2) mentions some key computational problems arising in attempts to put similarity to use, along with their possible solutions, (3) briefly states a previously developed similarity-based framework for visual object representation, the Chorus of Prototypes, along with the empirical support it enjoys, (4) presents new mathematical insights into the effectiveness of this framework, derived from its relationship to locality-sensitive hashing (LSH) and to concomitant statistics, (5) introduces a new model, the Chorus of Relational Descriptors (ChoRD), that extends this framework to scene representation and interpretation, (6) describes its implementation and testing, and finally (7) suggests possible directions in which the present research program can be extended in the future.
Collapse
Affiliation(s)
- Shimon Edelman
- Department of Psychology, Cornell University, IthacaNY, USA
| | | |
Collapse
|
14
|
Hafiz F, Shafie AA, Mustafah YM. Face Recognition From Single Sample Per Person by Learning of Generic Discriminant Vectors. ACTA ACUST UNITED AC 2012. [DOI: 10.1016/j.proeng.2012.07.199] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
15
|
Hill H, Claes P, Corcoran M, Walters M, Johnston A, Clement JG. How Different is Different? Criterion and Sensitivity in Face-Space. Front Psychol 2011; 2:41. [PMID: 21738516 PMCID: PMC3125532 DOI: 10.3389/fpsyg.2011.00041] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2010] [Accepted: 02/28/2011] [Indexed: 11/13/2022] Open
Abstract
Not all detectable differences between face images correspond to a change in identity. Here we measure both sensitivity to change and the criterion difference that is perceived as a change in identity. Both measures are used to test between possible similarity metrics. Using a same/different task and the method of constant stimuli criterion is specified as the 50% "different" point (P50) and sensitivity as the difference limen (DL). Stimuli and differences are defined within a "face-space" based on principal components analysis of measured differences in three-dimensional shape. In Experiment 1 we varied views available. Criterion (P50) was lowest for identical full-face view comparisons that can be based on image differences. When comparing across views P50, was the same for a static 45° change as for multiple animated views, although sensitivity (DL) was higher for the animated case, where it was as high as for identical views. Experiments 2 and 3 tested possible similarity metrics. Experiment 2 contrasted Euclidean and Mahalanobis distance by setting PC1 or PC2 to zero. DL did not differ between conditions consistent with Mahalanobis. P50 was lower when PC2 changed emphasizing that perceived changes in identity are not determined by the magnitude of Euclidean physical differences. Experiment 3 contrasted a distance with an angle based similarity measure. We varied the distinctiveness of the faces being compared by varying distance from the origin, a manipulation that affects distances but not angles between faces. Angular P50 and DL were both constant for faces from 1 to 2 SD from the mean, consistent with an angular measure. We conclude that both criterion and sensitivity need to be considered and that an angular similarity metric based on standardized PC values provides the best metric for specifying what physical differences will be perceived to change in identity.
Collapse
Affiliation(s)
- Harold Hill
- School of Psychology, University of Wollongong Wollongong, NSW, Australia
| | | | | | | | | | | |
Collapse
|
16
|
Kemelmacher-Shlizerman I, Basri R. 3D face reconstruction from a single image using a single reference face shape. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2011; 33:394-405. [PMID: 21193812 DOI: 10.1109/tpami.2010.63] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Human faces are remarkably similar in global properties, including size, aspect ratio, and location of main features, but can vary considerably in details across individuals, gender, race, or due to facial expression. We propose a novel method for 3D shape recovery of faces that exploits the similarity of faces. Our method obtains as input a single image and uses a mere single 3D reference model of a different person's face. Classical reconstruction methods from single images, i.e., shape-from-shading, require knowledge of the reflectance properties and lighting as well as depth values for boundary conditions. Recent methods circumvent these requirements by representing input faces as combinations (of hundreds) of stored 3D models. We propose instead to use the input image as a guide to "mold" a single reference model to reach a reconstruction of the sought 3D shape. Our method assumes Lambertian reflectance and uses harmonic representations of lighting. It has been tested on images taken under controlled viewing conditions as well as on uncontrolled images downloaded from the Internet, demonstrating its accuracy and robustness under a variety of imaging conditions and overcoming significant differences in shape between the input and reference individuals including differences in facial expressions, gender, and race.
Collapse
|
17
|
A combinatorial study of pose effects in unfamiliar face recognition. Vision Res 2010; 50:522-33. [DOI: 10.1016/j.visres.2009.12.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2009] [Revised: 12/03/2009] [Accepted: 12/17/2009] [Indexed: 11/23/2022]
|
18
|
Does matching of internal and external facial features depend on orientation and viewpoint? Acta Psychol (Amst) 2009; 132:267-78. [PMID: 19712921 DOI: 10.1016/j.actpsy.2009.07.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2007] [Revised: 07/26/2009] [Accepted: 07/28/2009] [Indexed: 11/23/2022] Open
Abstract
Although it is recognized that external (hair, head and face outline, ears) and internal (eyes, eyebrows, nose, mouth) features contribute differently to face recognition it is unclear whether both feature classes predominantly stimulate different sensory pathways. We employed a sequential speed-matching task to study face perception with internal and external features in the context of intact faces, and at two levels of contextual congruency. Both internal and external features were matched faster and more accurately in the context of totally congruent/incongruent facial stimuli compared to just featurally congruent/incongruent faces. Matching of totally congruent/incongruent faces was not affected by the matching criteria, but was strongly modulated by orientation and viewpoint. On the contrary, matching of just featurally congruent/incongruent faces was found to depend on the feature class to be attended, with strong effects of orientation and viewpoint only for matching of internal features, but not of external features. The data support the notion that different processing mechanisms are involved for both feature types, with internal features being handled by configuration sensitive mechanisms whereas featural processing modes dominate when external features are the focus.
Collapse
|
19
|
Perceptual learning modifies inversion effects for faces and textures. Vision Res 2009; 49:2273-84. [DOI: 10.1016/j.visres.2009.06.014] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2008] [Revised: 06/12/2009] [Accepted: 06/18/2009] [Indexed: 11/17/2022]
|
20
|
Viewpoint invariance in the discrimination of upright and inverted faces. Vision Res 2008; 48:2545-54. [PMID: 18804486 DOI: 10.1016/j.visres.2008.08.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2006] [Revised: 07/12/2008] [Accepted: 08/29/2008] [Indexed: 11/22/2022]
Abstract
Current models of face processing support an orientation-dependent expert face processing mechanism. However, even when upright, faces are encountered from different viewpoints, across which a face processing system must be able to generalize. Different computational models have generated competing predictions of how viewpoint variation might affect the perception of upright versus inverted faces. Our goal was to examine the interaction between viewpoint variation and orientation on face discrimination. Sixteen normal subjects performed an oddity paradigm requiring subjects to discriminate changes in three simultaneously viewed morphed faces presented either upright or inverted. In one type of trial all the faces were seen in frontal view; in the other all faces varied in viewpoint, rotated 45 degrees from each other. After the effects of orientation were adjusted for perceptual difficulty, there were only main effects of orientation and viewpoint, with no interaction between orientation and viewpoint. We conclude that the effects of viewpoint variation on the perceptual discrimination of faces is not different for upright versus inverted faces, indicating that its effects are independent of the expertise that exists for upright faces.
Collapse
|
21
|
Schechner YY, Nayar SK, Belhumeur PN. Multiplexing for optimal lighting. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2007; 29:1339-54. [PMID: 17568139 DOI: 10.1109/tpami.2007.1151] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Imaging of objects under variable lighting directions is an important and frequent practice in computer vision, machine vision, and image-based rendering. Methods for such imaging have traditionally used only a single light source per acquired image. They may result in images that are too dark and noisy, e.g., due to the need to avoid saturation of highlights. We introduce an approach that can significantly improve the quality of such images, in which multiple light sources illuminate the object simultaneously from different directions. These illumination-multiplexed frames are then computationally demultiplexed. The approach is useful for imaging dim objects, as well as objects having a specular reflection component. We give the optimal scheme by which lighting should be multiplexed to obtain the highest quality output, for signal-independent noise. The scheme is based on Hadamard codes. The consequences of imperfections such as stray light, saturation, and noisy illumination sources are then studied. In addition, the paper analyzes the implications of shot noise, which is signal-dependent, to Hadamard multiplexing. The approach facilitates practical lighting setups having high directional resolution. This is shown by a setup we devise, which is flexible, scalable, and programmable. We used it to demonstrate the benefit of multiplexing in experiments.
Collapse
Affiliation(s)
- Yoav Y Schechner
- Department of Electrical Engineering, Technion-Isreal Institute of Technology, Haifa, Isreal.
| | | | | |
Collapse
|
22
|
Hay DC, Smyth MM, Hitch GJ, Horton NJ. Serial position effects in short-term visual memory: a SIMPLE explanation? Mem Cognit 2007; 35:176-90. [PMID: 17533891 DOI: 10.3758/bf03195953] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A version of Sternberg's (1966) short-term visual memory recognition paradigm with pictures of unfamiliar faces as stimuli was used in three experiments to assess the applicability of the distinctiveness-based SIMPLE model proposed by Brown, Neath, and Chater (2002). Initial simulations indicated that the amount of recency predicted increased as the parameter measuring the psychological distinctiveness of the stimulus material (c) increased and that the amount of primacy was dependent on the extent of proactive interference from previously presented stimuli. The data from Experiment 1, in which memory lists of four and five faces varying in visual similarity were used, confirmed the predicted extended recency effect. However, changes in visual similarity were not found to produce changes in c. In Experiments 2 and 3, the conditions that influence the magnitude of c were explored. These revealed that both the familiarity of the stimulus class before testing and changes in familiarity, due to perceptual learning, influenced distinctiveness, as indexed by the parameter c. Overall, the empirical data from all three experiments were well fit by SIMPLE.
Collapse
Affiliation(s)
- Dennis C Hay
- Department of Psychology, Fylde College, Lancaster University, England.
| | | | | | | |
Collapse
|
23
|
Loidolt M, Aust U, Steurer M, Troje NF, Huber L. Limits of dynamic object perception in pigeons: Dynamic stimulus presentation does not enhance perception and discrimination of complex shape. Learn Behav 2006; 34:71-85. [PMID: 16786886 DOI: 10.3758/bf03192873] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A go/no-go procedure was used to train pigeons to discriminate pictures of human faces differing only in shape, with either static images or movies of human faces dynamically rotating in depth. On the basis of experimental findings in humans and some earlier studies on three-dimensional object perception in pigeons, we expected dynamic stimulus presentation to support the pigeon's perception of the complex morphology of a human face. However, the performance of the subjects presented with movies was either worse than (AVI format movies) or did not differ from (uncompressed dynamic presentation) that of the subjects trained with a single or with multiple static images of the faces. Furthermore, generalization tests to other presentation conditions and to novel static views revealed no promoting effect of dynamic training. Except for the subjects trained on multiple static views, performance dropped to chance level with views outside the training range. These results are in contrast to some prior reports from the literature, since they suggest that pigeons, unlike humans, have difficulty using the additional structural information provided by the dynamic presentation and integrating the multiple views into a three-dimensional object.
Collapse
|
24
|
|
25
|
Pourtois G, Schwartz S, Seghier ML, Lazeyras F, Vuilleumier P. Portraits or People? Distinct Representations of Face Identity in the Human Visual Cortex. J Cogn Neurosci 2005; 17:1043-57. [PMID: 16102236 DOI: 10.1162/0898929054475181] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
Humans can identify individual faces under different viewpoints, even after a single encounter. We determined brain regions responsible for processing face identity across view changes after variable delays with several intervening stimuli, using event-related functional magnetic resonance imaging during a long-term repetition priming paradigm. Unfamiliar faces were presented sequentially either in a frontal or three-quarter view. Each face identity was repeated once after an unpredictable lag, with either the same or another viewpoint. Behavioral data showed significant priming in response time, irrespective of view changes. Brain imaging results revealed a reduced response in the lateral occipital and fusiform cortex with face repetition. Bilateral face-selective fusiform areas showed view-sensitive repetition effects, generalizing only from three-quarter to front-views. More medial regions in the left (but not in the right) fusiform showed repetition effects across all types of viewpoint changes. These results reveal that distinct regions within the fusiform cortex hold view-sensitive or view-invariant traces of novel faces, and that face identity is represented in a view-sensitive manner in the functionally defined face-selective areas of both hemispheres. In addition, our finding of a better generalization after exposure to a 3/4-view than to a front-view demonstrates for the first time a neural substrate in the fusiform cortex for the common recognition advantage of three-quarter faces. This pattern provides new insights into the nature of face representation in the human visual system.
Collapse
Affiliation(s)
- Gilles Pourtois
- Neurology and Imaging of Cognition, Clinic of Neurology and Department of Neurosciences, University of Geneva, Switzerland.
| | | | | | | | | |
Collapse
|
26
|
Behrmann M, Avidan G, Marotta JJ, Kimchi R. Detailed Exploration of Face-related Processing in Congenital Prosopagnosia: 1. Behavioral Findings. J Cogn Neurosci 2005; 17:1130-49. [PMID: 16102241 DOI: 10.1162/0898929054475154] [Citation(s) in RCA: 179] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract
We show that five individuals with congenital prosopagnosia (CP) are impaired at face recognition and discrimination and do not exhibit the normal superiority for upright over inverted faces despite intact visual acuity, low-level vision and intelligence, and in the absence of any obvious neural concomitant. Interestingly, the deficit is not limited to faces: The CP individuals were also impaired at discriminating common objects and novel objects although to a lesser extent than discriminating faces. The perceptual deficit may be attributable to a more fundamental visual processing disorder; the CP individuals exhibited difficulty in deriving global configurations from simple visual stimuli, even with extended exposure duration and considerable perceptual support in the image. Deriving a global configuration from local components is more critical for faces than for other objects, perhaps accounting for the exaggerated deficit in face processing. These findings elucidate the psychological mechanisms underlying CP and support the link between configural and face processing.
Collapse
Affiliation(s)
- Marlene Behrmann
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA 15213-3890, USA.
| | | | | | | |
Collapse
|
27
|
Pourtois G, Schwartz S, Seghier ML, Lazeyras F, Vuilleumier P. View-independent coding of face identity in frontal and temporal cortices is modulated by familiarity: an event-related fMRI study. Neuroimage 2004; 24:1214-24. [PMID: 15670699 DOI: 10.1016/j.neuroimage.2004.10.038] [Citation(s) in RCA: 117] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2004] [Revised: 10/04/2004] [Accepted: 10/25/2004] [Indexed: 11/25/2022] Open
Abstract
Face recognition is a unique visual skill enabling us to recognize a large number of person identities, despite many differences in the visual image from one exposure to another due to changes in viewpoint, illumination, or simply passage of time. Previous familiarity with a face may facilitate recognition when visual changes are important. Using event-related fMRI in 13 healthy observers, we studied the brain systems involved in extracting face identity independent of modifications in visual appearance during a repetition priming paradigm in which two different photographs of the same face (either famous or unfamiliar) were repeated at varying delays. We found that functionally defined face-selective areas in the lateral fusiform cortex showed no repetition effects for faces across changes in image views, irrespective of pre-existing familiarity, suggesting that face representations formed in this region do not generalize across different visual images, even for well-known faces. Repetition of different but easily recognizable views of an unfamiliar face produced selective repetition decreases in a medial portion of the right fusiform gyrus, whereas distinct views of a famous face produced repetition decreases in left middle temporal and left inferior frontal cortex selectively, but no decreases in fusiform cortex. These findings reveal that different views of the same familiar face may not be integrated within a single representation at initial perceptual stages subserved by the fusiform face areas, but rather involve later processing stages where more abstract identity information is accessed.
Collapse
Affiliation(s)
- Gilles Pourtois
- Department of Neurosciences, Neurology and Imaging of Cognition, Clinic of Neurology, University Hospital, University Medical Center, Switzerland
| | | | | | | | | |
Collapse
|
28
|
Ullman S, Bart E. Recognition invariance obtained by extended and invariant features. Neural Netw 2004; 17:833-48. [PMID: 15288901 DOI: 10.1016/j.neunet.2004.01.006] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2004] [Accepted: 01/30/2004] [Indexed: 11/28/2022]
Abstract
In performing recognition, the visual system shows a remarkable capacity to distinguish between significant and immaterial image changes, to learn from examples to recognize new classes of objects, and to generalize from known to novel objects. Here we focus on one aspect of this problem, the ability to recognize novel objects from different viewing directions. This problem of view-invariant recognition is difficult because the image of an object seen from a novel viewing direction can be substantially different from all previously seen images of the same object. We describe an approach to view-invariant recognition that uses extended features to generalize across changes in viewing directions. Extended features are equivalence classes of informative image fragments, which represent object parts under different viewing conditions. This representation is extracted during learning from images of moving objects, and it allows the visual system to generalize from a single view of a novel object, and to compensate for large changes in the viewing direction, without using three-dimensional information. We describe the model, its implementation and performance on natural face images, compare it to alternative approaches, discuss its biological plausibility, and its extension to other aspects of visual recognition. The results of the study suggest that the capacity of the recognition system to generalize to novel conditions in an efficient and flexible manner depends on the ongoing extraction of different families of informative features, acquired for different tasks and different object classes.
Collapse
Affiliation(s)
- Shimon Ullman
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.
| | | |
Collapse
|
29
|
Turati C, Sangrigoli S, Ruely J, de Schonen S. Evidence of the Face Inversion Effect in 4-Month-Old Infants. INFANCY 2004; 6:275-297. [PMID: 33430531 DOI: 10.1207/s15327078in0602_8] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
This study tested the presence of the face inversion effect in 4-month-old infants using habituation to criterion followed by a novelty preference paradigm. Results of Experiment 1 confirmed previous findings, showing that when 1 single photograph of a face is presented in the habituation phase and when infants are required to recognize the same photograph, no differences in recognition performance with upright and inverted faces are found. However, Experiment 2 showed that, when infants are habituated to a face shown in a variety of poses and are required to recognize a new pose of the same face, infants' recognition performances were higher for upright than for inverted faces. Overall, results indicate that, under some experimental conditions, 4-month-olds process faces differently according to whether faces are presented upright or inverted.
Collapse
Affiliation(s)
- Chiara Turati
- Dipartimento Psicologia dello Sviluppo e della Socializzazione Università degli Studi di Padova Padova, Italy
| | - Sandy Sangrigoli
- Developmental Neurocognition Unit, LCD CNRS-Université René Descartes-Paris 5 Paris, France
| | - Josette Ruely
- Developmental Neurocognition Unit, LCD CNRS-Université René Descartes-Paris 5 Paris, France
| | - Scania de Schonen
- Developmental Neurocognition Unit, LCD CNRS-Université René Descartes-Paris 5 Paris, France
| |
Collapse
|
30
|
Jitsumori M, Makino H. Recognition of static and dynamic images of depth-rotated human faces by pigeons. Learn Behav 2004; 32:145-56. [PMID: 15281387 DOI: 10.3758/bf03196016] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2003] [Accepted: 11/11/2003] [Indexed: 11/08/2022]
Abstract
In three experiments, we examined pigeons' recognition of video images of human faces. In Experiment 1, pigeons were trained to discriminate between frontal views of human faces in a go/no-go discrimination procedure. They then showed substantial generalization to novel views, even though human faces change radically as viewpoint changes. In Experiment 2, the pigeons tested in Experiment 1 failed to transfer to the faces dynamically rotating in depth. In Experiment 3, the pigeons trained to discriminate the dynamic stimuli showed excellent transfer to the corresponding static views, but responses to the positive faces decreased at novel viewpoints outside the range spanned by the dynamic stimuli. These results suggest that pigeons are insensitive to the three-dimensional properties of video images. Consideration is given to the nature of the task, relating to the identification of three-dimensional objects and to perceptual classifications based on similarity judgments.
Collapse
Affiliation(s)
- Masako Jitsumori
- Department of Cognitive and Information Sciences, Faculty of Letters, Chiba University, Chiba, Japan.
| | | |
Collapse
|
31
|
Abstract
The present study investigated whether facial expressions of emotion are recognized holistically, i.e., all at once as an entire unit, as faces are or featurally as other nonface stimuli. Evidence for holistic processing of faces comes from a reliable decrement in recognition performance when faces are presented inverted rather than upright. If emotion is recognized holistically, then recognition of facial expressions of emotion should be impaired by inversion. To test this, participants were shown schematic drawings of faces showing one of six emotions (surprise, sadness, anger, happiness, disgust, and fear) in either an upright or inverted orientation and were asked to indicate the emotion depicted. Participants were more accurate in the upright than in the inverted orientation, providing evidence in support of holistic recognition of facial emotion. Because recognition of facial expressions of emotion is important in social relationships, this research may have implications for treatment of some social disorders.
Collapse
Affiliation(s)
- Marte Fallshore
- Department of Psychology, Central Washington University, Ellensburg 98926-7575, USA.
| | | |
Collapse
|
32
|
Hole GJ, George PA, Eaves K, Rasek A. Effects of geometric distortions on face-recognition performance. Perception 2003; 31:1221-40. [PMID: 12430949 DOI: 10.1068/p3252] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The importance of 'configural' processing for face recognition is now well established, but it remains unclear precisely what it entails. Through four experiments we attempted to clarify the nature of configural processing by investigating the effects of various affine transformations on the recognition of familiar faces. Experiment 1 showed that recognition was markedly impaired by inversion of faces, somewhat impaired by shearing or horizontally stretching them, but unaffected by vertical stretching of faces to twice their normal height. In experiment 2 we investigated vertical and horizontal stretching in more detail, and found no effects of either transformation. Two further experiments were performed to determine whether participants were recognising stretched faces by using configural information. Experiment 3 showed that nonglobal vertical stretching of faces (stretching either the top or the bottom half while leaving the remainder undistorted) impaired recognition, implying that configural information from the stretched part of the face was influencing the process of recognition--ie that configural processing involves global facial properties. In experiment 4 we examined the effects of Gaussian blurring on recognition of undistorted and vertically stretched faces. Faces remained recognisable even when they were both stretched and blurred, implying that participants were basing their judgments on configural information from these stimuli, rather than resorting to some strategy based on local featural details. The tolerance of spatial distortions in human face recognition suggests that the configural information used as a basis for face recognition is unlikely to involve information about the absolute position of facial features relative to each other, at least not in any simple way.
Collapse
Affiliation(s)
- Graham J Hole
- School of Cognitive and Computing Sciences, University of Sussex, Falmer, Brighton, UK
| | | | | | | |
Collapse
|
33
|
Marotta J, McKeeff T, Behrmann M. The effects of rotation and inversion on face processing in prosopagnosia. Cogn Neuropsychol 2002; 19:31-47. [DOI: 10.1080/02643290143000079] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
34
|
|
35
|
Abstract
At a given instant we see only visible surfaces, not an object's complete 3D appearance. Thus, objects may be represented as discrete 'views' showing only those features visible from a limited range of viewpoints. We address how to define a view using Koenderink's (Koenderink & Van Doorn, Biol. Cybernet. 32 (1979) 211.) geometric method for enumerating complete sets of stable views as aspect graphs. Using objects with known aspect graphs, five experiments examined whether the perception of orientation is sensitive to the qualitative features that define aspect graphs. Highest sensitivity to viewpoint changes was observed at locations where the theory predicts qualitative transitions, although some transitions did not affect performance. Hypotheses about why humans ignore some transitions offer insights into mechanisms for object representation.
Collapse
Affiliation(s)
- M J Tarr
- Department of Cognitive and Linguistic Sciences, Brown University, Box 1978, 02912, Providence, RI, USA.
| | | |
Collapse
|
36
|
Abstract
Evidence is given for a special, canonical, status of one specific view in the identification of familiar faces. In the first experiment, subjects identified by name the fully frontal or profile poses of briefly familiarised individuals less efficiently than an intermediate pose. In addition, in a matching experiment using faces seen in different poses, it was found that one specific intermediate pose (corresponding to 22.5 degrees of angle from the full frontal view) was matched more efficiently in the right visual field (RVF) than in the left visual field (LVF). This finding supports the hypothesis of a superiority of the left hemisphere (LH) over the right hemisphere (RH) in processing a familiar face's canonical view. The other tested "noncanonical" views (i.e., full frontal, 45 degrees, and profile) of these same familiar faces were better matched in the LVF (i.e., the RH); especially at low levels of familiarity. We conclude that, for each familiar face, a viewer-centred representation of the canonical (22.5 degrees ) view is stored in the LH's memory system, whereas multiple views of familiar faces are stored in a memory system of the RH. With increasing levels of familiarity other views are increasingly more efficiently encoded by the LH, and in fact for facial self-recognition the full-front view is superior to any of the other tested views. These findings taken together suggest that complementary lateralised memory subsystems in the two cerebral hemispheres store different sets, only partially overlapping, of view-centred face representations.
Collapse
Affiliation(s)
- B Laeng
- Department of Psychology, University of Tromsø, Norway.
| | | |
Collapse
|
37
|
Abstract
Understanding how biological visual systems recognize objects is one of the ultimate goals in computational neuroscience. From the computational viewpoint of learning, different recognition tasks, such as categorization and identification, are similar, representing different trade-offs between specificity and invariance. Thus, the different tasks do not require different classes of models. We briefly review some recent trends in computational vision and then focus on feedforward, view-based models that are supported by psychophysical and physiological data.
Collapse
Affiliation(s)
- M Riesenhuber
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, Center for Biological and Computational Learning and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge 02142, USA
| | | |
Collapse
|
38
|
Howard RW. Generalization and Transfer: An Interrelation of Paradigms and a Taxonomy of Knowledge Extension Processes. REVIEW OF GENERAL PSYCHOLOGY 2000. [DOI: 10.1037/1089-2680.4.3.211] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This article integrates work on generalization and transfer into a coherent framework. It analyzes the evolutionary problem with which generalization deals and then outlines a model of responses to a situation requiring generalization and a taxonomy of generalization and transfer processes. Key tenets are that many different generalization processes exist, that there are wide individual differences in which processes may occur, and that generalization of declarative knowledge usually involves concepts. Three experiments tested 2 tenets. One experiment suggested that generalization gradients found in the specialized paradigm used to study human stimulus generalization simply represent failure to perceptually discriminate between stimuli. The other experiments, involving a different paradigm, found step functions along dimensions instead of decremental gradients. They show that the traditional stimulus generalization paradigm is a type of concept learning paradigm. Participants generalize along a continuum by placing stimuli into categories. The experiments also show many different responses to a situation requiring generalization.
Collapse
Affiliation(s)
- Robert W. Howard
- School of Education, University of New South Wales, Sydney, New South Wales, Australia
| |
Collapse
|
39
|
Abstract
People are excellent at identifying faces familiar to them, even from very low quality images, but are bad at recognizing, or even matching, unfamiliar faces. In this review we shall consider some of the factors that affect our abilities to match unfamiliar faces. Major differences in orientation (e.g. inversion) or greyscale information (e.g. negation) affect face processing dramatically, and such effects suggest that representations derived from unfamiliar faces are based on relatively low-level image descriptions. Consistent with this, even relatively minor differences in lighting and viewpoint create problems for human face matching, leading to potentially important problems for the use of images from security videos. The relationships between different parts of the face (its 'configuration') are as important to the impression created of an upright face as the local features themselves, suggesting further constraints on the representations derived from faces. We go on to consider the contribution of computer face-recognition systems to the understanding of the theory and the practical problems of face identification. Finally, we look to the future of research in this area that will incorporate motion and 3-D shape information.
Collapse
|
40
|
Abstract
Two experiments examining repetition priming in face recognition are reported. They employed eight rather than the more usual two presentation trials so that the prediction made by Logan's (1988) instance model of power function speedup of response time (RT) distributions could be examined. In Experiment 1, we presented the same photograph on each trial; in Experiment 2, we presented photographs of varying poses. Both experiments showed repetition priming effects for familiar and unfamiliar faces, power function speedup for both the mean and the standard deviation of RT and the power function speedup of the quanties of the RT distributions. We argue that our findings are consistent with the predictions made by the instance model and provide an explanatory challenge for alternative theoretical approaches.
Collapse
Affiliation(s)
- D C Hay
- Department of Psychology, Fylde College, Lancaster University, England.
| |
Collapse
|
41
|
Abstract
A fundamental capacity of the perceptual systems and the brain in general is to deal with the novel and the unexpected. In vision, we can effortlessly recognize a familiar object under novel viewing conditions, or recognize a new object as a member of a familiar class, such as a house, a face, or a car. This ability to generalize and deal efficiently with novel stimuli has long been considered a challenging example of brain-like computation that proved extremely difficult to replicate in artificial systems. In this paper we present an approach to generalization and invariant recognition. We focus our discussion on the problem of invariance to position in the visual field, but also sketch how similar principles could apply to other domains.The approach is based on the use of a large repertoire of partial generalizations that are built upon past experience. In the case of shift invariance, visual patterns are described as the conjunction of multiple overlapping image fragments. The invariance to the more primitive fragments is built into the system by past experience. Shift invariance of complex shapes is obtained from the invariance of their constituent fragments. We study by simulations aspects of this shift invariance method and then consider its extensions to invariant perception and classification by brain-like structures.
Collapse
Affiliation(s)
- S Ullman
- Department of Applied Mathematics & Computer Science, The Weizmann Institute of Science, Rehovot, Israel
| | | |
Collapse
|
42
|
Gauthier I, Williams P, Tarr MJ, Tanaka J. Training 'greeble' experts: a framework for studying expert object recognition processes. Vision Res 1998; 38:2401-28. [PMID: 9798007 DOI: 10.1016/s0042-6989(97)00442-2] [Citation(s) in RCA: 265] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Twelve participants were trained to be experts at identifying a set of 'Greebles', novel objects that, like faces, all share a common spatial configuration. Tests comparing expert with novice performance revealed: (1) a surprising mix of generalizability and specificity in expert object recognition processes; and (2) that expertise is a multi-faceted phenomenon, neither adequately described by a single term nor adequately assessed by a single task. Greeble recognition by a simple neural-network model is also evaluated, and the model is found to account surprisingly well for both generalization and individuation using a single set of processes and representations.
Collapse
Affiliation(s)
- I Gauthier
- Department of Psychology, Yale University, New Haven, CT 06520, USA.
| | | | | | | |
Collapse
|
43
|
Tarr MJ, Kersten D, Bülthoff HH. Why the visual recognition system might encode the effects of illumination. Vision Res 1998; 38:2259-75. [PMID: 9797998 DOI: 10.1016/s0042-6989(98)00041-8] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
A key problem in recognition is that the image of an object depends on the lighting conditions. We investigated whether recognition is sensitive to illumination using 3-D objects that were lit from either the left or right, varying both the shading and the cast shadows. In experiments 1 and 2 participants judged whether two sequentially presented objects were the same regardless of illumination. Experiment 1 used six objects that were easily discriminated and that were rendered with cast shadows. While no cost was found in sensitivity, there was a response time cost over a change in lighting direction. Experiment 2 included six additional objects that were similar to the original six objects making recognition more difficult. The objects were rendered with cast shadows, no shadows, and as a control, white shadows. With normal shadows a change in lighting direction produced costs in both sensitivity and response times. With white shadows there was a much larger cost in sensitivity and a comparable cost in response times. Without cast shadows there was no cost in either measure, but the overall performance was poorer. Experiment 3 used a naming task in which names were assigned to six objects rendered with cast shadows. Participants practised identifying the objects in two viewpoints lit from a single lighting direction. Viewpoint and illumination invariance were then tested over new viewpoints and illuminations. Costs in both sensitivity and response time were found for naming the familiar objects in unfamiliar lighting directions regardless of whether the viewpoint was familiar or unfamiliar. Together these results suggest that illumination effects such as shadow edges: (1) affect visual memory; (2) serve the function of making unambigous the three-dimensional shape; and (3) are modeled with respect to object shape, rather than simply encoded in terms of their effects in the image.
Collapse
Affiliation(s)
- M J Tarr
- Department of Cognitive and Linguistic Sciences, Brown University, Providence, RI 02912, USA.
| | | | | |
Collapse
|
44
|
O'Toole AJ, Edelman S, Bülthoff HH. Stimulus-specific effects in face recognition over changes in viewpoint. Vision Res 1998; 38:2351-63. [PMID: 9798004 DOI: 10.1016/s0042-6989(98)00042-x] [Citation(s) in RCA: 94] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
Individual faces vary considerably in both the quality and quantity of the information they contain for recognition and for viewpoint generalization. In the present study, we assessed the typicality, recognizability, and viewpoint generalizability of individual faces using data from both human observers and from a computational model of face recognition across viewpoint change. The two-stage computational model incorporated a viewpoint alignment operation and a recognition-by-interpolation operation. An interesting aspect of this particular model is that the effects of typicality it predicts at the alignment and recognition stages dissociate, such that face typicality is beneficial for the success of the alignment process, but is adverse for the success of the recognition process. We applied a factor analysis to the covariance data for the human- and model-derived face measures across the different viewpoints and found two axes that appeared consistently across all viewpoints. Projection scores for individual faces on these axes (i.e. the extent to which a face's 'performance profile' matched the pattern of human- and model-derived scores on that axis), correlated across viewpoint changes to a much higher degree than did the raw recognizability scores of the faces. These results suggest that the stimulus information captured in the model measures may underlie distinct and dissociable aspects of the recognizability of individual faces across viewpoint change.
Collapse
Affiliation(s)
- A J O'Toole
- School of Human Development GR4.1, University of Texas at Dallas, Richardson 75083-0688, USA.
| | | | | |
Collapse
|
45
|
Cutzu F, Edelman S. Representation of object similarity in human vision: psychophysics and a computational model. Vision Res 1998; 38:2229-57. [PMID: 9797997 DOI: 10.1016/s0042-6989(97)00186-7] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
We report results from perceptual judgment, delayed matching to sample and long-term memory recall experiments, which indicate that the human visual system can support metrically veridical representations of similarities among 3D objects. In all the experiments, animal-like computer-rendered stimuli formed regular planar configurations in a common 70-dimensional parameter space. These configurations were fully recovered by multidimensional scaling from proximity tables derived from the subject data. We show that such faithful representation of similarity is possible if shapes are encoded by their similarities to a number of reference (prototypical) shapes, as in the computational model that accompanies the psychophysical data.
Collapse
Affiliation(s)
- F Cutzu
- Department of Applied Mathematics and Computer Science, Weizmann Institute of Science, Rehovot, Israel
| | | |
Collapse
|
46
|
Abstract
Theories of visual object recognition must solve the problem of recognizing 3D objects given that perceivers only receive 2D patterns of light on their retinae. Recent findings from human psychophysics, neurophysiology and machine vision provide converging evidence for 'image-based' models in which objects are represented as collections of viewpoint-specific local features. This approach is contrasted with 'structural-description' models in which objects are represented as configurations of 3D volumes or parts. We then review recent behavioral results that address the biological plausibility of both approaches, a well as some of their computational advantages and limitations. We conclude that, although the image-based approach holds great promise, it has potential pitfalls that may be best overcome by including structural information. Thus, the most viable model of object recognition may be one that incorporates the most appealing aspects of both image-based and structural description theories.
Collapse
Affiliation(s)
- M J Tarr
- Brown University, Department of Cognitive and Linguistic Sciences, Providence, RI 02912, USA
| | | |
Collapse
|
47
|
Perrett DI, Oram MW, Ashbridge E. Evidence accumulation in cell populations responsive to faces: an account of generalisation of recognition without mental transformations. Cognition 1998; 67:111-45. [PMID: 9735538 DOI: 10.1016/s0010-0277(98)00015-8] [Citation(s) in RCA: 190] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
In this paper we analyse the time course of neuronal activity in temporal cortex to the sight of the head and body. Previous studies have already demonstrated the impact of view, orientation and part occlusion on individual cells. We consider the cells as a population providing evidence in the form of neuronal activity for perceptual decisions related to recognition. The time course on neural responses to stimuli provides an explanation of the variation in speed of recognition across different viewing circumstances that is seen in behavioural experiments. A simple unifying explanation of the behavioural effects is that the speed of recognition of an object depends on the rate of accumulation of activity from neurones selective for the object, evoked by a particular viewing circumstance. This in turn depends on the extent that the object has been seen previously under the particular circumstance. For any familiar object, more cells will be tuned to the configuration of the object's features present in the view or views most frequently experienced. Therefore, activity amongst the population of cells selective for the object's appearance will accumulate more slowly when the object is seen in an unusual view, orientation or size. This accounts for the increased time to recognise rotated views without the need to postulate 'mental rotation' or 'transformations' of novel views to align with neural representations of familiar views.
Collapse
Affiliation(s)
- D I Perrett
- Psychological Laboratory, St. Andrews University, UK.
| | | | | |
Collapse
|
48
|
Abstract
Visual object recognition is complicated by the fact that the same 3D object can give rise to a large variety of projected images that depend on the viewing conditions, such as viewing direction, distance, and illumination. This paper describes a computational approach that uses combinations of a small number of object views to deal with the effects of viewing direction. The first part of the paper is an overview of the approach based on previous work. It is then shown that, in agreement with psychophysical evidence, the view-combination approach can use views of different class members rather than multiple views of a single object, to obtain class-based generalization. A number of extensions to the basic scheme are considered, including the use of non-linear combinations, using 3D versus 2D information, and the role of coarse classification on the way to precise identification. Finally, psychophysical and biological aspects of the view-combination approach are discussed. Compared with approaches that treat object recognition as a symbolic high-level activity, in the view-combination approach the emphasis is on processes that are simpler and pictorial in nature.
Collapse
Affiliation(s)
- S Ullman
- Weizmann Institute of Science, Department of Applied Mathematics and Computer Science, Rehovot, Israel.
| |
Collapse
|
49
|
Abstract
Evidence for viewpoint-specific image-based object representations have been collected almost entirely using exemplar-specific recognition tasks. Recent results, however, implicate image-based processes in more categorical tasks, for instance when objects contain qualitatively different 3D parts. Although such discriminations approximate class-level recognition. they do not establish whether image-based representations can support generalization across members of an object class. This issue is critical to any theory of recognition, in that one hallmark of human visual competence is the ability to recognize unfamiliar instances of a familiar class. The present study addresses this questions by testing whether viewpoint-specific representations for some members of a class facilitate the recognition of other members of that class. Experiment 1 demonstrates that familiarity with several members of a class of novel 3D objects generalizes in a viewpoint-dependent manner to cohort objects from the same class. Experiment 2 demonstrates that this generalization is based on the degree of familiarity and the degree of geometrical distinctiveness for particular viewpoints. Experiment 3 demonstrates that this generalization is restricted to visually-similar objects rather than all objects learned in a given context. These results support the hypothesis that image-based representations are viewpoint dependent, but that these representations generalize across members of perceptually-defined classes. More generally, these results provide evidence for a new approach to image-based recognition in which object classes are represented as cluster of visually-similar viewpoint-specific representations.
Collapse
Affiliation(s)
- M J Tarr
- Department of Cognitive and Linguistic Sciences, Brown University, Providence, RI 02912, USA.
| | | |
Collapse
|
50
|
Intrator N, Edelman S. Competitive learning in biological and artificial neural computation. Trends Cogn Sci 1997; 1:268-72. [DOI: 10.1016/s1364-6613(97)01066-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|