1
|
Huber LS, Geirhos R, Wichmann FA. The developmental trajectory of object recognition robustness: Children are like small adults but unlike big deep neural networks. J Vis 2023; 23:4. [PMID: 37410494 PMCID: PMC10337805 DOI: 10.1167/jov.23.7.4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 05/10/2023] [Indexed: 07/07/2023] Open
Abstract
In laboratory object recognition tasks based on undistorted photographs, both adult humans and deep neural networks (DNNs) perform close to ceiling. Unlike adults', whose object recognition performance is robust against a wide range of image distortions, DNNs trained on standard ImageNet (1.3M images) perform poorly on distorted images. However, the last 2 years have seen impressive gains in DNN distortion robustness, predominantly achieved through ever-increasing large-scale datasets-orders of magnitude larger than ImageNet. Although this simple brute-force approach is very effective in achieving human-level robustness in DNNs, it raises the question of whether human robustness, too, is simply due to extensive experience with (distorted) visual input during childhood and beyond. Here we investigate this question by comparing the core object recognition performance of 146 children (aged 4-15 years) against adults and against DNNs. We find, first, that already 4- to 6-year-olds show remarkable robustness to image distortions and outperform DNNs trained on ImageNet. Second, we estimated the number of images children had been exposed to during their lifetime. Compared with various DNNs, children's high robustness requires relatively little data. Third, when recognizing objects, children-like adults but unlike DNNs-rely heavily on shape but not on texture cues. Together our results suggest that the remarkable robustness to distortions emerges early in the developmental trajectory of human object recognition and is unlikely the result of a mere accumulation of experience with distorted visual input. Even though current DNNs match human performance regarding robustness, they seem to rely on different and more data-hungry strategies to do so.
Collapse
Affiliation(s)
- Lukas S Huber
- Department of Psychology, University of Bern, Bern, Switzerland
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany
- https://orcid.org/0000-0002-7755-6926
| | - Robert Geirhos
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany
- https://orcid.org/0000-0001-7698-3187
| | - Felix A Wichmann
- Neural Information Processing Group, University of Tübingen, Tübingen, Germany
- https://orcid.org/0000-0002-2592-634X
| |
Collapse
|
2
|
Izard V, Pica P, Spelke ES. Visual foundations of Euclidean geometry. Cogn Psychol 2022; 136:101494. [PMID: 35751917 DOI: 10.1016/j.cogpsych.2022.101494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 05/10/2022] [Accepted: 06/06/2022] [Indexed: 01/29/2023]
Abstract
Geometry defines entities that can be physically realized in space, and our knowledge of abstract geometry may therefore stem from our representations of the physical world. Here, we focus on Euclidean geometry, the geometry historically regarded as "natural". We examine whether humans possess representations describing visual forms in the same way as Euclidean geometry - i.e., in terms of their shape and size. One hundred and twelve participants from the U.S. (age 3-34 years), and 25 participants from the Amazon (age 5-67 years) were asked to locate geometric deviants in panels of 6 forms of variable orientation. Participants of all ages and from both cultures detected deviant forms defined in terms of shape or size, while only U.S. adults drew distinctions between mirror images (i.e. forms differing in "sense"). Moreover, irrelevant variations of sense did not disrupt the detection of a shape or size deviant, while irrelevant variations of shape or size did. At all ages and in both cultures, participants thus retained the same properties as Euclidean geometry in their analysis of visual forms, even in the absence of formal instruction in geometry. These findings show that representations of planar visual forms provide core intuitions on which humans' knowledge in Euclidean geometry could possibly be grounded.
Collapse
Affiliation(s)
- Véronique Izard
- Université Paris Cité, CNRS, Integrative Neuroscience and Cognition Center, F-75006 Paris, France
- Department of Psychology, Harvard University, 33 Kirkland St, Cambridge, MA 02138, USA.
| | - Pierre Pica
- Instituto do Cérebro, Universidade Federal do Rio grande do Norte, R. do Horto, Lagoa Nova, Natal, RN 59076-550, Brazil
- UMR 7023, Structures Formelles du Langage, Université Paris 8, 2 rue de la Liberté, 93200 Saint-Denis, France
| | - Elizabeth S Spelke
- Department of Psychology, Harvard University, 33 Kirkland St, Cambridge, MA 02138, USA; NSF-STC Center for Brains, Minds and Machines, 43 Vassar St, Cambridge, MA 02139, USA
| |
Collapse
|
3
|
Montag JL, Jones MN, Smith LB. Quantity and Diversity: Simulating Early Word Learning Environments. Cogn Sci 2018; 42 Suppl 2:375-412. [PMID: 29411899 DOI: 10.1111/cogs.12592] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Revised: 12/18/2017] [Accepted: 12/20/2017] [Indexed: 11/30/2022]
Abstract
The words in children's language learning environments are strongly predictive of cognitive development and school achievement. But how do we measure language environments and do so at the scale of the many words that children hear day in, day out? The quantity and quality of words in a child's input are typically measured in terms of total amount of talk and the lexical diversity in that talk. There are disagreements in the literature whether amount or diversity is the more critical measure of the input. Here we analyze the properties of a large corpus (6.5 million words) of speech to children and simulate learning environments that differ in amount of talk per unit time, lexical diversity, and the contexts of talk. The central conclusion is that what researchers need to theoretically understand, measure, and change is not the total amount of words, or the diversity of words, but the function that relates total words to the diversity of words, and how that function changes across different contexts of talk.
Collapse
Affiliation(s)
| | - Michael N Jones
- Department of Psychological and Brain Sciences, Indiana University
| | - Linda B Smith
- Department of Psychological and Brain Sciences, Indiana University
| |
Collapse
|
4
|
Kuwabara M, Smith LB. Cultural differences in visual object recognition in 3-year-old children. J Exp Child Psychol 2016; 147:22-38. [PMID: 26985576 PMCID: PMC4854758 DOI: 10.1016/j.jecp.2016.02.006] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2015] [Revised: 02/05/2016] [Accepted: 02/11/2016] [Indexed: 11/26/2022]
Abstract
Recent research indicates that culture penetrates fundamental processes of perception and cognition. Here, we provide evidence that these influences begin early and influence how preschool children recognize common objects. The three tasks (N=128) examined the degree to which nonface object recognition by 3-year-olds was based on individual diagnostic features versus more configural and holistic processing. Task 1 used a 6-alternative forced choice task in which children were asked to find a named category in arrays of masked objects where only three diagnostic features were visible for each object. U.S. children outperformed age-matched Japanese children. Task 2 presented pictures of objects to children piece by piece. U.S. children recognized the objects given fewer pieces than Japanese children, and the likelihood of recognition increased for U.S. children, but not Japanese children, when the piece added was rated by both U.S. and Japanese adults as highly defining. Task 3 used a standard measure of configural progressing, asking the degree to which recognition of matching pictures was disrupted by the rotation of one picture. Japanese children's recognition was more disrupted by inversion than was that of U.S. children, indicating more configural processing by Japanese than U.S. children. The pattern suggests early cross-cultural differences in visual processing; findings that raise important questions about how visual experiences differ across cultures and about universal patterns of cognitive development.
Collapse
Affiliation(s)
- Megumi Kuwabara
- Child Development Program, California State University Dominguez Hills, 1000 E. Victoria Street Caron, CA 90747
| | - Linda B. Smith
- Department of Psychological and Brain Sciences, Indiana University, 1101 E. Tenth Street Bloomington, IN 47405 USA
| |
Collapse
|
5
|
Augustine E, Jones SS, Smith LB, Longfield E. Relations among early object recognition skills: Objects and letters. JOURNAL OF COGNITION AND DEVELOPMENT 2015; 16:221-235. [PMID: 25969673 PMCID: PMC4426263 DOI: 10.1080/15248372.2013.815620] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Human visual object recognition is multifaceted, with several domains of expertise. Developmental relations between young children's letter recognition and their 3-dimensional object recognition abilities are implicated on several grounds but have received little research attention. Here, we ask how preschoolers' success in recognizing letters relates to their ability to recognize 3-dimensional objects from sparse shape information alone. A relation is predicted because perception of the spatial relations is critical in both domains. Seventy-three 2 ½- to 4-year-old children completed a Letter Recognition task, measuring the ability to identify a named letter among 3 letters with similar shapes, and a "Shape Caricature Recognition" task, measuring recognition of familiar objects from sparse, abstract information about their part shapes and the spatial relations among those parts. Children also completed a control "Shape Bias" task, in which success depends on recognition of overall object shape but not of relational structure. Children's success in letter recognition was positively related to their shape caricature recognition scores, but not to their shape bias scores. The results suggest that letter recognition builds upon developing skills in attending to and representing the relational structure of object shape, and that these skills are common to both 2-dimensional and 3-dimensional object perception.
Collapse
|
6
|
Borgström K, Torkildsen JVK, Lindgren M. Event-related potentials during word mapping to object shape predict toddlers' vocabulary size. Front Psychol 2015; 6:143. [PMID: 25762957 PMCID: PMC4327527 DOI: 10.3389/fpsyg.2015.00143] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 01/27/2015] [Indexed: 11/13/2022] Open
Abstract
What role does attention to different object properties play in early vocabulary development? This longitudinal study using event-related potentials in combination with behavioral measures investigated 20- and 24-month-olds' (n = 38; n = 34; overlapping n = 24) ability to use object shape and object part information in word-object mapping. The N400 component was used to measure semantic priming by images containing shape or detail information. At 20 months, the N400 to words primed by object shape varied in topography and amplitude depending on vocabulary size, and these differences predicted productive vocabulary size at 24 months. At 24 months, when most of the children had vocabularies of several hundred words, the relation between vocabulary size and the N400 effect in a shape context was weaker. Detached object parts did not function as word primes regardless of age or vocabulary size, although the part-objects were identified behaviorally. The behavioral measure, however, also showed relatively poor recognition of the part-objects compared to the shape-objects. These three findings provide new support for the link between shape recognition and early vocabulary development.
Collapse
|
7
|
Brito NH, Grenell A, Barr R. Specificity of the bilingual advantage for memory: examining cued recall, generalization, and working memory in monolingual, bilingual, and trilingual toddlers. Front Psychol 2014; 5:1369. [PMID: 25520686 PMCID: PMC4251311 DOI: 10.3389/fpsyg.2014.01369] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2014] [Accepted: 11/10/2014] [Indexed: 11/13/2022] Open
Abstract
The specificity of the bilingual advantage in memory was examined by testing groups of monolingual, bilingual, and trilingual 24-month-olds on tasks tapping cued recall, memory generalization and working memory. For the cued recall and memory generalization conditions, there was a 24-h delay between time of encoding and time of retrieval. In addition to the memory tasks, parent-toddler dyads completed a picture-book reading task, in order to observe emotional responsiveness, and a parental report of productive vocabulary. Results indicated no difference between language groups on cued recall, working memory, emotional responsiveness, or productive vocabulary, but a significant difference was found in the memory generalization condition with only the bilingual group outperforming the baseline control group. These results replicate and extend results from past studies (Brito and Barr, 2012, 2014; Brito et al., 2014) and suggest a bilingual advantage specific to memory generalization.
Collapse
Affiliation(s)
- Natalie H. Brito
- Robert Wood Johnson Foundation Health and Society Scholars, Columbia University in the City of New YorkNew York, NY, USA
| | - Amanda Grenell
- Department of Psychology, Georgetown UniversityWashington, DC, USA
| | - Rachel Barr
- Department of Psychology, Georgetown UniversityWashington, DC, USA
| |
Collapse
|
8
|
James KH, Jones SS, Smith LB, Swain SN. Young Children's Self-Generated Object Views and Object Recognition. JOURNAL OF COGNITION AND DEVELOPMENT 2014; 15:393-401. [PMID: 25368545 DOI: 10.1080/15248372.2012.749481] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
9
|
Smith LB, Street S, Jones SS, James KH. Using the axis of elongation to align shapes: developmental changes between 18 and 24 months of age. J Exp Child Psychol 2014; 123:15-35. [PMID: 24650776 PMCID: PMC4030647 DOI: 10.1016/j.jecp.2014.01.009] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Revised: 01/11/2014] [Accepted: 01/16/2014] [Indexed: 11/21/2022]
Abstract
An object's axis of elongation serves as an important frame of reference for forming three-dimensional representations of object shape. By several recent accounts, the formation of these representations is also related to experiences of acting on objects. Four experiments examined 18- to 24-month-olds' (N=103) sensitivity to the elongated axis in action tasks that required extracting, comparing, and physically rotating an object so that its major axis was aligned with that of a visual standard. In Experiments 1 and 2, the older toddlers precisely rotated both simple and complexly shaped three-dimensional objects in insertion tasks where the visual standard was the rectangular contour defining the opening in a box. The younger toddlers performed poorly. Experiments 3 and 4 provide evidence on emerging abilities in extracting and using the most extended axis as a frame of reference for shape comparison. Experiment 3 showed that 18-month-olds could rotate an object to align its major axis with the direction of their own hand motion, and Experiment 4 showed that they could align the major axis of one object with that of another object of the exact same three-dimensional shape. The results are discussed in terms of theories of the development of three-dimensional shape representations, visual object recognition, and the role of action in these developments.
Collapse
Affiliation(s)
- Linda B Smith
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA.
| | - Sandra Street
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA
| | - Susan S Jones
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA
| | - Karin H James
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA
| |
Collapse
|
10
|
Smith LB. It's all connected: Pathways in visual object recognition and early noun learning. AMERICAN PSYCHOLOGIST 2013; 68:618-29. [PMID: 24320634 PMCID: PMC3858855 DOI: 10.1037/a0034185] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A developmental pathway may be defined as the route, or chain of events, through which a new structure or function forms. For many human behaviors, including object name learning and visual object recognition, these pathways are often complex and multicausal and include unexpected dependencies. This article presents three principles of development that suggest the value of a developmental psychology that explicitly seeks to trace these pathways and uses empirical evidence on developmental dependencies among motor development, action on objects, visual object recognition, and object name learning in 12- to 24-month-old infants to make the case. The article concludes with a consideration of the theoretical implications of this approach. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Collapse
|
11
|
Representing part-whole relations in conceptual spaces. Cogn Process 2013; 15:127-42. [PMID: 24146391 DOI: 10.1007/s10339-013-0585-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Accepted: 10/07/2013] [Indexed: 10/26/2022]
Abstract
In this paper, we propose a cognitive semantic approach to represent part-whole relations. We base our proposal on the theory of conceptual spaces, focusing on prototypical structures in part-whole relations. Prototypical structures are not accounted for in traditional mereological formalisms. In our account, parts and wholes are represented in distinct conceptual spaces; parts are joined to form wholes in a structure space. The structure space allows systematic similarity judgments between wholes, taking into consideration shared parts and their configurations. A point in the structure space denotes a particular part structure; regions in the space represent different general types of part structures. We argue that the structural space can represent prototype effects: structural types are formed around typical arrangements of parts. We also show how structure space captures the variations in part structure of a given concept across different domains. In addition, we discuss how some taxonomies of part-whole relations can be understood within our framework.
Collapse
|
12
|
Yee M, Jones SS, Smith LB. Changes in visual object recognition precede the shape bias in early noun learning. Front Psychol 2012; 3:533. [PMID: 23227015 PMCID: PMC3512352 DOI: 10.3389/fpsyg.2012.00533] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2012] [Accepted: 11/12/2012] [Indexed: 11/13/2022] Open
Abstract
Two of the most formidable skills that characterize human beings are language and our prowess in visual object recognition. They may also be developmentally intertwined. Two experiments, a large sample cross-sectional study and a smaller sample 6-month longitudinal study of 18- to 24-month-olds, tested a hypothesized developmental link between changes in visual object representation and noun learning. Previous findings in visual object recognition indicate that children's ability to recognize common basic level categories from sparse structural shape representations of object shape emerges between the ages of 18 and 24 months, is related to noun vocabulary size, and is lacking in children with language delay. Other research shows in artificial noun learning tasks that during this same developmental period, young children systematically generalize object names by shape, that this shape bias predicts future noun learning, and is lacking in children with language delay. The two experiments examine the developmental relation between visual object recognition and the shape bias for the first time. The results show that developmental changes in visual object recognition systematically precede the emergence of the shape bias. The results suggest a developmental pathway in which early changes in visual object recognition that are themselves linked to category learning enable the discovery of higher-order regularities in category structure and thus the shape bias in novel noun learning tasks. The proposed developmental pathway has implications for understanding the role of specific experience in the development of both visual object recognition and the shape bias in early noun learning.
Collapse
Affiliation(s)
- Meagan Yee
- Department of Psychological and Brain Sciences, Indiana UniversityBloomington, IN, USA
| | - Susan S. Jones
- Department of Psychological and Brain Sciences, Indiana UniversityBloomington, IN, USA
| | - Linda B. Smith
- Department of Psychological and Brain Sciences, Indiana UniversityBloomington, IN, USA
| |
Collapse
|