1
|
Fu X, Franchak JM, MacNeill LA, Gunther KE, Borjon JI, Yurkovic-Harding J, Harding S, Bradshaw J, Pérez-Edgar KE. Implementing mobile eye tracking in psychological research: A practical guide. Behav Res Methods 2024; 56:8269-8288. [PMID: 39147949 PMCID: PMC11525247 DOI: 10.3758/s13428-024-02473-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/20/2024] [Indexed: 08/17/2024]
Abstract
Eye tracking provides direct, temporally and spatially sensitive measures of eye gaze. It can capture visual attention patterns from infancy through adulthood. However, commonly used screen-based eye tracking (SET) paradigms are limited in their depiction of how individuals process information as they interact with the environment in "real life". Mobile eye tracking (MET) records participant-perspective gaze in the context of active behavior. Recent technological developments in MET hardware enable researchers to capture egocentric vision as early as infancy and across the lifespan. However, challenges remain in MET data collection, processing, and analysis. The present paper aims to provide an introduction and practical guide to starting researchers in the field to facilitate the use of MET in psychological research with a wide range of age groups. First, we provide a general introduction to MET. Next, we briefly review MET studies in adults and children that provide new insights into attention and its roles in cognitive and socioemotional functioning. We then discuss technical issues relating to MET data collection and provide guidelines for data quality inspection, gaze annotations, data visualization, and statistical analyses. Lastly, we conclude by discussing the future directions of MET implementation. Open-source programs for MET data quality inspection, data visualization, and analysis are shared publicly.
Collapse
Affiliation(s)
- Xiaoxue Fu
- Department of Psychology, University of South Carolina, Columbia, SC, USA.
| | - John M Franchak
- Department of Psychology, University of California Riverside, Riverside, CA, USA
| | - Leigha A MacNeill
- Department of Medical Social Sciences, Northwestern University, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
- Institute for Innovations in Developmental Sciences, Northwestern University, Evanston, IL, USA
| | - Kelley E Gunther
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, MD, USA
| | - Jeremy I Borjon
- Department of Psychology, University of Houston, Houston, TX, USA
- Texas Institute for Measurement, Evaluation, and Statistics, University of Houston, Houston, TX, USA
- Texas Center for Learning Disorders, University of Houston, Houston, TX, USA
| | | | - Samuel Harding
- Department of Psychology, University of South Carolina, Columbia, SC, USA
| | - Jessica Bradshaw
- Department of Psychology, University of South Carolina, Columbia, SC, USA
| | - Koraly E Pérez-Edgar
- Department of Psychology, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
2
|
Davidson G, Orhan AE, Lake BM. Spatial relation categorization in infants and deep neural networks. Cognition 2024; 245:105690. [PMID: 38330851 DOI: 10.1016/j.cognition.2023.105690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Revised: 12/06/2023] [Accepted: 12/07/2023] [Indexed: 02/10/2024]
Abstract
Spatial relations, such as above, below, between, and containment, are important mediators in children's understanding of the world (Piaget, 1954). The development of these relational categories in infancy has been extensively studied (Quinn, 2003) yet little is known about their computational underpinnings. Using developmental tests, we examine the extent to which deep neural networks, pretrained on a standard vision benchmark or egocentric video captured from one baby's perspective, form categorical representations for visual stimuli depicting relations. Notably, the networks did not receive any explicit training on relations. We then analyze whether these networks recover similar patterns to ones identified in development, such as reproducing the relative difficulty of categorizing different spatial relations and different stimulus abstractions. We find that the networks we evaluate tend to recover many of the patterns observed with the simpler relations of "above versus below" or "between versus outside", but struggle to match developmental findings related to "containment". We identify factors in the choice of model architecture, pretraining data, and experimental design that contribute to the extent the networks match developmental patterns, and highlight experimental predictions made by our modeling results. Our results open the door to modeling infants' earliest categorization abilities with modern machine learning tools and demonstrate the utility and productivity of this approach.
Collapse
Affiliation(s)
- Guy Davidson
- Center for Data Science, New York University, United States of America.
| | - A Emin Orhan
- Center for Data Science, New York University, United States of America
| | - Brenden M Lake
- Center for Data Science, New York University, United States of America; Department of Psychology, New York University, United States of America
| |
Collapse
|
3
|
Linsley D, Serre T. Fixing the problems of deep neural networks will require better training data and learning algorithms. Behav Brain Sci 2023; 46:e400. [PMID: 38054333 DOI: 10.1017/s0140525x23001589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Bowers et al. argue that deep neural networks (DNNs) are poor models of biological vision because they often learn to rival human accuracy by relying on strategies that differ markedly from those of humans. We show that this problem is worsening as DNNs are becoming larger-scale and increasingly more accurate, and prescribe methods for building DNNs that can reliably model biological vision.
Collapse
Affiliation(s)
- Drew Linsley
- Department of Cognitive Linguistic & Psychological Sciences, Carney Institute for Brain Science, Brown University, Providence, RI, USA ://sites.brown.edu/drewlinsleyhttps://serre-lab.clps.brown.edu
| | - Thomas Serre
- Department of Cognitive Linguistic & Psychological Sciences, Carney Institute for Brain Science, Brown University, Providence, RI, USA ://sites.brown.edu/drewlinsleyhttps://serre-lab.clps.brown.edu
| |
Collapse
|
4
|
Yovel G, Abudarham N. Why psychologists should embrace rather than abandon DNNs. Behav Brain Sci 2023; 46:e414. [PMID: 38054326 DOI: 10.1017/s0140525x2300167x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Deep neural networks (DNNs) are powerful computational models, which generate complex, high-level representations that were missing in previous models of human cognition. By studying these high-level representations, psychologists can now gain new insights into the nature and origin of human high-level vision, which was not possible with traditional handcrafted models. Abandoning DNNs would be a huge oversight for psychological sciences.
Collapse
Affiliation(s)
- Galit Yovel
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel ; https://people.socsci.tau.ac.il/mu/galityovel/
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Naphtali Abudarham
- School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel ; https://people.socsci.tau.ac.il/mu/galityovel/
| |
Collapse
|
5
|
Bakopoulou M, Lorenz MG, Forbes SH, Tremlin R, Bates J, Samuelson LK. Vocabulary and automatic attention: The relation between novel words and gaze dynamics in noun generalization. Dev Sci 2023; 26:e13399. [PMID: 37072679 PMCID: PMC10582201 DOI: 10.1111/desc.13399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 02/21/2023] [Accepted: 03/18/2023] [Indexed: 04/20/2023]
Abstract
Words direct visual attention in infants, children, and adults, presumably by activating representations of referents that then direct attention to matching stimuli in the visual scene. Novel, unknown, words have also been shown to direct attention, likely via the activation of more general representations of naming events. To examine the critical issue of how novel words and visual attention interact to support word learning we coded frame-by-frame the gaze of 17- to 31-month-old children (n = 66, 38 females) while generalizing novel nouns. We replicate prior findings of more attention to shape when generalizing novel nouns, and a relation to vocabulary development. However, we also find that following a naming event, children who produce fewer nouns take longer to look at the objects they eventually select and make more transitions between objects before making a generalization decision. Children who produce more nouns look to the objects they eventually select more quickly following the naming event and make fewer looking transitions. We discuss these findings in the context of prior proposals regarding children's few-shot category learning, and a developmental cascade of multiple perceptual, cognitive, and word-learning processes that may operate in cases of both typical development and language delay. RESEARCH HIGHLIGHTS: Examined how novel words guide visual attention by coding frame-by-frame where children look when asked to generalize novel names. Gaze patterns differed with vocabulary size: children with smaller vocabularies attended to generalization targets more slowly and did more comparison than those with larger vocabularies. Demonstrates a relationship between vocabulary size and attention to object properties during naming. This work has implications for looking-based tests of early cognition, and our understanding of children's few-shot category learning.
Collapse
Affiliation(s)
| | - Megan G Lorenz
- Department of Psychology, Augustana College, Rock Island, Illinois, USA
| | - Samuel H Forbes
- Department of Psychology, Durham University, Durham, England
| | - Rachel Tremlin
- School of Psychology, University of East Anglia, Norwich, England
| | - Jessica Bates
- School of Psychology, University of East Anglia, Norwich, England
| | | |
Collapse
|
6
|
Application of Machine Learning in Intelligent Medical Image Diagnosis and Construction of Intelligent Service Process. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:9152605. [PMID: 36619816 PMCID: PMC9812610 DOI: 10.1155/2022/9152605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 11/23/2022] [Accepted: 12/06/2022] [Indexed: 12/29/2022]
Abstract
The introduction of digital technology in the healthcare industry is marked by ongoing difficulties with implementation and use. Slow progress has been made in unifying different healthcare systems, and much of the globe still lacks a fully integrated healthcare system. As a result, it is critical and advantageous for healthcare providers to comprehend the fundamental ideas of AI in order to design and deliver their own AI-powered technology. AI is commonly defined as the capacity of machines to mimic human cognitive functions. It can tackle jobs with equivalent or superior performance to humans by combining computer science, algorithms, machine learning, and data science. The healthcare system is a dynamic and evolving environment, and medical experts are constantly confronted with new issues, shifting duties, and frequent interruptions. Because of this variation, illness diagnosis frequently becomes a secondary concern for healthcare professionals. Furthermore, clinical interpretation of medical information is a cognitively demanding endeavor. This applies not just to seasoned experts, but also to individuals with varying or limited skills, such as young assistant doctors. In this paper, we proposed the comparative analysis of various state-of-the-art methods of deep learning for medical imaging diagnosis and evaluated various important characteristics. The methodology is to evaluate various important factors such as interpretability, visualization, semantic data, and quantification of logical relationships in medical data. Furthermore, the glaucoma diagnosis system is discussed in detail via qualitative and quantitative approaches. Finally, the applications and future prospects were also discussed.
Collapse
|
7
|
Saglietti L, Mannelli SS, Saxe A. An analytical theory of curriculum learning in teacher-student networks. JOURNAL OF STATISTICAL MECHANICS (ONLINE) 2022; 2022:114014. [PMID: 37817944 PMCID: PMC10561397 DOI: 10.1088/1742-5468/ac9b3c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 10/13/2022] [Indexed: 10/12/2023]
Abstract
In animals and humans, curriculum learning-presenting data in a curated order-is critical to rapid learning and effective pedagogy. A long history of experiments has demonstrated the impact of curricula in a variety of animals but, despite its ubiquitous presence, a theoretical understanding of the phenomenon is still lacking. Surprisingly, in contrast to animal learning, curricula strategies are not widely used in machine learning and recent simulation studies reach the conclusion that curricula are moderately effective or even ineffective in most cases. This stark difference in the importance of curriculum raises a fundamental theoretical question: when and why does curriculum learning help? In this work, we analyse a prototypical neural network model of curriculum learning in the high-dimensional limit, employing statistical physics methods. We study a task in which a sparse set of informative features are embedded amidst a large set of noisy features. We analytically derive average learning trajectories for simple neural networks on this task, which establish a clear speed benefit for curriculum learning in the online setting. However, when training experiences can be stored and replayed (for instance, during sleep), the advantage of curriculum in standard neural networks disappears, in line with observations from the deep learning literature. Inspired by synaptic consolidation techniques developed to combat catastrophic forgetting, we propose curriculum-aware algorithms that consolidate synapses at curriculum change points and investigate whether this can boost the benefits of curricula. We derive generalisation performance as a function of consolidation strength (implemented as an L 2 regularisation/elastic coupling connecting learning phases), and show that curriculum-aware algorithms can yield a large improvement in test performance. Our reduced analytical descriptions help reconcile apparently conflicting empirical results, trace regimes where curriculum learning yields the largest gains, and provide experimentally-accessible predictions for the impact of task parameters on curriculum benefits. More broadly, our results suggest that fully exploiting a curriculum may require explicit adjustments in the loss.
Collapse
Affiliation(s)
- Luca Saglietti
- Institute for Data Science and Analytics, Bocconi University, Italy
| | - Stefano Sarao Mannelli
- Gatsby Computational Neuroscience Unit and Sainsbury Wellcome Centre, University College, London, United Kingdom
| | - Andrew Saxe
- Institute for Data Science and Analytics, Bocconi University, Italy
- FAIR, Meta AI, United States of America
| |
Collapse
|
8
|
Baran OB, Cinbis RG. Semantics-driven attentive few-shot learning over clean and noisy samples. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
9
|
Mahmoud S, Billing E, Svensson H, Thill S. Where to from here? On the future development of autonomous vehicles from a cognitive systems perspective. COGN SYST RES 2022. [DOI: 10.1016/j.cogsys.2022.09.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
10
|
Matthews CM, Mondloch CJ, Lewis-Dennis F, Laurence S. Children's ability to recognize their parent's face improves with age. J Exp Child Psychol 2022; 223:105480. [PMID: 35753197 DOI: 10.1016/j.jecp.2022.105480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 05/18/2022] [Accepted: 05/23/2022] [Indexed: 12/01/2022]
Abstract
Adults are experts at recognizing familiar faces across images that incorporate natural within-person variability in appearance (i.e., ambient images). Little is known about children's ability to do so. In the current study, we investigated whether 4- to 7-year-olds (n = 56) could recognize images of their own parent-a person with whom children have had abundant exposure in a variety of different contexts. Children were asked to identify images of their parent that were intermixed with images of other people. We included images of each parent taken both before and after their child was born to manipulate how close the images were to the child's own experience. When viewing before-birth images, 4- and 5-year-olds were less sensitive to identity than were older children; sensitivity did not differ when viewing images taken after the child was born. These findings suggest that with even the most familiar face, 4- and 5-year-olds have difficulty recognizing instances that go beyond their direct experience. We discuss two factors that may contribute to the prolonged development of familiar face recognition.
Collapse
Affiliation(s)
| | | | | | - Sarah Laurence
- Keele University, Keele, Staffordshire ST5 5BG, UK; The Open University, Milton Keynes MK7 6AA, UK.
| |
Collapse
|
11
|
Lessons from infant learning for unsupervised machine learning. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00488-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
12
|
Nicholson DA, Prinz AA. Could simplified stimuli change how the brain performs visual search tasks? A deep neural network study. J Vis 2022; 22:3. [PMID: 35675057 PMCID: PMC9187944 DOI: 10.1167/jov.22.7.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 05/04/2022] [Indexed: 11/24/2022] Open
Abstract
Visual search is a complex behavior influenced by many factors. To control for these factors, many studies use highly simplified stimuli. However, the statistics of these stimuli are very different from the statistics of the natural images that the human visual system is optimized by evolution and experience to perceive. Could this difference change search behavior? If so, simplified stimuli may contribute to effects typically attributed to cognitive processes, such as selective attention. Here we use deep neural networks to test how optimizing models for the statistics of one distribution of images constrains performance on a task using images from a different distribution. We train four deep neural network architectures on one of three source datasets-natural images, faces, and x-ray images-and then adapt them to a visual search task using simplified stimuli. This adaptation produces models that exhibit performance limitations similar to humans, whereas models trained on the search task alone exhibit no such limitations. However, we also find that deep neural networks trained to classify natural images exhibit similar limitations when adapted to a search task that uses a different set of natural images. Therefore, the distribution of data alone cannot explain this effect. We discuss how future work might integrate an optimization-based approach into existing models of visual search behavior.
Collapse
Affiliation(s)
- David A Nicholson
- Emory University, Department of Biology, O. Wayne Rollins Research Center, Atlanta, Georgia
| | - Astrid A Prinz
- Emory University, Department of Biology, O. Wayne Rollins Research Center, Atlanta, Georgia
| |
Collapse
|
13
|
Tiedemann H, Morgenstern Y, Schmidt F, Fleming RW. One-shot generalization in humans revealed through a drawing task. eLife 2022; 11:75485. [PMID: 35536739 PMCID: PMC9090327 DOI: 10.7554/elife.75485] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 05/01/2022] [Indexed: 11/13/2022] Open
Abstract
Humans have the amazing ability to learn new visual concepts from just a single exemplar. How we achieve this remains mysterious. State-of-the-art theories suggest observers rely on internal 'generative models', which not only describe observed objects, but can also synthesize novel variations. However, compelling evidence for generative models in human one-shot learning remains sparse. In most studies, participants merely compare candidate objects created by the experimenters, rather than generating their own ideas. Here, we overcame this key limitation by presenting participants with 2D 'Exemplar' shapes and asking them to draw their own 'Variations' belonging to the same class. The drawings reveal that participants inferred-and synthesized-genuine novel categories that were far more varied than mere copies. Yet, there was striking agreement between participants about which shape features were most distinctive, and these tended to be preserved in the drawn Variations. Indeed, swapping distinctive parts caused objects to swap apparent category. Our findings suggest that internal generative models are key to how humans generalize from single exemplars. When observers see a novel object for the first time, they identify its most distinctive features and infer a generative model of its shape, allowing them to mentally synthesize plausible variants.
Collapse
Affiliation(s)
- Henning Tiedemann
- Department of Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany
| | - Yaniv Morgenstern
- Department of Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany.,Laboratory of Experimental Psychology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Filipp Schmidt
- Department of Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany.,Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, Giessen, Germany
| | - Roland W Fleming
- Department of Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany.,Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, Giessen, Germany
| |
Collapse
|
14
|
Konkle T, Alvarez GA. A self-supervised domain-general learning framework for human ventral stream representation. Nat Commun 2022; 13:491. [PMID: 35078981 PMCID: PMC8789817 DOI: 10.1038/s41467-022-28091-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 12/13/2021] [Indexed: 12/25/2022] Open
Abstract
Anterior regions of the ventral visual stream encode substantial information about object categories. Are top-down category-level forces critical for arriving at this representation, or can this representation be formed purely through domain-general learning of natural image structure? Here we present a fully self-supervised model which learns to represent individual images, rather than categories, such that views of the same image are embedded nearby in a low-dimensional feature space, distinctly from other recently encountered views. We find that category information implicitly emerges in the local similarity structure of this feature space. Further, these models learn hierarchical features which capture the structure of brain responses across the human ventral visual stream, on par with category-supervised models. These results provide computational support for a domain-general framework guiding the formation of visual representation, where the proximate goal is not explicitly about category information, but is instead to learn unique, compressed descriptions of the visual world.
Collapse
Affiliation(s)
- Talia Konkle
- Department of Psychology & Center for Brain Science, Harvard University, Cambridge, MA, USA.
| | - George A Alvarez
- Department of Psychology & Center for Brain Science, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
15
|
Intuitive physics learning in a deep-learning model inspired by developmental psychology. Nat Hum Behav 2022; 6:1257-1267. [PMID: 35817932 PMCID: PMC9489531 DOI: 10.1038/s41562-022-01394-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Accepted: 05/18/2022] [Indexed: 02/02/2023]
Abstract
'Intuitive physics' enables our pragmatic engagement with the physical world and forms a key component of 'common sense' aspects of thought. Current artificial intelligence systems pale in their understanding of intuitive physics, in comparison to even very young children. Here we address this gap between humans and machines by drawing on the field of developmental psychology. First, we introduce and open-source a machine-learning dataset designed to evaluate conceptual understanding of intuitive physics, adopting the violation-of-expectation (VoE) paradigm from developmental psychology. Second, we build a deep-learning system that learns intuitive physics directly from visual data, inspired by studies of visual cognition in children. We demonstrate that our model can learn a diverse set of physical concepts, which depends critically on object-level representations, consistent with findings from developmental psychology. We consider the implications of these results both for AI and for research on human cognition.
Collapse
|
16
|
Mazzu-Nascimento T, Evangelista DN, Abubakar O, Roscani MG, Aguilar RS, Chachá SGF, Rosa PRD, Silva DF. Smartphone-Based Screening for Cardiovascular Diseases: A Trend? INTERNATIONAL JOURNAL OF CARDIOVASCULAR SCIENCES 2021. [DOI: 10.36660/ijcs.20210096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
17
|
Mendoza JK, Fausey CM. Quantifying Everyday Ecologies: Principles for Manual Annotation of Many Hours of Infants' Lives. Front Psychol 2021; 12:710636. [PMID: 34552533 PMCID: PMC8450442 DOI: 10.3389/fpsyg.2021.710636] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Accepted: 07/20/2021] [Indexed: 11/25/2022] Open
Abstract
Everyday experiences are the experiences available to shape developmental change. Remarkable advances in devices used to record infants' and toddlers' everyday experiences, as well as in repositories to aggregate and share such recordings across teams of theorists, have yielded a potential gold mine of insights to spur next-generation theories of experience-dependent change. Making full use of these advances, however, currently requires manual annotation. Manually annotating many hours of everyday life is a dedicated pursuit requiring significant time and resources, and in many domains is an endeavor currently lacking foundational facts to guide potentially consequential implementation decisions. These realities make manual annotation a frequent barrier to discoveries, as theorists instead opt for narrower scoped activities. Here, we provide theorists with a framework for manually annotating many hours of everyday life designed to reduce both theoretical and practical overwhelm. We share insights based on our team's recent adventures in the previously uncharted territory of everyday music. We identify principles, and share implementation examples and tools, to help theorists achieve scalable solutions to challenges that are especially fierce when annotating extended timescales. These principles for quantifying everyday ecologies will help theorists collectively maximize return on investment in databases of everyday recordings and will enable a broad community of scholars—across institutions, skillsets, experiences, and working environments—to make discoveries about the experiences upon which development may depend.
Collapse
Affiliation(s)
- Jennifer K Mendoza
- Department of Psychology, University of Oregon, Eugene, OR, United States
| | - Caitlin M Fausey
- Department of Psychology, University of Oregon, Eugene, OR, United States
| |
Collapse
|
18
|
Abstract
Deep learning models currently achieve human levels of performance on real-world face recognition tasks. We review scientific progress in understanding human face processing using computational approaches based on deep learning. This review is organized around three fundamental advances. First, deep networks trained for face identification generate a representation that retains structured information about the face (e.g., identity, demographics, appearance, social traits, expression) and the input image (e.g., viewpoint, illumination). This forces us to rethink the universe of possible solutions to the problem of inverse optics in vision. Second, deep learning models indicate that high-level visual representations of faces cannot be understood in terms of interpretable features. This has implications for understanding neural tuning and population coding in the high-level visual cortex. Third, learning in deep networks is a multistep process that forces theoretical consideration of diverse categories of learning that can overlap, accumulate over time, and interact. Diverse learning types are needed to model the development of human face processing skills, cross-race effects, and familiarity with individual faces.
Collapse
Affiliation(s)
- Alice J O'Toole
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas 75080, USA;
| | - Carlos D Castillo
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA;
| |
Collapse
|
19
|
Mendoza JK, Fausey CM. Everyday music in infancy. Dev Sci 2021; 24:e13122. [PMID: 34170059 PMCID: PMC8596421 DOI: 10.1111/desc.13122] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 03/18/2021] [Accepted: 03/30/2021] [Indexed: 11/28/2022]
Abstract
Infants enculturate to their soundscape over the first year of life, yet theories of how they do so rarely make contact with details about the sounds available in everyday life. Here, we report on properties of a ubiquitous early ecology in which foundational skills get built: music. We captured daylong recordings from 35 infants ages 6–12 months at home and fully double‐coded 467 h of everyday sounds for music and its features, tunes, and voices. Analyses of this first‐of‐its‐kind corpus revealed two distributional properties of infants’ everyday musical ecology. First, infants encountered vocal music in over half, and instrumental in over three‐quarters, of everyday music. Live sources generated one‐third, and recorded sources three‐quarters, of everyday music. Second, infants did not encounter each individual tune and voice in their day equally often. Instead, the most available identity cumulated to many more seconds of the day than would be expected under a uniform distribution. These properties of everyday music in human infancy are different from what is discoverable in environments highly constrained by context (e.g., laboratories) and time (e.g., minutes rather than hours). Together with recent insights about the everyday motor, language, and visual ecologies of infancy, these findings reinforce an emerging priority to build theories of development that address the opportunities and challenges of real input encountered by real learners.
Collapse
Affiliation(s)
| | - Caitlin M Fausey
- Department of Psychology, University of Oregon, Eugene, Oregon, USA
| |
Collapse
|
20
|
Snow JC, Culham JC. The Treachery of Images: How Realism Influences Brain and Behavior. Trends Cogn Sci 2021; 25:506-519. [PMID: 33775583 PMCID: PMC10149139 DOI: 10.1016/j.tics.2021.02.008] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 02/08/2021] [Accepted: 02/22/2021] [Indexed: 10/21/2022]
Abstract
Although the cognitive sciences aim to ultimately understand behavior and brain function in the real world, for historical and practical reasons, the field has relied heavily on artificial stimuli, typically pictures. We review a growing body of evidence that both behavior and brain function differ between image proxies and real, tangible objects. We also propose a new framework for immersive neuroscience to combine two approaches: (i) the traditional build-up approach of gradually combining simplified stimuli, tasks, and processes; and (ii) a newer tear-down approach that begins with reality and compelling simulations such as virtual reality to determine which elements critically affect behavior and brain processing.
Collapse
Affiliation(s)
- Jacqueline C Snow
- Department of Psychology, University of Nevada Reno, Reno, NV 89557, USA
| | - Jody C Culham
- Department of Psychology, University of Western Ontario, London, Ontario, N6A 5C2, Canada; Brain and Mind Institute, Western Interdisciplinary Research Building, University of Western Ontario, London, Ontario, N6A 3K7, Canada.
| |
Collapse
|
21
|
Carvalho PF, Chen CH, Yu C. The distributional properties of exemplars affect category learning and generalization. Sci Rep 2021; 11:11263. [PMID: 34050226 PMCID: PMC8163832 DOI: 10.1038/s41598-021-90743-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 05/13/2021] [Indexed: 11/17/2022] Open
Abstract
What we learn about the world is affected by the input we receive. Many extant category learning studies use uniform distributions as input in which each exemplar in a category is presented the same number of times. Another common assumption on input used in previous studies is that exemplars from the same category form a roughly normal distribution. However, recent corpus studies suggest that real-world category input tends to be organized around skewed distributions. We conducted three experiments to examine the distributional properties of the input on category learning and generalization. Across all studies, skewed input distributions resulted in broader generalization than normal input distributions. Uniform distributions also resulted in broader generalization than normal input distributions. Our results not only suggest that current category learning theories may underestimate category generalization but also challenge current theories to explain category learning in the real world with skewed, instead of the normal or uniform distributions often used in experimental studies.
Collapse
Affiliation(s)
- Paulo F Carvalho
- Human-Computer Interaction Institute, Carnegie Mellon University, Pittsburgh, PA, USA.
| | - Chi-Hsin Chen
- Department of Otolaryngology-Head and Neck Surgery, The Ohio State University, Columbus, OH, USA.
| | - Chen Yu
- Department of Psychology, The University of Texas At Austin, Austin, TX, USA
| |
Collapse
|
22
|
Unsupervised learning predicts human perception and misperception of gloss. Nat Hum Behav 2021; 5:1402-1417. [PMID: 33958744 PMCID: PMC8526360 DOI: 10.1038/s41562-021-01097-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Accepted: 03/09/2021] [Indexed: 02/01/2023]
Abstract
Reflectance, lighting and geometry combine in complex ways to create images. How do we disentangle these to perceive individual properties, such as surface glossiness? We suggest that brains disentangle properties by learning to model statistical structure in proximal images. To test this hypothesis, we trained unsupervised generative neural networks on renderings of glossy surfaces and compared their representations with human gloss judgements. The networks spontaneously cluster images according to distal properties such as reflectance and illumination, despite receiving no explicit information about these properties. Intriguingly, the resulting representations also predict the specific patterns of ‘successes’ and ‘errors’ in human perception. Linearly decoding specular reflectance from the model’s internal code predicts human gloss perception better than ground truth, supervised networks or control models, and it predicts, on an image-by-image basis, illusions of gloss perception caused by interactions between material, shape and lighting. Unsupervised learning may underlie many perceptual dimensions in vision and beyond. Storrs et al. train unsupervised generative neural networks on glossy surfaces and show how gloss perception in humans may emerge in an unsupervised fashion from learning to model statistical structure.
Collapse
|
23
|
Zhuang C, Yan S, Nayebi A, Schrimpf M, Frank MC, DiCarlo JJ, Yamins DLK. Unsupervised neural network models of the ventral visual stream. Proc Natl Acad Sci U S A 2021; 118:e2014196118. [PMID: 33431673 PMCID: PMC7826371 DOI: 10.1073/pnas.2014196118] [Citation(s) in RCA: 115] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods and that the mapping of these neural network models' hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.
Collapse
Affiliation(s)
- Chengxu Zhuang
- Department of Psychology, Stanford University, Stanford, CA 94305;
| | - Siming Yan
- Department of Computer Science, The University of Texas at Austin, Austin, TX 78712
| | - Aran Nayebi
- Neurosciences PhD Program, Stanford University, Stanford, CA 94305
| | - Martin Schrimpf
- Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Michael C Frank
- Department of Psychology, Stanford University, Stanford, CA 94305
| | - James J DiCarlo
- Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Daniel L K Yamins
- Department of Psychology, Stanford University, Stanford, CA 94305
- Department of Computer Science, Stanford University, Stanford, CA 94305
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305
| |
Collapse
|
24
|
Learning sparse and meaningful representations through embodiment. Neural Netw 2020; 134:23-41. [PMID: 33279863 DOI: 10.1016/j.neunet.2020.11.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 11/06/2020] [Accepted: 11/12/2020] [Indexed: 11/23/2022]
Abstract
How do humans acquire a meaningful understanding of the world with little to no supervision or semantic labels provided by the environment? Here we investigate embodiment with a closed loop between action and perception as one key component in this process. We take a close look at the representations learned by a deep reinforcement learning agent that is trained with high-dimensional visual observations collected in a 3D environment with very sparse rewards. We show that this agent learns stable representations of meaningful concepts such as doors without receiving any semantic labels. Our results show that the agent learns to represent the action relevant information, extracted from a simulated camera stream, in a wide variety of sparse activation patterns. The quality of the representations learned shows the strength of embodied learning and its advantages over fully supervised approaches.
Collapse
|
25
|
Yuan L, Xiang V, Crandall D, Smith L. Learning the generative principles of a symbol system from limited examples. Cognition 2020; 200:104243. [DOI: 10.1016/j.cognition.2020.104243] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 02/19/2020] [Accepted: 02/20/2020] [Indexed: 10/24/2022]
|
26
|
Wood JN, Wood SMW. One-shot learning of view-invariant object representations in newborn chicks. Cognition 2020; 199:104192. [PMID: 32199170 DOI: 10.1016/j.cognition.2020.104192] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2018] [Revised: 01/13/2020] [Accepted: 01/15/2020] [Indexed: 11/19/2022]
Abstract
Can newborn brains perform one-shot learning? To address this question, we reared newborn chicks in strictly controlled environments containing a single view of a single object, then tested their object recognition performance across 24 uniformly-spaced viewpoints. We found that chicks can build view-invariant object representations from a single view of an object: a case of one-shot learning in newborn brains. Chicks can also build the same view-invariant object representation from different views of an object, showing that newborn brains converge on common object representations from different sets of sensory inputs. Finally, by rearing chicks with larger numbers of object views, we found that chicks develop enhanced recognition for familiar views. These results illuminate the earliest stages of object recognition, revealing (1) powerful one-shot learning that builds invariant object representations from the first views of an object and (2) view-based learning that enriches object representations, producing enhanced recognition for familiar views.
Collapse
Affiliation(s)
- Justin N Wood
- Indiana University, Department of Informatics, 700 N Woodlawn Ave., Bloomington, IN 47408, United States of America.
| | - Samantha M W Wood
- Indiana University, Department of Informatics, 700 N Woodlawn Ave., Bloomington, IN 47408, United States of America.
| |
Collapse
|
27
|
Frankenhuis WE, Nettle D, Dall SRX. A case for environmental statistics of early-life effects. Philos Trans R Soc Lond B Biol Sci 2020; 374:20180110. [PMID: 30966883 PMCID: PMC6460088 DOI: 10.1098/rstb.2018.0110] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
There is enduring debate over the question of which early-life effects are adaptive and which ones are not. Mathematical modelling shows that early-life effects can be adaptive in environments that have particular statistical properties, such as reliable cues to current conditions and high autocorrelation of environmental states. However, few empirical studies have measured these properties, leading to an impasse. Progress, therefore, depends on research that quantifies cue reliability and autocorrelation of environmental parameters in real environments. These statistics may be different for social and non-social aspects of the environment. In this paper, we summarize evolutionary models of early-life effects. Then, we discuss empirical data on environmental statistics from a range of disciplines. We highlight cases where data on environmental statistics have been used to test competing explanations of early-life effects. We conclude by providing guidelines for new data collection and reflections on future directions. This article is part of the theme issue ‘Developing differences: early-life effects and evolutionary medicine'.
Collapse
Affiliation(s)
- Willem E Frankenhuis
- 1 Behavioural Science Institute, Radboud University , Nijmegen 6500 HE , The Netherlands
| | - Daniel Nettle
- 2 Centre for Behaviour and Evolution and Institute of Neuroscience, Newcastle University , Newcastle upon Tyne NE1 7RU , UK
| | - Sasha R X Dall
- 3 Centre for Ecology and Conservation, University of Exeter , Penryn TR10 9FE , UK
| |
Collapse
|
28
|
Abstract
Materials with complex appearances, like textiles and foodstuffs, pose challenges for conventional theories of vision. But recent advances in unsupervised deep learning provide a framework for explaining how we learn to see them. We suggest that perception does not involve estimating physical quantities like reflectance or lighting. Instead, representations emerge from learning to encode and predict the visual input as efficiently and accurately as possible. Neural networks can be trained to compress natural images or to predict frames in movies without 'ground truth' data about the outside world. Yet, to succeed, such systems may automatically discover how to disentangle distal causal factors. Such 'statistical appearance models' potentially provide a coherent explanation of both failures and successes in perception.
Collapse
|
29
|
Putting the variability–stability–flexibility pattern to use: Adapting instruction to how children develop. NEW IDEAS IN PSYCHOLOGY 2019. [DOI: 10.1016/j.newideapsych.2019.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
30
|
Wood SM, Wood JN. Using automation to combat the replication crisis: A case study from controlled-rearing studies of newborn chicks. Infant Behav Dev 2019; 57:101329. [DOI: 10.1016/j.infbeh.2019.101329] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2018] [Revised: 01/18/2019] [Accepted: 05/01/2019] [Indexed: 11/24/2022]
|
31
|
Wong-Kee-You AMB, Tsotsos JK, Adler SA. Development of spatial suppression surrounding the focus of visual attention. J Vis 2019; 19:9. [DOI: 10.1167/19.7.9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Affiliation(s)
| | - John K. Tsotsos
- Centre for Vision Research, York University, Toronto, ON, Canada
- Department of Electrical Engineering and Computer Science, York University, Toronto, ON, Canada
- ://jtl.lassonde.yorku.ca/
| | - Scott A. Adler
- Department of Psychology, York University, Toronto, Canada
- Centre for Vision Research, York University, Toronto, ON, Canada
- ://babylab.cvr.yorku.ca
| |
Collapse
|
32
|
Raz HK, Abney DH, Crandall D, Yu C, Smith LB. How do infants start learning object names in a sea of clutter? COGSCI ... ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY. COGNITIVE SCIENCE SOCIETY (U.S.). CONFERENCE 2019; 2019:521-526. [PMID: 33634271 PMCID: PMC7903936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Infants are powerful learners. A large corpus of experimental paradigms demonstrate that infants readily learn distributional cues of name-object co-occurrences. But infants' natural learning environment is cluttered: every heard word has multiple competing referents in view. Here we ask how infants start learning name-object co-occurrences in naturalistic learning environments that are cluttered and where there is much visual ambiguity. The framework presented in this paper integrates a naturalistic behavioral study and an application of a machine learning model. Our behavioral findings suggest that in order to start learning object names, infants and their parents consistently select a set of a few objects to play with during a set amount of time. What emerges is a frequency distribution of a few toys that approximates a Zipfian frequency distribution of objects for learning. We find that a machine learning model trained with a Zipf-like distribution of these object images outperformed the model trained with a uniform distribution. Overall, these findings suggest that to overcome referential ambiguity in clutter, infants may be selecting just a few toys allowing them to learn many distributional cues about a few name-object pairs.
Collapse
Affiliation(s)
- Hadar Karmazyn Raz
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405 USA
| | - Drew H Abney
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405 USA
| | - David Crandall
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405 USA
| | - Chen Yu
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405 USA
| | - Linda B Smith
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405 USA
| |
Collapse
|
33
|
Frankenhuis WE, Bijlstra G. Does Exposure to Hostile Environments Predict Enhanced Emotion Detection? COLLABRA-PSYCHOLOGY 2018. [DOI: 10.1525/collabra.127] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
We used a Face-in-the-Crowd task to examine whether hostile environments predict enhanced detection of anger, and whether such enhanced cognition occurs for a different negative emotion, sadness, as well. We conducted a well-powered, preregistered study in 100 college students and 100 individuals from a community sample with greater exposure to hostile environments. At the group level, the community sample was less accurate at detecting both angry and sad faces than students; and, only students discriminated anger more accurately than sadness. At the individual level, having experienced more violence did not predict enhanced anger detection accuracy. In general, participants had a lower threshold (i.e., a more liberal criterion) for detecting emotion in response to anger than sadness. And, students had a higher threshold (i.e., a more conservative criterion) for detecting emotion than the community sample in response to both anger and sadness. Overall, these findings contradict our hypothesis that exposure to hostile environments predicts enhanced danger detection. Rather, our community sample was more prone to over-perceiving emotions, consistent with previous studies showing bias in threat-exposed populations. Future work is needed to tease apart the conditions in which people exposed to social danger show enhanced accuracy or bias in their perception of emotions.
Collapse
|