1
|
Arthur T, Vine S, Wilson M, Harris D. The role of prediction and visual tracking strategies during manual interception: An exploration of individual differences. J Vis 2024; 24:4. [PMID: 38842836 PMCID: PMC11160954 DOI: 10.1167/jov.24.6.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 04/10/2024] [Indexed: 06/07/2024] Open
Abstract
The interception (or avoidance) of moving objects is a common component of various daily living tasks; however, it remains unclear whether precise alignment of foveal vision with a target is important for motor performance. Furthermore, there has also been little examination of individual differences in visual tracking strategy and the use of anticipatory gaze adjustments. We examined the importance of in-flight tracking and predictive visual behaviors using a virtual reality environment that required participants (n = 41) to intercept tennis balls projected from one of two possible locations. Here, we explored whether different tracking strategies spontaneously arose during the task, and which were most effective. Although indices of closer in-flight tracking (pursuit gain, tracking coherence, tracking lag, and saccades) were predictive of better interception performance, these relationships were rather weak. Anticipatory gaze shifts toward the correct release location of the ball provided no benefit for subsequent interception. Nonetheless, two interceptive strategies were evident: 1) early anticipation of the ball's onset location followed by attempts to closely track the ball in flight (i.e., predictive strategy); or 2) positioning gaze between possible onset locations and then using peripheral vision to locate the moving ball (i.e., a visual pivot strategy). Despite showing much poorer in-flight foveal tracking of the ball, participants adopting a visual pivot strategy performed slightly better in the task. Overall, these results indicate that precise alignment of the fovea with the target may not be critical for interception tasks, but that observers can adopt quite varied visual guidance approaches.
Collapse
Affiliation(s)
- Tom Arthur
- School of Public Health and Sport Sciences, Medical School, University of Exeter, Exeter, EX1 2LU, UK
| | - Samuel Vine
- School of Public Health and Sport Sciences, Medical School, University of Exeter, Exeter, EX1 2LU, UK
| | - Mark Wilson
- School of Public Health and Sport Sciences, Medical School, University of Exeter, Exeter, EX1 2LU, UK
| | - David Harris
- School of Public Health and Sport Sciences, Medical School, University of Exeter, Exeter, EX1 2LU, UK
| |
Collapse
|
2
|
Zhang Z, Xu F. An Overview of the Free Energy Principle and Related Research. Neural Comput 2024; 36:963-1021. [PMID: 38457757 DOI: 10.1162/neco_a_01642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/20/2023] [Indexed: 03/10/2024]
Abstract
The free energy principle and its corollary, the active inference framework, serve as theoretical foundations in the domain of neuroscience, explaining the genesis of intelligent behavior. This principle states that the processes of perception, learning, and decision making-within an agent-are all driven by the objective of "minimizing free energy," evincing the following behaviors: learning and employing a generative model of the environment to interpret observations, thereby achieving perception, and selecting actions to maintain a stable preferred state and minimize the uncertainty about the environment, thereby achieving decision making. This fundamental principle can be used to explain how the brain processes perceptual information, learns about the environment, and selects actions. Two pivotal tenets are that the agent employs a generative model for perception and planning and that interaction with the world (and other agents) enhances the performance of the generative model and augments perception. With the evolution of control theory and deep learning tools, agents based on the FEP have been instantiated in various ways across different domains, guiding the design of a multitude of generative models and decision-making algorithms. This letter first introduces the basic concepts of the FEP, followed by its historical development and connections with other theories of intelligence, and then delves into the specific application of the FEP to perception and decision making, encompassing both low-dimensional simple situations and high-dimensional complex situations. It compares the FEP with model-based reinforcement learning to show that the FEP provides a better objective function. We illustrate this using numerical studies of Dreamer3 by adding expected information gain into the standard objective function. In a complementary fashion, existing reinforcement learning, and deep learning algorithms can also help implement the FEP-based agents. Finally, we discuss the various capabilities that agents need to possess in complex environments and state that the FEP can aid agents in acquiring these capabilities.
Collapse
Affiliation(s)
- Zhengquan Zhang
- Key Laboratory of Information Science of Electromagnetic Waves, Fudan University, Shanghai, P.R.C.
| | - Feng Xu
- Key Laboratory of Information Science of Electromagnetic Waves, Fudan University, Shanghai, P.R.C.
| |
Collapse
|
3
|
Gruel A, Hareb D, Grimaldi A, Martinet J, Perrinet L, Linares-Barranco B, Serrano-Gotarredona T. Stakes of neuromorphic foveation: a promising future for embedded event cameras. BIOLOGICAL CYBERNETICS 2023; 117:389-406. [PMID: 37733033 DOI: 10.1007/s00422-023-00974-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 08/18/2023] [Indexed: 09/22/2023]
Abstract
Foveation can be defined as the organic action of directing the gaze towards a visual region of interest to acquire relevant information selectively. With the recent advent of event cameras, we believe that taking advantage of this visual neuroscience mechanism would greatly improve the efficiency of event data processing. Indeed, applying foveation to event data would allow to comprehend the visual scene while significantly reducing the amount of raw data to handle. In this respect, we demonstrate the stakes of neuromorphic foveation theoretically and empirically across several computer vision tasks, namely semantic segmentation and classification. We show that foveated event data have a significantly better trade-off between quantity and quality of the information conveyed than high- or low-resolution event data. Furthermore, this compromise extends even over fragmented datasets. Our code is publicly available online at: https://github.com/amygruel/FoveationStakes_DVS .
Collapse
Affiliation(s)
- Amélie Gruel
- SPARKS, Université Côte d'Azur, CNRS, I3S, 2000 Rte des Lucioles, 06900, Sophia-Antipolis, France.
| | - Dalia Hareb
- SPARKS, Université Côte d'Azur, CNRS, I3S, 2000 Rte des Lucioles, 06900, Sophia-Antipolis, France
| | - Antoine Grimaldi
- NeOpTo, Université Aix Marseille, CNRS, INT, 27 Bd Jean Moulin, 13005, Marseille, France
| | - Jean Martinet
- SPARKS, Université Côte d'Azur, CNRS, I3S, 2000 Rte des Lucioles, 06900, Sophia-Antipolis, France
| | - Laurent Perrinet
- NeOpTo, Université Aix Marseille, CNRS, INT, 27 Bd Jean Moulin, 13005, Marseille, France
| | - Bernabé Linares-Barranco
- Neuromorphic Group, Instituto de Microelectrónica de Sevilla IMSE-CNM, 28. Parque Científico y Tecnológico Cartuja, 41092, Sevilla, Spain
| | - Teresa Serrano-Gotarredona
- Neuromorphic Group, Instituto de Microelectrónica de Sevilla IMSE-CNM, 28. Parque Científico y Tecnológico Cartuja, 41092, Sevilla, Spain
| |
Collapse
|
4
|
Jérémie JN, Perrinet LU. Ultrafast Image Categorization in Biology and Neural Models. Vision (Basel) 2023; 7:vision7020029. [PMID: 37092462 PMCID: PMC10123664 DOI: 10.3390/vision7020029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 03/09/2023] [Accepted: 03/15/2023] [Indexed: 03/29/2023] Open
Abstract
Humans are able to categorize images very efficiently, in particular to detect the presence of an animal very quickly. Recently, deep learning algorithms based on convolutional neural networks (CNNs) have achieved higher than human accuracy for a wide range of visual categorization tasks. However, the tasks on which these artificial networks are typically trained and evaluated tend to be highly specialized and do not generalize well, e.g., accuracy drops after image rotation. In this respect, biological visual systems are more flexible and efficient than artificial systems for more general tasks, such as recognizing an animal. To further the comparison between biological and artificial neural networks, we re-trained the standard VGG 16 CNN on two independent tasks that are ecologically relevant to humans: detecting the presence of an animal or an artifact. We show that re-training the network achieves a human-like level of performance, comparable to that reported in psychophysical tasks. In addition, we show that the categorization is better when the outputs of the models are combined. Indeed, animals (e.g., lions) tend to be less present in photographs that contain artifacts (e.g., buildings). Furthermore, these re-trained models were able to reproduce some unexpected behavioral observations from human psychophysics, such as robustness to rotation (e.g., an upside-down or tilted image) or to a grayscale transformation. Finally, we quantified the number of CNN layers required to achieve such performance and showed that good accuracy for ultrafast image categorization can be achieved with only a few layers, challenging the belief that image recognition requires deep sequential analysis of visual objects. We hope to extend this framework to biomimetic deep neural architectures designed for ecological tasks, but also to guide future model-based psychophysical experiments that would deepen our understanding of biological vision.
Collapse
|
5
|
Lukanov H, König P, Pipa G. Biologically Inspired Deep Learning Model for Efficient Foveal-Peripheral Vision. Front Comput Neurosci 2021; 15:746204. [PMID: 34880741 PMCID: PMC8645638 DOI: 10.3389/fncom.2021.746204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 10/27/2021] [Indexed: 11/13/2022] Open
Abstract
While abundant in biology, foveated vision is nearly absent from computational models and especially deep learning architectures. Despite considerable hardware improvements, training deep neural networks still presents a challenge and constraints complexity of models. Here we propose an end-to-end neural model for foveal-peripheral vision, inspired by retino-cortical mapping in primates and humans. Our model has an efficient sampling technique for compressing the visual signal such that a small portion of the scene is perceived in high resolution while a large field of view is maintained in low resolution. An attention mechanism for performing "eye-movements" assists the agent in collecting detailed information incrementally from the observed scene. Our model achieves comparable results to a similar neural architecture trained on full-resolution data for image classification and outperforms it at video classification tasks. At the same time, because of the smaller size of its input, it can reduce computational effort tenfold and uses several times less memory. Moreover, we present an easy to implement bottom-up and top-down attention mechanism which relies on task-relevant features and is therefore a convenient byproduct of the main architecture. Apart from its computational efficiency, the presented work provides means for exploring active vision for agent training in simulated environments and anthropomorphic robotics.
Collapse
Affiliation(s)
- Hristofor Lukanov
- Department of Neuroinformatics, Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany
| | - Peter König
- Department of Neurobiopsychology, Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany.,Department of Neurophysiology and Pathophysiology, Center of Experimental Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Gordon Pipa
- Department of Neuroinformatics, Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany
| |
Collapse
|
6
|
Abstract
The properties of the human eye retina, including space-variant resolution and gaze characters, provide many advantages for numerous applications that simultaneously require a large field of view, high resolution, and real-time performance. Therefore, retina-like mechanisms and sensors have received considerable attention in recent years. This paper provides a review of state-of-the-art retina-like imaging techniques and applications. First, we introduce the principle and implementing methods, including software and hardware, and describe the comparisons between them. Then, we present typical applications combined with retina-like imaging, including three-dimensional acquisition and reconstruction, target tracking, deep learning, and ghost imaging. Finally, the challenges and outlook are discussed to further study for practical use. The results are beneficial for better understanding retina-like imaging.
Collapse
|