1
|
Ceccarini F, Colpizzi I, Caudek C. Age-dependent changes in the anger superiority effect: Evidence from a visual search task. Psychon Bull Rev 2024; 31:1704-1713. [PMID: 38238561 PMCID: PMC11358229 DOI: 10.3758/s13423-023-02401-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/03/2023] [Indexed: 04/16/2024]
Abstract
The perception of threatening facial expressions is a critical skill necessary for detecting the emotional states of others and responding appropriately. The anger superiority effect hypothesis suggests that individuals are better at processing and identifying angry faces compared with other nonthreatening facial expressions. In adults, the anger superiority effect is present even after controlling for the bottom-up visual saliency, and when ecologically valid stimuli are used. However, it is as yet unclear whether this effect is present in children. To fill this gap, we tested the anger superiority effect in children ages 6-14 years in a visual search task by using emotional dynamic stimuli and equating the visual salience of target and distractors. The results suggest that in childhood, the angry superiority effect consists of improved accuracy in detecting angry faces, while in adolescence, the ability to discriminate angry faces undergoes further development, enabling faster and more accurate threat detection.
Collapse
Affiliation(s)
| | - Ilaria Colpizzi
- Health Sciences Department, Università Degli Studi Di Firenze, Florence, Italy
| | - Corrado Caudek
- NEUROFARBA Department, Università degli Studi di Firenze, Florence, Italy.
| |
Collapse
|
2
|
Nanami T, Yamada D, Someya M, Hige T, Kazama H, Kohno T. A lightweight data-driven spiking neuronal network model of Drosophila olfactory nervous system with dedicated hardware support. Front Neurosci 2024; 18:1384336. [PMID: 38994271 PMCID: PMC11238178 DOI: 10.3389/fnins.2024.1384336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 06/05/2024] [Indexed: 07/13/2024] Open
Abstract
Data-driven spiking neuronal network (SNN) models enable in-silico analysis of the nervous system at the cellular and synaptic level. Therefore, they are a key tool for elucidating the information processing principles of the brain. While extensive research has focused on developing data-driven SNN models for mammalian brains, their complexity poses challenges in achieving precision. Network topology often relies on statistical inference, and the functions of specific brain regions and supporting neuronal activities remain unclear. Additionally, these models demand huge computing facilities and their simulation speed is considerably slower than real-time. Here, we propose a lightweight data-driven SNN model that strikes a balance between simplicity and reproducibility. The model is built using a qualitative modeling approach that can reproduce key dynamics of neuronal activity. We target the Drosophila olfactory nervous system, extracting its network topology from connectome data. The model was successfully implemented on a small entry-level field-programmable gate array and simulated the activity of a network in real-time. In addition, the model reproduced olfactory associative learning, the primary function of the olfactory system, and characteristic spiking activities of different neuron types. In sum, this paper propose a method for building data-driven SNN models from biological data. Our approach reproduces the function and neuronal activities of the nervous system and is lightweight, acceleratable with dedicated hardware, making it scalable to large-scale networks. Therefore, our approach is expected to play an important role in elucidating the brain's information processing at the cellular and synaptic level through an analysis-by-construction approach. In addition, it may be applicable to edge artificial intelligence systems in the future.
Collapse
Affiliation(s)
- Takuya Nanami
- Institute of Industrial Science, The University of Tokyo, Meguro Ku, Tokyo, Japan
| | - Daichi Yamada
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Makoto Someya
- RIKEN Center for Brain Science, Wako, Saitama, Japan
| | - Toshihide Hige
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
- Integrative Program for Biological and Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Hokto Kazama
- RIKEN Center for Brain Science, Wako, Saitama, Japan
| | - Takashi Kohno
- Institute of Industrial Science, The University of Tokyo, Meguro Ku, Tokyo, Japan
| |
Collapse
|
3
|
Leung TS, Zeng G, Maylott SE, Martinez SN, Jakobsen KV, Simpson EA. Infection detection in faces: Children's development of pathogen avoidance. Child Dev 2024; 95:e35-e46. [PMID: 37589080 DOI: 10.1111/cdev.13983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 06/20/2023] [Accepted: 07/05/2023] [Indexed: 08/18/2023]
Abstract
This study examined the development of children's avoidance and recognition of sickness using face photos from people with natural, acute, contagious illness. In a U.S. sample of fifty-seven 4- to 5-year-olds (46% male, 70% White), fifty-two 8- to 9-year-olds (26% male, 62% White), and 51 adults (59% male, 61% White), children and adults avoided and recognized sick faces (ds ranged from 0.38 to 2.26). Both avoidance and recognition improved with age. Interestingly, 4- to 5-year-olds' avoidance of sick faces positively correlated with their recognition, suggesting stable individual differences in these emerging skills. Together, these findings are consistent with a hypothesized immature but functioning and flexible behavioral immune system emerging early in development. Characterizing children's sickness perception may help design interventions to improve health.
Collapse
Affiliation(s)
- Tiffany S Leung
- Department of Psychology, University of Miami, Coral Gables, Florida, USA
| | - Guangyu Zeng
- Department of Psychology, University of Miami, Coral Gables, Florida, USA
- Division of Applied Psychology, The Chinese University of Hong Kong, Shenzhen, China
| | - Sarah E Maylott
- Department of Psychiatry & Behavioral Sciences, Duke University, Durham, North Carolina, USA
| | | | | | | |
Collapse
|
4
|
Entzmann L, Guyader N, Kauffmann L, Peyrin C, Mermillod M. Detection of emotional faces: The role of spatial frequencies and local features. Vision Res 2023; 211:108281. [PMID: 37421829 DOI: 10.1016/j.visres.2023.108281] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 06/18/2023] [Accepted: 06/28/2023] [Indexed: 07/10/2023]
Abstract
Models of emotion processing suggest that threat-related stimuli such as fearful faces can be detected based on the rapid extraction of low spatial frequencies. However, this remains debated as other models argue that the decoding of facial expressions occurs with a more flexible use of spatial frequencies. The purpose of this study was to clarify the role of spatial frequencies and differences in luminance contrast between spatial frequencies, on the detection of facial emotions. We used a saccadic choice task in which emotional-neutral face pairs were presented and participants were asked to make a saccade toward the neutral or the emotional (happy or fearful) face. Faces were displayed either in low, high, or broad spatial frequencies. Results showed that participants were better to saccade toward the emotional face. They were also better for high or broad than low spatial frequencies, and the accuracy was higher with a happy target. An analysis of the eye and mouth saliency ofour stimuli revealed that the mouth saliency of the target correlates with participants' performance. Overall, this study underlines the importance of local more than global information, and of the saliency of the mouth region in the detection of emotional and neutral faces.
Collapse
Affiliation(s)
- Léa Entzmann
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, 38000 Grenoble, France; Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France; Icelandic Vision Lab, School of Health Sciences, University of Iceland, Reykjavík, Iceland.
| | - Nathalie Guyader
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France
| | - Louise Kauffmann
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, 38000 Grenoble, France
| | - Carole Peyrin
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, 38000 Grenoble, France
| | - Martial Mermillod
- Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, 38000 Grenoble, France
| |
Collapse
|
5
|
Roth N, Rolfs M, Hellwich O, Obermayer K. Objects guide human gaze behavior in dynamic real-world scenes. PLoS Comput Biol 2023; 19:e1011512. [PMID: 37883331 PMCID: PMC10602265 DOI: 10.1371/journal.pcbi.1011512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 09/12/2023] [Indexed: 10/28/2023] Open
Abstract
The complexity of natural scenes makes it challenging to experimentally study the mechanisms behind human gaze behavior when viewing dynamic environments. Historically, eye movements were believed to be driven primarily by space-based attention towards locations with salient features. Increasing evidence suggests, however, that visual attention does not select locations with high saliency but operates on attentional units given by the objects in the scene. We present a new computational framework to investigate the importance of objects for attentional guidance. This framework is designed to simulate realistic scanpaths for dynamic real-world scenes, including saccade timing and smooth pursuit behavior. Individual model components are based on psychophysically uncovered mechanisms of visual attention and saccadic decision-making. All mechanisms are implemented in a modular fashion with a small number of well-interpretable parameters. To systematically analyze the importance of objects in guiding gaze behavior, we implemented five different models within this framework: two purely spatial models, where one is based on low-level saliency and one on high-level saliency, two object-based models, with one incorporating low-level saliency for each object and the other one not using any saliency information, and a mixed model with object-based attention and selection but space-based inhibition of return. We optimized each model's parameters to reproduce the saccade amplitude and fixation duration distributions of human scanpaths using evolutionary algorithms. We compared model performance with respect to spatial and temporal fixation behavior, including the proportion of fixations exploring the background, as well as detecting, inspecting, and returning to objects. A model with object-based attention and inhibition, which uses saliency information to prioritize between objects for saccadic selection, leads to scanpath statistics with the highest similarity to the human data. This demonstrates that scanpath models benefit from object-based attention and selection, suggesting that object-level attentional units play an important role in guiding attentional processing.
Collapse
Affiliation(s)
- Nicolas Roth
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Institute of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Germany
| | - Martin Rolfs
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Department of Psychology, Humboldt-Universität zu Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
| | - Olaf Hellwich
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Institute of Computer Engineering and Microelectronics, Technische Universität Berlin, Germany
| | - Klaus Obermayer
- Cluster of Excellence Science of Intelligence, Technische Universität Berlin, Germany
- Institute of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Germany
- Bernstein Center for Computational Neuroscience Berlin, Germany
| |
Collapse
|
6
|
Segen V, Avraamides MN, Slattery T, Wiener JM. Biases in object location estimation: The role of rotations and translation. Atten Percept Psychophys 2023; 85:2307-2320. [PMID: 37258895 PMCID: PMC10584736 DOI: 10.3758/s13414-023-02716-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2023] [Indexed: 06/02/2023]
Abstract
Spatial memory studies often employ static images depicting a scene, an array of objects, or environmental features from one perspective and then following a perspective-shift-prompt memory either of the scene or objects within the scene. The current study investigated a previously reported systematic bias in spatial memory where, following a perspective shift from encoding to recall, participants indicated the location of an object farther to the direction of the shift. In Experiment 1, we aimed to replicate this bias by asking participants to encode the location of an object in a virtual room and then indicate it from memory following a perspective shift induced by camera translation and rotation. In Experiment 2, we decoupled the influence of camera translations and rotations and examined whether adding additional objects to the virtual room would reduce the bias. Overall, our results indicate that camera translations result in greater systematic bias than camera rotations. We propose that the accurate representation of camera translations requires more demanding mental computations than camera rotations, leading to greater uncertainty regarding the location of an object in memory. This uncertainty causes people to rely on an egocentric anchor, thereby giving rise to the systematic bias in the direction of camera translation.
Collapse
Affiliation(s)
- Vladislava Segen
- Aging and Dementia Research Centre, Bournemouth University, Poole, UK.
- Department of Psychology, Bournemouth University, Poole, UK.
- German Centre for Neurodegenerative Disease, Magdeburg, Germany.
| | - Marios N Avraamides
- Department of Psychology, University of Cyprus, Nicosia, Cyprus
- CYENS Centre of Excellence, Nicosia, Cyprus
| | | | - Jan M Wiener
- Aging and Dementia Research Centre, Bournemouth University, Poole, UK
- Department of Psychology, Bournemouth University, Poole, UK
| |
Collapse
|
7
|
Cavanagh P, Caplovitz GP, Lytchenko TK, Maechler MR, Tse PU, Sheinberg DL. The Architecture of Object-Based Attention. Psychon Bull Rev 2023; 30:1643-1667. [PMID: 37081283 DOI: 10.3758/s13423-023-02281-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/22/2023] [Indexed: 04/22/2023]
Abstract
The allocation of attention to objects raises several intriguing questions: What are objects, how does attention access them, what anatomical regions are involved? Here, we review recent progress in the field to determine the mechanisms underlying object-based attention. First, findings from unconscious priming and cueing suggest that the preattentive targets of object-based attention can be fully developed object representations that have reached the level of identity. Next, the control of object-based attention appears to come from ventral visual areas specialized in object analysis that project downward to early visual areas. How feedback from object areas can accurately target the object's specific locations and features is unknown but recent work in autoencoding has made this plausible. Finally, we suggest that the three classic modes of attention may not be as independent as is commonly considered, and instead could all rely on object-based attention. Specifically, studies show that attention can be allocated to the separated members of a group-without affecting the space between them-matching the defining property of feature-based attention. At the same time, object-based attention directed to a single small item has the properties of space-based attention. We outline the architecture of object-based attention, the novel predictions it brings, and discuss how it works in parallel with other attention pathways.
Collapse
Affiliation(s)
- Patrick Cavanagh
- Department of Psychology, Glendon College, 2275 Bayview Avenue, North York, ON, M4N 3M6, Canada.
- CVR, York University, Toronto, ON, Canada.
| | | | | | | | | | - David L Sheinberg
- Department of Neuroscience, Brown University, Providence, RI, USA
- Carney Institute for Brain Science, Brown University, Providence, RI, USA
| |
Collapse
|
8
|
Akamatsu K, Nishino T, Miyawaki Y. Spatiotemporal bias of the human gaze toward hierarchical visual features during natural scene viewing. Sci Rep 2023; 13:8104. [PMID: 37202449 DOI: 10.1038/s41598-023-34829-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 05/09/2023] [Indexed: 05/20/2023] Open
Abstract
The human gaze is directed at various locations from moment to moment in acquiring information necessary to recognize the external environment at the fine resolution of foveal vision. Previous studies showed that the human gaze is attracted to particular locations in the visual field at a particular time, but it remains unclear what visual features produce such spatiotemporal bias. In this study, we used a deep convolutional neural network model to extract hierarchical visual features from natural scene images and evaluated how much the human gaze is attracted to the visual features in space and time. Eye movement measurement and visual feature analysis using the deep convolutional neural network model showed that the gaze was more strongly attracted to spatial locations containing higher-order visual features than to locations containing lower-order visual features or to locations predicted by conventional saliency. Analysis of the time course of gaze attraction revealed that the bias to higher-order visual features was prominent within a short period after the beginning of observation of the natural scene images. These results demonstrate that higher-order visual features are a strong gaze attractor in both space and time, suggesting that the human visual system uses foveal vision resources to extract information from higher-order visual features with higher spatiotemporal priority.
Collapse
Affiliation(s)
- Kazuaki Akamatsu
- Graduate School of Informatics and Engineering, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, 182-8585, Japan
| | - Tomohiro Nishino
- Faculty of Informatics and Engineering, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, 182-8585, Japan
| | - Yoichi Miyawaki
- Graduate School of Informatics and Engineering, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, 182-8585, Japan.
- Center for Neuroscience and Biomedical Engineering (CNBE), The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, 182-8585, Japan.
| |
Collapse
|
9
|
Behavioral and physiological sensitivity to natural sick faces. Brain Behav Immun 2023; 110:195-211. [PMID: 36893923 DOI: 10.1016/j.bbi.2023.03.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 03/03/2023] [Accepted: 03/03/2023] [Indexed: 03/11/2023] Open
Abstract
The capacity to rapidly detect and avoid sick people may be adaptive. Given that faces are reliably available, as well as rapidly detected and processed, they may provide health information that influences social interaction. Prior studies used faces that were manipulated to appear sick (e.g., editing photos, inducing inflammatory response); however, responses to naturally sick faces remain largely unexplored. We tested whether adults detected subtle cues of genuine, acute, potentially contagious illness in face photos compared to the same individuals when healthy. We tracked illness symptoms and severity with the Sickness Questionnaire and Common Cold Questionnaire. We also checked that sick and healthy photos were matched on low-level features. We found that participants (N = 109) rated sick faces, compared to healthy faces, as sicker, more dangerous, and eliciting more unpleasant feelings. Participants (N = 90) rated sick faces as more likely to be avoided, more tired, and more negative in expression than healthy faces. In a passive-viewing eye-tracking task, participants (N = 50) looked longer at healthy than sick faces, especially the eye region, suggesting people may be more drawn to healthy conspecifics. When making approach-avoidance decisions, participants (N = 112) had greater pupil dilation to sick than healthy faces, and more pupil dilation was associated with greater avoidance, suggesting elevated arousal to threat. Across all experiments, participants' behaviors correlated with the degree of sickness, as reported by the face donors, suggesting a nuanced, fine-tuned sensitivity. Together, these findings suggest that humans may detect subtle threats of contagion from sick faces, which may facilitate illness avoidance. By better understanding how humans naturally avoid illness in conspecifics, we may identify what information is used and ultimately improve public health.
Collapse
|
10
|
Novin S, Fallah A, Rashidi S, Daliri MR. An improved saliency model of visual attention dependent on image content. Front Hum Neurosci 2023; 16:862588. [PMID: 36926377 PMCID: PMC10011177 DOI: 10.3389/fnhum.2022.862588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 11/14/2022] [Indexed: 03/08/2023] Open
Abstract
Many visual attention models have been presented to obtain the saliency of a scene, i.e., the visually significant parts of a scene. However, some mechanisms are still not taken into account in these models, and the models do not fit the human data accurately. These mechanisms include which visual features are informative enough to be incorporated into the model, how the conspicuity of different features and scales of an image may integrate to obtain the saliency map of the image, and how the structure of an image affects the strategy of our attention system. We integrate such mechanisms in the presented model more efficiently compared to previous models. First, besides low-level features commonly employed in state-of-the-art models, we also apply medium-level features as the combination of orientations and colors based on the visual system behavior. Second, we use a variable number of center-surround difference maps instead of the fixed number used in the other models, suggesting that human visual attention operates differently for diverse images with different structures. Third, we integrate the information of different scales and different features based on their weighted sum, defining the weights according to each component's contribution, and presenting both the local and global saliency of the image. To test the model's performance in fitting human data, we compared it to other models using the CAT2000 dataset and the Area Under Curve (AUC) metric. Our results show that the model has high performance compared to the other models (AUC = 0.79 and sAUC = 0.58) and suggest that the proposed mechanisms can be applied to the existing models to improve them.
Collapse
Affiliation(s)
- Shabnam Novin
- Faculty of Biomedical Engineering, Amirkabir University of Technology (AUT), Tehran, Iran
| | - Ali Fallah
- Faculty of Biomedical Engineering, Amirkabir University of Technology (AUT), Tehran, Iran
| | - Saeid Rashidi
- Faculty of Medical Sciences and Technologies, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Mohammad Reza Daliri
- Neuroscience and Neuroengineering Research Laboratory, Biomedical Engineering Department, School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
- School of Cognitive Sciences (SCS), Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| |
Collapse
|
11
|
Merck C, Noël A, Jamet E, Robert M, Salmon A, Belliard S, Kalénine S. Nonspecific Effects of Normal Aging on Taxonomic and Thematic Semantic Processing. Exp Aging Res 2023; 49:18-40. [PMID: 35234091 DOI: 10.1080/0361073x.2022.2046948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
OBJECTIVE This study aimed to assess the effect of normal aging on the processing of taxonomic and thematic semantic relations. METHOD We used the Visual-World-Paradigm coupled with eye-movement recording. We compared performance of healthy younger and older adults on a word-to-picture matching task in which participants had to identify each target among semantically related (taxonomic or thematic) and unrelated distractors. RESULTS Younger and older participants exhibited similar patterns of gaze fixations in the two semantic conditions. The effect of aging took the form of an overall reduction in sensitivity to semantic competitors, with no difference between the taxonomic and thematic conditions. Moreover, comparison of the proportions of fixations between the younger and older participants indicated that targets were identified equally quickly in both age groups. This was not the case when mouse-click reaction times were analyzed. CONCLUSIONS Findings argue in favor of nonspecific effects of normal aging on semantic processing that similarly affect taxonomic and thematic processing. There are important clinical implications, as pathological aging has been repeatedly shown to selectively affect either taxonomic or thematic relations. Measuring eye-movements in a semantic task is also an interesting approach in the elderly, as these seem to be less impacted by aging than other motor responses.
Collapse
Affiliation(s)
- Catherine Merck
- Service de Neurologie, Cmrr Haute Bretagne, Chu Pontchaillou, Rennes, France.,Normandie Univ, Unicaen, Psl Research University, Ephe, Inserm, U1077, Chu de Caen, Neuropsychologie Et Imagerie de la Mémoire Humaine, Caen, France
| | - Audrey Noël
- Univ Rennes, LP3C (Psychology of Cognition, Behavior & Communication Laboratory) - Ea 1285, Rennes, France
| | - Eric Jamet
- Univ Rennes, LP3C (Psychology of Cognition, Behavior & Communication Laboratory) - Ea 1285, Rennes, France
| | - Maxime Robert
- Univ Rennes, LP3C (Psychology of Cognition, Behavior & Communication Laboratory) - Ea 1285, Rennes, France
| | - Anne Salmon
- Service de Neurologie, Cmrr Haute Bretagne, Chu Pontchaillou, Rennes, France
| | - Serge Belliard
- Service de Neurologie, Cmrr Haute Bretagne, Chu Pontchaillou, Rennes, France.,Normandie Univ, Unicaen, Psl Research University, Ephe, Inserm, U1077, Chu de Caen, Neuropsychologie Et Imagerie de la Mémoire Humaine, Caen, France
| | - Solène Kalénine
- Univ. Lille, Cnrs, Chu Lille, Umr 9193 - SCALab - Sciences Cognitives Et Sciences Affectives, Lille, France
| |
Collapse
|
12
|
Hayes TR, Henderson JM. Scene inversion reveals distinct patterns of attention to semantically interpreted and uninterpreted features. Cognition 2022; 229:105231. [DOI: 10.1016/j.cognition.2022.105231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 07/19/2022] [Accepted: 07/20/2022] [Indexed: 11/03/2022]
|
13
|
|
14
|
Pandey S, Harit G. Handwritten Annotation Spotting in Printed Documents Using Top-Down Visual Saliency Models. ACM T ASIAN LOW-RESO 2022. [DOI: 10.1145/3485468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
In this article, we address the problem of localizing text and symbolic annotations on the scanned image of a printed document. Previous approaches have considered the task of annotation extraction as binary classification into printed and handwritten text. In this work, we further subcategorize the annotations as underlines, encirclements, inline text, and marginal text. We have collected a new dataset of 300 documents constituting all classes of annotations marked around or in-between printed text. Using the dataset as a benchmark, we report the results of two saliency formulations—CRF Saliency and Discriminant Saliency, for predicting salient patches, which can correspond to different types of annotations. We also compare our work with recent semantic segmentation techniques using deep models. Our analysis shows that Discriminant Saliency can be considered as the preferred approach for fast localization of patches containing different types of annotations. The saliency models were learned on a small dataset, but still, give comparable performance to the deep networks for pixel-level semantic segmentation. We show that saliency-based methods give better outcomes with limited annotated data compared to more sophisticated segmentation techniques that require a large training set to learn the model.
Collapse
Affiliation(s)
- Shilpa Pandey
- Adani Institute of Infrastructure Engineering, Ahmedabad, Gujarat, India
| | - Gaurav Harit
- Indian Institute of Technology Jodhpur, Jodhpur, Rajasthan, India
| |
Collapse
|
15
|
Ghosh S, D'Angelo G, Glover A, Iacono M, Niebur E, Bartolozzi C. Event-driven proto-object based saliency in 3D space to attract a robot's attention. Sci Rep 2022; 12:7645. [PMID: 35538154 PMCID: PMC9090933 DOI: 10.1038/s41598-022-11723-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 04/25/2022] [Indexed: 11/28/2022] Open
Abstract
To interact with its environment, a robot working in 3D space needs to organise its visual input in terms of objects or their perceptual precursors, proto-objects. Among other visual cues, depth is a submodality used to direct attention to visual features and objects. Current depth-based proto-object attention models have been implemented for standard RGB-D cameras that produce synchronous frames. In contrast, event cameras are neuromorphic sensors that loosely mimic the function of the human retina by asynchronously encoding per-pixel brightness changes at very high temporal resolution, thereby providing advantages like high dynamic range, efficiency (thanks to their high degree of signal compression), and low latency. We propose a bio-inspired bottom-up attention model that exploits event-driven sensing to generate depth-based saliency maps that allow a robot to interact with complex visual input. We use event-cameras mounted in the eyes of the iCub humanoid robot to directly extract edge, disparity and motion information. Real-world experiments demonstrate that our system robustly selects salient objects near the robot in the presence of clutter and dynamic scene changes, for the benefit of downstream applications like object segmentation, tracking and robot interaction with external objects.
Collapse
Affiliation(s)
- Suman Ghosh
- Event Driven Perception for Robotics, Istituto Italiano di Tecnologia, 16163, Genoa, Italy
- Electrical Engineering and Computer Science, Technische Universität Berlin, 10623, Berlin, Germany
| | - Giulia D'Angelo
- Event Driven Perception for Robotics, Istituto Italiano di Tecnologia, 16163, Genoa, Italy
- Department of Computer Science, The University of Manchester, Manchester, M13 9PL, UK
| | - Arren Glover
- Event Driven Perception for Robotics, Istituto Italiano di Tecnologia, 16163, Genoa, Italy
| | - Massimiliano Iacono
- Event Driven Perception for Robotics, Istituto Italiano di Tecnologia, 16163, Genoa, Italy
| | - Ernst Niebur
- Mind/Brain Institute, Johns Hopkins University, Baltimore, 21218, MD, USA
| | - Chiara Bartolozzi
- Event Driven Perception for Robotics, Istituto Italiano di Tecnologia, 16163, Genoa, Italy.
| |
Collapse
|
16
|
Zemliak V, MacInnes WJ. The Spatial Leaky Competing Accumulator Model. FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.866029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Leaky Competing Accumulator model (LCA) of Usher and McClelland is able to simulate the time course of perceptual decision making between an arbitrary number of stimuli. Reaction times, such as saccadic latencies, produce a typical distribution that is skewed toward longer latencies and accumulator models have shown excellent fit to these distributions. We propose a new implementation called the Spatial Leaky Competing Accumulator (SLCA), which can be used to predict the timing of subsequent fixation durations during a visual task. SLCA uses a pre-existing saliency map as input and represents accumulation neurons as a two-dimensional grid to generate predictions in visual space. The SLCA builds on several biologically motivated parameters: leakage, recurrent self-excitation, randomness and non-linearity, and we also test two implementations of lateral inhibition. A global lateral inhibition, as implemented in the original model of Usher and McClelland, is applied to all competing neurons, while a local implementation allows only inhibition of immediate neighbors. We trained and compared versions of the SLCA with both global and local lateral inhibition with use of a genetic algorithm, and compared their performance in simulating human fixation latency distribution in a foraging task. Although both implementations were able to produce a positively skewed latency distribution, only the local SLCA was able to match the human data distribution from the foraging task. Our model is discussed for its potential in models of salience and priority, and its benefits as compared to other models like the Leaky integrate and fire network.
Collapse
|
17
|
Visual Landing Based on the Human Depth Perception in Limited Visibility and Failure of Avionic Systems. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4320101. [PMID: 35498171 PMCID: PMC9054408 DOI: 10.1155/2022/4320101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 03/21/2022] [Indexed: 11/21/2022]
Abstract
This paper introduces a novel visual landing system applicable to the accurate landing of commercial aircraft utilizing human depth perception algorithms, named a 3D Model Landing System (3DMLS). The 3DMLS uses a simulation environment for visual landing in the failure of navigation aids/avionics, adverse weather conditions, and limited visibility. To simulate the approach path and surrounding area, the 3DMLS implements both the inertial measurement unit (IMU) and the digital elevation model (DEM). While the aircraft is in the instrument landing system (ILS) range, the 3DMLS simulates more details of the environment in addition to implementing the DOF depth perception algorithm to provide a clear visual landing path. This path is displayed on a multifunction display in the cockpit for pilots. As the pilot's eye concentrates mostly on the runway location and touch-down point, “the runway” becomes the center of focus in the environment simulation. To display and evaluate the performance of the 3DMLS and depth perception, a landing auto test is also designed and implemented to guide the aircraft along the runway. The flight path is derived simultaneously by comparison of the current aircraft and the runway position. The Unity and MATLAB software are adopted to model the 3DMLS. The accuracy and the quality of the simulated environment in terms of resolution, the field of view, frame per second, and latency are confirmed based on FSTD's visual requirements. Finally, the saliency map toolbox shows that the depth of field (DOF) implementation increases the pilot's concentration resulting in safe landing guidance.
Collapse
|
18
|
Martin D, Serrano A, Bergman AW, Wetzstein G, Masia B. ScanGAN360: A Generative Model of Realistic Scanpaths for 360° Images. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:2003-2013. [PMID: 35167469 DOI: 10.1109/tvcg.2022.3150502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Understanding and modeling the dynamics of human gaze behavior in 360° environments is crucial for creating, improving, and developing emerging virtual reality applications. However, recruiting human observers and acquiring enough data to analyze their behavior when exploring virtual environments requires complex hardware and software setups, and can be time-consuming. Being able to generate virtual observers can help overcome this limitation, and thus stands as an open problem in this medium. Particularly, generative adversarial approaches could alleviate this challenge by generating a large number of scanpaths that reproduce human behavior when observing new scenes, essentially mimicking virtual observers. However, existing methods for scanpath generation do not adequately predict realistic scanpaths for 360° images. We present ScanGAN360, a new generative adversarial approach to address this problem. We propose a novel loss function based on dynamic time warping and tailor our network to the specifics of 360° images. The quality of our generated scanpaths outperforms competing approaches by a large margin, and is almost on par with the human baseline. ScanGAN360 allows fast simulation of large numbers of virtual observers, whose behavior mimics real users, enabling a better understanding of gaze behavior, facilitating experimentation, and aiding novel applications in virtual reality and beyond.
Collapse
|
19
|
Cimminella F, D'Innocenzo G, Sala SD, Iavarone A, Musella C, Coco MI. Preserved Extra-Foveal Processing of Object Semantics in Alzheimer's Disease. J Geriatr Psychiatry Neurol 2022; 35:418-433. [PMID: 34044661 DOI: 10.1177/08919887211016056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Alzheimer's disease (AD) patients underperform on a range of tasks requiring semantic processing, but it is unclear whether this impairment is due to a generalised loss of semantic knowledge or to issues in accessing and selecting such information from memory. The objective of this eye-tracking visual search study was to determine whether semantic expectancy mechanisms known to support object recognition in healthy adults are preserved in AD patients. Furthermore, as AD patients are often reported to be impaired in accessing information in extra-foveal vision, we investigated whether that was also the case in our study. Twenty AD patients and 20 age-matched controls searched for a target object among an array of distractors presented extra-foveally. The distractors were either semantically related or unrelated to the target (e.g., a car in an array with other vehicles or kitchen items). Results showed that semantically related objects were detected with more difficulty than semantically unrelated objects by both groups, but more markedly by the AD group. Participants looked earlier and for longer at the critical objects when these were semantically unrelated to the distractors. Our findings show that AD patients can process the semantics of objects and access it in extra-foveal vision. This suggests that their impairments in semantic processing may reflect difficulties in accessing semantic information rather than a generalised loss of semantic memory.
Collapse
Affiliation(s)
- Francesco Cimminella
- Human Cognitive Neuroscience, Psychology, University of Edinburgh, Edinburgh, United Kingdom.,Laboratory of Experimental Psychology, Suor Orsola Benincasa University, Naples, Italy
| | | | - Sergio Della Sala
- Human Cognitive Neuroscience, Psychology, University of Edinburgh, Edinburgh, United Kingdom
| | | | - Caterina Musella
- Associazione Italiana Malattia d'Alzheimer (AIMA sezione Campania), Naples, Italy
| | - Moreno I Coco
- Faculdade de Psicologia, Universidade de Lisboa, Lisbon, Portugal.,School of Psychology, The University of East London, London, United Kingdom
| |
Collapse
|
20
|
Qianchen L, Gallagher RM, Tsuchiya N. How much can we differentiate at a brief glance: revealing the truer limit in conscious contents through the massive report paradigm (MRP). ROYAL SOCIETY OPEN SCIENCE 2022; 9:210394. [PMID: 35619998 PMCID: PMC9128849 DOI: 10.1098/rsos.210394] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 04/27/2022] [Indexed: 06/15/2023]
Abstract
Upon a brief glance, how well can we differentiate what we see from what we do not? Previous studies answered this question as 'poorly'. This is in stark contrast with our everyday experience. Here, we consider the possibility that previous restriction in stimulus variability and response alternatives reduced what participants could express from what they consciously experienced. We introduce a novel massive report paradigm that probes the ability to differentiate what we see from what we do not. In each trial, participants viewed a natural scene image and judged whether a small image patch was a part of the original image. To examine the limit of discriminability, we also included subtler changes in the image as modification of objects. Neither the images nor patches were repeated per participant. Our results showed that participants were highly accurate (accuracy greater than 80%) in differentiating patches from the viewed images from patches that are not present. Additionally, the differentiation between original and modified objects was influenced by object sizes and/or the congruence between objects and the scene gists. Our massive report paradigm opens a door to quantitatively measure the limit of immense informativeness of a moment of consciousness.
Collapse
Affiliation(s)
- Liang Qianchen
- School of Psychological Sciences, Faculty of Medicine, Nursing and Health Sciences, Monash University, Clayton, Victoria, Australia
- Turner Institute for Brain and Mental Health, Monash University, Melbourne, Victoria, Australia
| | - Regan M. Gallagher
- School of Psychological Sciences, Faculty of Medicine, Nursing and Health Sciences, Monash University, Clayton, Victoria, Australia
- Turner Institute for Brain and Mental Health, Monash University, Melbourne, Victoria, Australia
| | - Naotsugu Tsuchiya
- School of Psychological Sciences, Faculty of Medicine, Nursing and Health Sciences, Monash University, Clayton, Victoria, Australia
- Turner Institute for Brain and Mental Health, Monash University, Melbourne, Victoria, Australia
- Center for Information and Neural Networks (CiNet), Osaka, Japan
- Advanced Telecommunications Research Computational Neuroscience Laboratories, Kyoto, Japan
| |
Collapse
|
21
|
Kümmerer M, Bethge M, Wallis TSA. DeepGaze III: Modeling free-viewing human scanpaths with deep learning. J Vis 2022; 22:7. [PMID: 35472130 PMCID: PMC9055565 DOI: 10.1167/jov.22.5.7] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Humans typically move their eyes in “scanpaths” of fixations linked by saccades. Here we present DeepGaze III, a new model that predicts the spatial location of consecutive fixations in a free-viewing scanpath over static images. DeepGaze III is a deep learning–based model that combines image information with information about the previous fixation history to predict where a participant might fixate next. As a high-capacity and flexible model, DeepGaze III captures many relevant patterns in the human scanpath data, setting a new state of the art in the MIT300 dataset and thereby providing insight into how much information in scanpaths across observers exists in the first place. We use this insight to assess the importance of mechanisms implemented in simpler, interpretable models for fixation selection. Due to its architecture, DeepGaze III allows us to disentangle several factors that play an important role in fixation selection, such as the interplay of scene content and scanpath history. The modular nature of DeepGaze III allows us to conduct ablation studies, which show that scene content has a stronger effect on fixation selection than previous scanpath history in our main dataset. In addition, we can use the model to identify scenes for which the relative importance of these sources of information differs most. These data-driven insights would be difficult to accomplish with simpler models that do not have the computational capacity to capture such patterns, demonstrating an example of how deep learning advances can be used to contribute to scientific understanding.
Collapse
Affiliation(s)
| | | | - Thomas S A Wallis
- Technical University of Darmstadt, Institute of Psychology and Centre for Cognitive Science, Darmstadt, Germany.,
| |
Collapse
|
22
|
Miuccio MT, Zelinsky GJ, Schmidt J. Are all real-world objects created equal? Estimating the "set-size" of the search target in visual working memory. Psychophysiology 2022; 59:e13998. [PMID: 35001411 PMCID: PMC8957527 DOI: 10.1111/psyp.13998] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 11/23/2021] [Accepted: 12/16/2021] [Indexed: 11/30/2022]
Abstract
Are all real-world objects created equal? Visual search difficulty increases with the number of targets and as target-related visual working memory (VWM) load increases. Our goal was to investigate the load imposed by individual real-world objects held in VWM in the context of search. Measures of visual clutter attempt to quantify real-world set-size in the context of scenes. We applied one of these measures, the number of proto-objects, to individual real-world objects and used contralateral delay activity (CDA) to measure the resulting VWM load. The current study presented a real-world object as a target cue, followed by a delay where CDA was measured. This was followed by a four-object search array. We compared CDA and later search performance from target cues containing a high or low number of proto-objects. High proto-object target cues resulted in greater CDA, longer search RTs, target dwell times, and reduced search guidance, relative to low proto-object targets. These findings demonstrate that targets with more proto-objects result in a higher VWM load and reduced search performance. This shows that the number of proto-objects contained within individual objects produce set-size like effects in VWM and suggests proto-objects may be a viable unit of measure of real-world VWM load. Importantly, this demonstrates that not all real-world objects are created equal.
Collapse
|
23
|
Lv W, Xu H, Han X, Zhang H, Ma J, Rahmim A, Lu L. Context-Aware Saliency Guided Radiomics: Application to Prediction of Outcome and HPV-Status from Multi-Center PET/CT Images of Head and Neck Cancer. Cancers (Basel) 2022; 14:cancers14071674. [PMID: 35406449 PMCID: PMC8996849 DOI: 10.3390/cancers14071674] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 03/19/2022] [Accepted: 03/21/2022] [Indexed: 12/15/2022] Open
Abstract
Simple Summary This study investigated the ability of context-aware saliency-guided PET/CT radiomics in the prediction of outcome and HPV status for head and neck cancer. In total, 806 HNC patients (training vs. validation vs. external testing: 500 vs. 97 vs. 209) from 9 centers were collected from The Cancer Imaging Archive (TCIA). Saliency-guided radiomics showed enhanced performance for both outcome and HPV-status predictions relative to conventional radiomics. The radiomics-predicted HPV status also showed complementary prognostic value. This multi-center study highlights the feasibility of saliency-guided PET/CT radiomics in outcome predictions of head and neck cancer, confirming that certain regions are more relevant to tumor aggressiveness and prognosis. Abstract Purpose: This multi-center study aims to investigate the prognostic value of context-aware saliency-guided radiomics in 18F-FDG PET/CT images of head and neck cancer (HNC). Methods: 806 HNC patients (training vs. validation vs. external testing: 500 vs. 97 vs. 209) from 9 centers were collected from The Cancer Imaging Archive (TCIA). There were 100/384 and 60/123 oropharyngeal carcinoma (OPC) patients with human papillomavirus (HPV) status in training and testing cohorts, respectively. Six types of images were used for radiomics feature extraction and further model construction, namely (i) the original image (Origin), (ii) a context-aware saliency map (SalMap), (iii, iv) high- or low-saliency regions in the original image (highSal or lowSal), (v) a saliency-weighted image (SalxImg), and finally, (vi) a fused PET-CT image (FusedImg). Four outcomes were evaluated, i.e., recurrence-free survival (RFS), metastasis-free survival (MFS), overall survival (OS), and disease-free survival (DFS), respectively. Multivariate Cox analysis and logistic regression were adopted to construct radiomics scores for the prediction of outcome (Rad_Ocm) and HPV-status (Rad_HPV), respectively. Besides, the prognostic value of their integration (Rad_Ocm_HPV) was also investigated. Results: In the external testing cohort, compared with the Origin model, SalMap and SalxImg achieved the highest C-indices for RFS (0.621 vs. 0.559) and MFS (0.785 vs. 0.739) predictions, respectively, while FusedImg performed the best for both OS (0.685 vs. 0.659) and DFS (0.641 vs. 0.582) predictions. In the OPC HPV testing cohort, FusedImg showed higher AUC for HPV-status prediction compared with the Origin model (0.653 vs. 0.484). In the OPC testing cohort, compared with Rad_Ocm or Rad_HPV alone, Rad_Ocm_HPV performed the best for OS and DFS predictions with C-indices of 0.702 (p = 0.002) and 0.684 (p = 0.006), respectively. Conclusion: Saliency-guided radiomics showed enhanced performance for both outcome and HPV-status predictions relative to conventional radiomics. The radiomics-predicted HPV status also showed complementary prognostic value.
Collapse
Affiliation(s)
- Wenbing Lv
- School of Biomedical Engineering, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China; (W.L.); (H.X.); (X.H.); (J.M.)
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China
- Pazhou Lab, Guangzhou 510330, China
| | - Hui Xu
- School of Biomedical Engineering, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China; (W.L.); (H.X.); (X.H.); (J.M.)
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China
- Pazhou Lab, Guangzhou 510330, China
| | - Xu Han
- School of Biomedical Engineering, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China; (W.L.); (H.X.); (X.H.); (J.M.)
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China
- Pazhou Lab, Guangzhou 510330, China
| | - Hao Zhang
- Department of Medical Imaging, Nanfang Hospital, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China;
| | - Jianhua Ma
- School of Biomedical Engineering, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China; (W.L.); (H.X.); (X.H.); (J.M.)
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China
- Pazhou Lab, Guangzhou 510330, China
| | - Arman Rahmim
- Department of Integrative Oncology, BC Cancer Research Institute, 675 West 10th Avenue, Vancouver, BC V5Z 1L3, Canada;
- Department of Radiology, University of British Columbia, 2211 Wesbrook Mall, Vancouver, BC V6T 1Z1, Canada
- Department of Physics, University of British Columbia, 6224 Agricultural Road, Vancouver, BC V6T 1Z1, Canada
| | - Lijun Lu
- School of Biomedical Engineering, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China; (W.L.); (H.X.); (X.H.); (J.M.)
- Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China
- Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, 1023 Shatai Road, Guangzhou 510515, China
- Pazhou Lab, Guangzhou 510330, China
- Correspondence: ; Tel.: +86-020-62789116
| |
Collapse
|
24
|
Chakraborty S, Samaras D, Zelinsky GJ. Weighting the factors affecting attention guidance during free viewing and visual search: The unexpected role of object recognition uncertainty. J Vis 2022; 22:13. [PMID: 35323870 PMCID: PMC8963662 DOI: 10.1167/jov.22.4.13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 02/18/2022] [Indexed: 11/24/2022] Open
Abstract
The factors determining how attention is allocated during visual tasks have been studied for decades, but few studies have attempted to model the weighting of several of these factors within and across tasks to better understand their relative contributions. Here we consider the roles of saliency, center bias, target features, and object recognition uncertainty in predicting the first nine changes in fixation made during free viewing and visual search tasks in the OSIE and COCO-Search18 datasets, respectively. We focus on the latter-most and least familiar of these factors by proposing a new method of quantifying uncertainty in an image, one based on object recognition. We hypothesize that the greater the number of object categories competing for an object proposal, the greater the uncertainty of how that object should be recognized and, hence, the greater the need for attention to resolve this uncertainty. As expected, we found that target features best predicted target-present search, with their dominance obscuring the use of other features. Unexpectedly, we found that target features were only weakly used during target-absent search. We also found that object recognition uncertainty outperformed an unsupervised saliency model in predicting free-viewing fixations, although saliency was slightly more predictive of search. We conclude that uncertainty in object recognition, a measure that is image computable and highly interpretable, is better than bottom-up saliency in predicting attention during free viewing.
Collapse
Affiliation(s)
| | - Dimitris Samaras
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| | - Gregory J Zelinsky
- Department of Psychology, Stony Brook University, Stony Brook, NY, USA
- Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
| |
Collapse
|
25
|
A self-learning cognitive architecture exploiting causality from rewards. Neural Netw 2022; 150:274-292. [DOI: 10.1016/j.neunet.2022.02.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 01/10/2022] [Accepted: 02/28/2022] [Indexed: 11/20/2022]
|
26
|
Delmas M, Caroux L, Lemercier C. Searching in clutter: Visual behavior and performance of expert action video game players. APPLIED ERGONOMICS 2022; 99:103628. [PMID: 34717071 DOI: 10.1016/j.apergo.2021.103628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 09/08/2021] [Accepted: 10/23/2021] [Indexed: 06/13/2023]
Abstract
Searching for targets among distractors in visual scenes can be more difficult due to the presence of clutter. However, studies in various domains have shown differentiated effects according to the expertise of the searcher. The present study extended these findings to the domain of action video games expertise. 58 participants, split in 2 groups (action video game players and non-action video game players) searched for targets in visual scenes under two clutter conditions (uncluttered and high clutter). Reaction times and accuracy served as measures of performance, and the visual behavior was assessed using the number and duration of eye fixations. Our findings suggest that visual clutter has a negative influence on performance and alters the visual behavior during visual search in action video game scenes. Our results also suggest that expert action video game players might use different visual strategies to cope with clutter, leading however to no performance benefits.
Collapse
Affiliation(s)
- Maxime Delmas
- Cognition, Languages, Language and Ergonomics (CLLE) Laboratory, University of Toulouse - Jean Jaurès & CNRS, Toulouse, France.
| | - Loïc Caroux
- Cognition, Languages, Language and Ergonomics (CLLE) Laboratory, University of Toulouse - Jean Jaurès & CNRS, Toulouse, France.
| | - Céline Lemercier
- Cognition, Languages, Language and Ergonomics (CLLE) Laboratory, University of Toulouse - Jean Jaurès & CNRS, Toulouse, France.
| |
Collapse
|
27
|
Helo A, Guerra E, Coloma CJ, Aravena-Bravo P, Rämä P. Do Children With Developmental Language Disorder Activate Scene Knowledge to Guide Visual Attention? Effect of Object-Scene Inconsistencies on Gaze Allocation. Front Psychol 2022; 12:796459. [PMID: 35069387 PMCID: PMC8776641 DOI: 10.3389/fpsyg.2021.796459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 12/09/2021] [Indexed: 12/03/2022] Open
Abstract
Our visual environment is highly predictable in terms of where and in which locations objects can be found. Based on visual experience, children extract rules about visual scene configurations, allowing them to generate scene knowledge. Similarly, children extract the linguistic rules from relatively predictable linguistic contexts. It has been proposed that the capacity of extracting rules from both domains might share some underlying cognitive mechanisms. In the present study, we investigated the link between language and scene knowledge development. To do so, we assessed whether preschool children (age range = 5;4–6;6) with Developmental Language Disorder (DLD), who present several difficulties in the linguistic domain, are equally attracted to object-scene inconsistencies in a visual free-viewing task in comparison with age-matched children with Typical Language Development (TLD). All children explored visual scenes containing semantic (e.g., soap on a breakfast table), syntactic (e.g., bread on the chair back), or both inconsistencies (e.g., soap on the chair back). Since scene knowledge interacts with image properties (i.e., saliency) to guide gaze allocation during visual exploration from the early stages of development, we also included the objects’ saliency rank in the analysis. The results showed that children with DLD were less attracted to semantic and syntactic inconsistencies than children with TLD. In addition, saliency modulated syntactic effect only in the group of children with TLD. Our findings indicate that children with DLD do not activate scene knowledge to guide visual attention as efficiently as children with TLD, especially at the syntactic level, suggesting a link between scene knowledge and language development.
Collapse
Affiliation(s)
- Andrea Helo
- Departamento de Fonoaudiología, Facultad de Medicina, Universidad de Chile, Santiago, Chile.,Departamento de Neurociencias, Facultad de Medicina, Universidad de Chile, Santiago, Chile.,Centro de Investigación Avanzada en Educación, Instituto de Educación-IE, Universidad de Chile, Santiago, Chile
| | - Ernesto Guerra
- Centro de Investigación Avanzada en Educación, Instituto de Educación-IE, Universidad de Chile, Santiago, Chile
| | - Carmen Julia Coloma
- Departamento de Fonoaudiología, Facultad de Medicina, Universidad de Chile, Santiago, Chile.,Centro de Investigación Avanzada en Educación, Instituto de Educación-IE, Universidad de Chile, Santiago, Chile
| | - Paulina Aravena-Bravo
- Departamento de Fonoaudiología, Facultad de Medicina, Universidad de Chile, Santiago, Chile.,Escuela de Psicología, Pontificia Universidad Católica de Chile, Santiago, Chile
| | - Pia Rämä
- Integrative Neuroscience and Cognition Center (UMR 8002), CNRS, Université Paris Descartes, Paris, France
| |
Collapse
|
28
|
Conte S, Baccolo E, Bulf H, Proietti V, Macchi Cassia V. Infants' visual exploration strategies for adult and child faces. INFANCY 2022; 27:492-514. [PMID: 35075767 DOI: 10.1111/infa.12458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Revised: 11/05/2021] [Accepted: 12/21/2021] [Indexed: 11/28/2022]
Abstract
By the end of the first year of life, infants' discrimination abilities tune to frequently experienced face groups. Little is known about the exploration strategies adopted to efficiently discriminate frequent, familiar face types. The present eye-tracking study examined the distribution of visual fixations produced by 10-month-old and 4-month-old singletons while learning adult (i.e., familiar) and child (i.e., unfamiliar) White faces. Infants were tested in an infant-controlled visual habituation task, in which post-habituation preference measured successful discrimination. Results confirmed earlier evidence that, without sibling experience, 10-month-olds discriminate only among adult faces. Analyses of gaze movements during habituation showed that infants' fixations were centered in the upper part of the stimuli. The mouth was sampled longer in adult faces than in child faces, while the child eyes were sampled longer and more frequently than the adult eyes. At 10 months, but not at 4 months, global measures of scanning behavior on the whole face also varied according to face age, as the spatiotemporal distribution of scan paths showed larger within- and between-participants similarity for adult faces than for child faces. Results are discussed with reference to the perceptual narrowing literature, and the influence of age-appropriate developmental tasks on infants' face processing abilities.
Collapse
Affiliation(s)
- Stefania Conte
- Department of Psychology, University of South Carolina, Columbia, South Carolina, USA
| | - Elisa Baccolo
- Department of Psychology, University of Milano-Bicocca, Milano, Italy
| | - Hermann Bulf
- Department of Psychology, University of Milano-Bicocca, Milano, Italy
| | - Valentina Proietti
- Department of Psychology, Trinity Western University, Langley, British Columbia, Canada
| | | |
Collapse
|
29
|
King J, Markant J. Selective attention to lesson‐relevant contextual information promotes 3‐ to 5‐year‐old children's learning. Dev Sci 2022; 25:e13237. [DOI: 10.1111/desc.13237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 01/07/2022] [Accepted: 01/19/2022] [Indexed: 11/29/2022]
Affiliation(s)
- Jill King
- Neuroscience Program Tulane University New Orleans Louisiana 70118
- Tulane Brain Institute Tulane University New Orleans Louisiana 70118
| | - Julie Markant
- Department of Psychology Tulane University New Orleans Louisiana 70118
- Tulane Brain Institute Tulane University New Orleans Louisiana 70118
| |
Collapse
|
30
|
Priscilla Martinez-Cedillo A, Dent K, Foulsham T. Do cognitive load and ADHD traits affect the tendency to prioritise social information in scenes? Q J Exp Psychol (Hove) 2022; 75:1904-1918. [PMID: 34844477 PMCID: PMC9424720 DOI: 10.1177/17470218211066475] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
We report two experiments investigating the effect of working memory (WM)
load on selective attention. Experiment 1 was a modified version of
Lavie et al. and confirmed that increasing memory load disrupted
performance in the classic flanker task. Experiment 2 used the same
manipulation of WM load to probe attention during the viewing of
complex scenes while also investigating individual differences in
attention deficit hyperactivity disorder (ADHD) traits. In the
image-viewing task, we measured the degree to which fixations targeted
each of two crucial objects: (1) a social object (a person in the
scene) and (2) a non-social object of higher or lower physical
salience. We compared the extent to which increasing WM load would
change the pattern of viewing of the physically salient and socially
salient objects. If attending to the social item requires greater
default voluntary top-down resources, then the viewing of social
objects should show stronger modulation by WM load compared with
viewing of physically salient objects. The results showed that the
social object was fixated to a greater degree than the other object
(regardless of physical salience). Increased salience drew fixations
away from the background leading to slightly increased fixations on
the non-social object, without changing fixations on the social
object. Increased levels of ADHD-like traits were associated with
fewer fixations on the social object, but only in the high-salient,
low-load condition. Importantly, WM load did not affect the number of
fixations on the social object. Such findings suggest rather
surprisingly that attending to a social area in complex stimuli is not
dependent on the availability of voluntary top-down resources.
Collapse
Affiliation(s)
| | - Kevin Dent
- Department of Psychology, University of Essex, Colchester, UK
| | - Tom Foulsham
- Department of Psychology, University of Essex, Colchester, UK
| |
Collapse
|
31
|
Malladi SPK, Mukherjee J, Larabi MC, Chaudhury S. EG-SNIK: A Free Viewing Egocentric Gaze Dataset and Its Applications. IEEE ACCESS 2022; 10:129626-129641. [DOI: 10.1109/access.2022.3228484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/19/2023]
Affiliation(s)
- Sai Phani Kumar Malladi
- Advanced Technology Development Centre, Indian Institute of Technology Kharagpur, Kharagpur, India
| | - Jayanta Mukherjee
- Department of Computer Science and Engineering, IIT Kharagpur, Kharagpur, India
| | | | - Santanu Chaudhury
- Department of Computer Science and Engineering, IIT Jodhpur, Jodhpur, India
| |
Collapse
|
32
|
Kano F, Furuichi T, Hashimoto C, Krupenye C, Leinwand JG, Hopper LM, Martin CF, Otsuka R, Tajima T. What is unique about the human eye? Comparative image analysis on the external eye morphology of human and nonhuman great apes. EVOL HUM BEHAV 2021. [DOI: 10.1016/j.evolhumbehav.2021.12.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
33
|
Song H, Park BY, Park H, Shim WM. Cognitive and Neural State Dynamics of Narrative Comprehension. J Neurosci 2021; 41:8972-8990. [PMID: 34531284 PMCID: PMC8549535 DOI: 10.1523/jneurosci.0037-21.2021] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 09/03/2021] [Accepted: 09/07/2021] [Indexed: 11/21/2022] Open
Abstract
Narrative comprehension involves a constant interplay of the accumulation of incoming events and their integration into a coherent structure. This study characterizes cognitive states during narrative comprehension and the network-level reconfiguration occurring dynamically in the functional brain. We presented movie clips of temporally scrambled sequences to human participants (male and female), eliciting fluctuations in the subjective feeling of comprehension. Comprehension occurred when processing events that were highly causally related to the previous events, suggesting that comprehension entails the integration of narratives into a causally coherent structure. The functional neuroimaging results demonstrated that the integrated and efficient brain state emerged during the moments of narrative integration with the increased level of activation and across-modular connections in the default mode network. Underlying brain states were synchronized across individuals when comprehending novel narratives, with increased occurrences of the default mode network state, integrated with sensory processing network, during narrative integration. A model based on time-resolved functional brain connectivity predicted changing cognitive states related to comprehension that are general across narratives. Together, these results support adaptive reconfiguration and interaction of the functional brain networks on causal integration of the narratives.SIGNIFICANCE STATEMENT The human brain can integrate temporally disconnected pieces of information into coherent narratives. However, the underlying cognitive and neural mechanisms of how the brain builds a narrative representation remain largely unknown. We showed that comprehension occurs as the causally related events are integrated to form a coherent situational model. Using fMRI, we revealed that the large-scale brain states and interaction between brain regions dynamically reconfigure as comprehension evolves, with the default mode network playing a central role during moments of narrative integration. Overall, the study demonstrates that narrative comprehension occurs through a dynamic process of information accumulation and causal integration, supported by the time-varying reconfiguration and brain network interaction.
Collapse
Affiliation(s)
- Hayoung Song
- Center for Neuroscience Imaging Research, IBS, Suwon, Korea, 16419
- Department of Biomedical Engineering, Sungkyunkwan University, Suwon, Korea, 16419
- Department of Psychology, University of Chicago, Chicago, Illinois, 60637
| | - Bo-Yong Park
- Center for Neuroscience Imaging Research, IBS, Suwon, Korea, 16419
- Department of Electronic, Electrical and Computer Engineering, Sungkyunkwan University, Suwon, Korea, 16419
- McConnell Brain Imaging Centre, Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec Canada, H3A 2B4
- Department of Data Science, Inha University, Incheon, Korea, 22201
| | - Hyunjin Park
- Center for Neuroscience Imaging Research, IBS, Suwon, Korea, 16419
- School of Electronics and Electrical Engineering, Sungkyunkwan University, Suwon, Korea, 16419
| | - Won Mok Shim
- Center for Neuroscience Imaging Research, IBS, Suwon, Korea, 16419
- Department of Biomedical Engineering, Sungkyunkwan University, Suwon, Korea, 16419
- Department of Intelligent Precision Healthcare Convergence, Sungkyunkwan University, Suwon, Korea, 16419
| |
Collapse
|
34
|
Li W, Guan J, Shi W. Increasing the load on executive working memory reduces the search performance in the natural scenes: Evidence from eye movements. CURRENT PSYCHOLOGY 2021. [DOI: 10.1007/s12144-021-02270-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
35
|
Ogawa A. Time-varying measures of cerebral network centrality correlate with visual saliency during movie watching. Brain Behav 2021; 11:e2334. [PMID: 34435748 PMCID: PMC8442596 DOI: 10.1002/brb3.2334] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 07/05/2021] [Accepted: 08/07/2021] [Indexed: 12/12/2022] Open
Abstract
The extensive development of graph-theoretic analysis for functional connectivity has revealed the multifaceted characteristics of brain networks. Network centralities identify the principal functional regions, individual differences, and hub structure in brain networks. Neuroimaging studies using movie-watching have investigated brain function under naturalistic stimuli. Visual saliency is one of the promising measures for revealing cognition and emotions driven by naturalistic stimuli. This study investigated whether the visual saliency in movies was associated with network centrality. The study examined eigenvector centrality (EC), which is a measure of a region's influence in the brain network, and the participation coefficient (PC), which reflects the hub structure in the brain, was used for comparison. Static and time-varying EC and PC were analyzed by a parcel-based technique. While EC was correlated with brain activity in parcels in the visual and auditory areas during movie-watching, it was only correlated with parcels in the visual areas in the retinotopy task. In addition, high PC was consistently observed in parcels in the putative hub both during the tasks and the resting-state condition. Time-varying EC in the parietal parcels and time-varying PC in the primary sensory parcels significantly correlated with visual saliency in the movies. These results suggest that time-varying centralities in brain networks are distinctively associated with perceptual processing and subsequent higher processing of visual saliency.
Collapse
Affiliation(s)
- Akitoshi Ogawa
- Faculty of Medicine, Juntendo University, Bunkyo-ku, Tokyo, Japan.,Brain Science Institute, Tamagawa University, Machida, Tokyo, Japan
| |
Collapse
|
36
|
Human scanpath estimation based on semantic segmentation guided by common eye fixation behaviors. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.07.121] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
37
|
The roles of symbolic and numerical representations in asymmetric visual search. Acta Psychol (Amst) 2021; 219:103397. [PMID: 34392013 DOI: 10.1016/j.actpsy.2021.103397] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 07/03/2021] [Accepted: 08/10/2021] [Indexed: 11/22/2022] Open
Abstract
This study aims to explore the impact of symbolic and numerical representations in asymmetric visual search. Heterogeneous and homogeneous mathematical units, letters, symbols and numbers were used in the first and second experiments, respectively. Target was searched among 6 or 12 stimuli. Analysis showed that the search efficiency was greatest in LaS (large among small), but the performance decreased significantly in SaS (small among small). Moreover, LaS was faster than SaL (small among large). However, SaL was faster than LaL (large among large) and LaL was faster than SaS. The findings of this study showed that the search efficiency was indirectly proportional to the number of stimuli. This implies that processing of visual stimuli of different sizes and appearance was asymmetric, all requiring varied attention. In summary, these findings strengthen the integration theory, similarity theory and guided search model.
Collapse
|
38
|
Peters B, Kriegeskorte N. Capturing the objects of vision with neural networks. Nat Hum Behav 2021; 5:1127-1144. [PMID: 34545237 DOI: 10.1038/s41562-021-01194-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2019] [Accepted: 08/06/2021] [Indexed: 01/31/2023]
Abstract
Human visual perception carves a scene at its physical joints, decomposing the world into objects, which are selectively attended, tracked and predicted as we engage our surroundings. Object representations emancipate perception from the sensory input, enabling us to keep in mind that which is out of sight and to use perceptual content as a basis for action and symbolic cognition. Human behavioural studies have documented how object representations emerge through grouping, amodal completion, proto-objects and object files. By contrast, deep neural network models of visual object recognition remain largely tethered to sensory input, despite achieving human-level performance at labelling objects. Here, we review related work in both fields and examine how these fields can help each other. The cognitive literature provides a starting point for the development of new experimental tasks that reveal mechanisms of human object perception and serve as benchmarks driving the development of deep neural network models that will put the object into object recognition.
Collapse
Affiliation(s)
- Benjamin Peters
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.
| | - Nikolaus Kriegeskorte
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA. .,Department of Psychology, Columbia University, New York, NY, USA. .,Department of Neuroscience, Columbia University, New York, NY, USA. .,Department of Electrical Engineering, Columbia University, New York, NY, USA.
| |
Collapse
|
39
|
Wolf C, Lappe M. Salient objects dominate the central fixation bias when orienting toward images. J Vis 2021; 21:23. [PMID: 34431965 PMCID: PMC8399466 DOI: 10.1167/jov.21.8.23] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 07/16/2021] [Indexed: 01/02/2023] Open
Abstract
Short-latency saccades are often biased toward salient objects or toward the center of images, for example, when inspecting photographs of natural scenes. Here, we measured the contribution of salient objects and central fixation bias to visual selection over time. Participants made saccades to images containing one salient object on a structured background and were instructed to either look at (i) the image center, (ii) the salient object, or (iii) at a cued position halfway in between the two. Results revealed, first, an early involuntary bias toward the image center irrespective of strategic behavior or the location of objects in the image. Second, the salient object bias was stronger than the center bias and prevailed over the latter when they directly competed for visual selection. In a second experiment, we tested whether the center bias depends on how well the image can be segregated from the monitor background. We asked participants to explore images that either did or did not contain a salient object while we manipulated the contrast between image background and monitor background to make the image borders more or less visible. The initial orienting toward the image was not affected by the image-monitor contrast, but only by the presence of objects-with a strong bias toward the center of images containing no object. Yet, a low image-monitor contrast reduced this center bias during the subsequent image exploration.
Collapse
Affiliation(s)
- Christian Wolf
- Institute for Psychology, University of Muenster, Münster, Germany
| | - Markus Lappe
- Institute for Psychology, University of Muenster, Münster, Germany
| |
Collapse
|
40
|
Wynn JS, Liu ZX, Ryan JD. Neural Correlates of Subsequent Memory-Related Gaze Reinstatement. J Cogn Neurosci 2021; 34:1547-1562. [PMID: 34272959 DOI: 10.1162/jocn_a_01761] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Mounting evidence linking gaze reinstatement-the recapitulation of encoding-related gaze patterns during retrieval-to behavioral measures of memory suggests that eye movements play an important role in mnemonic processing. Yet, the nature of the gaze scanpath, including its informational content and neural correlates, has remained in question. In this study, we examined eye movement and neural data from a recognition memory task to further elucidate the behavioral and neural bases of functional gaze reinstatement. Consistent with previous work, gaze reinstatement during retrieval of freely viewed scene images was greater than chance and predictive of recognition memory performance. Gaze reinstatement was also associated with viewing of informationally salient image regions at encoding, suggesting that scanpaths may encode and contain high-level scene content. At the brain level, gaze reinstatement was predicted by encoding-related activity in the occipital pole and BG, neural regions associated with visual processing and oculomotor control. Finally, cross-voxel brain pattern similarity analysis revealed overlapping subsequent memory and subsequent gaze reinstatement modulation effects in the parahippocampal place area and hippocampus, in addition to the occipital pole and BG. Together, these findings suggest that encoding-related activity in brain regions associated with scene processing, oculomotor control, and memory supports the formation, and subsequent recapitulation, of functional scanpaths. More broadly, these findings lend support to scanpath theory's assertion that eye movements both encode, and are themselves embedded in, mnemonic representations.
Collapse
Affiliation(s)
| | | | - Jennifer D Ryan
- Rotman Research Institute at Baycrest Health Sciences.,University of Toronto
| |
Collapse
|
41
|
Du R, Varshney A, Potel M. Saliency Computation for Virtual Cinematography in 360° Videos. IEEE COMPUTER GRAPHICS AND APPLICATIONS 2021; 41:99-106. [PMID: 34264820 DOI: 10.1109/mcg.2021.3080320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recent advances in virtual reality cameras have contributed to a phenomenal growth of 360$^{\circ }$∘ videos. Estimating regions likely to attract user attention is critical for efficiently streaming and rendering 360$^{\circ }$∘ videos. In this article, we present a simple, novel, GPU-driven pipeline for saliency computation and virtual cinematography in 360$^{\circ }$∘ videos using spherical harmonics (SH). We efficiently compute the 360$^{\circ }$∘ video saliency through the spectral residual of the SH coefficients between multiple bands at over 60FPS for 4K resolution videos. Further, our interactive computation of spherical saliency can be used for saliency-guided virtual cinematography in 360$^{\circ }$∘ videos.
Collapse
|
42
|
Golosio B, De Luca C, Capone C, Pastorelli E, Stegel G, Tiddia G, De Bonis G, Paolucci PS. Thalamo-cortical spiking model of incremental learning combining perception, context and NREM-sleep. PLoS Comput Biol 2021; 17:e1009045. [PMID: 34181642 PMCID: PMC8270441 DOI: 10.1371/journal.pcbi.1009045] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 07/09/2021] [Accepted: 05/05/2021] [Indexed: 01/19/2023] Open
Abstract
The brain exhibits capabilities of fast incremental learning from few noisy examples, as well as the ability to associate similar memories in autonomously-created categories and to combine contextual hints with sensory perceptions. Together with sleep, these mechanisms are thought to be key components of many high-level cognitive functions. Yet, little is known about the underlying processes and the specific roles of different brain states. In this work, we exploited the combination of context and perception in a thalamo-cortical model based on a soft winner-take-all circuit of excitatory and inhibitory spiking neurons. After calibrating this model to express awake and deep-sleep states with features comparable with biological measures, we demonstrate the model capability of fast incremental learning from few examples, its resilience when proposed with noisy perceptions and contextual signals, and an improvement in visual classification after sleep due to induced synaptic homeostasis and association of similar memories. We created a thalamo-cortical spiking model (ThaCo) with the purpose of demonstrating a link among two phenomena that we believe to be essential for the brain capability of efficient incremental learning from few examples in noisy environments. Grounded in two experimental observations—the first about the effects of deep-sleep on pre- and post-sleep firing rate distributions, the second about the combination of perceptual and contextual information in pyramidal neurons—our model joins these two ingredients. ThaCo alternates phases of incremental learning, classification and deep-sleep. Memories of handwritten digit examples are learned through thalamo-cortical and cortico-cortical plastic synapses. In absence of noise, the combination of contextual information with perception enables fast incremental learning. Deep-sleep becomes crucial when noisy inputs are considered. We observed in ThaCo both homeostatic and associative processes: deep-sleep fights noise in perceptual and internal knowledge and it supports the categorical association of examples belonging to the same digit class, through reinforcement of class-specific cortico-cortical synapses. The distributions of pre-sleep and post-sleep firing rates during classification change in a manner similar to those of experimental observation. These changes promote energetic efficiency during recall of memories, better representation of individual memories and categories and higher classification performances.
Collapse
Affiliation(s)
- Bruno Golosio
- Dipartimento di Fisica, Università di Cagliari, Cagliari, Italy
- Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Cagliari, Cagliari, Italy
| | - Chiara De Luca
- Ph.D. Program in Behavioural Neuroscience, “Sapienza” Università di Roma, Rome, Italy
- Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Roma, Rome, Italy
- * E-mail:
| | - Cristiano Capone
- Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Roma, Rome, Italy
| | - Elena Pastorelli
- Ph.D. Program in Behavioural Neuroscience, “Sapienza” Università di Roma, Rome, Italy
- Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Roma, Rome, Italy
| | - Giovanni Stegel
- Dipartimento di Chimica e Farmacia, Università di Sassari, Sassari, Italy
| | - Gianmarco Tiddia
- Dipartimento di Fisica, Università di Cagliari, Cagliari, Italy
- Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Cagliari, Cagliari, Italy
| | - Giulia De Bonis
- Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Roma, Rome, Italy
| | | |
Collapse
|
43
|
Does task-irrelevant music affect gaze allocation during real-world scene viewing? Psychon Bull Rev 2021; 28:1944-1960. [PMID: 34159530 DOI: 10.3758/s13423-021-01947-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/02/2021] [Indexed: 11/08/2022]
Abstract
Gaze control manifests from a dynamic integration of visual and auditory information, with sound providing important cues for how a viewer should behave. Some past research suggests that music, even if entirely irrelevant to the current task demands, may also sway the timing and frequency of fixations. The current work sought to further assess this idea as well as investigate whether task-irrelevant music could also impact how gaze is spatially allocated. In preparation for a later memory test, participants studied pictures of urban scenes in silence or while simultaneously listening to one of two types of music. Eye tracking was recorded, and nine gaze behaviors were measured to characterize the temporal and spatial aspects of gaze control. Findings showed that while these gaze behaviors changed over the course of viewing, music had no impact. Participants in the music conditions, however, did show better memory performance than those who studied in silence. These findings are discussed within theories of multimodal gaze control.
Collapse
|
44
|
Attention capture by trains and faces in children with and without autism spectrum disorder. PLoS One 2021; 16:e0250763. [PMID: 34143788 PMCID: PMC8213190 DOI: 10.1371/journal.pone.0250763] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Accepted: 04/13/2021] [Indexed: 11/19/2022] Open
Abstract
This study examined involuntary capture of attention, overt attention, and stimulus valence and arousal ratings, all factors that can contribute to potential attentional biases to face and train objects in children with and without autism spectrum disorder (ASD). In the visual domain, faces are particularly captivating, and are thought to have a ‘special status’ in the attentional system. Research suggests that similar attentional biases may exist for other objects of expertise (e.g. birds for bird experts), providing support for the role of exposure in attention prioritization. Autistic individuals often have circumscribed interests around certain classes of objects, such as trains, that are related to vehicles and mechanical systems. This research aimed to determine whether this propensity in autistic individuals leads to stronger attention capture by trains, and perhaps weaker attention capture by faces, than what would be expected in non-autistic children. In Experiment 1, autistic children (6–14 years old) and age- and IQ-matched non-autistic children performed a visual search task where they manually indicated whether a target butterfly appeared amongst an array of face, train, and neutral distractors while their eye-movements were tracked. Autistic children were no less susceptible to attention capture by faces than non-autistic children. Overall, for both groups, trains captured attention more strongly than face stimuli and, trains had a larger effect on overt attention to the target stimuli, relative to face distractors. In Experiment 2, a new group of children (autistic and non-autistic) rated train stimuli as more interesting and exciting than the face stimuli, with no differences between groups. These results suggest that: (1) other objects (trains) can capture attention in a similar manner as faces, in both autistic and non-autistic children (2) attention capture is driven partly by voluntary attentional processes related to personal interest or affective responses to the stimuli.
Collapse
|
45
|
Saarimäki H. Naturalistic Stimuli in Affective Neuroimaging: A Review. Front Hum Neurosci 2021; 15:675068. [PMID: 34220474 PMCID: PMC8245682 DOI: 10.3389/fnhum.2021.675068] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 05/17/2021] [Indexed: 11/13/2022] Open
Abstract
Naturalistic stimuli such as movies, music, and spoken and written stories elicit strong emotions and allow brain imaging of emotions in close-to-real-life conditions. Emotions are multi-component phenomena: relevant stimuli lead to automatic changes in multiple functional components including perception, physiology, behavior, and conscious experiences. Brain activity during naturalistic stimuli reflects all these changes, suggesting that parsing emotion-related processing during such complex stimulation is not a straightforward task. Here, I review affective neuroimaging studies that have employed naturalistic stimuli to study emotional processing, focusing especially on experienced emotions. I argue that to investigate emotions with naturalistic stimuli, we need to define and extract emotion features from both the stimulus and the observer.
Collapse
Affiliation(s)
- Heini Saarimäki
- Human Information Processing Laboratory, Faculty of Social Sciences, Tampere University, Tampere, Finland
| |
Collapse
|
46
|
Sun W, Chen Z, Wu F. Visual Scanpath Prediction Using IOR-ROI Recurrent Mixture Density Network. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:2101-2118. [PMID: 31796389 DOI: 10.1109/tpami.2019.2956930] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
A visual scanpath represents the human eye movements when scanning the visual field for acquiring and receiving visual information. Predicting visual scanpaths when a certain stimulus is presented plays an important role in modeling overt human visual attention and search behavior. In this paper, we presented an 'Inhibition of Return - Region of Interest' (IOR-ROI) recurrent mixture density network based framework learning to produce human-like visual scanpaths under task-free viewing conditions. The proposed model simultaneously predicts a sequence of ordered fixation positions and their corresponding fixation durations. Our model integrates bottom-up features and semantic features extracted by convolutional neural networks. Then the integrated feature maps are fed into the IOR-ROI Long Short-Term Memory (LSTM) which is the core component of the proposed model. The IOR-ROI LSTM is a dual LSTM unit, i.e., the IOR-LSTM and the ROI-LSTM, capturing IOR dynamics and gaze shift behavior simultaneously. IOR-LSTM simulates the visual working memory to adaptively maintain and update visual information regarding previously fixated regions. ROI-LSTM is responsible for predicting the next possible ROIs given the spatially inhibited image feature maps on the feature-wise basis. Fixation duration is predicted by a regression neural network given the viewing history and image feature maps corresponding to currently fixated ROI. Considering the eye movement pattern variations among subjects, a mixture density network is adopted to model the next fixation distribution as Gaussian mixtures and the fixation duration is also modeled using Gaussian distribution. Our model is evaluated on the OSIE and MIT low resolution eye-tracking datasets and experimental results indicate that the proposed method can achieve superior performance in predicting visual scanpaths. The code will be publicly available on URL: https://github.com/sunwj/scanpath.
Collapse
|
47
|
Molin J, Thakur C, Niebur E, Etienne-Cummings R. A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model With a Hybrid FPGA Implementation. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2021; 15:580-594. [PMID: 34133287 PMCID: PMC8407057 DOI: 10.1109/tbcas.2021.3089622] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Computing and attending to salient regions of a visual scene is an innate and necessary preprocessing step for both biological and engineered systems performing high-level visual tasks including object detection, tracking, and classification. Computational bandwidth and speed are improved by preferentially devoting computational resources to salient regions of the visual field. The human brain computes saliency effortlessly, but modeling this task in engineered systems is challenging. We first present a neuromorphic dynamic saliency model, which is bottom-up, feed-forward, and based on the notion of proto-objects with neurophysiological spatio-temporal features requiring no training. Our neuromorphic model outperforms state-of-the-art dynamic visual saliency models in predicting human eye fixations (i.e., ground truth saliency). Secondly, we present a hybrid FPGA implementation of the model for real-time applications, capable of processing 112×84 resolution frames at 18.71 Hz running at a 100 MHz clock rate - a 23.77× speedup from the software implementation. Additionally, our fixed-point model of the FPGA implementation yields comparable results to the software implementation.
Collapse
|
48
|
Iigaya K, Yi S, Wahle IA, Tanwisuth K, O'Doherty JP. Aesthetic preference for art can be predicted from a mixture of low- and high-level visual features. Nat Hum Behav 2021; 5:743-755. [PMID: 34017097 DOI: 10.1038/s41562-021-01124-6] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 04/21/2021] [Indexed: 01/02/2023]
Abstract
It is an open question whether preferences for visual art can be lawfully predicted from the basic constituent elements of a visual image. Here, we developed and tested a computational framework to investigate how aesthetic values are formed. We show that it is possible to explain human preferences for a visual art piece based on a mixture of low- and high-level features of the image. Subjective value ratings could be predicted not only within but also across individuals, using a regression model with a common set of interpretable features. We also show that the features predicting aesthetic preference can emerge hierarchically within a deep convolutional neural network trained only for object recognition. Our findings suggest that human preferences for art can be explained at least in part as a systematic integration over the underlying visual features of an image.
Collapse
Affiliation(s)
- Kiyohito Iigaya
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.
| | - Sanghyun Yi
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
| | - Iman A Wahle
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
| | - Koranis Tanwisuth
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.,Department of Psychology, University of California, Berkeley, Berkeley, CA, USA
| | - John P O'Doherty
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
49
|
Türkan BN, İyilikci O, Amado S. Ways of processing semantic information during different change detection tasks. VISUAL COGNITION 2021. [DOI: 10.1080/13506285.2021.1927276] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
| | - Osman İyilikci
- Department of Psychology, Manisa Celal Bayar University, Manisa, Turkey
| | - Sonia Amado
- Department of Psychology, Ege University, Izmir, Turkey
| |
Collapse
|
50
|
Unsupervised foveal vision neural architecture with top-down attention. Neural Netw 2021; 141:145-159. [PMID: 33901879 DOI: 10.1016/j.neunet.2021.03.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 02/23/2021] [Accepted: 03/02/2021] [Indexed: 11/20/2022]
Abstract
Deep learning architectures are an extremely powerful tool for recognizing and classifying images. However, they require supervised learning and normally work on vectors of the size of image pixels and produce the best results when trained on millions of object images. To help mitigate these issues, we propose an end-to-end architecture that fuses bottom-up saliency and top-down attention with an object recognition module to focus on relevant data and learn important features that can later be fine-tuned for a specific task, employing only unsupervised learning. In addition, by utilizing a virtual fovea that focuses on relevant portions of the data, the training speed can be greatly improved. We test the performance of the proposed Gamma saliency technique on the Toronto and CAT 2000 databases, and the foveated vision in the large Street View House Numbers (SVHN) database. The results with foveated vision show that Gamma saliency performs at the same level as the best alternative algorithms while being computationally faster. The results in SVHN show that our unsupervised cognitive architecture is comparable to fully supervised methods and that saliency also improves CNN performance if desired. Finally, we develop and test a top-down attention mechanism based on the Gamma saliency applied to the top layer of CNNs to facilitate scene understanding in multi-object cluttered images. We show that the extra information from top-down saliency is capable of speeding up the extraction of digits in the cluttered multidigit MNIST data set, corroborating the important role of top down attention.
Collapse
|