1
|
Almadori E, Mastroberardino S, Botta F, Brunetti R, Lupiáñez J, Spence C, Santangelo V. Crossmodal Semantic Congruence Interacts with Object Contextual Consistency in Complex Visual Scenes to Enhance Short-Term Memory Performance. Brain Sci 2021; 11:brainsci11091206. [PMID: 34573227 PMCID: PMC8467083 DOI: 10.3390/brainsci11091206] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 08/30/2021] [Accepted: 09/09/2021] [Indexed: 11/17/2022] Open
Abstract
Object sounds can enhance the attentional selection and perceptual processing of semantically-related visual stimuli. However, it is currently unknown whether crossmodal semantic congruence also affects the post-perceptual stages of information processing, such as short-term memory (STM), and whether this effect is modulated by the object consistency with the background visual scene. In two experiments, participants viewed everyday visual scenes for 500 ms while listening to an object sound, which could either be semantically related to the object that served as the STM target at retrieval or not. This defined crossmodal semantically cued vs. uncued targets. The target was either in- or out-of-context with respect to the background visual scene. After a maintenance period of 2000 ms, the target was presented in isolation against a neutral background, in either the same or different spatial position as in the original scene. The participants judged the same vs. different position of the object and then provided a confidence judgment concerning the certainty of their response. The results revealed greater accuracy when judging the spatial position of targets paired with a semantically congruent object sound at encoding. This crossmodal facilitatory effect was modulated by whether the target object was in- or out-of-context with respect to the background scene, with out-of-context targets reducing the facilitatory effect of object sounds. Overall, these findings suggest that the presence of the object sound at encoding facilitated the selection and processing of the semantically related visual stimuli, but this effect depends on the semantic configuration of the visual scene.
Collapse
Affiliation(s)
- Erika Almadori
- Neuroimaging Laboratory, IRCCS Santa Lucia Foundation, Via Ardeatina 306, 00179 Rome, Italy;
| | - Serena Mastroberardino
- Department of Psychology, School of Medicine & Psychology, Sapienza University of Rome, Via dei Marsi 78, 00185 Rome, Italy;
| | - Fabiano Botta
- Department of Experimental Psychology and Mind, Brain, and Behavior Research Center (CIMCYC), University of Granada, 18071 Granada, Spain; (F.B.); (J.L.)
| | - Riccardo Brunetti
- Cognitive and Clinical Psychology Laboratory, Department of Human Sciences, Università Europea di Roma, 00163 Roma, Italy;
| | - Juan Lupiáñez
- Department of Experimental Psychology and Mind, Brain, and Behavior Research Center (CIMCYC), University of Granada, 18071 Granada, Spain; (F.B.); (J.L.)
| | - Charles Spence
- Department of Experimental Psychology, Oxford University, Oxford OX2 6GG, UK;
| | - Valerio Santangelo
- Neuroimaging Laboratory, IRCCS Santa Lucia Foundation, Via Ardeatina 306, 00179 Rome, Italy;
- Department of Philosophy, Social Sciences & Education, University of Perugia, Piazza G. Ermini, 1, 06123 Perugia, Italy
- Correspondence:
| |
Collapse
|
2
|
The interplay between gaze and consistency in scene viewing: Evidence from visual search by young and older adults. Atten Percept Psychophys 2021; 83:1954-1970. [PMID: 33748905 PMCID: PMC8213592 DOI: 10.3758/s13414-021-02242-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/04/2021] [Indexed: 11/08/2022]
Abstract
Searching for an object in a complex scene is influenced by high-level factors such as how much the item would be expected in that setting (semantic consistency). There is also evidence that a person gazing at an object directs our attention towards it. However, there has been little previous research that has helped to understand how we integrate top-down cues such as semantic consistency and gaze to direct attention when searching for an object. Also, there are separate lines of evidence to suggest that older adults may be more influenced by semantic factors and less by gaze cues compared to younger counterparts, but this has not been investigated before in an integrated task. In the current study we analysed eye-movements of 34 younger and 30 older adults as they searched for a target object in complex visual scenes. Younger adults were influenced by semantic consistency in their attention to objects, but were more influenced by gaze cues. In contrast, older adults were more guided by semantic consistency in directing their attention, and showed less influence from gaze cues. These age differences in use of high-level cues were apparent early in processing (time to first fixation and probability of immediate fixation) but not in later processing (total time looking at objects and time to make a response). Overall, this pattern of findings indicates that people are influenced by both social cues and prior expectations when processing a complex scene, and the relative importance of these factors depends on age.
Collapse
|
3
|
Behe BK, Huddleston PT, Childs KL, Chen J, Muraro IS. Seeing through the forest: The gaze path to purchase. PLoS One 2020; 15:e0240179. [PMID: 33036020 PMCID: PMC7546910 DOI: 10.1371/journal.pone.0240179] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 09/21/2020] [Indexed: 11/18/2022] Open
Abstract
Eye tracking studies have analyzed the relationship between visual attention to point of purchase marketing elements (price, signage, etc.) and purchase intention. Our study is the first to investigate the relationship between the gaze sequence in which consumers view a display (including gaze aversion away from products) and the influence of consumer (top down) characteristics on product choice. We conducted an in-lab 3 (display size: large, moderate, small) X 2 (price: sale, non-sale) within-subject experiment with 92 persons. After viewing the displays, subjects completed an online survey to provide demographic data, self-reported and actual product knowledge, and past purchase information. We employed a random forest machine learning approach via R software to analyze all possible three-unit subsequences of gaze fixations. Models comparing multiclass F1-macro score and F1-micro score of product choice were analyzed. Gaze sequence models that included gaze aversion more accurately predicted product choice in a lab setting for more complex displays. Inclusion of consumer characteristics generally improved model predictive F1-macro and F1-micro scores for less complex displays with fewer plant sizes Consumer attributes that helped improve model prediction performance were product expertise, ethnicity, and previous plant purchases.
Collapse
Affiliation(s)
- Bridget K. Behe
- Department of Horticulture, Michigan State University, East Lansing, Michigan, United States of America
| | - Patricia T. Huddleston
- Department of Advertising & Public Relations, College of Communication Arts & Sciences, Michigan State University, East Lansing, Michigan, United States of America
| | - Kevin L. Childs
- Department of Plant Biology, Michigan State University, East Lansing, Michigan, United States of America
| | - Jiaoping Chen
- Eli Broad College of Business, Michigan State University, East Lansing, Michigan, United States of America
| | - Iago S. Muraro
- Department of Advertising & Public Relations, College of Communication Arts & Sciences, Michigan State University, East Lansing, Michigan, United States of America
| |
Collapse
|
4
|
Cimminella F, Sala SD, Coco MI. Extra-foveal Processing of Object Semantics Guides Early Overt Attention During Visual Search. Atten Percept Psychophys 2020; 82:655-670. [PMID: 31792893 PMCID: PMC7246246 DOI: 10.3758/s13414-019-01906-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
Eye-tracking studies using arrays of objects have demonstrated that some high-level processing of object semantics can occur in extra-foveal vision, but its role on the allocation of early overt attention is still unclear. This eye-tracking visual search study contributes novel findings by examining the role of object-to-object semantic relatedness and visual saliency on search responses and eye-movement behaviour across arrays of increasing size (3, 5, 7). Our data show that a critical object was looked at earlier and for longer when it was semantically unrelated than related to the other objects in the display, both when it was the search target (target-present trials) and when it was a target's semantically related competitor (target-absent trials). Semantic relatedness effects manifested already during the very first fixation after array onset, were consistently found for increasing set sizes, and were independent of low-level visual saliency, which did not play any role. We conclude that object semantics can be extracted early in extra-foveal vision and capture overt attention from the very first fixation. These findings pose a challenge to models of visual attention which assume that overt attention is guided by the visual appearance of stimuli, rather than by their semantics.
Collapse
Affiliation(s)
- Francesco Cimminella
- Human Cognitive Neuroscience, Psychology, University of Edinburgh, Edinburgh, UK.
- Laboratory of Experimental Psychology, Suor Orsola Benincasa University, Naples, Italy.
| | - Sergio Della Sala
- Human Cognitive Neuroscience, Psychology, University of Edinburgh, Edinburgh, UK
| | - Moreno I Coco
- Human Cognitive Neuroscience, Psychology, University of Edinburgh, Edinburgh, UK.
- School of Psychology, The University of East London, London, UK.
- Faculdade de Psicologia, Universidade de Lisboa, Lisbon, Portugal.
| |
Collapse
|
5
|
Borges MT, Fernandes EG, Coco MI. Age-related differences during visual search: the role of contextual expectations and cognitive control mechanisms. AGING NEUROPSYCHOLOGY AND COGNITION 2019; 27:489-516. [DOI: 10.1080/13825585.2019.1632256] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Miguel T. Borges
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | | | - Moreno I. Coco
- School of Psychology, University of East London, London, United Kingdom of Great Britain and Northern Ireland
| |
Collapse
|
6
|
Qiao K, Chen J, Wang L, Zhang C, Zeng L, Tong L, Yan B. Category Decoding of Visual Stimuli From Human Brain Activity Using a Bidirectional Recurrent Neural Network to Simulate Bidirectional Information Flows in Human Visual Cortices. Front Neurosci 2019; 13:692. [PMID: 31354409 PMCID: PMC6630063 DOI: 10.3389/fnins.2019.00692] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Accepted: 06/18/2019] [Indexed: 12/04/2022] Open
Abstract
Recently, visual encoding and decoding based on functional magnetic resonance imaging (fMRI) has had many achievements with the rapid development of deep network computation. In the human vision system, when people process the perceived visual content, visual information flows from primary visual cortices to high-level visual cortices and also vice versa based on the bottom-up and top-down manners, respectively. Inspired by the bidirectional information flows, we proposed a bidirectional recurrent neural network (BRNN)-based method to decode the corresponding categories from fMRI data. The forward and backward directions in the BRNN module characterized the bottom-up and top-down manners, respectively. The proposed method regarded the selected voxels in each visual area (V1, V2, V3, V4, and LO) as one node of the space sequence and fed it into the BRNN module, then combined the output of the BRNN module to decode categories with the subsequent fully connected softmax layer. This new method can use the hierarchical information representations and bidirectional information flows in human visual cortices more efficiently. Experiments demonstrated that our method could improve the accuracy of the three-level category decoding. Comparative analysis validated and revealed that correlative representations of categories were included in visual cortices because of the bidirectional information flows, in addition to the hierarchical, distributed, and complementary representations that accorded with previous studies.
Collapse
Affiliation(s)
- Kai Qiao
- PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Jian Chen
- PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Linyuan Wang
- PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Chi Zhang
- PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Lei Zeng
- PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Li Tong
- PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| | - Bin Yan
- PLA Strategic Support Force Information Engineering University, Zhengzhou, China
| |
Collapse
|
7
|
AlignTool: The automatic temporal alignment of spoken utterances in German, Dutch, and British English for psycholinguistic purposes. Behav Res Methods 2018; 50:466-489. [PMID: 29380301 DOI: 10.3758/s13428-017-1002-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In language production research, the latency with which speakers produce a spoken response to a stimulus and the onset and offset times of words in longer utterances are key dependent variables. Measuring these variables automatically often yields partially incorrect results. However, exact measurements through the visual inspection of the recordings are extremely time-consuming. We present AlignTool, an open-source alignment tool that establishes preliminarily the onset and offset times of words and phonemes in spoken utterances using Praat, and subsequently performs a forced alignment of the spoken utterances and their orthographic transcriptions in the automatic speech recognition system MAUS. AlignTool creates a Praat TextGrid file for inspection and manual correction by the user, if necessary. We evaluated AlignTool's performance with recordings of single-word and four-word utterances as well as semi-spontaneous speech. AlignTool performs well with audio signals with an excellent signal-to-noise ratio, requiring virtually no corrections. For audio signals of lesser quality, AlignTool still is highly functional but its results may require more frequent manual corrections. We also found that audio recordings including long silent intervals tended to pose greater difficulties for AlignTool than recordings filled with speech, which AlignTool analyzed well overall. We expect that by semi-automatizing the temporal analysis of complex utterances, AlignTool will open new avenues in language production research.
Collapse
|
8
|
Coco MI, Dale R, Keller F. Performance in a Collaborative Search Task: The Role of Feedback and Alignment. Top Cogn Sci 2017; 10:55-79. [DOI: 10.1111/tops.12300] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Revised: 07/10/2017] [Accepted: 08/23/2017] [Indexed: 11/29/2022]
Affiliation(s)
- Moreno I. Coco
- School of Philosophy, Psychology and Language Sciences; University of Edinburgh
| | - Rick Dale
- Cognitive and Information Sciences; University of California; Merced
| | | |
Collapse
|
9
|
Influence of semantic consistency and perceptual features on visual attention during scene viewing in toddlers. Infant Behav Dev 2017; 49:248-266. [DOI: 10.1016/j.infbeh.2017.09.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Revised: 09/14/2017] [Accepted: 09/16/2017] [Indexed: 11/20/2022]
|
10
|
Coco MI, Araujo S, Petersson KM. Disentangling stimulus plausibility and contextual congruency: Electro-physiological evidence for differential cognitive dynamics. Neuropsychologia 2017; 96:150-163. [DOI: 10.1016/j.neuropsychologia.2016.12.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Revised: 11/21/2016] [Accepted: 12/06/2016] [Indexed: 11/27/2022]
|
11
|
Making Sense of Real-World Scenes. Trends Cogn Sci 2016; 20:843-856. [PMID: 27769727 DOI: 10.1016/j.tics.2016.09.003] [Citation(s) in RCA: 81] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Revised: 09/06/2016] [Accepted: 09/06/2016] [Indexed: 11/23/2022]
Abstract
To interact with the world, we have to make sense of the continuous sensory input conveying information about our environment. A recent surge of studies has investigated the processes enabling scene understanding, using increasingly complex stimuli and sophisticated analyses to highlight the visual features and brain regions involved. However, there are two major challenges to producing a comprehensive framework for scene understanding. First, scene perception is highly dynamic, subserving multiple behavioral goals. Second, a multitude of different visual properties co-occur across scenes and may be correlated or independent. We synthesize the recent literature and argue that for a complete view of scene understanding, it is necessary to account for both differing observer goals and the contribution of diverse scene properties.
Collapse
|
12
|
Zarcone A, van Schijndel M, Vogels J, Demberg V. Salience and Attention in Surprisal-Based Accounts of Language Processing. Front Psychol 2016; 7:844. [PMID: 27375525 PMCID: PMC4894064 DOI: 10.3389/fpsyg.2016.00844] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 05/20/2016] [Indexed: 12/04/2022] Open
Abstract
The notion of salience has been singled out as the explanatory factor for a diverse range of linguistic phenomena. In particular, perceptual salience (e.g., visual salience of objects in the world, acoustic prominence of linguistic sounds) and semantic-pragmatic salience (e.g., prominence of recently mentioned or topical referents) have been shown to influence language comprehension and production. A different line of research has sought to account for behavioral correlates of cognitive load during comprehension as well as for certain patterns in language usage using information-theoretic notions, such as surprisal. Surprisal and salience both affect language processing at different levels, but the relationship between the two has not been adequately elucidated, and the question of whether salience can be reduced to surprisal / predictability is still open. Our review identifies two main challenges in addressing this question: terminological inconsistency and lack of integration between high and low levels of representations in salience-based accounts and surprisal-based accounts. We capitalize upon work in visual cognition in order to orient ourselves in surveying the different facets of the notion of salience in linguistics and their relation with models of surprisal. We find that work on salience highlights aspects of linguistic communication that models of surprisal tend to overlook, namely the role of attention and relevance to current goals, and we argue that the Predictive Coding framework provides a unified view which can account for the role played by attention and predictability at different levels of processing and which can clarify the interplay between low and high levels of processes and between predictability-driven expectation and attention-driven focus.
Collapse
Affiliation(s)
- Alessandra Zarcone
- Computational Linguistics and Phonetics, Universität des Saarlandes Saarbrücken, Germany
| | | | - Jorrig Vogels
- Computational Linguistics and Phonetics, Universität des Saarlandes Saarbrücken, Germany
| | - Vera Demberg
- Computational Linguistics and Phonetics, Universität des Saarlandes Saarbrücken, Germany
| |
Collapse
|
13
|
When expectancies collide: Action dynamics reveal the interaction between stimulus plausibility and congruency. Psychon Bull Rev 2016; 23:1920-1931. [PMID: 27197650 PMCID: PMC5133277 DOI: 10.3758/s13423-016-1033-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The cognitive architecture routinely relies on expectancy mechanisms to process the plausibility of stimuli and establish their sequential congruency. In two computer mouse-tracking experiments, we use a cross-modal verification task to uncover the interaction between plausibility and congruency by examining their temporal signatures of activation competition as expressed in a computer- mouse movement decision response. In this task, participants verified the content congruency of sentence and scene pairs that varied in plausibility. The order of presentation (sentence-scene, scene-sentence) was varied between participants to uncover any differential processing. Our results show that implausible but congruent stimuli triggered less accurate and slower responses than implausible and incongruent stimuli, and were associated with more complex angular mouse trajectories independent of the order of presentation. This study provides novel evidence of a disassociation between the temporal signatures of plausibility and congruency detection on decision responses.
Collapse
|
14
|
Santangelo V. Forced to remember: when memory is biased by salient information. Behav Brain Res 2015; 283:1-10. [PMID: 25595422 DOI: 10.1016/j.bbr.2015.01.013] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Revised: 01/05/2015] [Accepted: 01/07/2015] [Indexed: 11/19/2022]
Abstract
The last decades have seen a rapid growing in the attempt to understand the key factors involved in the internal memory representation of the external world. Visual salience have been found to provide a major contribution in predicting the probability for an item/object embedded in a complex setting (i.e., a natural scene) to be encoded and then remembered later on. Here I review the existing literature highlighting the impact of perceptual- (based on low-level sensory features) and semantics-related salience (based on high-level knowledge) on short-term memory representation, along with the neural mechanisms underpinning the interplay between these factors. The available evidence reveal that both perceptual- and semantics-related factors affect attention selection mechanisms during the encoding of natural scenes. Biasing internal memory representation, both perceptual and semantics factors increase the probability to remember high- to the detriment of low-saliency items. The available evidence also highlight an interplay between these factors, with a reduced impact of perceptual-related salience in biasing memory representation as a function of the increasing availability of semantics-related salient information. The neural mechanisms underpinning this interplay involve the activation of different portions of the frontoparietal attention control network. Ventral regions support the assignment of selection/encoding priorities based on high-level semantics, while the involvement of dorsal regions reflects priorities assignment based on low-level sensory features.
Collapse
Affiliation(s)
- Valerio Santangelo
- Department of Philosophy, Social, Human & Educational Sciences, University of Perugia, Perugia, Italy; Cognitive Neuroscience Group, Neuroimaging Laboratory, Santa Lucia Foundation, Rome, Italy.
| |
Collapse
|
15
|
Coco MI, Keller F. The interaction of visual and linguistic saliency during syntactic ambiguity resolution. Q J Exp Psychol (Hove) 2014; 68:46-74. [PMID: 25176109 DOI: 10.1080/17470218.2014.936475] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Psycholinguistic research using the visual world paradigm has shown that the processing of sentences is constrained by the visual context in which they occur. Recently, there has been growing interest in the interactions observed when both language and vision provide relevant information during sentence processing. In three visual world experiments on syntactic ambiguity resolution, we investigate how visual and linguistic information influence the interpretation of ambiguous sentences. We hypothesize that (1) visual and linguistic information both constrain which interpretation is pursued by the sentence processor, and (2) the two types of information act upon the interpretation of the sentence at different points during processing. In Experiment 1, we show that visual saliency is utilized to anticipate the upcoming arguments of a verb. In Experiment 2, we operationalize linguistic saliency using intonational breaks and demonstrate that these give prominence to linguistic referents. These results confirm prediction (1). In Experiment 3, we manipulate visual and linguistic saliency together and find that both types of information are used, but at different points in the sentence, to incrementally update its current interpretation. This finding is consistent with prediction (2). Overall, our results suggest an adaptive processing architecture in which different types of information are used when they become available, optimizing different aspects of situated language processing.
Collapse
Affiliation(s)
- Moreno I Coco
- a Institute for Language, Cognition and Computation, School of Informatics , University of Edinburgh , UK
| | | |
Collapse
|