1
|
Brich IR, Papenmeier F, Huff M, Merkt M. Construction or updating? Event model processes during visual narrative comprehension. Psychon Bull Rev 2024:10.3758/s13423-023-02424-w. [PMID: 38361105 DOI: 10.3758/s13423-023-02424-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/10/2023] [Indexed: 02/17/2024]
Abstract
The plot of a narrative is represented in the form of event models in working memory. Because only parts of the plot are actually presented and information is continually changing, comprehenders have to infer a good portion of a narrative and keep their mental representation updated. Research has identified two related processes (e.g., Gernsbacher, 1997): During model construction (shifting, laying a foundation) at large coherence breaks an event model is completely built anew. During model updating (mapping) at smaller omissions, however, the current event model is preserved, and only changed parts are updated through inference processes. Thus far, reliably distinguishing those two processes in visual narratives like comics was difficult. We report a study (N = 80) that aimed to map the differences between constructing and updating event models in visual narratives by combining measures from narrative comprehension and event cognition research and manipulating event structure. Participants watched short visual narratives designed to (not) contain event boundaries at larger coherence breaks and elicit inferences through small omissions, while we collected viewing time measures as well as event segmentation and comprehensibility data. Viewing time, segmentation, and comprehensibility data were in line with the assumption of two distinct comprehension processes. We thus found converging evidence across multiple measures for distinct model construction and updating processes in visual narratives.
Collapse
Affiliation(s)
- Irina R Brich
- Leibniz-Institut für Wissensmedien, Schleichstr. 6, D-72076, Tübingen, Germany.
| | - Frank Papenmeier
- Department of Psychology, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Markus Huff
- Leibniz-Institut für Wissensmedien, Schleichstr. 6, D-72076, Tübingen, Germany
- Department of Psychology, Eberhard Karls Universität Tübingen, Tübingen, Germany
| | - Martin Merkt
- Leibniz-Institut für Wissensmedien, Schleichstr. 6, D-72076, Tübingen, Germany
- German Institute for Adult Education - Leibniz Centre for Lifelong Learning, Bonn, Germany
| |
Collapse
|
2
|
Ikuta H, Wöhler L, Aizawa K. Statistical characteristics of comic panel viewing times. Sci Rep 2023; 13:20291. [PMID: 37985682 PMCID: PMC10661992 DOI: 10.1038/s41598-023-47120-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 11/09/2023] [Indexed: 11/22/2023] Open
Abstract
Comics are a bimodal form of art involving a mixture of text and images. Since comics require a combination of various cognitive processes to comprehend their contents, the analysis of human comic reading behavior sheds light on how humans process such bimodal forms of media. In this paper, we particularly focus on the viewing times of each comic panel as a quantitative measure of attention, and analyze the statistical characteristics of the distributions of comic panel viewing times. We create a user interface that presents comics in a panel-wise manner, and measure the viewing times of each panel through a user study experiment. We collected data from 18 participants reading 7 comic book volumes resulting in over 99,000 viewing time data points, which will be released publicly. The results show that the average viewing times are proportional to the text length contained in the panel's speech bubbles, with a rate of proportion differing for each reader, despite the bimodal setting. Additionally, we find that the viewing time for all users follows a common heavy-tailed distribution.
Collapse
Affiliation(s)
- Hikaru Ikuta
- Department of Information and Communication Engineering, The University of Tokyo, Tokyo, 113-8656, Japan.
| | - Leslie Wöhler
- The University of Tokyo JSPS International Research Fellow, Tokyo, 113-8656, Japan
| | - Kiyoharu Aizawa
- Department of Information and Communication Engineering, The University of Tokyo, Tokyo, 113-8656, Japan
| |
Collapse
|
3
|
Pedziwiatr MA, Heer S, Coutrot A, Bex PJ, Mareschal I. Influence of prior knowledge on eye movements to scenes as revealed by hidden Markov models. J Vis 2023; 23:10. [PMID: 37721772 PMCID: PMC10511023 DOI: 10.1167/jov.23.10.10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Accepted: 08/14/2023] [Indexed: 09/19/2023] Open
Abstract
Human visual experience usually provides ample opportunity to accumulate knowledge about events unfolding in the environment. In typical scene perception experiments, however, participants view images that are unrelated to each other and, therefore, they cannot accumulate knowledge relevant to the upcoming visual input. Consequently, the influence of such knowledge on how this input is processed remains underexplored. Here, we investigated this influence in the context of gaze control. We used sequences of static film frames arranged in a way that allowed us to compare eye movements to identical frames between two groups: a group that accumulated prior knowledge relevant to the situations depicted in these frames and a group that did not. We used a machine learning approach based on hidden Markov models fitted to individual scanpaths to demonstrate that the gaze patterns from the two groups differed systematically and, thereby, showed that recently accumulated prior knowledge contributes to gaze control. Next, we leveraged the interpretability of hidden Markov models to characterize these differences. Additionally, we report two unexpected and interesting caveats of our approach. Overall, our results highlight the importance of recently acquired prior knowledge for oculomotor control and the potential of hidden Markov models as a tool for investigating it.
Collapse
Affiliation(s)
- Marek A Pedziwiatr
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
| | - Sophie Heer
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
| | - Antoine Coutrot
- Univ Lyon, CNRS, INSA Lyon, UCBL, LIRIS, UMR5205, F-69621 Lyon, France
| | - Peter J Bex
- Department of Psychology, Northeastern University, Boston, MA, USA
| | - Isabelle Mareschal
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
| |
Collapse
|
4
|
Glaser M, Knoos M, Schwan S. Localizing, describing, interpreting: effects of different audio text structures on attributing meaning to digital pictures. INSTRUCTIONAL SCIENCE 2022; 50:729-748. [PMID: 35971387 PMCID: PMC9366788 DOI: 10.1007/s11251-022-09593-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 06/07/2022] [Accepted: 06/28/2022] [Indexed: 06/15/2023]
Abstract
Based on previous research on multimedia learning and text comprehension, an eye-tracking study was conducted to examine the influence of audio text coherence on visual attention and memory in a multimedia learning situation with a focus on picture comprehension. Audio text coherence was manipulated by the type of LDI structure, that is, whether localization, description, and interpretation followed in immediate succession for each pictorial detail or whether localizations and description of details were separated from their interpretation. Results show that with a LDI integrated structure compared to a LDI separated structure the referred-to picture elements were fixated longer during interpretation parts, and linkages between descriptions and interpretations were better recalled and recognized. The effects on recall and recognition of linkages were fully mediated by fixation times. This pattern of results can be explained by an interplay between audio text coherence and dual coding processes. It points out the importance of local coherence and the provision of localization information in audio explanations as well as visual attention to allow for dual coding processes that can be used to better attribute meaning to picture details. Practical implications for the design of educational videos, audio texts on websites, and audio guides are discussed.
Collapse
Affiliation(s)
- Manuela Glaser
- Leibniz-Institut für Wissensmedien, Schleichstr. 6, 72076 Tuebingen, Germany
- Im Bruckenschlegel 7, 70186 Stuttgart, Germany
| | - Manuel Knoos
- Leibniz-Institut für Wissensmedien, Schleichstr. 6, 72076 Tuebingen, Germany
| | - Stephan Schwan
- Leibniz-Institut für Wissensmedien, Schleichstr. 6, 72076 Tuebingen, Germany
| |
Collapse
|
5
|
Klomberg B, Hacımusaoğlu I, Cohn N. Running through the Who, Where, and When: A Cross-cultural Analysis of Situational Changes in Comics. DISCOURSE PROCESSES 2022. [DOI: 10.1080/0163853x.2022.2106402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Affiliation(s)
- Bien Klomberg
- Department of Communication and Cognition, Tilburg University, Tilburg School of Humanities and Digital Sciences
| | - Irmak Hacımusaoğlu
- Department of Communication and Cognition, Tilburg University, Tilburg School of Humanities and Digital Sciences
| | - Neil Cohn
- Department of Communication and Cognition, Tilburg University, Tilburg School of Humanities and Digital Sciences
| |
Collapse
|
6
|
Hutson JP, Chandran P, Magliano JP, Smith TJ, Loschky LC. Narrative Comprehension Guides Eye Movements in the Absence of Motion. Cogn Sci 2022; 46:e13131. [PMID: 35579883 DOI: 10.1111/cogs.13131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 02/17/2022] [Accepted: 02/19/2022] [Indexed: 11/30/2022]
Abstract
Viewers' attentional selection while looking at scenes is affected by both top-down and bottom-up factors. However, when watching film, viewers typically attend to the movie similarly irrespective of top-down factors-a phenomenon we call the tyranny of film. A key difference between still pictures and film is that film contains motion, which is a strong attractor of attention and highly predictive of gaze during film viewing. The goal of the present study was to test if the tyranny of film is driven by motion. To do this, we created a slideshow presentation of the opening scene of Touch of Evil. Context condition participants watched the full slideshow. No-context condition participants did not see the opening portion of the scene, which showed someone placing a time bomb into the trunk of a car. In prior research, we showed that despite producing very different understandings of the clip, this manipulation did not affect viewers' attention (i.e., the tyranny of film), as both context and no-context participants were equally likely to fixate on the car with the bomb when the scene was presented as a film. The current study found that when the scene was shown as a slideshow, the context manipulation produced differences in attentional selection (i.e., it attenuated attentional synchrony). We discuss these results in the context of the Scene Perception and Event Comprehension Theory, which specifies the relationship between event comprehension and attentional selection in the context of visual narratives.
Collapse
Affiliation(s)
- John P Hutson
- Department of Learning Sciences, Georgia State University
| | | | | | - Tim J Smith
- Department of Psychological Sciences, Birkbeck, University of London
| | | |
Collapse
|
7
|
Smith ME, Loschky LC, Bailey HR. Knowledge guides attention to goal-relevant information in older adults. COGNITIVE RESEARCH-PRINCIPLES AND IMPLICATIONS 2021; 6:56. [PMID: 34406505 PMCID: PMC8374018 DOI: 10.1186/s41235-021-00321-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 07/31/2021] [Indexed: 11/18/2022]
Abstract
How does viewers’ knowledge guide their attention while they watch everyday events, how does it affect their memory, and does it change with age? Older adults have diminished episodic memory for everyday events, but intact semantic knowledge. Indeed, research suggests that older adults may rely on their semantic memory to offset impairments in episodic memory, and when relevant knowledge is lacking, older adults’ memory can suffer. Yet, the mechanism by which prior knowledge guides attentional selection when watching dynamic activity is unclear. To address this, we studied the influence of knowledge on attention and memory for everyday events in young and older adults by tracking their eyes while they watched videos. The videos depicted activities that older adults perform more frequently than young adults (balancing a checkbook, planting flowers) or activities that young adults perform more frequently than older adults (installing a printer, setting up a video game). Participants completed free recall, recognition, and order memory tests after each video. We found age-related memory deficits when older adults had little knowledge of the activities, but memory did not differ between age groups when older adults had relevant knowledge and experience with the activities. Critically, results showed that knowledge influenced where viewers fixated when watching the videos. Older adults fixated less goal-relevant information compared to young adults when watching young adult activities, but they fixated goal-relevant information similarly to young adults, when watching more older adult activities. Finally, results showed that fixating goal-relevant information predicted free recall of the everyday activities for both age groups. Thus, older adults may use relevant knowledge to more effectively infer the goals of actors, which guides their attention to goal-relevant actions, thus improving their episodic memory for everyday activities.
Collapse
Affiliation(s)
- Maverick E Smith
- Department of Psychological Sciences, Kansas State University, 471 Bluemont Hall, 1100 Mid-campus Dr., Manhattan, KS, 66506, USA.
| | - Lester C Loschky
- Department of Psychological Sciences, Kansas State University, 471 Bluemont Hall, 1100 Mid-campus Dr., Manhattan, KS, 66506, USA
| | - Heather R Bailey
- Department of Psychological Sciences, Kansas State University, 471 Bluemont Hall, 1100 Mid-campus Dr., Manhattan, KS, 66506, USA
| |
Collapse
|
8
|
Levin DT, Salas JA, Wright AM, Seiffert AE, Carter KE, Little JW. The Incomplete Tyranny of Dynamic Stimuli: Gaze Similarity Predicts Response Similarity in Screen-Captured Instructional Videos. Cogn Sci 2021; 45:e12984. [PMID: 34170026 DOI: 10.1111/cogs.12984] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 01/19/2021] [Accepted: 04/16/2021] [Indexed: 11/27/2022]
Abstract
Although eye tracking has been used extensively to assess cognitions for static stimuli, recent research suggests that the link between gaze and cognition may be more tenuous for dynamic stimuli such as videos. Part of the difficulty in convincingly linking gaze with cognition is that in dynamic stimuli, gaze position is strongly influenced by exogenous cues such as object motion. However, tests of the gaze-cognition link in dynamic stimuli have been done on only a limited range of stimuli often characterized by highly organized motion. Also, analyses of cognitive contrasts between participants have been mostly been limited to categorical contrasts among small numbers of participants that may have limited the power to observe more subtle influences. We, therefore, tested for cognitive influences on gaze for screen-captured instructional videos, the contents of which participants were tested on. Between-participant scanpath similarity predicted between-participant similarity in responses on test questions, but with imperfect consistency across videos. We also observed that basic gaze parameters and measures of attention to centers of interest only inconsistently predicted learning, and that correlations between gaze and centers of interest defined by other-participant gaze and cursor movement did not predict learning. It, therefore, appears that the search for eye movement indices of cognition during dynamic naturalistic stimuli may be fruitful, but we also agree that the tyranny of dynamic stimuli is real, and that links between eye movements and cognition are highly dependent on task and stimulus properties.
Collapse
Affiliation(s)
- Daniel T Levin
- Department of Psychology and Human Development, Vanderbilt University
| | - Jorge A Salas
- Department of Psychology and Human Development, Vanderbilt University
| | - Anna M Wright
- Department of Psychology and Human Development, Vanderbilt University
| | | | - Kelly E Carter
- Department of Psychology and Human Development, Vanderbilt University
| | - Joshua W Little
- Department of Psychology and Human Development, Vanderbilt University
| |
Collapse
|
9
|
Cohn N. A starring role for inference in the neurocognition of visual narratives. COGNITIVE RESEARCH-PRINCIPLES AND IMPLICATIONS 2021; 6:8. [PMID: 33587244 PMCID: PMC7884514 DOI: 10.1186/s41235-021-00270-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 01/06/2021] [Indexed: 11/10/2022]
Abstract
Research in verbal and visual narratives has often emphasized backward-looking inferences, where absent information is subsequently inferred. However, comics use conventions like star-shaped “action stars” where a reader knows events are undepicted at that moment, rather than omitted entirely. We contrasted the event-related brain potentials (ERPs) to visual narratives depicting an explicit event, an action star, or a “noise” panel of scrambled lines. Both action stars and noise panels evoked large N400s compared to explicit-events (300–500 ms), but action stars and noise panels then differed in their later effects (500–900 ms). Action stars elicited sustained negativities and P600s, which could indicate further interpretive processes and integration of meaning into a mental model, while noise panels evoked late frontal positivities possibly indexing that they were improbable narrative units. Nevertheless, panels following action stars and noise panels both evoked late sustained negativities, implying further inferential processing. Inference in visual narratives thus uses cascading mechanisms resembling those in language processing that differ based on the inferential techniques.
Collapse
Affiliation(s)
- Neil Cohn
- Department of Communication and Cognition, Tilburg School of Humanities and Digital Sciences, Tilburg University, P.O. Box 90153, 5000 LE, Tilburg, The Netherlands.
| |
Collapse
|
10
|
Cohn N, Foulsham T. Zooming in on the cognitive neuroscience of visual narrative. Brain Cogn 2020; 146:105634. [DOI: 10.1016/j.bandc.2020.105634] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Revised: 10/09/2020] [Accepted: 10/19/2020] [Indexed: 10/23/2022]
|
11
|
Abstract
The comprehension of visual narratives requires paying attention to certain elements and integrating them across a sequence of images. To study this process, we developed a new approach that modified comic strips according to where observers looked while viewing each sequence. Across three self-paced experiments, we presented sequences of six panels that were sometimes automatically "zoomed-in" or re-framed in order to highlight parts of the image that had been fixated by another group of observers. Fixation zoom panels were rated as easier to understand and produced viewing times more similar to the original comic than panels modified to contain non-fixated or incongruous regions. When a single panel depicting the start of an action was cropped to show only the most fixated region, viewing times were similar to the original narrative despite the reduced information. Modifying such panels also had an impact on the viewing time on subsequent panels, both when zoomed in and when regions were highlighted through an "inset" panel. These findings demonstrate that fixations in a visual narrative are guided to informative elements, and that these elements influence both the current panel and the processing of the sequence.
Collapse
|
12
|
Cohn N, Magliano JP. Editors’ Introduction and Review: Visual Narrative Research: An Emerging Field in Cognitive Science. Top Cogn Sci 2019; 12:197-223. [PMID: 31865641 PMCID: PMC9328199 DOI: 10.1111/tops.12473] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 09/29/2019] [Accepted: 09/29/2019] [Indexed: 01/06/2023]
Abstract
Drawn sequences of images are among our oldest records of human intelligence, appearing on cave paintings, wall carvings, and ancient pottery, and they pervade across cultures from instruction manuals to comics. They also appear prevalently as stimuli across Cognitive Science, for studies of temporal cognition, event structure, social cognition, discourse, and basic intelligence. Yet, despite this fundamental place in human expression and research on cognition, the study of visual narratives themselves has only recently gained traction in Cognitive Science. This work has suggested that visual narrative comprehension requires cultural exposure across a developmental trajectory and engages with domain‐general processing mechanisms shared by visual perception, attention, event cognition, and language, among others. Here, we review the relevance of such research for the broader Cognitive Science community, and make the case for why researchers should join the scholarship of this ubiquitous but understudied aspect of human expression. Drawn sequences of images, like those in comics and picture stories, are a pervasive and fundamental way that humans have communicated for millennia. Yet, the study of visual narratives has only recently gained traction in Cognitive Science. Here we explore what has held back the study of the cognition of visual narratives, and why researchers should join in scholarship of this ubiquitous aspect of expression.
Collapse
Affiliation(s)
- Neil Cohn
- Department of Communciation and Cognition, Tilburg School of Humanities and Digital Sciences, Tilburg Center for Cognition and Communication, Tilburg Unviersity
| | - Joseph P. Magliano
- Department of Learning Sciences at the College of Education & Human Development, Georgia State University
| |
Collapse
|
13
|
Laubrock J, Dunst A. Computational Approaches to Comics Analysis. Top Cogn Sci 2019; 12:274-310. [PMID: 31705626 DOI: 10.1111/tops.12476] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 08/17/2019] [Accepted: 08/27/2019] [Indexed: 11/29/2022]
Abstract
Comics are complex documents whose reception engages cognitive processes such as scene perception, language processing, and narrative understanding. Possibly because of their complexity, they have rarely been studied in cognitive science. Modeling the stimulus ideally requires a formal description, which can be provided by feature descriptors from computer vision and computational linguistics. With a focus on document analysis, here we review work on the computational modeling of comics. We argue that the development of modern feature descriptors based on deep learning techniques has made sufficient progress to allow the investigation of complex material such as comics for reception studies, including experimentation and computational modeling of cognitive processes.
Collapse
Affiliation(s)
| | - Alexander Dunst
- Department of English and American Studies, University of Paderborn
| |
Collapse
|
14
|
Kendeou P, McMaster KL, Butterfuss R, Kim J, Bresina B, Wagner K. The Inferential Language Comprehension (iLC) Framework: Supporting Children's Comprehension of Visual Narratives. Top Cogn Sci 2019; 12:256-273. [PMID: 31549797 DOI: 10.1111/tops.12457] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 05/07/2019] [Accepted: 06/03/2019] [Indexed: 11/28/2022]
Abstract
We present an integrated theoretical framework guiding the use of visual narratives in educational settings. We focus specifically on the use of static and dynamic visual narratives to teach and assess inference skills in young children and discuss evidence to support the efficacy of this approach. In doing so, first we review the basis of the integrated framework, which builds on major findings of cognitive, developmental, and language research highlighting that (a) inference skills can be developed in non-reading contexts using different media, (b) inference skills can transfer across different media, and (c) inference skills can be improved using questioning that includes scaffolding and specific feedback. Second, we review instructional and assessment approaches that align with the proposed framework; these approaches are designed to teach or assess inference making skills using visual narratives and interactive questioning. In this context, we discuss how these approaches leverage the unique affordances of static and dynamic visual narratives with respect to unit of meaning (by increasing opportunities to generate inferences), multimodality (by providing opportunities to generate inferences of higher complexity than text), and vocabulary/knowledge demands (by providing vocabulary/knowledge support), while also reviewing evidence for their usability, feasibility, and efficacy to improve educational outcomes. We conclude with important theoretical and practical questions about future work in this area.
Collapse
Affiliation(s)
| | | | | | - Jasmine Kim
- Department of Educational Psychology, University of Minnesota
| | - Britta Bresina
- Department of Educational Psychology, University of Minnesota
| | - Kyle Wagner
- Department of Educational Psychology, University of Minnesota
| |
Collapse
|
15
|
Loschky LC, Larson AM, Smith TJ, Magliano JP. The Scene Perception & Event Comprehension Theory (SPECT) Applied to Visual Narratives. Top Cogn Sci 2019; 12:311-351. [PMID: 31486277 PMCID: PMC9328418 DOI: 10.1111/tops.12455] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 08/05/2019] [Accepted: 08/05/2019] [Indexed: 11/29/2022]
Abstract
Understanding how people comprehend visual narratives (including picture stories, comics, and film) requires the combination of traditionally separate theories that span the initial sensory and perceptual processing of complex visual scenes, the perception of events over time, and comprehension of narratives. Existing piecemeal approaches fail to capture the interplay between these levels of processing. Here, we propose the Scene Perception & Event Comprehension Theory (SPECT), as applied to visual narratives, which distinguishes between front‐end and back‐end cognitive processes. Front‐end processes occur during single eye fixations and are comprised of attentional selection and information extraction. Back‐end processes occur across multiple fixations and support the construction of event models, which reflect understanding of what is happening now in a narrative (stored in working memory) and over the course of the entire narrative (stored in long‐term episodic memory). We describe relationships between front‐ and back‐end processes, and medium‐specific differences that likely produce variation in front‐end and back‐end processes across media (e.g., picture stories vs. film). We describe several novel research questions derived from SPECT that we have explored. By addressing these questions, we provide greater insight into how attention, information extraction, and event model processes are dynamically coordinated to perceive and understand complex naturalistic visual events in narratives and the real world. Comprehension of visual narratives like comics, picture stories, and films involves both decoding the visual content and construing the meaningful events they represent. The Scene Perception & Event Comprehension Theory (SPECT) proposes a framework for understanding how a comprehender perceptually negotiates the surface of a visual representation and integrates its meaning into a growing mental model.
Collapse
Affiliation(s)
| | | | - Tim J Smith
- Department of Psychological Sciences, Birkbeck, University of London
| | | |
Collapse
|
16
|
Kopatich RD, Feller DP, Kurby CA, Magliano JP. The role of character goals and changes in body position in the processing of events in visual narratives. COGNITIVE RESEARCH-PRINCIPLES AND IMPLICATIONS 2019; 4:22. [PMID: 31286278 PMCID: PMC6614232 DOI: 10.1186/s41235-019-0176-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2018] [Accepted: 05/31/2019] [Indexed: 11/10/2022]
Abstract
BACKGROUND A growing body of research is beginning to understand how people comprehend sequential visual narratives. However, previous work has used materials that primarily rely on visual information (i.e., they contain minimal language information). The current work seeks to address how visual and linguistic information streams are coordinated in sequential image comprehension. In experiment 1, participants viewed picture stories and engaged in an event segmentation task. The extent to which critical points in the narrative depicted situational continuity of character goals and continuity in bodily position was manipulated. The likelihood of perceiving an event boundary and viewing latencies at critical locations were measured. Experiment 1 was replicated in the second experiment, without the segmentation task. That is, participants read the picture stories without deciding where the event boundaries occurred. RESULTS Experiment 1 indicated that changes in character goals were associated with an increased likelihood of segmenting at the critical point, but changes in bodily position were not. A follow-up analysis, however, revealed that over the course of the entire story, changes in body position were a significant predictor of event segmentation. Viewing time, however, was affected by both goal and body position shifts. Experiment 2 corroborated the finding that viewing time was affected by changes in goals and body positions. CONCLUSION The current study shows that changes in body position influence a viewer's perception of event structure and event processing. This fits into a growing body of research that attempts to understand how consumers of multimodal media coordinate multiple information streams. The current study underscores the need for the systematic study of the visual, perceptual, and comprehension processes that occur during visual narrative understanding.
Collapse
|
17
|
Cohn N. Your Brain on Comics: A Cognitive Model of Visual Narrative Comprehension. Top Cogn Sci 2019; 12:352-386. [PMID: 30963724 PMCID: PMC9328425 DOI: 10.1111/tops.12421] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 01/21/2019] [Accepted: 03/18/2019] [Indexed: 11/30/2022]
Abstract
The past decade has seen a rapid growth of cognitive and brain research focused on visual narratives like comics and picture stories. This paper will summarize and integrate this emerging literature into the Parallel Interfacing Narrative‐Semantics Model (PINS Model)—a theory of sequential image processing characterized by an interaction between two representational levels: semantics and narrative structure. Ongoing semantic processes build meaning into an evolving mental model of a visual discourse. Updating of spatial, referential, and event information then incurs costs when they are discontinuous with the growing context. In parallel, a narrative structure organizes semantic information into coherent sequences by assigning images to categorical roles, which are then embedded within a hierarchic constituent structure. Narrative constructional schemas allow for specific predictions of structural sequencing, independent of semantics. Together, these interacting levels of representation engage in an iterative process of retrieval of semantic and narrative information, prediction of upcoming information based on those assessments, and subsequent updating based on discontinuity. These core mechanisms are argued to be domain‐general—spanning across expressive systems—as suggested by similar electrophysiological brain responses (N400, P600, anterior negativities) generated in response to manipulation of sequential images, music, and language. Such similarities between visual narratives and other domains thus pose fundamental questions for the linguistic and cognitive sciences. Visual narratives like comics involve a range of complex cognitive operations in order to be understood. The Parallel Interfacing Narrative‐Semantics (PINS) Model integrates an emerging literature showing that comprehension of wordless image sequences balances two representational levels of semantic and narrative structure. The neurocognitive mechanisms that guide these processes are argued to overlap with other domains, such as language and music.
Collapse
Affiliation(s)
- Neil Cohn
- Department of Communication and Cognition, Tilburg University
| |
Collapse
|
18
|
Cohn N. Visual narratives and the mind: Comprehension, cognition, and learning. PSYCHOLOGY OF LEARNING AND MOTIVATION 2019. [DOI: 10.1016/bs.plm.2019.02.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|