1
|
Yang W, Li S, Guo A, Li Z, Yang X, Ren Y, Yang J, Wu J, Zhang Z. Auditory attentional load modulates the temporal dynamics of audiovisual integration in older adults: An ERPs study. Front Aging Neurosci 2022; 14:1007954. [PMID: 36325188 PMCID: PMC9618958 DOI: 10.3389/fnagi.2022.1007954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 09/23/2022] [Indexed: 12/02/2022] Open
Abstract
As older adults experience degenerations in perceptual ability, it is important to gain perception from audiovisual integration. Due to attending to one or more auditory stimuli, performing other tasks is a common challenge for older adults in everyday life. Therefore, it is necessary to probe the effects of auditory attentional load on audiovisual integration in older adults. The present study used event-related potentials (ERPs) and a dual-task paradigm [Go / No-go task + rapid serial auditory presentation (RSAP) task] to investigate the temporal dynamics of audiovisual integration. Behavioral results showed that both older and younger adults responded faster and with higher accuracy to audiovisual stimuli than to either visual or auditory stimuli alone. ERPs revealed weaker audiovisual integration under the no-attentional auditory load condition at the earlier processing stages and, conversely, stronger integration in the late stages. Moreover, audiovisual integration was greater in older adults than in younger adults at the following time intervals: 60–90, 140–210, and 430–530 ms. Notably, only under the low load condition in the time interval of 140–210 ms, we did find that the audiovisual integration of older adults was significantly greater than that of younger adults. These results delineate the temporal dynamics of the interactions with auditory attentional load and audiovisual integration in aging, suggesting that modulation of auditory attentional load affects audiovisual integration, enhancing it in older adults.
Collapse
Affiliation(s)
- Weiping Yang
- Department of Psychology, Faculty of Education, Hubei University, Wuhan, China
- Brain and Cognition Research Center (BCRC), Faculty of Education, Hubei University, Wuhan, China
| | - Shengnan Li
- Graduate School of Interdisciplinary Science and Engineering in Health Systems, Okayama University, Okayama, Japan
| | - Ao Guo
- Graduate School of Interdisciplinary Science and Engineering in Health Systems, Okayama University, Okayama, Japan
| | - Zimo Li
- Department of Psychology, Faculty of Education, Hubei University, Wuhan, China
| | - Xiangfu Yang
- Department of Psychology, Faculty of Education, Hubei University, Wuhan, China
| | - Yanna Ren
- Department of Psychology, College of Humanities and Management, Guizhou University of Traditional Chinese Medicine, Guiyang, China
- *Correspondence: Yanna Ren
| | - Jiajia Yang
- Applied Brain Science Lab, Faculty of Interdisciplinary Science and Engineering in Health Systems, Okayama University, Okayama, Japan
| | - Jinglong Wu
- Graduate School of Interdisciplinary Science and Engineering in Health Systems, Okayama University, Okayama, Japan
- Research Center for Medical Artificial Intelligence, Shenzhen Institute of Advanced Technology, Chinese Academy of Science, Shenzhen, China
| | - Zhilin Zhang
- Research Center for Medical Artificial Intelligence, Shenzhen Institute of Advanced Technology, Chinese Academy of Science, Shenzhen, China
- Zhilin Zhang
| |
Collapse
|
2
|
Sánchez-García C, Kandel S, Savariaux C, Soto-Faraco S. The Time Course of Audio-Visual Phoneme Identification: a High Temporal Resolution Study. Multisens Res 2018; 31:57-78. [DOI: 10.1163/22134808-00002560] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Accepted: 02/20/2017] [Indexed: 11/19/2022]
Abstract
Speech unfolds in time and, as a consequence, its perception requires temporal integration. Yet, studies addressing audio-visual speech processing have often overlooked this temporal aspect. Here, we address the temporal course of audio-visual speech processing in a phoneme identification task using a Gating paradigm. We created disyllabic Spanish word-like utterances (e.g., /pafa/, /paθa/, …) from high-speed camera recordings. The stimuli differed only in the middle consonant (/f/, /θ/, /s/, /r/, /g/), which varied in visual and auditory saliency. As in classical Gating tasks, the utterances were presented in fragments of increasing length (gates), here in 10 ms steps, for identification and confidence ratings. We measured correct identification as a function of time (at each gate) for each critical consonant in audio, visual and audio-visual conditions, and computed the Identification Point and Recognition Point scores. The results revealed that audio-visual identification is a time-varying process that depends on the relative strength of each modality (i.e., saliency). In some cases, audio-visual identification followed the pattern of one dominant modality (either A or V), when that modality was very salient. In other cases, both modalities contributed to identification, hence resulting in audio-visual advantage or interference with respect to unimodal conditions. Both unimodal dominance and audio-visual interaction patterns may arise within the course of identification of the same utterance, at different times. The outcome of this study suggests that audio-visual speech integration models should take into account the time-varying nature of visual and auditory saliency.
Collapse
Affiliation(s)
- Carolina Sánchez-García
- Departament de Tecnologies de la Informació i les Comunicacions, Universitat Pompeu Fabra, Barcelona, Spain
| | - Sonia Kandel
- Université Grenoble Alpes, GIPSA-lab (CNRS UMR 5216), Grenoble, France
| | | | - Salvador Soto-Faraco
- Departament de Tecnologies de la Informació i les Comunicacions, Universitat Pompeu Fabra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| |
Collapse
|
3
|
Yankouskaya A, Stolte M, Moradi Z, Rotshtein P, Humphreys G. Integration of identity and emotion information in faces: fMRI evidence. Brain Cogn 2017; 116:29-39. [DOI: 10.1016/j.bandc.2017.05.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Revised: 05/22/2017] [Accepted: 05/23/2017] [Indexed: 10/24/2022]
|
4
|
Hawkins RX, Houpt JW, Eidels A, Townsend JT. Can two dots form a Gestalt? Measuring emergent features with the capacity coefficient. Vision Res 2016; 126:19-33. [DOI: 10.1016/j.visres.2015.04.019] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2014] [Revised: 04/25/2015] [Accepted: 04/27/2015] [Indexed: 11/28/2022]
|
5
|
Altieri N, Hudock D. Normative data on audiovisual speech integration using sentence recognition and capacity measures. Int J Audiol 2016; 55:206-14. [PMID: 26853446 DOI: 10.3109/14992027.2015.1120895] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE The ability to use visual speech cues and integrate them with auditory information is important, especially in noisy environments and for hearing-impaired (HI) listeners. Providing data on measures of integration skills that encompass accuracy and processing speed will benefit researchers and clinicians. DESIGN The study consisted of two experiments: First, accuracy scores were obtained using City University of New York (CUNY) sentences, and capacity measures that assessed reaction-time distributions were obtained from a monosyllabic word recognition task. STUDY SAMPLE We report data on two measures of integration obtained from a sample comprised of 86 young and middle-age adult listeners: RESULTS To summarize our results, capacity showed a positive correlation with accuracy measures of audiovisual benefit obtained from sentence recognition. More relevant, factor analysis indicated that a single-factor model captured audiovisual speech integration better than models containing more factors. Capacity exhibited strong loadings on the factor, while the accuracy-based measures from sentence recognition exhibited weaker loadings. CONCLUSIONS Results suggest that a listener's integration skills may be assessed optimally using a measure that incorporates both processing speed and accuracy.
Collapse
Affiliation(s)
- Nicholas Altieri
- a Department of Communication Sciences and Disorders , Idaho State University , Pocatello , USA
| | - Daniel Hudock
- a Department of Communication Sciences and Disorders , Idaho State University , Pocatello , USA
| |
Collapse
|
6
|
Altieri N. Multimodal theories of recognition and their relation to Molyneux's question. Front Psychol 2015; 5:1547. [PMID: 25620944 PMCID: PMC4288054 DOI: 10.3389/fpsyg.2014.01547] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2014] [Accepted: 12/14/2014] [Indexed: 11/29/2022] Open
Affiliation(s)
- Nicholas Altieri
- ISU Multimodal Language Processing Lab, Department of Communication Sciences and Disorders, Idaho State University Pocatello, ID, USA
| |
Collapse
|
7
|
Yu JC, Chang TY, Yang CT. Individual differences in working memory capacity and workload capacity. Front Psychol 2014; 5:1465. [PMID: 25566143 PMCID: PMC4270186 DOI: 10.3389/fpsyg.2014.01465] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2014] [Accepted: 11/29/2014] [Indexed: 11/13/2022] Open
Abstract
We investigated the relationship between working memory capacity (WMC) and workload capacity (WLC). Each participant performed an operation span (OSPAN) task to measure his/her WMC and three redundant-target detection tasks to measure his/her WLC. WLC was computed non-parametrically (Experiments 1 and 2) and parametrically (Experiment 2). Both levels of analyses showed that participants high in WMC had larger WLC than those low in WMC only when redundant information came from visual and auditory modalities, suggesting that high-WMC participants had superior processing capacity in dealing with redundant visual and auditory information. This difference was eliminated when multiple processes required processing for only a single working memory subsystem in a color-shape detection task and a double-dot detection task. These results highlighted the role of executive control in integrating and binding information from the two working memory subsystems for perceptual decision making.
Collapse
Affiliation(s)
- Ju-Chi Yu
- Department of Psychology, National Cheng Kung University Tainan, Taiwan
| | - Ting-Yun Chang
- Department of Psychology, National Cheng Kung University Tainan, Taiwan
| | - Cheng-Ta Yang
- Department of Psychology, National Cheng Kung University Tainan, Taiwan
| |
Collapse
|
8
|
Connolly K. Multisensory perception as an associative learning process. Front Psychol 2014; 5:1095. [PMID: 25309498 PMCID: PMC4176039 DOI: 10.3389/fpsyg.2014.01095] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2014] [Accepted: 09/10/2014] [Indexed: 11/21/2022] Open
Abstract
Suppose that you are at a live jazz show. The drummer begins a solo. You see the cymbal jolt and you hear the clang. But in addition seeing the cymbal jolt and hearing the clang, you are also aware that the jolt and the clang are part of the same event. Casey O’Callaghan (forthcoming) calls this awareness “intermodal feature binding awareness.” Psychologists have long assumed that multimodal perceptions such as this one are the result of a automatic feature binding mechanism (see Pourtois et al., 2000; Vatakis and Spence, 2007; Navarra et al., 2012). I present new evidence against this. I argue that there is no automatic feature binding mechanism that couples features like the jolt and the clang together. Instead, when you experience the jolt and the clang as part of the same event, this is the result of an associative learning process. The cymbal’s jolt and the clang are best understood as a single learned perceptual unit, rather than as automatically bound. I outline the specific learning process in perception called “unitization,” whereby we come to “chunk” the world into multimodal units. Unitization has never before been applied to multimodal cases. Yet I argue that this learning process can do the same work that intermodal binding would do, and that this issue has important philosophical implications. Specifically, whether we take multimodal cases to involve a binding mechanism or an associative process will have impact on philosophical issues from Molyneux’s question to the question of how active or passive we consider perception to be.
Collapse
Affiliation(s)
- Kevin Connolly
- Philosophy and Institute for Research in Cognitive Science, University of Pennsylvania Philadelphia, PA, USA
| |
Collapse
|
9
|
Altieri N, Hudock D. Hearing impairment and audiovisual speech integration ability: a case study report. Front Psychol 2014; 5:678. [PMID: 25071649 PMCID: PMC4076931 DOI: 10.3389/fpsyg.2014.00678] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2013] [Accepted: 06/11/2014] [Indexed: 11/13/2022] Open
Abstract
Research in audiovisual speech perception has demonstrated that sensory factors such as auditory and visual acuity are associated with a listener's ability to extract and combine auditory and visual speech cues. This case study report examined audiovisual integration using a newly developed measure of capacity in a sample of hearing-impaired listeners. Capacity assessments are unique because they examine the contribution of reaction-time (RT) as well as accuracy to determine the extent to which a listener efficiently combines auditory and visual speech cues relative to independent race model predictions. Multisensory speech integration ability was examined in two experiments: an open-set sentence recognition and a closed set speeded-word recognition study that measured capacity. Most germane to our approach, capacity illustrated speed-accuracy tradeoffs that may be predicted by audiometric configuration. Results revealed that some listeners benefit from increased accuracy, but fail to benefit in terms of speed on audiovisual relative to unisensory trials. Conversely, other listeners may not benefit in the accuracy domain but instead show an audiovisual processing time benefit.
Collapse
Affiliation(s)
- Nicholas Altieri
- Department of Communication Sciences and Disorders, Idaho State University Pocatello, ID, USA
| | - Daniel Hudock
- Department of Communication Sciences and Disorders, Idaho State University Pocatello, ID, USA
| |
Collapse
|
10
|
Altieri N, Hudock D. Assessing variability in audiovisual speech integration skills using capacity and accuracy measures. Int J Audiol 2014; 53:710-8. [DOI: 10.3109/14992027.2014.909053] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
11
|
Identifying and quantifying multisensory integration: a tutorial review. Brain Topogr 2014; 27:707-30. [PMID: 24722880 DOI: 10.1007/s10548-014-0365-7] [Citation(s) in RCA: 133] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2013] [Accepted: 03/26/2014] [Indexed: 12/19/2022]
Abstract
We process information from the world through multiple senses, and the brain must decide what information belongs together and what information should be segregated. One challenge in studying such multisensory integration is how to quantify the multisensory interactions, a challenge that is amplified by the host of methods that are now used to measure neural, behavioral, and perceptual responses. Many of the measures that have been developed to quantify multisensory integration (and which have been derived from single unit analyses), have been applied to these different measures without much consideration for the nature of the process being studied. Here, we provide a review focused on the means with which experimenters quantify multisensory processes and integration across a range of commonly used experimental methodologies. We emphasize the most commonly employed measures, including single- and multiunit responses, local field potentials, functional magnetic resonance imaging, and electroencephalography, along with behavioral measures of detection, accuracy, and response times. In each section, we will discuss the different metrics commonly used to quantify multisensory interactions, including the rationale for their use, their advantages, and the drawbacks and caveats associated with them. Also discussed are possible alternatives to the most commonly used metrics.
Collapse
|
12
|
Altieri N. Multisensory integration, learning, and the predictive coding hypothesis. Front Psychol 2014; 5:257. [PMID: 24715884 PMCID: PMC3970030 DOI: 10.3389/fpsyg.2014.00257] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2013] [Accepted: 03/10/2014] [Indexed: 11/13/2022] Open
Affiliation(s)
- Nicholas Altieri
- ISU Multimodal Language Processing Lab, Department of Communication Sciences and Disorders, Idaho State University Pocatello, Idaho, USA
| |
Collapse
|
13
|
Learning to associate auditory and visual stimuli: behavioral and neural mechanisms. Brain Topogr 2013; 28:479-93. [PMID: 24276220 DOI: 10.1007/s10548-013-0333-7] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2013] [Accepted: 11/11/2013] [Indexed: 12/20/2022]
Abstract
The ability to effectively combine sensory inputs across modalities is vital for acquiring a unified percept of events. For example, watching a hammer hit a nail while simultaneously identifying the sound as originating from the event requires the ability to identify spatio-temporal congruencies and statistical regularities. In this study, we applied a reaction time and hazard function measure known as capacity (e.g., Townsend and AshbyCognitive Theory 200-239, 1978) to quantify the extent to which observers learn paired associations between simple auditory and visual patterns in a model theoretic manner. As expected, results showed that learning was associated with an increase in accuracy, but more significantly, an increase in capacity. The aim of this study was to associate capacity measures of multisensory learning, with neural based measures, namely mean global field power (GFP). We observed a co-variation between an increase in capacity, and a decrease in GFP amplitude as learning occurred. This suggests that capacity constitutes a reliable behavioral index of efficient energy expenditure in the neural domain.
Collapse
|
14
|
Altieri N, Wenger MJ. Neural dynamics of audiovisual speech integration under variable listening conditions: an individual participant analysis. Front Psychol 2013; 4:615. [PMID: 24058358 PMCID: PMC3767908 DOI: 10.3389/fpsyg.2013.00615] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2013] [Accepted: 08/22/2013] [Indexed: 11/25/2022] Open
Abstract
Speech perception engages both auditory and visual modalities. Limitations of traditional accuracy-only approaches in the investigation of audiovisual speech perception have motivated the use of new methodologies. In an audiovisual speech identification task, we utilized capacity (Townsend and Nozawa, 1995), a dynamic measure of efficiency, to quantify audiovisual integration. Capacity was used to compare RT distributions from audiovisual trials to RT distributions from auditory-only and visual-only trials across three listening conditions: clear auditory signal, S/N ratio of −12 dB, and S/N ratio of −18 dB. The purpose was to obtain EEG recordings in conjunction with capacity to investigate how a late ERP co-varies with integration efficiency. Results showed efficient audiovisual integration for low auditory S/N ratios, but inefficient audiovisual integration when the auditory signal was clear. The ERP analyses showed evidence for greater audiovisual amplitude compared to the unisensory signals for lower auditory S/N ratios (higher capacity/efficiency) compared to the high S/N ratio (low capacity/inefficient integration). The data are consistent with an interactive framework of integration, where auditory recognition is influenced by speech-reading as a function of signal clarity.
Collapse
Affiliation(s)
- Nicholas Altieri
- Department of Communication Sciences and Disorders, Idaho State University Pocatello, ID, USA
| | | |
Collapse
|
15
|
|
16
|
van Wassenhove V. Speech through ears and eyes: interfacing the senses with the supramodal brain. Front Psychol 2013; 4:388. [PMID: 23874309 PMCID: PMC3709159 DOI: 10.3389/fpsyg.2013.00388] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2013] [Accepted: 06/10/2013] [Indexed: 12/02/2022] Open
Abstract
The comprehension of auditory-visual (AV) speech integration has greatly benefited from recent advances in neurosciences and multisensory research. AV speech integration raises numerous questions relevant to the computational rules needed for binding information (within and across sensory modalities), the representational format in which speech information is encoded in the brain (e.g., auditory vs. articulatory), or how AV speech ultimately interfaces with the linguistic system. The following non-exhaustive review provides a set of empirical findings and theoretical questions that have fed the original proposal for predictive coding in AV speech processing. More recently, predictive coding has pervaded many fields of inquiries and positively reinforced the need to refine the notion of internal models in the brain together with their implications for the interpretation of neural activity recorded with various neuroimaging techniques. However, it is argued here that the strength of predictive coding frameworks reside in the specificity of the generative internal models not in their generality; specifically, internal models come with a set of rules applied on particular representational formats themselves depending on the levels and the network structure at which predictive operations occur. As such, predictive coding in AV speech owes to specify the level(s) and the kinds of internal predictions that are necessary to account for the perceptual benefits or illusions observed in the field. Among those specifications, the actual content of a prediction comes first and foremost, followed by the representational granularity of that prediction in time. This review specifically presents a focused discussion on these issues.
Collapse
Affiliation(s)
- Virginie van Wassenhove
- Cognitive Neuroimaging Unit, Brain Dynamics, INSERM, U992 Gif/Yvette, France ; NeuroSpin Center, CEA, DSV/I2BM Gif/Yvette, France ; Cognitive Neuroimaging Unit, University Paris-Sud Gif/Yvette, France
| |
Collapse
|