1
|
Lee J, Park S. Multi-modal Representation of the Size of Space in the Human Brain. J Cogn Neurosci 2024; 36:340-361. [PMID: 38010320 DOI: 10.1162/jocn_a_02092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
To estimate the size of an indoor space, we must analyze the visual boundaries that limit the spatial extent and acoustic cues from reflected interior surfaces. We used fMRI to examine how the brain processes the geometric size of indoor scenes when various types of sensory cues are presented individually or together. Specifically, we asked whether the size of space is represented in a modality-specific way or in an integrative way that combines multimodal cues. In a block-design study, images or sounds that depict small- and large-sized indoor spaces were presented. Visual stimuli were real-world pictures of empty spaces that were small or large. Auditory stimuli were sounds convolved with different reverberations. By using a multivoxel pattern classifier, we asked whether the two sizes of space can be classified in visual, auditory, and visual-auditory combined conditions. We identified both sensory-specific and multimodal representations of the size of space. To further investigate the nature of the multimodal region, we specifically examined whether it contained multimodal information in a coexistent or integrated form. We found that angular gyrus and the right medial frontal gyrus had modality-integrated representation, displaying sensitivity to the match in the spatial size information conveyed through image and sound. Background functional connectivity analysis further demonstrated that the connection between sensory-specific regions and modality-integrated regions increases in the multimodal condition compared with single modality conditions. Our results suggest that spatial size perception relies on both sensory-specific and multimodal representations, as well as their interplay during multimodal perception.
Collapse
|
2
|
Lowe MX, Mohsenzadeh Y, Lahner B, Charest I, Oliva A, Teng S. Cochlea to categories: The spatiotemporal dynamics of semantic auditory representations. Cogn Neuropsychol 2021; 38:468-489. [PMID: 35729704 PMCID: PMC10589059 DOI: 10.1080/02643294.2022.2085085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 03/31/2022] [Accepted: 05/25/2022] [Indexed: 10/17/2022]
Abstract
How does the auditory system categorize natural sounds? Here we apply multimodal neuroimaging to illustrate the progression from acoustic to semantically dominated representations. Combining magnetoencephalographic (MEG) and functional magnetic resonance imaging (fMRI) scans of observers listening to naturalistic sounds, we found superior temporal responses beginning ∼55 ms post-stimulus onset, spreading to extratemporal cortices by ∼100 ms. Early regions were distinguished less by onset/peak latency than by functional properties and overall temporal response profiles. Early acoustically-dominated representations trended systematically toward category dominance over time (after ∼200 ms) and space (beyond primary cortex). Semantic category representation was spatially specific: Vocalizations were preferentially distinguished in frontotemporal voice-selective regions and the fusiform; scenes and objects were distinguished in parahippocampal and medial place areas. Our results are consistent with real-world events coded via an extended auditory processing hierarchy, in which acoustic representations rapidly enter multiple streams specialized by category, including areas typically considered visual cortex.
Collapse
Affiliation(s)
- Matthew X. Lowe
- Computer Science and Artificial Intelligence Lab (CSAIL), MIT, Cambridge, MA
- Unlimited Sciences, Colorado Springs, CO
| | - Yalda Mohsenzadeh
- Computer Science and Artificial Intelligence Lab (CSAIL), MIT, Cambridge, MA
- The Brain and Mind Institute, The University of Western Ontario, London, ON, Canada
- Department of Computer Science, The University of Western Ontario, London, ON, Canada
| | - Benjamin Lahner
- Computer Science and Artificial Intelligence Lab (CSAIL), MIT, Cambridge, MA
| | - Ian Charest
- Département de Psychologie, Université de Montréal, Montréal, Québec, Canada
- Center for Human Brain Health, University of Birmingham, UK
| | - Aude Oliva
- Computer Science and Artificial Intelligence Lab (CSAIL), MIT, Cambridge, MA
| | - Santani Teng
- Computer Science and Artificial Intelligence Lab (CSAIL), MIT, Cambridge, MA
- Smith-Kettlewell Eye Research Institute (SKERI), San Francisco, CA
| |
Collapse
|
3
|
Valenzuela J, Díaz-Andreu M, Escera C. Psychology Meets Archaeology: Psychoarchaeoacoustics for Understanding Ancient Minds and Their Relationship to the Sacred. Front Psychol 2020; 11:550794. [PMID: 33391069 PMCID: PMC7775382 DOI: 10.3389/fpsyg.2020.550794] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Accepted: 11/17/2020] [Indexed: 11/13/2022] Open
Abstract
How important is the influence of spatial acoustics on our mental processes related to sound perception and cognition? There is a large body of research in fields encompassing architecture, musicology, and psychology that analyzes human response, both subjective and objective, to different soundscapes. But what if we want to understand how acoustic environments influenced the human experience of sound in sacred ritual practices in premodern societies? Archaeoacoustics is the research field that investigates sound in the past. One of its branches delves into how sound was used in specific landscapes and at sites with rock art, and why past societies endowed a special significance to places with specific acoustical properties. Taking advantage of the advances made in sound recording and reproduction technologies, researchers are now exploring how ancient social and sacred ceremonies and practices related to the acoustic properties of their sound environment. Here, we advocate for the emergence of a new and innovative discipline, experimental psychoarchaeoacoustics. We also review underlying methodological approaches and discuss the limitations, challenges, and future directions for this new field.
Collapse
Affiliation(s)
- Jose Valenzuela
- Brainlab ‐ Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, Faculty of Psychology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain
| | - Margarita Díaz-Andreu
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
- Department of History and Geography, University of Barcelona, Barcelona, Spain
| | - Carles Escera
- Brainlab ‐ Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, Faculty of Psychology, University of Barcelona, Barcelona, Spain
- Institute of Neurosciences, University of Barcelona, Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
- Sant Joan de Déu Research Institute (IRSJD), Esplugues de Llobregat, Spain
| |
Collapse
|
4
|
de Kerangal M, Vickers D, Chait M. The effect of healthy aging on change detection and sensitivity to predictable structure in crowded acoustic scenes. Hear Res 2020; 399:108074. [PMID: 33041093 DOI: 10.1016/j.heares.2020.108074] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 08/01/2020] [Accepted: 09/01/2020] [Indexed: 01/25/2023]
Abstract
The auditory system plays a critical role in supporting our ability to detect abrupt changes in our surroundings. Here we study how this capacity is affected in the course of healthy ageing. Artifical acoustic 'scenes', populated by multiple concurrent streams of pure tones ('sources') were used to capture the challenges of listening in complex acoustic environments. Two scene conditions were included: REG scenes consisted of sources characterized by a regular temporal structure. Matched RAND scenes contained sources which were temporally random. Changes, manifested as the abrupt disappearance of one of the sources, were introduced to a subset of the trials and participants ('young' group N = 41, age 20-38 years; 'older' group N = 41, age 60-82 years) were instructed to monitor the scenes for these events. Previous work demonstrated that young listeners exhibit better change detection performance in REG scenes, reflecting sensitivity to temporal structure. Here we sought to determine: (1) Whether 'baseline' change detection ability (i.e. in RAND scenes) is affected by age. (2) Whether aging affects listeners' sensitivity to temporal regularity. (3) How change detection capacity relates to listeners' hearing and cognitive profile (a battery of tests that capture hearing and cognitive abilities hypothesized to be affected by aging). The results demonstrated that healthy aging is associated with reduced sensitivity to abrupt scene changes in RAND scenes but that performance does not correlate with age or standard audiological measures such as pure tone audiometry or speech in noise performance. Remarkably older listeners' change detection performance improved substantially (up to the level exhibited by young listeners) in REG relative to RAND scenes. This suggests that the ability to extract and track the regularity associated with scene sources, even in crowded acoustic environments, is relatively preserved in older listeners.
Collapse
Affiliation(s)
- Mathilde de Kerangal
- Ear Institute, University College London, 332 Gray's Inn Road, London WC1 X 8EE, UK
| | - Deborah Vickers
- Ear Institute, University College London, 332 Gray's Inn Road, London WC1 X 8EE, UK; Cambridge Hearing Group, Clinical Neurosciences Department, University of Cambridge, UK
| | - Maria Chait
- Ear Institute, University College London, 332 Gray's Inn Road, London WC1 X 8EE, UK.
| |
Collapse
|
5
|
Ogg M, Carlson TA, Slevc LR. The Rapid Emergence of Auditory Object Representations in Cortex Reflect Central Acoustic Attributes. J Cogn Neurosci 2019; 32:111-123. [PMID: 31560265 DOI: 10.1162/jocn_a_01472] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Human listeners are bombarded by acoustic information that the brain rapidly organizes into coherent percepts of objects and events in the environment, which aids speech and music perception. The efficiency of auditory object recognition belies the critical constraint that acoustic stimuli necessarily require time to unfold. Using magnetoencephalography, we studied the time course of the neural processes that transform dynamic acoustic information into auditory object representations. Participants listened to a diverse set of 36 tokens comprising everyday sounds from a typical human environment. Multivariate pattern analysis was used to decode the sound tokens from the magnetoencephalographic recordings. We show that sound tokens can be decoded from brain activity beginning 90 msec after stimulus onset with peak decoding performance occurring at 155 msec poststimulus onset. Decoding performance was primarily driven by differences between category representations (e.g., environmental vs. instrument sounds), although within-category decoding was better than chance. Representational similarity analysis revealed that these emerging neural representations were related to harmonic and spectrotemporal differences among the stimuli, which correspond to canonical acoustic features processed by the auditory pathway. Our findings begin to link the processing of physical sound properties with the perception of auditory objects and events in cortex.
Collapse
|
6
|
Modality-Independent Coding of Scene Categories in Prefrontal Cortex. J Neurosci 2018; 38:5969-5981. [PMID: 29858483 DOI: 10.1523/jneurosci.0272-18.2018] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Revised: 05/03/2018] [Accepted: 05/26/2018] [Indexed: 11/21/2022] Open
Abstract
Natural environments convey information through multiple sensory modalities, all of which contribute to people's percepts. Although it has been shown that visual or auditory content of scene categories can be decoded from brain activity, it remains unclear how humans represent scene information beyond a specific sensory modality domain. To address this question, we investigated how categories of scene images and sounds are represented in several brain regions. A group of healthy human subjects (both sexes) participated in the present study, where their brain activity was measured with fMRI while viewing images or listening to sounds of different real-world environments. We found that both visual and auditory scene categories can be decoded not only from modality-specific areas, but also from several brain regions in the temporal, parietal, and prefrontal cortex (PFC). Intriguingly, only in the PFC, but not in any other regions, categories of scene images and sounds appear to be represented in similar activation patterns, suggesting that scene representations in PFC are modality-independent. Furthermore, the error patterns of neural decoders indicate that category-specific neural activity patterns in the middle and superior frontal gyri are tightly linked to categorization behavior. Our findings demonstrate that complex scene information is represented at an abstract level in the PFC, regardless of the sensory modality of the stimulus.SIGNIFICANCE STATEMENT Our experience in daily life includes multiple sensory inputs, such as images, sounds, or scents from the surroundings, which all contribute to our understanding of the environment. Here, for the first time, we investigated where and how in the brain information about the natural environment from multiple senses is merged to form modality-independent representations of scene categories. We show direct decoding of scene categories across sensory modalities from patterns of neural activity in the prefrontal cortex (PFC). We also conclusively tie these neural representations to human categorization behavior by comparing patterns of errors between a neural decoder and behavior. Our findings suggest that PFC is a central hub for integrating sensory information and computing modality-independent representations of scene categories.
Collapse
|
7
|
Salminen NH, Jones SJ, Christianson GB, Marquardt T, McAlpine D. A common periodic representation of interaural time differences in mammalian cortex. Neuroimage 2018; 167:95-103. [PMID: 29122721 PMCID: PMC5854251 DOI: 10.1016/j.neuroimage.2017.11.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Revised: 10/01/2017] [Accepted: 11/04/2017] [Indexed: 11/16/2022] Open
Abstract
Binaural hearing, the ability to detect small differences in the timing and level of sounds at the two ears, underpins the ability to localize sound sources along the horizontal plane, and is important for decoding complex spatial listening environments into separate objects – a critical factor in ‘cocktail-party listening’. For human listeners, the most important spatial cue is the interaural time difference (ITD). Despite many decades of neurophysiological investigations of ITD sensitivity in small mammals, and computational models aimed at accounting for human perception, a lack of concordance between these studies has hampered our understanding of how the human brain represents and processes ITDs. Further, neural coding of spatial cues might depend on factors such as head-size or hearing range, which differ considerably between humans and commonly used experimental animals. Here, using magnetoencephalography (MEG) in human listeners, and electro-corticography (ECoG) recordings in guinea pig—a small mammal representative of a range of animals in which ITD coding has been assessed at the level of single-neuron recordings—we tested whether processing of ITDs in human auditory cortex accords with a frequency-dependent periodic code of ITD reported in small mammals, or whether alternative or additional processing stages implemented in psychoacoustic models of human binaural hearing must be assumed. Our data were well accounted for by a model consisting of periodically tuned ITD-detectors, and were highly consistent across the two species. The results suggest that the representation of ITD in human auditory cortex is similar to that found in other mammalian species, a representation in which neural responses to ITD are determined by phase differences relative to sound frequency rather than, for instance, the range of ITDs permitted by head size or the absolute magnitude or direction of ITD. ITD tuning is studied in human MEG and guinea pig ECoG with identical stimuli. Auditory cortical tuning to ITD is highly consistent across species. Results are consistent with a periodic, frequency-dependent code.
Collapse
Affiliation(s)
- Nelli H Salminen
- Brain and Mind Laboratory, Dept. of Neuroscience and Biomedical Engineering, MEG Core, Aalto NeuroImaging, Aalto University School of Science, Espoo, Finland.
| | - Simon J Jones
- UCL Ear Institute, 332 Gray's Inn Road, London, WC1X 8EE, UK
| | | | | | - David McAlpine
- UCL Ear Institute, 332 Gray's Inn Road, London, WC1X 8EE, UK; Dept of Linguistics, Australian Hearing Hub, Macquarie University, Sydney, NSW 2109, Australia
| |
Collapse
|
8
|
Cichy RM, Teng S. Resolving the neural dynamics of visual and auditory scene processing in the human brain: a methodological approach. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0108. [PMID: 28044019 PMCID: PMC5206276 DOI: 10.1098/rstb.2016.0108] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/22/2016] [Indexed: 01/06/2023] Open
Abstract
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research. This article is part of the themed issue ‘Auditory and visual scene analysis’.
Collapse
Affiliation(s)
| | - Santani Teng
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|