1
|
Obama N, Sato Y, Kodama N, Kodani Y, Nakamura K, Yokozeki A, Nagami S. Exploring sex differences in auditory saliency: the role of acoustic characteristics in bottom-up attention. BMC Neurosci 2024; 25:54. [PMID: 39448936 PMCID: PMC11515512 DOI: 10.1186/s12868-024-00909-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 10/16/2024] [Indexed: 10/26/2024] Open
Abstract
BACKGROUND Several cognitive functions are related to sex. However, the relationship between auditory attention and sex remains unclear. The present study aimed to explore sex differences in auditory saliency judgments, with a particular focus on bottom-up type auditory attention. METHODS Forty-five typical adults (mean age: 21.5 ± 0.64 years) with no known hearing deficits, intelligence abnormalities, or attention deficits were enrolled in this study. They were tasked with annotating attention capturing sounds from five audio clips played in a soundproof room. Each stimulus contained ten salient sounds randomly placed within a 1-min natural soundscape. We conducted a generalized linear mixed model (GLMM) analysis using the number of responses to salient sounds as the dependent variable, sex as the between-subjects factor, duration, maximum loudness, and maximum spectrum of each sound as the within-subjects factor, and each sound event and participant as the variable effect. RESULTS No significant differences were found between male and female groups in age, hearing threshold, intellectual function, and attention function (all p > 0.05). Analysis confirmed 77 distinct sound events, with individual response rates of 4.0-100%. In a GLMM analysis, the main effect of sex was not statistically significant (p = 0.458). Duration and spectrum had a significant effect on response rate (p = 0.006 and p < 0.001). The effect of loudness was not statistically significant (p = 0.13). CONCLUSIONS The results suggest that male and female listeners do not differ significantly in their auditory saliency judgments based on the acoustic characteristics studied. This finding challenges the notion of inherent sex differences in bottom-up auditory attention and highlights the need for further research to explore other potential factors or conditions under which such differences might emerge.
Collapse
Affiliation(s)
- Naoya Obama
- Department of Speech and Hearing Sciences, Faculty of Rehabilitation, Kawasaki University of Medical Welfare, Kurashiki, Okayama, Japan
| | - Yoshiki Sato
- Department of Rehabilitation, Kurashiki Central Hospital, Kurashiki, Okayama, Japan
| | - Narihiro Kodama
- Department of Speech and Hearing Sciences, Faculty of Rehabilitation, Kawasaki University of Medical Welfare, Kurashiki, Okayama, Japan
| | - Yuhei Kodani
- Department of Speech and Hearing Sciences, Faculty of Rehabilitation, Kawasaki University of Medical Welfare, Kurashiki, Okayama, Japan
- Graduate School of Health and Welfare Sciences, Okayama Prefectural University, Soja, Okayama, Japan
| | - Katsuya Nakamura
- Department of Speech and Hearing Sciences, Faculty of Rehabilitation, Kawasaki University of Medical Welfare, Kurashiki, Okayama, Japan
- Graduate School of Comprehensive Scientific Research, Prefectural University of Hiroshima, Shobara, Hiroshima, Japan
| | - Ayaka Yokozeki
- Department of Neurosurgery, TAKAMATSU Red Cross Hospital, Takamatsu, Kagawa, Japan
| | - Shinsuke Nagami
- Department of Communication Disorders, School of Rehabilitation Sciences, Health Sciences University of Hokkaido, 1757, Ishikari-gun, Kanazawa, Tobetsu-cho, Hokkaido, 061-0293, Japan.
| |
Collapse
|
2
|
Kothinti SR, Elhilali M. Are acoustics enough? Semantic effects on auditory salience in natural scenes. Front Psychol 2023; 14:1276237. [PMID: 38098516 PMCID: PMC10720592 DOI: 10.3389/fpsyg.2023.1276237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 11/10/2023] [Indexed: 12/17/2023] Open
Abstract
Auditory salience is a fundamental property of a sound that allows it to grab a listener's attention regardless of their attentional state or behavioral goals. While previous research has shed light on acoustic factors influencing auditory salience, the semantic dimensions of this phenomenon have remained relatively unexplored owing both to the complexity of measuring salience in audition as well as limited focus on complex natural scenes. In this study, we examine the relationship between acoustic, contextual, and semantic attributes and their impact on the auditory salience of natural audio scenes using a dichotic listening paradigm. The experiments present acoustic scenes in forward and backward directions; the latter allows to diminish semantic effects, providing a counterpoint to the effects observed in forward scenes. The behavioral data collected from a crowd-sourced platform reveal a striking convergence in temporal salience maps for certain sound events, while marked disparities emerge in others. Our main hypothesis posits that differences in the perceptual salience of events are predominantly driven by semantic and contextual cues, particularly evident in those cases displaying substantial disparities between forward and backward presentations. Conversely, events exhibiting a high degree of alignment can largely be attributed to low-level acoustic attributes. To evaluate this hypothesis, we employ analytical techniques that combine rich low-level mappings from acoustic profiles with high-level embeddings extracted from a deep neural network. This integrated approach captures both acoustic and semantic attributes of acoustic scenes along with their temporal trajectories. The results demonstrate that perceptual salience is a careful interplay between low-level and high-level attributes that shapes which moments stand out in a natural soundscape. Furthermore, our findings underscore the important role of longer-term context as a critical component of auditory salience, enabling us to discern and adapt to temporal regularities within an acoustic scene. The experimental and model-based validation of semantic factors of salience paves the way for a complete understanding of auditory salience. Ultimately, the empirical and computational analyses have implications for developing large-scale models for auditory salience and audio analytics.
Collapse
Affiliation(s)
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, MD, United States
| |
Collapse
|
3
|
Lorenzi C, Apoux F, Grinfeder E, Krause B, Miller-Viacava N, Sueur J. Human Auditory Ecology: Extending Hearing Research to the Perception of Natural Soundscapes by Humans in Rapidly Changing Environments. Trends Hear 2023; 27:23312165231212032. [PMID: 37981813 PMCID: PMC10658775 DOI: 10.1177/23312165231212032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 10/13/2023] [Accepted: 10/18/2023] [Indexed: 11/21/2023] Open
Abstract
Research in hearing sciences has provided extensive knowledge about how the human auditory system processes speech and assists communication. In contrast, little is known about how this system processes "natural soundscapes," that is the complex arrangements of biological and geophysical sounds shaped by sound propagation through non-anthropogenic habitats [Grinfeder et al. (2022). Frontiers in Ecology and Evolution. 10: 894232]. This is surprising given that, for many species, the capacity to process natural soundscapes determines survival and reproduction through the ability to represent and monitor the immediate environment. Here we propose a framework to encourage research programmes in the field of "human auditory ecology," focusing on the study of human auditory perception of ecological processes at work in natural habitats. Based on large acoustic databases with high ecological validity, these programmes should investigate the extent to which this presumably ancestral monitoring function of the human auditory system is adapted to specific information conveyed by natural soundscapes, whether it operate throughout the life span or whether it emerges through individual learning or cultural transmission. Beyond fundamental knowledge of human hearing, these programmes should yield a better understanding of how normal-hearing and hearing-impaired listeners monitor rural and city green and blue spaces and benefit from them, and whether rehabilitation devices (hearing aids and cochlear implants) restore natural soundscape perception and emotional responses back to normal. Importantly, they should also reveal whether and how humans hear the rapid changes in the environment brought about by human activity.
Collapse
Affiliation(s)
- Christian Lorenzi
- Laboratoire des Systèmes Perceptifs, UMR CNRS 8248, Département d’Etudes Cognitives, Ecole Normale Supérieure, Université Paris Sciences et Lettres (PSL), Paris, France
| | - Frédéric Apoux
- Laboratoire des Systèmes Perceptifs, UMR CNRS 8248, Département d’Etudes Cognitives, Ecole Normale Supérieure, Université Paris Sciences et Lettres (PSL), Paris, France
| | - Elie Grinfeder
- Laboratoire des Systèmes Perceptifs, UMR CNRS 8248, Département d’Etudes Cognitives, Ecole Normale Supérieure, Université Paris Sciences et Lettres (PSL), Paris, France
- Institut de Systématique, Évolution, Biodiversité (ISYEB), Muséum national d’Histoire naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
| | | | - Nicole Miller-Viacava
- Laboratoire des Systèmes Perceptifs, UMR CNRS 8248, Département d’Etudes Cognitives, Ecole Normale Supérieure, Université Paris Sciences et Lettres (PSL), Paris, France
| | - Jérôme Sueur
- Institut de Systématique, Évolution, Biodiversité (ISYEB), Muséum national d’Histoire naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
| |
Collapse
|
4
|
Kothinti SR, Huang N, Elhilali M. Auditory salience using natural scenes: An online study. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:2952. [PMID: 34717500 PMCID: PMC8528551 DOI: 10.1121/10.0006750] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 08/13/2021] [Accepted: 09/29/2021] [Indexed: 05/12/2023]
Abstract
Salience is the quality of a sensory signal that attracts involuntary attention in humans. While it primarily reflects conspicuous physical attributes of a scene, our understanding of processes underlying what makes a certain object or event salient remains limited. In the vision literature, experimental results, theoretical accounts, and large amounts of eye-tracking data using rich stimuli have shed light on some of the underpinnings of visual salience in the brain. In contrast, studies of auditory salience have lagged behind due to limitations in both experimental designs and stimulus datasets used to probe the question of salience in complex everyday soundscapes. In this work, we deploy an online platform to study salience using a dichotic listening paradigm with natural auditory stimuli. The study validates crowd-sourcing as a reliable platform to collect behavioral responses to auditory salience by comparing experimental outcomes to findings acquired in a controlled laboratory setting. A model-based analysis demonstrates the benefits of extending behavioral measures of salience to broader selection of auditory scenes and larger pools of subjects. Overall, this effort extends our current knowledge of auditory salience in everyday soundscapes and highlights the limitations of low-level acoustic attributes in capturing the richness of natural soundscapes.
Collapse
Affiliation(s)
- Sandeep Reddy Kothinti
- Department of Electrical and Computer Engineering, Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Nicholas Huang
- Department of Biomedical Engineering, The Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, Maryland 21218, USA
| |
Collapse
|
5
|
de Kerangal M, Vickers D, Chait M. The effect of healthy aging on change detection and sensitivity to predictable structure in crowded acoustic scenes. Hear Res 2020; 399:108074. [PMID: 33041093 DOI: 10.1016/j.heares.2020.108074] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 08/01/2020] [Accepted: 09/01/2020] [Indexed: 01/25/2023]
Abstract
The auditory system plays a critical role in supporting our ability to detect abrupt changes in our surroundings. Here we study how this capacity is affected in the course of healthy ageing. Artifical acoustic 'scenes', populated by multiple concurrent streams of pure tones ('sources') were used to capture the challenges of listening in complex acoustic environments. Two scene conditions were included: REG scenes consisted of sources characterized by a regular temporal structure. Matched RAND scenes contained sources which were temporally random. Changes, manifested as the abrupt disappearance of one of the sources, were introduced to a subset of the trials and participants ('young' group N = 41, age 20-38 years; 'older' group N = 41, age 60-82 years) were instructed to monitor the scenes for these events. Previous work demonstrated that young listeners exhibit better change detection performance in REG scenes, reflecting sensitivity to temporal structure. Here we sought to determine: (1) Whether 'baseline' change detection ability (i.e. in RAND scenes) is affected by age. (2) Whether aging affects listeners' sensitivity to temporal regularity. (3) How change detection capacity relates to listeners' hearing and cognitive profile (a battery of tests that capture hearing and cognitive abilities hypothesized to be affected by aging). The results demonstrated that healthy aging is associated with reduced sensitivity to abrupt scene changes in RAND scenes but that performance does not correlate with age or standard audiological measures such as pure tone audiometry or speech in noise performance. Remarkably older listeners' change detection performance improved substantially (up to the level exhibited by young listeners) in REG relative to RAND scenes. This suggests that the ability to extract and track the regularity associated with scene sources, even in crowded acoustic environments, is relatively preserved in older listeners.
Collapse
Affiliation(s)
- Mathilde de Kerangal
- Ear Institute, University College London, 332 Gray's Inn Road, London WC1 X 8EE, UK
| | - Deborah Vickers
- Ear Institute, University College London, 332 Gray's Inn Road, London WC1 X 8EE, UK; Cambridge Hearing Group, Clinical Neurosciences Department, University of Cambridge, UK
| | - Maria Chait
- Ear Institute, University College London, 332 Gray's Inn Road, London WC1 X 8EE, UK.
| |
Collapse
|
6
|
Sohoglu E, Kumar S, Chait M, Griffiths TD. Multivoxel codes for representing and integrating acoustic features in human cortex. Neuroimage 2020; 217:116661. [PMID: 32081785 PMCID: PMC7339141 DOI: 10.1016/j.neuroimage.2020.116661] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 02/13/2020] [Accepted: 02/15/2020] [Indexed: 10/25/2022] Open
Abstract
Using fMRI and multivariate pattern analysis, we determined whether spectral and temporal acoustic features are represented by independent or integrated multivoxel codes in human cortex. Listeners heard band-pass noise varying in frequency (spectral) and amplitude-modulation (AM) rate (temporal) features. In the superior temporal plane, changes in multivoxel activity due to frequency were largely invariant with respect to AM rate (and vice versa), consistent with an independent representation. In contrast, in posterior parietal cortex, multivoxel representation was exclusively integrated and tuned to specific conjunctions of frequency and AM features (albeit weakly). Direct between-region comparisons show that whereas independent coding of frequency weakened with increasing levels of the hierarchy, such a progression for AM and integrated coding was less fine-grained and only evident in the higher hierarchical levels from non-core to parietal cortex (with AM coding weakening and integrated coding strengthening). Our findings support the notion that primary auditory cortex can represent spectral and temporal acoustic features in an independent fashion and suggest a role for parietal cortex in feature integration and the structuring of sensory input.
Collapse
Affiliation(s)
- Ediz Sohoglu
- School of Psychology, University of Sussex, Brighton, BN1 9QH, United Kingdom.
| | - Sukhbinder Kumar
- Institute of Neurobiology, Medical School, Newcastle University, Newcastle Upon Tyne, NE2 4HH, United Kingdom; Wellcome Trust Centre for Human Neuroimaging, University College London, London, WC1N 3BG, United Kingdom
| | - Maria Chait
- Ear Institute, University College London, London, United Kingdom
| | - Timothy D Griffiths
- Institute of Neurobiology, Medical School, Newcastle University, Newcastle Upon Tyne, NE2 4HH, United Kingdom; Wellcome Trust Centre for Human Neuroimaging, University College London, London, WC1N 3BG, United Kingdom
| |
Collapse
|
7
|
Liao HI, Yoneya M, Kashino M, Furukawa S. Pupillary dilation response reflects surprising moments in music. J Eye Mov Res 2018; 11. [PMID: 33828696 PMCID: PMC7899049 DOI: 10.16910/jemr.11.2.13] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
There are indications that the pupillary dilation response (PDR) reflects surprising moments in an auditory sequence such as the appearance of a deviant noise against repetitively presented pure tones (4), and salient and loud sounds that are evaluated by human paricipants subjectively (12). In the current study, we further examined whether the reflection of PDR in auditory surprise can be accumulated and revealed in complex and yet structured auditory stimuli, i.e., music, and when the surprise is defined subjectively. Participants listened to 15 excerpts of music while their pupillary responses were recorded. In the surprise-rating session, participants rated how surprising an instance in the excerpt was, i.e., rich in variation versus monotonous, while they listened to it. In the passive-listening session, they listened to the same 15 excerpts again but were not involved in any task. The pupil diameter data obtained from both sessions were time-aligned to the rating data obtained from the surprise-rating session. Results showed that in both sessions, mean pupil diameter was larger at moments rated more surprising than unsurprising. The result suggests that the PDR reflects surprise in music automatically.
Collapse
Affiliation(s)
- Hsin-I Liao
- NTT Communication Science Laboratories, NTT Cooperation,, Japan
| | - Makoto Yoneya
- NTT Communication Science Laboratories, NTT Cooperation,, Japan
| | - Makio Kashino
- NTT Communication Science Laboratories, NTT Cooperation,, Japan
| | | |
Collapse
|
8
|
Huang N, Slaney M, Elhilali M. Connecting Deep Neural Networks to Physical, Perceptual, and Electrophysiological Auditory Signals. Front Neurosci 2018; 12:532. [PMID: 30154688 PMCID: PMC6102345 DOI: 10.3389/fnins.2018.00532] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2018] [Accepted: 07/16/2018] [Indexed: 11/13/2022] Open
Abstract
Deep neural networks have been recently shown to capture intricate information transformation of signals from the sensory profiles to semantic representations that facilitate recognition or discrimination of complex stimuli. In this vein, convolutional neural networks (CNNs) have been used very successfully in image and audio classification. Designed to imitate the hierarchical structure of the nervous system, CNNs reflect activation with increasing degrees of complexity that transform the incoming signal onto object-level representations. In this work, we employ a CNN trained for large-scale audio object classification to gain insights about the contribution of various audio representations that guide sound perception. The analysis contrasts activation of different layers of a CNN with acoustic features extracted directly from the scenes, perceptual salience obtained from behavioral responses of human listeners, as well as neural oscillations recorded by electroencephalography (EEG) in response to the same natural scenes. All three measures are tightly linked quantities believed to guide percepts of salience and object formation when listening to complex scenes. The results paint a picture of the intricate interplay between low-level and object-level representations in guiding auditory salience that is very much dependent on context and sound category.
Collapse
Affiliation(s)
- Nicholas Huang
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Malcolm Slaney
- Machine Hearing, Google AI, Google (United States), Mountain View, CA, United States
| | - Mounya Elhilali
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, United States
| |
Collapse
|
9
|
Murphy S, Spence C, Dalton P. Auditory perceptual load: A review. Hear Res 2017; 352:40-48. [DOI: 10.1016/j.heares.2017.02.005] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Revised: 12/21/2016] [Accepted: 02/05/2017] [Indexed: 11/26/2022]
|