1
|
van der Willigen RF, Versnel H, van Opstal AJ. Spectral-temporal processing of naturalistic sounds in monkeys and humans. J Neurophysiol 2024; 131:38-63. [PMID: 37965933 DOI: 10.1152/jn.00129.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 10/23/2023] [Accepted: 11/13/2023] [Indexed: 11/16/2023] Open
Abstract
Human speech and vocalizations in animals are rich in joint spectrotemporal (S-T) modulations, wherein acoustic changes in both frequency and time are functionally related. In principle, the primate auditory system could process these complex dynamic sounds based on either an inseparable representation of S-T features or, alternatively, a separable representation. The separability hypothesis implies an independent processing of spectral and temporal modulations. We collected comparative data on the S-T hearing sensitivity in humans and macaque monkeys to a wide range of broadband dynamic spectrotemporal ripple stimuli employing a yes-no signal-detection task. Ripples were systematically varied, as a function of density (spectral modulation frequency), velocity (temporal modulation frequency), or modulation depth, to cover a listener's full S-T modulation sensitivity, derived from a total of 87 psychometric ripple detection curves. Audiograms were measured to control for normal hearing. Determined were hearing thresholds, reaction time distributions, and S-T modulation transfer functions (MTFs), both at the ripple detection thresholds and at suprathreshold modulation depths. Our psychophysically derived MTFs are consistent with the hypothesis that both monkeys and humans employ analogous perceptual strategies: S-T acoustic information is primarily processed separable. Singular value decomposition (SVD), however, revealed a small, but consistent, inseparable spectral-temporal interaction. Finally, SVD analysis of the known visual spatiotemporal contrast sensitivity function (CSF) highlights that human vision is space-time inseparable to a much larger extent than is the case for S-T sensitivity in hearing. Thus, the specificity with which the primate brain encodes natural sounds appears to be less strict than is required to adequately deal with natural images.NEW & NOTEWORTHY We provide comparative data on primate audition of naturalistic sounds comprising hearing thresholds, reaction time distributions, and spectral-temporal modulation transfer functions. Our psychophysical experiments demonstrate that auditory information is primarily processed in a spectral-temporal-independent manner by both monkeys and humans. Singular value decomposition of known visual spatiotemporal contrast sensitivity, in comparison to our auditory spectral-temporal sensitivity, revealed a striking contrast in how the brain encodes natural sounds as opposed to natural images, as vision appears to be space-time inseparable.
Collapse
Affiliation(s)
- Robert F van der Willigen
- Section Neurophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
- School of Communication, Media and Information Technology, Rotterdam University of Applied Sciences, Rotterdam, The Netherlands
- Research Center Creating 010, Rotterdam University of Applied Sciences, Rotterdam, The Netherlands
| | - Huib Versnel
- Section Neurophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
- Department of Otorhinolaryngology and Head & Neck Surgery, UMC Utrecht Brain Center, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - A John van Opstal
- Section Neurophysics, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
2
|
Tao DD, Shi B, Galvin JJ, Liu JS, Fu QJ. Frequency detection, frequency discrimination, and spectro-temporal pattern perception in older and younger typically hearing adults. Heliyon 2023; 9:e18922. [PMID: 37583764 PMCID: PMC10424075 DOI: 10.1016/j.heliyon.2023.e18922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 07/14/2023] [Accepted: 08/02/2023] [Indexed: 08/17/2023] Open
Abstract
Elderly adults often experience difficulties in speech understanding, possibly due to age-related deficits in frequency perception. It is unclear whether age-related deficits in frequency perception differ between the apical or basal regions of the cochlea. It is also unclear how aging might differently affect frequency discrimination or detection of a change in frequency within a stimulus. In the present study, pure-tone frequency thresholds were measured in 19 older (61-74 years) and 20 younger (22-28 years) typically hearing adults. Participants were asked to discriminate between reference and probe frequencies or to detect changes in frequency within a probe stimulus. Broadband spectro-temporal pattern perception was also measured using the spectro-temporal modulated ripple test (SMRT). Frequency thresholds were significantly poorer in the basal than in the apical region of the cochlea; the deficit in the basal region was 2 times larger for the older than for the younger group. Frequency thresholds were significantly poorer in the older group, especially in the basal region where frequency detection thresholds were 3.9 times poorer for the older than for the younger group. SMRT thresholds were 1.5 times better for the younger than for the older group. Significant age effects were observed for SMRT thresholds and for frequency thresholds only in the basal region. SMRT thresholds were significantly correlated with frequency thresholds only in the older group. The poorer frequency and spectro-temporal pattern perception may contribute to age-related deficits in speech perception, even when audiometric thresholds are nearly normal.
Collapse
Affiliation(s)
- Duo-Duo Tao
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Bin Shi
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - John J. Galvin
- House Institute Foundation, Los Angeles, CA, 90057, USA
- University Hospital Center of Tours, Tours, 37000, France
| | - Ji-Sheng Liu
- Department of Ear, Nose, and Throat, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Qian-Jie Fu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095, USA
| |
Collapse
|
3
|
Veugen LCE, van Opstal AJ, van Wanrooij MM. Reaction Time Sensitivity to Spectrotemporal Modulations of Sound. Trends Hear 2022; 26:23312165221127589. [PMID: 36172759 PMCID: PMC9523861 DOI: 10.1177/23312165221127589] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We tested whether sensitivity to acoustic spectrotemporal modulations can be observed from reaction times for normal-hearing and impaired-hearing conditions. In a manual reaction-time task, normal-hearing listeners had to detect the onset of a ripple (with density between 0–8 cycles/octave and a fixed modulation depth of 50%), that moved up or down the log-frequency axis at constant velocity (between 0–64 Hz), in an otherwise-unmodulated broadband white-noise. Spectral and temporal modulations elicited band-pass filtered sensitivity characteristics, with fastest detection rates around 1 cycle/oct and 32 Hz for normal-hearing conditions. These results closely resemble data from other studies that typically used the modulation-depth threshold as a sensitivity criterion. To simulate hearing-impairment, stimuli were processed with a 6-channel cochlear-implant vocoder, and a hearing-aid simulation that introduced separate spectral smearing and low-pass filtering. Reaction times were always much slower compared to normal hearing, especially for the highest spectral densities. Binaural performance was predicted well by the benchmark race model of binaural independence, which models statistical facilitation of independent monaural channels. For the impaired-hearing simulations this implied a “best-of-both-worlds” principle in which the listeners relied on the hearing-aid ear to detect spectral modulations, and on the cochlear-implant ear for temporal-modulation detection. Although singular-value decomposition indicated that the joint spectrotemporal sensitivity matrix could be largely reconstructed from independent temporal and spectral sensitivity functions, in line with time-spectrum separability, a substantial inseparable spectral-temporal interaction was present in all hearing conditions. These results suggest that the reaction-time task yields a valid and effective objective measure of acoustic spectrotemporal-modulation sensitivity.
Collapse
Affiliation(s)
- Lidwien C E Veugen
- Department of Biophysics, Donders Institute for Brain, Cognition and Behavior, 6029Radboud University, Nijmegen, Netherlands
| | - A John van Opstal
- Department of Biophysics, Donders Institute for Brain, Cognition and Behavior, 6029Radboud University, Nijmegen, Netherlands
| | - Marc M van Wanrooij
- Department of Biophysics, Donders Institute for Brain, Cognition and Behavior, 6029Radboud University, Nijmegen, Netherlands
| |
Collapse
|
4
|
Nechaev DI, Milekhina ON, Tomozova MS, Supin AY. High Ripple-Density Resolution for Discriminating Between Rippled and Nonrippled Signals: Effect of Temporal Processing or Combination Products? Trends Hear 2021; 25:23312165211010163. [PMID: 33926309 PMCID: PMC8111533 DOI: 10.1177/23312165211010163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 03/19/2021] [Accepted: 03/24/2021] [Indexed: 11/17/2022] Open
Abstract
The goal of the study was to investigate the role of combination products in the higher ripple-density resolution estimates obtained by discrimination between a spectrally rippled and a nonrippled noise signal than that obtained by discrimination between two rippled signals. To attain this goal, a noise band was used to mask the frequency band of expected low-frequency combination products. A three-alternative forced-choice procedure with adaptive ripple-density variation was used. The mean background (unmasked) ripple-density resolution was 9.8 ripples/oct for rippled reference signals and 21.8 ripples/oct for nonrippled reference signals. Low-frequency maskers reduced the ripple-density resolution. For masker levels from -10 to 10 dB re. signal, the ripple-density resolution for nonrippled reference signals was approximately twice as high as that for rippled reference signals. At a masker level as high as 20 dB re. signal, the ripple-density resolution decreased in both discrimination tasks. This result leads to the conclusion that low-frequency combination products are not responsible for the task-dependent difference in ripple-density resolution estimates.
Collapse
Affiliation(s)
- Dmitry I. Nechaev
- Institute of Ecology and Evolution, Russian Academy of Sciences, Moscow, Russian Federation
| | - Olga N. Milekhina
- Institute of Ecology and Evolution, Russian Academy of Sciences, Moscow, Russian Federation
| | - Marina S. Tomozova
- Institute of Ecology and Evolution, Russian Academy of Sciences, Moscow, Russian Federation
| | - Alexander Y. Supin
- Institute of Ecology and Evolution, Russian Academy of Sciences, Moscow, Russian Federation
| |
Collapse
|
5
|
Ponsot E, Varnet L, Wallaert N, Daoud E, Shamma SA, Lorenzi C, Neri P. Mechanisms of Spectrotemporal Modulation Detection for Normal- and Hearing-Impaired Listeners. Trends Hear 2021; 25:2331216520978029. [PMID: 33620023 PMCID: PMC7905488 DOI: 10.1177/2331216520978029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 10/26/2020] [Accepted: 11/06/2020] [Indexed: 11/20/2022] Open
Abstract
Spectrotemporal modulations (STM) are essential features of speech signals that make them intelligible. While their encoding has been widely investigated in neurophysiology, we still lack a full understanding of how STMs are processed at the behavioral level and how cochlear hearing loss impacts this processing. Here, we introduce a novel methodological framework based on psychophysical reverse correlation deployed in the modulation space to characterize the mechanisms underlying STM detection in noise. We derive perceptual filters for young normal-hearing and older hearing-impaired individuals performing a detection task of an elementary target STM (a given product of temporal and spectral modulations) embedded in other masking STMs. Analyzed with computational tools, our data show that both groups rely on a comparable linear (band-pass)-nonlinear processing cascade, which can be well accounted for by a temporal modulation filter bank model combined with cross-correlation against the target representation. Our results also suggest that the modulation mistuning observed for the hearing-impaired group results primarily from broader cochlear filters. Yet, we find idiosyncratic behaviors that cannot be captured by cochlear tuning alone, highlighting the need to consider variability originating from additional mechanisms. Overall, this integrated experimental-computational approach offers a principled way to assess suprathreshold processing distortions in each individual and could thus be used to further investigate interindividual differences in speech intelligibility.
Collapse
Affiliation(s)
- Emmanuel Ponsot
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
- Hearing Technology @ WAVES, Department of Information
Technology, Ghent University, Ghent, Belgium
| | - Léo Varnet
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Nicolas Wallaert
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Elza Daoud
- Aix-Marseille Université, UMR CNRS 7260, Laboratoire
Neurosciences Intégratives et Adaptatives, Centre Saint-Charles,
Marseille, France
| | - Shihab A. Shamma
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Christian Lorenzi
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Peter Neri
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| |
Collapse
|