1
|
Borjigin A, Bharadwaj HM. Individual Differences Elucidate the Perceptual Benefits Associated with Robust Temporal Fine-Structure Processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.20.558670. [PMID: 37790457 PMCID: PMC10542537 DOI: 10.1101/2023.09.20.558670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The auditory system is unique among sensory systems in its ability to phase lock to and precisely follow very fast cycle-by-cycle fluctuations in the phase of sound-driven cochlear vibrations. Yet, the perceptual role of this temporal fine structure (TFS) code is debated. This fundamental gap is attributable to our inability to experimentally manipulate TFS cues without altering other perceptually relevant cues. Here, we circumnavigated this limitation by leveraging individual differences across 200 participants to systematically compare variations in TFS sensitivity to performance in a range of speech perception tasks. TFS sensitivity was assessed through detection of interaural time/phase differences, while speech perception was evaluated by word identification under noise interference. Results suggest that greater TFS sensitivity is not associated with greater masking release from fundamental-frequency or spatial cues, but appears to contribute to resilience against the effects of reverberation. We also found that greater TFS sensitivity is associated with faster response times, indicating reduced listening effort. These findings highlight the perceptual significance of TFS coding for everyday hearing.
Collapse
Affiliation(s)
- Agudemu Borjigin
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA
- Waisman Center, University of Wisconsin - Madison, Madison, WI 53705, USA
| | - Hari M. Bharadwaj
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN 47907, USA
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA 15213, USA
| |
Collapse
|
2
|
Viswanathan V, Heinz MG, Shinn-Cunningham BG. Impact of Reduced Spectral Resolution on Temporal-Coherence-Based Source Segregation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.11.584489. [PMID: 38586037 PMCID: PMC10998286 DOI: 10.1101/2024.03.11.584489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Hearing-impaired listeners struggle to understand speech in noise, even when using cochlear implants (CIs) or hearing aids. Successful listening in noisy environments depends on the brain's ability to organize a mixture of sound sources into distinct perceptual streams (i.e., source segregation). In normal-hearing listeners, temporal coherence of sound fluctuations across frequency channels supports this process by promoting grouping of elements belonging to a single acoustic source. We hypothesized that reduced spectral resolution-a hallmark of both electric/CI (from current spread) and acoustic (from broadened tuning) hearing with sensorineural hearing loss-degrades segregation based on temporal coherence. This is because reduced frequency resolution decreases the likelihood that a single sound source dominates the activity driving any specific channel; concomitantly, it increases the correlation in activity across channels. Consistent with our hypothesis, predictions from a physiologically plausible model of temporal-coherence-based segregation suggest that CI current spread reduces comodulation masking release (CMR; a correlate of temporal-coherence processing) and speech intelligibility in noise. These predictions are consistent with our behavioral data with simulated CI listening. Our model also predicts smaller CMR with increasing levels of outer-hair-cell damage. These results suggest that reduced spectral resolution relative to normal hearing impairs temporal-coherence-based segregation and speech-in-noise outcomes.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Neuroscience Institute, Carnegie Mellon University, Pitttsburgh, PA 15213
| | - Michael G. Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN 47907
| | | |
Collapse
|
3
|
Mok BA, Viswanathan V, Borjigin A, Singh R, Kafi H, Bharadwaj HM. Web-based psychoacoustics: Hearing screening, infrastructure, and validation. Behav Res Methods 2024; 56:1433-1448. [PMID: 37326771 PMCID: PMC10704001 DOI: 10.3758/s13428-023-02101-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/01/2023] [Indexed: 06/17/2023]
Abstract
Anonymous web-based experiments are increasingly used in many domains of behavioral research. However, online studies of auditory perception, especially of psychoacoustic phenomena pertaining to low-level sensory processing, are challenging because of limited available control of the acoustics, and the inability to perform audiometry to confirm normal-hearing status of participants. Here, we outline our approach to mitigate these challenges and validate our procedures by comparing web-based measurements to lab-based data on a range of classic psychoacoustic tasks. Individual tasks were created using jsPsych, an open-source JavaScript front-end library. Dynamic sequences of psychoacoustic tasks were implemented using Django, an open-source library for web applications, and combined with consent pages, questionnaires, and debriefing pages. Subjects were recruited via Prolific, a subject recruitment platform for web-based studies. Guided by a meta-analysis of lab-based data, we developed and validated a screening procedure to select participants for (putative) normal-hearing status based on their responses in a suprathreshold task and a survey. Headphone use was standardized by supplementing procedures from prior literature with a binaural hearing task. Individuals meeting all criteria were re-invited to complete a range of classic psychoacoustic tasks. For the re-invited participants, absolute thresholds were in excellent agreement with lab-based data for fundamental frequency discrimination, gap detection, and sensitivity to interaural time delay and level difference. Furthermore, word identification scores, consonant confusion patterns, and co-modulation masking release effect also matched lab-based studies. Our results suggest that web-based psychoacoustics is a viable complement to lab-based research. Source code for our infrastructure is provided.
Collapse
Affiliation(s)
- Brittany A Mok
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA
| | - Vibha Viswanathan
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA
| | - Agudemu Borjigin
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA
| | - Ravinderjit Singh
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA
| | - Homeira Kafi
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA
| | - Hari M Bharadwaj
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
4
|
Viswanathan V, Bharadwaj HM, Heinz MG, Shinn-Cunningham BG. Induced alpha and beta electroencephalographic rhythms covary with single-trial speech intelligibility in competition. Sci Rep 2023; 13:10216. [PMID: 37353552 PMCID: PMC10290148 DOI: 10.1038/s41598-023-37173-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 06/17/2023] [Indexed: 06/25/2023] Open
Abstract
Neurophysiological studies suggest that intrinsic brain oscillations influence sensory processing, especially of rhythmic stimuli like speech. Prior work suggests that brain rhythms may mediate perceptual grouping and selective attention to speech amidst competing sound, as well as more linguistic aspects of speech processing like predictive coding. However, we know of no prior studies that have directly tested, at the single-trial level, whether brain oscillations relate to speech-in-noise outcomes. Here, we combined electroencephalography while simultaneously measuring intelligibility of spoken sentences amidst two different interfering sounds: multi-talker babble or speech-shaped noise. We find that induced parieto-occipital alpha (7-15 Hz; thought to modulate attentional focus) and frontal beta (13-30 Hz; associated with maintenance of the current sensorimotor state and predictive coding) oscillations covary with trial-wise percent-correct scores; importantly, alpha and beta power provide significant independent contributions to predicting single-trial behavioral outcomes. These results can inform models of speech processing and guide noninvasive measures to index different neural processes that together support complex listening.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, 15213, USA.
| | - Hari M Bharadwaj
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Michael G Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | | |
Collapse
|
5
|
Viswanathan V, Bharadwaj HM, Heinz MG, Shinn-Cunningham BG. Induced Alpha And Beta Electroencephalographic Rhythms Covary With Single-Trial Speech Intelligibility In Competition. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2022.12.31.522365. [PMID: 36712081 PMCID: PMC9884507 DOI: 10.1101/2022.12.31.522365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Neurophysiological studies suggest that intrinsic brain oscillations influence sensory processing, especially of rhythmic stimuli like speech. Prior work suggests that brain rhythms may mediate perceptual grouping and selective attention to speech amidst competing sound, as well as more linguistic aspects of speech processing like predictive coding. However, we know of no prior studies that have directly tested, at the single-trial level, whether brain oscillations relate to speech-in-noise outcomes. Here, we combined electroencephalography while simultaneously measuring intelligibility of spoken sentences amidst two different interfering sounds: multi-talker babble or speech-shaped noise. We find that induced parieto-occipital alpha (7-15 Hz; thought to modulate attentional focus) and frontal beta (13-30 Hz; associated with maintenance of the current sensorimotor state and predictive coding) oscillations covary with trial-wise percent-correct scores; importantly, alpha and beta power provide significant independent contributions to predicting single-trial behavioral outcomes. These results can inform models of speech processing and guide noninvasive measures to index different neural processes that together support complex listening.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Neuroscience Institute, Carnegie Mellon University, Pitttsburgh, PA 15213
| | - Hari M. Bharadwaj
- Department of Communication Science and Disorders, University of Pittsburgh, Pitttsburgh, PA 15260
| | - Michael G. Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN 47907
| | | |
Collapse
|
6
|
Individualized Assays of Temporal Coding in the Ascending Human Auditory System. eNeuro 2022; 9:ENEURO.0378-21.2022. [PMID: 35193890 PMCID: PMC8925652 DOI: 10.1523/eneuro.0378-21.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 01/12/2022] [Accepted: 02/08/2022] [Indexed: 11/21/2022] Open
Abstract
Neural phase-locking to temporal fluctuations is a fundamental and unique mechanism by which acoustic information is encoded by the auditory system. The perceptual role of this metabolically expensive mechanism, the neural phase-locking to temporal fine structure (TFS) in particular, is debated. Although hypothesized, it is unclear whether auditory perceptual deficits in certain clinical populations are attributable to deficits in TFS coding. Efforts to uncover the role of TFS have been impeded by the fact that there are no established assays for quantifying the fidelity of TFS coding at the individual level. While many candidates have been proposed, for an assay to be useful, it should not only intrinsically depend on TFS coding, but should also have the property that individual differences in the assay reflect TFS coding per se over and beyond other sources of variance. Here, we evaluate a range of behavioral and electroencephalogram (EEG)-based measures as candidate individualized measures of TFS sensitivity. Our comparisons of behavioral and EEG-based metrics suggest that extraneous variables dominate both behavioral scores and EEG amplitude metrics, rendering them ineffective. After adjusting behavioral scores using lapse rates, and extracting latency or percent-growth metrics from EEG, interaural timing sensitivity measures exhibit robust behavior-EEG correlations. Together with the fact that unambiguous theoretical links can be made relating binaural measures and phase-locking to TFS, our results suggest that these "adjusted" binaural assays may be well suited for quantifying individual TFS processing.
Collapse
|
7
|
Viswanathan V, Shinn-Cunningham BG, Heinz MG. Speech Categorization Reveals the Role of Early-Stage Temporal-Coherence Processing in Auditory Scene Analysis. J Neurosci 2022; 42:240-254. [PMID: 34764159 PMCID: PMC8802934 DOI: 10.1523/jneurosci.1610-21.2021] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 10/18/2021] [Accepted: 10/26/2021] [Indexed: 11/21/2022] Open
Abstract
Temporal coherence of sound fluctuations across spectral channels is thought to aid auditory grouping and scene segregation. Although prior studies on the neural bases of temporal-coherence processing focused mostly on cortical contributions, neurophysiological evidence suggests that temporal-coherence-based scene analysis may start as early as the cochlear nucleus (i.e., the first auditory region supporting cross-channel processing over a wide frequency range). Accordingly, we hypothesized that aspects of temporal-coherence processing that could be realized in early auditory areas may shape speech understanding in noise. We then explored whether physiologically plausible computational models could account for results from a behavioral experiment that measured consonant categorization in different masking conditions. We tested whether within-channel masking of target-speech modulations predicted consonant confusions across the different conditions and whether predictions were improved by adding across-channel temporal-coherence processing mirroring the computations known to exist in the cochlear nucleus. Consonant confusions provide a rich characterization of error patterns in speech categorization, and are thus crucial for rigorously testing models of speech perception; however, to the best of our knowledge, they have not been used in prior studies of scene analysis. We find that within-channel modulation masking can reasonably account for category confusions, but that it fails when temporal fine structure cues are unavailable. However, the addition of across-channel temporal-coherence processing significantly improves confusion predictions across all tested conditions. Our results suggest that temporal-coherence processing strongly shapes speech understanding in noise and that physiological computations that exist early along the auditory pathway may contribute to this process.SIGNIFICANCE STATEMENT Temporal coherence of sound fluctuations across distinct frequency channels is thought to be important for auditory scene analysis. Prior studies on the neural bases of temporal-coherence processing focused mostly on cortical contributions, and it was unknown whether speech understanding in noise may be shaped by across-channel processing that exists in earlier auditory areas. Using physiologically plausible computational modeling to predict consonant confusions across different listening conditions, we find that across-channel temporal coherence contributes significantly to scene analysis and speech perception and that such processing may arise in the auditory pathway as early as the brainstem. By virtue of providing a richer characterization of error patterns not obtainable with just intelligibility scores, consonant confusions yield unique insight into scene analysis mechanisms.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana 47907
| | | | - Michael G Heinz
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana 47907
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907
| |
Collapse
|
8
|
Viswanathan V, Shinn-Cunningham BG, Heinz MG. Temporal fine structure influences voicing confusions for consonant identification in multi-talker babble. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:2664. [PMID: 34717498 PMCID: PMC8514254 DOI: 10.1121/10.0006527] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 09/07/2021] [Accepted: 09/09/2021] [Indexed: 05/17/2023]
Abstract
To understand the mechanisms of speech perception in everyday listening environments, it is important to elucidate the relative contributions of different acoustic cues in transmitting phonetic content. Previous studies suggest that the envelope of speech in different frequency bands conveys most speech content, while the temporal fine structure (TFS) can aid in segregating target speech from background noise. However, the role of TFS in conveying phonetic content beyond what envelopes convey for intact speech in complex acoustic scenes is poorly understood. The present study addressed this question using online psychophysical experiments to measure the identification of consonants in multi-talker babble for intelligibility-matched intact and 64-channel envelope-vocoded stimuli. Consonant confusion patterns revealed that listeners had a greater tendency in the vocoded (versus intact) condition to be biased toward reporting that they heard an unvoiced consonant, despite envelope and place cues being largely preserved. This result was replicated when babble instances were varied across independent experiments, suggesting that TFS conveys voicing information beyond what is conveyed by envelopes for intact speech in babble. Given that multi-talker babble is a masker that is ubiquitous in everyday environments, this finding has implications for the design of assistive listening devices such as cochlear implants.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana 47907, USA
| | | | - Michael G. Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| |
Collapse
|