1
|
Skoe E, Kraus N. Neural Delays in Processing Speech in Background Noise Minimized after Short-Term Auditory Training. BIOLOGY 2024; 13:509. [PMID: 39056702 PMCID: PMC11273880 DOI: 10.3390/biology13070509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 06/19/2024] [Accepted: 06/24/2024] [Indexed: 07/28/2024]
Abstract
Background noise disrupts the neural processing of sound, resulting in delayed and diminished far-field auditory-evoked responses. In young adults, we previously provided evidence that cognitively based short-term auditory training can ameliorate the impact of background noise on the frequency-following response (FFR), leading to greater neural synchrony to the speech fundamental frequency(F0) in noisy listening conditions. In this same dataset (55 healthy young adults), we now examine whether training-related changes extend to the latency of the FFR, with the prediction of faster neural timing after training. FFRs were measured on two days separated by ~8 weeks. FFRs were elicited by the syllable "da" presented at a signal-to-noise ratio (SNR) of +10 dB SPL relative to a background of multi-talker noise. Half of the participants participated in 20 sessions of computerized training (Listening and Communication Enhancement Program, LACE) between test sessions, while the other half served as Controls. In both groups, half of the participants were non-native speakers of English. In the Control Group, response latencies were unchanged at retest, but for the training group, response latencies were earlier. Findings suggest that auditory training can improve how the adult nervous system responds in noisy listening conditions, as demonstrated by decreased response latencies.
Collapse
Affiliation(s)
- Erika Skoe
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, CT 06269, USA
| | - Nina Kraus
- Department of Communication Sciences, Northwestern University, Evanston, IL 60208, USA;
- Cognitive Sciences, Institute for Neuroscience, Northwestern University, Evanston, IL 60208, USA
- Department of Neurobiology and Physiology, Northwestern University, Evanston, IL 60208, USA
- Department of Linguistics, Northwestern University, Evanston, IL 60208, USA
- Department of Otolaryngology, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
2
|
Bidelman GM, Sisson A, Rizzi R, MacLean J, Baer K. Myogenic artifacts masquerade as neuroplasticity in the auditory frequency-following response. Front Neurosci 2024; 18:1422903. [PMID: 39040631 PMCID: PMC11260751 DOI: 10.3389/fnins.2024.1422903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 06/24/2024] [Indexed: 07/24/2024] Open
Abstract
The frequency-following response (FFR) is an evoked potential that provides a neural index of complex sound encoding in the brain. FFRs have been widely used to characterize speech and music processing, experience-dependent neuroplasticity (e.g., learning and musicianship), and biomarkers for hearing and language-based disorders that distort receptive communication abilities. It is widely assumed that FFRs stem from a mixture of phase-locked neurogenic activity from the brainstem and cortical structures along the hearing neuraxis. In this study, we challenge this prevailing view by demonstrating that upwards of ~50% of the FFR can originate from an unexpected myogenic source: contamination from the postauricular muscle (PAM) vestigial startle reflex. We measured PAM, transient auditory brainstem responses (ABRs), and sustained frequency-following response (FFR) potentials reflecting myogenic (PAM) and neurogenic (ABR/FFR) responses in young, normal-hearing listeners with varying degrees of musical training. We first establish that PAM artifact is present in all ears, varies with electrode proximity to the muscle, and can be experimentally manipulated by directing listeners' eye gaze toward the ear of sound stimulation. We then show this muscular noise easily confounds auditory FFRs, spuriously amplifying responses 3-4-fold with tandem PAM contraction and even explaining putative FFR enhancements observed in highly skilled musicians. Our findings expose a new and unrecognized myogenic source to the FFR that drives its large inter-subject variability and cast doubt on whether changes in the response typically attributed to neuroplasticity/pathology are solely of brain origin.
Collapse
Affiliation(s)
- Gavin M. Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
- Cognitive Science Program, Indiana University, Bloomington, IN, United States
| | - Alexandria Sisson
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
| | - Rose Rizzi
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
| | - Jessica MacLean
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
| | - Kaitlin Baer
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
- Veterans Affairs Medical Center, Memphis, TN, United States
| |
Collapse
|
3
|
Bidelman G, Sisson A, Rizzi R, MacLean J, Baer K. Myogenic artifacts masquerade as neuroplasticity in the auditory frequency-following response (FFR). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.27.564446. [PMID: 37961324 PMCID: PMC10634913 DOI: 10.1101/2023.10.27.564446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The frequency-following response (FFR) is an evoked potential that provides a "neural fingerprint" of complex sound encoding in the brain. FFRs have been widely used to characterize speech and music processing, experience-dependent neuroplasticity (e.g., learning, musicianship), and biomarkers for hearing and language-based disorders that distort receptive communication abilities. It is widely assumed FFRs stem from a mixture of phase-locked neurogenic activity from brainstem and cortical structures along the hearing neuraxis. Here, we challenge this prevailing view by demonstrating upwards of ~50% of the FFR can originate from a non-neural source: contamination from the postauricular muscle (PAM) vestigial startle reflex. We first establish PAM artifact is present in all ears, varies with electrode proximity to the muscle, and can be experimentally manipulated by directing listeners' eye gaze toward the ear of sound stimulation. We then show this muscular noise easily confounds auditory FFRs, spuriously amplifying responses by 3-4x fold with tandem PAM contraction and even explaining putative FFR enhancements observed in highly skilled musicians. Our findings expose a new and unrecognized myogenic source to the FFR that drives its large inter-subject variability and cast doubt on whether changes in the response typically attributed to neuroplasticity/pathology are solely of brain origin.
Collapse
|
4
|
MacLean J, Stirn J, Sisson A, Bidelman GM. Short- and long-term neuroplasticity interact during the perceptual learning of concurrent speech. Cereb Cortex 2024; 34:bhad543. [PMID: 38212291 PMCID: PMC10839853 DOI: 10.1093/cercor/bhad543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 12/20/2023] [Accepted: 12/21/2023] [Indexed: 01/13/2024] Open
Abstract
Plasticity from auditory experience shapes the brain's encoding and perception of sound. However, whether such long-term plasticity alters the trajectory of short-term plasticity during speech processing has yet to be investigated. Here, we explored the neural mechanisms and interplay between short- and long-term neuroplasticity for rapid auditory perceptual learning of concurrent speech sounds in young, normal-hearing musicians and nonmusicians. Participants learned to identify double-vowel mixtures during ~ 45 min training sessions recorded simultaneously with high-density electroencephalography (EEG). We analyzed frequency-following responses (FFRs) and event-related potentials (ERPs) to investigate neural correlates of learning at subcortical and cortical levels, respectively. Although both groups showed rapid perceptual learning, musicians showed faster behavioral decisions than nonmusicians overall. Learning-related changes were not apparent in brainstem FFRs. However, plasticity was highly evident in cortex, where ERPs revealed unique hemispheric asymmetries between groups suggestive of different neural strategies (musicians: right hemisphere bias; nonmusicians: left hemisphere). Source reconstruction and the early (150-200 ms) time course of these effects localized learning-induced cortical plasticity to auditory-sensory brain areas. Our findings reinforce the domain-general benefits of musicianship but reveal that successful speech sound learning is driven by a critical interplay between long- and short-term mechanisms of auditory plasticity, which first emerge at a cortical level.
Collapse
Affiliation(s)
- Jessica MacLean
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
| | - Jack Stirn
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Alexandria Sisson
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- Cognitive Science Program, Indiana University, Bloomington, IN, USA
| |
Collapse
|
5
|
MacLean J, Stirn J, Sisson A, Bidelman GM. Short- and long-term experience-dependent neuroplasticity interact during the perceptual learning of concurrent speech. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.26.559640. [PMID: 37808665 PMCID: PMC10557636 DOI: 10.1101/2023.09.26.559640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Plasticity from auditory experiences shapes brain encoding and perception of sound. However, whether such long-term plasticity alters the trajectory of short-term plasticity during speech processing has yet to be investigated. Here, we explored the neural mechanisms and interplay between short- and long-term neuroplasticity for rapid auditory perceptual learning of concurrent speech sounds in young, normal-hearing musicians and nonmusicians. Participants learned to identify double-vowel mixtures during ∼45 minute training sessions recorded simultaneously with high-density EEG. We analyzed frequency-following responses (FFRs) and event-related potentials (ERPs) to investigate neural correlates of learning at subcortical and cortical levels, respectively. While both groups showed rapid perceptual learning, musicians showed faster behavioral decisions than nonmusicians overall. Learning-related changes were not apparent in brainstem FFRs. However, plasticity was highly evident in cortex, where ERPs revealed unique hemispheric asymmetries between groups suggestive of different neural strategies (musicians: right hemisphere bias; nonmusicians: left hemisphere). Source reconstruction and the early (150-200 ms) time course of these effects localized learning-induced cortical plasticity to auditory-sensory brain areas. Our findings confirm domain-general benefits for musicianship but reveal successful speech sound learning is driven by a critical interplay between long- and short-term mechanisms of auditory plasticity that first emerge at a cortical level.
Collapse
|
6
|
Xu C, Cheng FY, Medina S, Eng E, Gifford R, Smith S. Objective discrimination of bimodal speech using frequency following responses. Hear Res 2023; 437:108853. [PMID: 37441879 DOI: 10.1016/j.heares.2023.108853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 07/03/2023] [Accepted: 07/08/2023] [Indexed: 07/15/2023]
Abstract
Bimodal hearing, in which a contralateral hearing aid is combined with a cochlear implant (CI), provides greater speech recognition benefits than using a CI alone. Factors predicting individual bimodal patient success are not fully understood. Previous studies have shown that bimodal benefits may be driven by a patient's ability to extract fundamental frequency (f0) and/or temporal fine structure cues (e.g., F1). Both of these features may be represented in frequency following responses (FFR) to bimodal speech. Thus, the goals of this study were to: 1) parametrically examine neural encoding of f0 and F1 in simulated bimodal speech conditions; 2) examine objective discrimination of FFRs to bimodal speech conditions using machine learning; 3) explore whether FFRs are predictive of perceptual bimodal benefit. Three vowels (/ε/, /i/, and /ʊ/) with identical f0 were manipulated by a vocoder (right ear) and low-pass filters (left ear) to create five bimodal simulations for evoking FFRs: Vocoder-only, Vocoder +125 Hz, Vocoder +250 Hz, Vocoder +500 Hz, and Vocoder +750 Hz. Perceptual performance on the BKB-SIN test was also measured using the same five configurations. Results suggested that neural representation of f0 and F1 FFR components were enhanced with increasing acoustic bandwidth in the simulated "non-implanted" ear. As spectral differences between vowels emerged in the FFRs with increased acoustic bandwidth, FFRs were more accurately classified and discriminated using a machine learning algorithm. Enhancement of f0 and F1 neural encoding with increasing bandwidth were collectively predictive of perceptual bimodal benefit on a speech-in-noise task. Given these results, FFR may be a useful tool to objectively assess individual variability in bimodal hearing.
Collapse
Affiliation(s)
- Can Xu
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, 2504A Whitis Ave. (A1100), Austin 78712-0114, TX, USA
| | - Fan-Yin Cheng
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, 2504A Whitis Ave. (A1100), Austin 78712-0114, TX, USA
| | - Sarah Medina
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, 2504A Whitis Ave. (A1100), Austin 78712-0114, TX, USA
| | - Erica Eng
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, 2504A Whitis Ave. (A1100), Austin 78712-0114, TX, USA
| | - René Gifford
- Department of Speech, Language, and Hearing Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Spencer Smith
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, 2504A Whitis Ave. (A1100), Austin 78712-0114, TX, USA.
| |
Collapse
|
7
|
Rizzi R, Bidelman GM. Duplex perception reveals brainstem auditory representations are modulated by listeners' ongoing percept for speech. Cereb Cortex 2023; 33:10076-10086. [PMID: 37522248 PMCID: PMC10502779 DOI: 10.1093/cercor/bhad266] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/27/2023] [Accepted: 07/10/2023] [Indexed: 08/01/2023] Open
Abstract
So-called duplex speech stimuli with perceptually ambiguous spectral cues to one ear and isolated low- versus high-frequency third formant "chirp" to the opposite ear yield a coherent percept supporting their phonetic categorization. Critically, such dichotic sounds are only perceived categorically upon binaural integration. Here, we used frequency-following responses (FFRs), scalp-recorded potentials reflecting phase-locked subcortical activity, to investigate brainstem responses to fused speech percepts and to determine whether FFRs reflect binaurally integrated category-level representations. We recorded FFRs to diotic and dichotic stop-consonants (/da/, /ga/) that either did or did not require binaural fusion to properly label along with perceptually ambiguous sounds without clear phonetic identity. Behaviorally, listeners showed clear categorization of dichotic speech tokens confirming they were heard with a fused, phonetic percept. Neurally, we found FFRs were stronger for categorically perceived speech relative to category-ambiguous tokens but also differentiated phonetic categories for both diotically and dichotically presented speech sounds. Correlations between neural and behavioral data further showed FFR latency predicted the degree to which listeners labeled tokens as "da" versus "ga." The presence of binaurally integrated, category-level information in FFRs suggests human brainstem processing reflects a surprisingly abstract level of the speech code typically circumscribed to much later cortical processing.
Collapse
Affiliation(s)
- Rose Rizzi
- Department of Speech, Language, and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
| | - Gavin M Bidelman
- Department of Speech, Language, and Hearing Sciences, Indiana University, Bloomington, IN, United States
- Program in Neuroscience, Indiana University, Bloomington, IN, United States
- Cognitive Science Program, Indiana University, Bloomington, IN, United States
| |
Collapse
|
8
|
Rizzi R, Bidelman GM. Duplex perception reveals brainstem auditory representations are modulated by listeners' ongoing percept for speech. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.09.540018. [PMID: 37214801 PMCID: PMC10197666 DOI: 10.1101/2023.05.09.540018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
So-called duplex speech stimuli with perceptually ambiguous spectral cues to one ear and isolated low- vs. high-frequency third formant "chirp" to the opposite ear yield a coherent percept supporting their phonetic categorization. Critically, such dichotic sounds are only perceived categorically upon binaural integration. Here, we used frequency-following responses (FFRs), scalp-recorded potentials reflecting phase-locked subcortical activity, to investigate brainstem responses to fused speech percepts and to determine whether FFRs reflect binaurally integrated category-level representations. We recorded FFRs to diotic and dichotic stop-consonants (/da/, /ga/) that either did or did not require binaural fusion to properly label along with perceptually ambiguous sounds without clear phonetic identity. Behaviorally, listeners showed clear categorization of dichotic speech tokens confirming they were heard with a fused, phonetic percept. Neurally, we found FFRs were stronger for categorically perceived speech relative to category-ambiguous tokens but also differentiated phonetic categories for both diotically and dichotically presented speech sounds. Correlations between neural and behavioral data further showed FFR latency predicted the degree to which listeners labeled tokens as "da" vs. "ga". The presence of binaurally integrated, category-level information in FFRs suggests human brainstem processing reflects a surprisingly abstract level of the speech code typically circumscribed to much later cortical processing.
Collapse
Affiliation(s)
- Rose Rizzi
- Department of Speech, Language, and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA
| | - Gavin M. Bidelman
- Department of Speech, Language, and Hearing Sciences, Indiana University, Bloomington, IN, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, USA
- Cognitive Science Program, Indiana University, Bloomington, IN, USA
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA
| |
Collapse
|
9
|
Carter JA, Bidelman GM. Perceptual warping exposes categorical representations for speech in human brainstem responses. Neuroimage 2023; 269:119899. [PMID: 36720437 PMCID: PMC9992300 DOI: 10.1016/j.neuroimage.2023.119899] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 01/17/2023] [Accepted: 01/22/2023] [Indexed: 01/30/2023] Open
Abstract
The brain transforms continuous acoustic events into discrete category representations to downsample the speech signal for our perceptual-cognitive systems. Such phonetic categories are highly malleable, and their percepts can change depending on surrounding stimulus context. Previous work suggests these acoustic-phonetic mapping and perceptual warping of speech emerge in the brain no earlier than auditory cortex. Here, we examined whether these auditory-category phenomena inherent to speech perception occur even earlier in the human brain, at the level of auditory brainstem. We recorded speech-evoked frequency following responses (FFRs) during a task designed to induce more/less warping of listeners' perceptual categories depending on stimulus presentation order of a speech continuum (random, forward, backward directions). We used a novel clustered stimulus paradigm to rapidly record the high trial counts needed for FFRs concurrent with active behavioral tasks. We found serial stimulus order caused perceptual shifts (hysteresis) near listeners' category boundary confirming identical speech tokens are perceived differentially depending on stimulus context. Critically, we further show neural FFRs during active (but not passive) listening are enhanced for prototypical vs. category-ambiguous tokens and are biased in the direction of listeners' phonetic label even for acoustically-identical speech stimuli. These findings were not observed in the stimulus acoustics nor model FFR responses generated via a computational model of cochlear and auditory nerve transduction, confirming a central origin to the effects. Our data reveal FFRs carry category-level information and suggest top-down processing actively shapes the neural encoding and categorization of speech at subcortical levels. These findings suggest the acoustic-phonetic mapping and perceptual warping in speech perception occur surprisingly early along the auditory neuroaxis, which might aid understanding by reducing ambiguity inherent to the speech signal.
Collapse
Affiliation(s)
- Jared A Carter
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA; Division of Clinical Neuroscience, School of Medicine, Hearing Sciences - Scottish Section, University of Nottingham, Glasgow, Scotland, UK
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA; Program in Neuroscience, Indiana University, Bloomington, IN, USA.
| |
Collapse
|
10
|
Suresh CH, Krishnan A. Frequency-Following Response to Steady-State Vowel in Quiet and Background Noise Among Marching Band Participants With Normal Hearing. Am J Audiol 2022; 31:719-736. [PMID: 35944059 DOI: 10.1044/2022_aja-21-00226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
OBJECTIVE Human studies enrolling individuals at high risk for cochlear synaptopathy (CS) have reported difficulties in speech perception in adverse listening conditions. The aim of this study is to determine if these individuals show a degradation in the neural encoding of speech in quiet and in the presence of background noise as reflected in neural phase-locking to both envelope periodicity and temporal fine structure (TFS). To our knowledge, there are no published reports that have specifically examined the neural encoding of both envelope periodicity and TFS of speech stimuli (in quiet and in adverse listening conditions) among a sample with loud-sound exposure history who are at risk for CS. METHOD Using scalp-recorded frequency-following response (FFR), the authors evaluated the neural encoding of envelope periodicity (FFRENV) and TFS (FFRTFS) for a steady-state vowel (English back vowel /u/) in quiet and in the presence of speech-shaped noise presented at +5- and 0 dB SNR. Participants were young individuals with normal hearing who participated in the marching band for at least 5 years (high-risk group) and non-marching band group with low-noise exposure history (low-risk group). RESULTS The results showed no group differences in the neural encoding of either the FFRENV or the first formant (F1) in the FFRTFS in quiet and in noise. Paradoxically, the high-risk group demonstrated enhanced representation of F2 harmonics across all stimulus conditions. CONCLUSIONS These results appear to be in line with a music experience-dependent enhancement of F2 harmonics. However, due to sound overexposure in the high-risk group, the role of homeostatic central compensation cannot be ruled out. A larger scale data set with different noise exposure background, longitudinal measurements with an array of behavioral and electrophysiological tests is needed to disentangle the nature of the complex interaction between the effects of central compensatory gain and experience-dependent enhancement.
Collapse
Affiliation(s)
- Chandan H Suresh
- Department of Communication Disorders, California State University, Los Angeles
| | | |
Collapse
|
11
|
Easwar V, Purcell D, Eeckhoutte MV, Aiken SJ. The Influence of Male- and Female-Spoken Vowel Acoustics on Envelope-Following Responses. Semin Hear 2022; 43:223-239. [PMID: 36313043 PMCID: PMC9605803 DOI: 10.1055/s-0042-1756165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023] Open
Abstract
The influence of male and female vowel characteristics on the envelope-following responses (EFRs) is not well understood. This study explored the role of vowel characteristics on the EFR at the fundamental frequency (f0) in response to the vowel /ε/ (as in "head"). Vowel tokens were spoken by five males and five females and EFRs were measured in 25 young adults (21 females). An auditory model was used to estimate changes in auditory processing that might account for talker effects on EFR amplitude. There were several differences between male and female vowels in relation to the EFR. For male talkers, EFR amplitudes were correlated with the bandwidth and harmonic count of the first formant, and the amplitude of the trough below the second formant. For female talkers, EFR amplitudes were correlated with the range of f0 frequencies and the amplitude of the trough above the second formant. The model suggested that the f0 EFR reflects a wide distribution of energy in speech, with primary contributions from high-frequency harmonics mediated from cochlear regions basal to the peaks of the first and second formants, not from low-frequency harmonics with energy near f0. Vowels produced by female talkers tend to produce lower-amplitude EFR, likely because they depend on higher-frequency harmonics where speech sound levels tend to be lower. This work advances auditory electrophysiology by showing how the EFR evoked by speech relates to the acoustics of speech, for both male and female voices.
Collapse
Affiliation(s)
- Vijayalakshmi Easwar
- Department of Communication Sciences and Disorders & Waisman Center, University of Wisconsin, Madison
- Department of Communication Sciences, National Acoustic Laboratories, Sydney, Australia
| | - David Purcell
- National Center for Audiology, School of Communication Sciences and Disorders, Western University, London, Canada
| | - Maaike Van Eeckhoutte
- Division of Hearing Systems, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, Denmark
- Copenhagen Hearing and Balance Centre - Ear, Nose, Throat and Audiology Clinic, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
- National Center for Audiology, Western University, London, Canada
| | - Steven J. Aiken
- School of Communication Sciences and Disorders, Departments of Surgery and Psychology and Neuroscience, Dalhousie University, Halifax, Canada
| |
Collapse
|
12
|
Easwar V, Chung L. The influence of phoneme contexts on adaptation in vowel-evoked envelope following responses. Eur J Neurosci 2022; 56:4572-4582. [PMID: 35804282 PMCID: PMC9543495 DOI: 10.1111/ejn.15768] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 02/25/2022] [Accepted: 07/06/2022] [Indexed: 11/28/2022]
Abstract
Repeated stimulus presentation leads to neural adaptation and consequent amplitude reduction in vowel-evoked envelope following responses (EFRs)-a response that reflects neural activity phase-locked to envelope periodicity. EFRs are elicited by vowels presented in isolation or in the context of other phonemes such as in syllables. While context phonemes could exert some forward influence on vowel-evoked EFRs, they may reduce the degree of adaptation. Here, we evaluated whether the properties of context phonemes between consecutive vowel stimuli influence adaptation. EFRs were elicited by the low-frequency first formant (resolved harmonics) and mid-to-high frequency second and higher formants (unresolved harmonics) of a male-spoken/i/when the presence, number, and predictability of context phonemes (/s/, /a/, /∫/, /u/) between vowel repetitions varied. Monitored over four iterations of /i/, adaptation was evident only for EFRs elicited by the unresolved harmonics. EFRs elicited by the unresolved harmonics decreased in amplitude by ~16-20 nV (10-17%) after the first presentation of/i/and remained stable thereafter. EFR adaptation was reduced by the presence of a context phoneme, but the reduction did not change with their number or predictability. The presence of a context phoneme, however, attenuated EFRs by a degree similar to that caused by adaptation (~21-23 nV). Such a trade-off in the short- and long-term influence of context phonemes suggests that the benefit of interleaving EFR-eliciting vowels with other context phonemes depends on whether the use of consonant-vowel syllables is critical to improve the validity of EFR applications.
Collapse
Affiliation(s)
- Vijayalakshmi Easwar
- Department of Communication Sciences & Disorders, University of Wisconsin-Madison, Madison, USA.,Waisman Center, University of Wisconsin-Madison, Madison, USA
| | - Lauren Chung
- Department of Communication Sciences & Disorders, University of Wisconsin-Madison, Madison, USA.,Waisman Center, University of Wisconsin-Madison, Madison, USA
| |
Collapse
|
13
|
Chauvette L, Fournier P, Sharp A. The frequency-following response to assess the neural representation of spectral speech cues in older adults. Hear Res 2022; 418:108486. [DOI: 10.1016/j.heares.2022.108486] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 03/12/2022] [Accepted: 03/15/2022] [Indexed: 11/04/2022]
|
14
|
Bachmann FL, MacDonald EN, Hjortkjær J. Neural Measures of Pitch Processing in EEG Responses to Running Speech. Front Neurosci 2022; 15:738408. [PMID: 35002597 PMCID: PMC8729880 DOI: 10.3389/fnins.2021.738408] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 11/01/2021] [Indexed: 11/13/2022] Open
Abstract
Linearized encoding models are increasingly employed to model cortical responses to running speech. Recent extensions to subcortical responses suggest clinical perspectives, potentially complementing auditory brainstem responses (ABRs) or frequency-following responses (FFRs) that are current clinical standards. However, while it is well-known that the auditory brainstem responds both to transient amplitude variations and the stimulus periodicity that gives rise to pitch, these features co-vary in running speech. Here, we discuss challenges in disentangling the features that drive the subcortical response to running speech. Cortical and subcortical electroencephalographic (EEG) responses to running speech from 19 normal-hearing listeners (12 female) were analyzed. Using forward regression models, we confirm that responses to the rectified broadband speech signal yield temporal response functions consistent with wave V of the ABR, as shown in previous work. Peak latency and amplitude of the speech-evoked brainstem response were correlated with standard click-evoked ABRs recorded at the vertex electrode (Cz). Similar responses could be obtained using the fundamental frequency (F0) of the speech signal as model predictor. However, simulations indicated that dissociating responses to temporal fine structure at the F0 from broadband amplitude variations is not possible given the high co-variance of the features and the poor signal-to-noise ratio (SNR) of subcortical EEG responses. In cortex, both simulations and data replicated previous findings indicating that envelope tracking on frontal electrodes can be dissociated from responses to slow variations in F0 (relative pitch). Yet, no association between subcortical F0-tracking and cortical responses to relative pitch could be detected. These results indicate that while subcortical speech responses are comparable to click-evoked ABRs, dissociating pitch-related processing in the auditory brainstem may be challenging with natural speech stimuli.
Collapse
Affiliation(s)
- Florine L Bachmann
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark
| | - Ewen N MacDonald
- Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada
| | - Jens Hjortkjær
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Lyngby, Denmark.,Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital - Amager and Hvidovre, Copenhagen, Denmark
| |
Collapse
|
15
|
Cheng FY, Xu C, Gold L, Smith S. Rapid Enhancement of Subcortical Neural Responses to Sine-Wave Speech. Front Neurosci 2022; 15:747303. [PMID: 34987356 PMCID: PMC8721138 DOI: 10.3389/fnins.2021.747303] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 12/02/2021] [Indexed: 01/15/2023] Open
Abstract
The efferent auditory nervous system may be a potent force in shaping how the brain responds to behaviorally significant sounds. Previous human experiments using the frequency following response (FFR) have shown efferent-induced modulation of subcortical auditory function online and over short- and long-term time scales; however, a contemporary understanding of FFR generation presents new questions about whether previous effects were constrained solely to the auditory subcortex. The present experiment used sine-wave speech (SWS), an acoustically-sparse stimulus in which dynamic pure tones represent speech formant contours, to evoke FFRSWS. Due to the higher stimulus frequencies used in SWS, this approach biased neural responses toward brainstem generators and allowed for three stimuli (/bɔ/, /bu/, and /bo/) to be used to evoke FFRSWSbefore and after listeners in a training group were made aware that they were hearing a degraded speech stimulus. All SWS stimuli were rapidly perceived as speech when presented with a SWS carrier phrase, and average token identification reached ceiling performance during a perceptual training phase. Compared to a control group which remained naïve throughout the experiment, training group FFRSWS amplitudes were enhanced post-training for each stimulus. Further, linear support vector machine classification of training group FFRSWS significantly improved post-training compared to the control group, indicating that training-induced neural enhancements were sufficient to bolster machine learning classification accuracy. These results suggest that the efferent auditory system may rapidly modulate auditory brainstem representation of sounds depending on their context and perception as non-speech or speech.
Collapse
Affiliation(s)
- Fan-Yin Cheng
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Can Xu
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Lisa Gold
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| | - Spencer Smith
- Department of Speech, Language, and Hearing Sciences, University of Texas at Austin, Austin, TX, United States
| |
Collapse
|
16
|
Abstract
OBJECTIVES To evaluate sensation level (SL)-dependent characteristics of envelope following responses (EFRs) elicited by band-limited speech dominant in low, mid, and high frequencies. DESIGN In 21 young normal hearing adults, EFRs were elicited by 8 male-spoken speech stimuli-the first formant, and second and higher formants of /u/, /a/ and /i/, and modulated fricatives, /∫/ and /s/. Stimulus SL was computed from behaviorally measured thresholds. RESULTS At 30 dB SL, the amplitude and phase coherence of fricative-elicited EFRs were ~1.5 to 2 times higher than all vowel-elicited EFRs, whereas fewer and smaller differences were found among vowel-elicited EFRs. For all stimuli, EFR amplitude and phase coherence increased by roughly 50% for every 10 dB increase in SL between ~0 and 50 dB. CONCLUSIONS Stimulus and frequency dependency in EFRs exist despite accounting for differences in audibility of speech sounds. The growth rate of EFR characteristics with SL is independent of stimulus and its frequency.
Collapse
|
17
|
Shukla B, Bidelman GM. Enhanced brainstem phase-locking in low-level noise reveals stochastic resonance in the frequency-following response (FFR). Brain Res 2021; 1771:147643. [PMID: 34473999 PMCID: PMC8490316 DOI: 10.1016/j.brainres.2021.147643] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 08/23/2021] [Accepted: 08/28/2021] [Indexed: 11/29/2022]
Abstract
In nonlinear systems, the inclusion of low-level noise can paradoxically improve signal detection, a phenomenon known as stochastic resonance (SR). SR has been observed in human hearing whereby sensory thresholds (e.g., signal detection and discrimination) are enhanced in the presence of noise. Here, we asked whether subcortical auditory processing (neural phase locking) shows evidence of SR. We recorded brainstem frequency-following-responses (FFRs) in young, normal-hearing listeners to near-electrophysiological-threshold (40 dB SPL) complex tones composed of 10 iso-amplitude harmonics of 150 Hz fundamental frequency (F0) presented concurrent with low-level noise (+20 to -20 dB SNRs). Though variable and weak across ears, some listeners showed improvement in auditory detection thresholds with subthreshold noise confirming SR psychophysically. At the neural level, low-level FFRs were initially eradicated by noise (expected masking effect) but were surprisingly reinvigorated at select masker levels (local maximum near ∼ 35 dB SPL). These data suggest brainstem phase-locking to near threshold periodic stimuli is enhanced in optimal levels of noise, the hallmark of SR. Our findings provide novel evidence for stochastic resonance in the human auditory brainstem and suggest that under some circumstances, noise can actually benefit both the behavioral and neural encoding of complex sounds.
Collapse
Affiliation(s)
- Bhanu Shukla
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
18
|
Easwar V, Boothalingam S, Flaherty R. Fundamental frequency-dependent changes in vowel-evoked envelope following responses. Hear Res 2021; 408:108297. [PMID: 34229221 DOI: 10.1016/j.heares.2021.108297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Revised: 06/03/2021] [Accepted: 06/09/2021] [Indexed: 10/21/2022]
Abstract
Scalp-recorded envelope following responses (EFRs) provide a non-invasive method to assess the encoding of the fundamental frequency (f0) of voice that is important for speech understanding. It is well-known that EFRs are influenced by voice f0. However, this effect of f0 has not been examined independent of concomitant changes in spectra or neural generators. We evaluated the effect of voice f0 on EFRs while controlling for vowel formant characteristics and potentially avoiding significant changes in dominant neural generators using a small f0 range. EFRs were elicited by a male-spoken vowel /u/ (average f0 = 100.4 Hz) and its lowered f0 version (average f0 = 91.9 Hz) with closely matched formant characteristics. Vowels were presented to each ear of 17 young adults with normal hearing. EFRs were simultaneously recorded between the vertex and the nape, and the vertex and the ipsilateral mastoid-the two most common electrode montages used for EFRs. Our results indicate that when vowel formant characteristics are matched, an increase in f0 by 8.5 Hz reduces EFR amplitude by 25 nV, phase coherence by 0.05 and signal-to-noise ratio by 3.5 dB, on average. The reduction in EFR characteristics was similar across ears of stimulation and the two montages used. These findings will help parse the influence of f0 or stimulus spectra on EFRs when both co-vary.
Collapse
Affiliation(s)
- Vijayalakshmi Easwar
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, United States; Waisman Center, University of Wisconsin-Madison, United States
| | - Sriram Boothalingam
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, United States; Waisman Center, University of Wisconsin-Madison, United States
| | - Regan Flaherty
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, United States; Waisman Center, University of Wisconsin-Madison, United States
| |
Collapse
|
19
|
Van Canneyt J, Wouters J, Francart T. Neural tracking of the fundamental frequency of the voice: The effect of voice characteristics. Eur J Neurosci 2021; 53:3640-3653. [DOI: 10.1111/ejn.15229] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 03/24/2021] [Accepted: 04/08/2021] [Indexed: 11/26/2022]
Affiliation(s)
| | - Jan Wouters
- ExpORL Department of Neurosciences KU Leuven Leuven Belgium
| | - Tom Francart
- ExpORL Department of Neurosciences KU Leuven Leuven Belgium
| |
Collapse
|
20
|
Defining the Role of Attention in Hierarchical Auditory Processing. Audiol Res 2021; 11:112-128. [PMID: 33805600 PMCID: PMC8006147 DOI: 10.3390/audiolres11010012] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 03/07/2021] [Accepted: 03/10/2021] [Indexed: 01/09/2023] Open
Abstract
Communication in noise is a complex process requiring efficient neural encoding throughout the entire auditory pathway as well as contributions from higher-order cognitive processes (i.e., attention) to extract speech cues for perception. Thus, identifying effective clinical interventions for individuals with speech-in-noise deficits relies on the disentanglement of bottom-up (sensory) and top-down (cognitive) factors to appropriately determine the area of deficit; yet, how attention may interact with early encoding of sensory inputs remains unclear. For decades, attentional theorists have attempted to address this question with cleverly designed behavioral studies, but the neural processes and interactions underlying attention's role in speech perception remain unresolved. While anatomical and electrophysiological studies have investigated the neurological structures contributing to attentional processes and revealed relevant brain-behavior relationships, recent electrophysiological techniques (i.e., simultaneous recording of brainstem and cortical responses) may provide novel insight regarding the relationship between early sensory processing and top-down attentional influences. In this article, we review relevant theories that guide our present understanding of attentional processes, discuss current electrophysiological evidence of attentional involvement in auditory processing across subcortical and cortical levels, and propose areas for future study that will inform the development of more targeted and effective clinical interventions for individuals with speech-in-noise deficits.
Collapse
|
21
|
The Influence of Vowel Identity, Vowel Production Variability, and Consonant Environment on Envelope Following Responses. Ear Hear 2021; 42:662-672. [PMID: 33577218 DOI: 10.1097/aud.0000000000000966] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES The vowel-evoked envelope following response (EFR) is a useful tool for studying brainstem processing of speech in natural consonant-vowel productions. Previous work, however, demonstrates that the amplitude of EFRs is highly variable across vowels. To clarify factors contributing to the variability observed, the objectives of the present study were to evaluate: (1) the influence of vowel identity and the consonant context surrounding each vowel on EFR amplitude and (2) the effect of variations in repeated productions of a vowel on EFR amplitude while controlling for the consonant context. DESIGN In Experiment 1, EFRs were recorded in response to seven English vowels (/ij/, /Ι/, /ej/, /ε/, /æ/, /u/, and /JOURNAL/earher/04.03/00003446-202105000-00017/inline-graphic1/v/2021-04-30T105427Z/r/image-tiff/) embedded in each of four consonant contexts (/hVd/, /sVt/, /zVf/, and /JOURNAL/earher/04.03/00003446-202105000-00017/inline-graphic2/v/2021-04-30T105427Z/r/image-tiffVv/). In Experiment 2, EFRs were recorded in response to four different variants of one of the four possible vowels (/ij/, /ε/, /æ/, or /JOURNAL/earher/04.03/00003446-202105000-00017/inline-graphic3/v/2021-04-30T105427Z/r/image-tiff/), embedded in the same consonant-vowel-consonant environments used in Experiment 1. All vowels were edited to minimize formant transitions before embedding in a consonant context. Different talkers were used for the two experiments. Data from a total of 30 and 64 (16 listeners/vowel) young adults with normal hearing were included in Experiments 1 and 2, respectively. EFRs were recorded using a single-channel electrode montage between the vertex and nape of the neck while stimuli were presented monaurally. RESULTS In Experiment 1, vowel identity had a significant effect on EFR amplitude with the vowel /æ/ eliciting the highest amplitude EFRs (170 nV, on average), and the vowel /ej/ eliciting the lowest amplitude EFRs (106 nV, on average). The consonant context surrounding each vowel stimulus had no statistically significant effect on EFR amplitude. Similarly in Experiment 2, consonant context did not influence the amplitude of EFRs elicited by the vowel variants. Vowel identity significantly altered EFR amplitude with /ε/ eliciting the highest amplitude EFRs (104 nV, on average). Significant, albeit small, differences (<21 nV, on average) in EFR amplitude were evident between some variants of /ε/ and /u/. CONCLUSION Based on a comprehensive set of naturally produced vowel samples in carefully controlled consonant contexts, the present study provides additional evidence for the sensitivity of EFRs to vowel identity and variations in vowel production. The surrounding consonant context (after removal of formant transitions) has no measurable effect on EFRs, irrespective of vowel identity and variant. The sensitivity of EFRs to nuances in vowel acoustics emphasizes the need for adequate control and evaluation of stimuli proposed for clinical and research purposes.
Collapse
|
22
|
Speech frequency-following response in human auditory cortex is more than a simple tracking. Neuroimage 2020; 226:117545. [PMID: 33186711 DOI: 10.1016/j.neuroimage.2020.117545] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 10/29/2020] [Accepted: 11/02/2020] [Indexed: 11/20/2022] Open
Abstract
The human auditory cortex is recently found to contribute to the frequency following response (FFR) and the cortical component has been shown to be more relevant to speech perception. However, it is not clear how cortical FFR may contribute to the processing of speech fundamental frequency (F0) and the dynamic pitch. Using intracranial EEG recordings, we observed a significant FFR at the fundamental frequency (F0) for both speech and speech-like harmonic complex stimuli in the human auditory cortex, even in the missing fundamental condition. Both the spectral amplitude and phase coherence of the cortical FFR showed a significant harmonic preference, and attenuated from the primary auditory cortex to the surrounding associative auditory cortex. The phase coherence of the speech FFR was found significantly higher than that of the harmonic complex stimuli, especially in the left hemisphere, showing a high timing fidelity of the cortical FFR in tracking dynamic F0 in speech. Spectrally, the frequency band of the cortical FFR was largely overlapped with the range of the human vocal pitch. Taken together, our study parsed the intrinsic properties of the cortical FFR and reveals a preference for speech-like sounds, supporting its potential role in processing speech intonation and lexical tones.
Collapse
|
23
|
Brainstem correlates of cochlear nonlinearity measured via the scalp-recorded frequency-following response. Neuroreport 2020; 31:702-707. [PMID: 32453027 DOI: 10.1097/wnr.0000000000001452] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
The frequency-following response (FFR) is an EEG-based potential used to characterize the brainstem encoding of complex sounds. Adopting techniques from auditory signal processing, we assessed the degree to which FFRs encode important properties of cochlear processing (e.g. nonlinearities) and their relation to speech-in-noise (SIN) listening skills. Based on the premise that normal cochlear transduction is characterized by rectification and compression, we reasoned these nonlinearities would create measurable harmonic distortion in FFRs in response to even pure tone input. We recorded FFRs to nonspeech (pure- and amplitude-modulated-tones) stimuli in normal-hearing individuals. We then compared conventional indices of cochlear nonlinearity, via distortion product otoacoustic emission (DPOAE) I/O functions, to total harmonic distortion measured from neural FFRs (FFRTHD). Analysis of DPOAE growth and the FFRTHD revealed listeners with higher cochlear compression thresholds had lower neural FFRTHD distortion (i.e. more linear FFRs), thus linking cochlear and brainstem correlates of auditory nonlinearity. Importantly, FFRTHD was also negatively correlated with SIN perception whereby listeners with higher FFRTHD (i.e. more nonlinear responses) showed better performance on the QuickSIN. We infer individual differences in SIN perception and FFR nonlinearity even in normal-hearing individuals may reflect subtle differences in auditory health and suprathreshold hearing skills not captured by normal audiometric evaluation. Future studies in hearing-impaired individuals and animal models are necessary to confirm the diagnostic utility of FFRTHD and its relation to cochlear hearing loss or peripheral neurodegeneration in humans.
Collapse
|
24
|
Van Canneyt J, Wouters J, Francart T. From modulated noise to natural speech: The effect of stimulus parameters on the envelope following response. Hear Res 2020; 393:107993. [PMID: 32535277 DOI: 10.1016/j.heares.2020.107993] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Revised: 04/28/2020] [Accepted: 05/04/2020] [Indexed: 11/28/2022]
Abstract
Envelope following responses (EFRs) can be evoked by a wide range of auditory stimuli, but for many stimulus parameters the effect on EFR strength is not fully understood. This complicates the comparison of earlier studies and the design of new studies. Furthermore, the most optimal stimulus parameters are unknown. To help resolve this issue, we investigated the effects of four important stimulus parameters and their interactions on the EFR. Responses were measured in 16 normal hearing subjects evoked by stimuli with four levels of stimulus complexity (amplitude modulated noise, artificial vowels, natural vowels and vowel-consonant-vowel combinations), three fundamental frequencies (105 Hz, 185 Hz and 245 Hz), three fundamental frequency contours (upward sweeping, downward sweeping and flat) and three vowel identities (Flemish /a:/, /u:/, and /i:/). We found that EFRs evoked by artificial vowels were on average 4-6 dB SNR larger than responses evoked by the other stimulus complexities, probably because of (unnaturally) strong higher harmonics. Moreover, response amplitude decreased with fundamental frequency but response SNR remained largely unaffected. Thirdly, fundamental frequency variation within the stimulus did not impact EFR strength, but only when rate of change remained low (e.g. not the case for sweeping natural vowels). Finally, the vowel /i:/ appeared to evoke larger response amplitudes compared to /a:/ and /u:/, but analysis power was too small to confirm this statistically. Vowel-dependent differences in response strength have been suggested to stem from destructive interference between response components. We show how a model of the auditory periphery can simulate these interference patterns and predict response strength. Altogether, the results of this study can guide stimulus choice for future EFR research and practical applications.
Collapse
Affiliation(s)
- Jana Van Canneyt
- ExpORL, Dept. of Neurosciences, KU Leuven, Herestraat 49 Bus 721, 3000, Leuven, Belgium.
| | - Jan Wouters
- ExpORL, Dept. of Neurosciences, KU Leuven, Herestraat 49 Bus 721, 3000, Leuven, Belgium.
| | - Tom Francart
- ExpORL, Dept. of Neurosciences, KU Leuven, Herestraat 49 Bus 721, 3000, Leuven, Belgium.
| |
Collapse
|
25
|
Su Y, Delgutte B. Pitch of harmonic complex tones: rate and temporal coding of envelope repetition rate in inferior colliculus of unanesthetized rabbits. J Neurophysiol 2019; 122:2468-2485. [PMID: 31664871 DOI: 10.1152/jn.00512.2019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Harmonic complex tones (HCTs) found in speech, music, and animal vocalizations evoke strong pitch percepts at their fundamental frequencies. The strongest pitches are produced by HCTs that contain harmonics resolved by cochlear frequency analysis, but HCTs containing solely unresolved harmonics also evoke a weaker pitch at their envelope repetition rate (ERR). In the auditory periphery, neurons phase lock to the stimulus envelope, but this temporal representation of ERR degrades and gives way to rate codes along the ascending auditory pathway. To assess the role of the inferior colliculus (IC) in such transformations, we recorded IC neuron responses to HCT and sinusoidally modulated broadband noise (SAMN) with varying ERR from unanesthetized rabbits. Different interharmonic phase relationships of HCT were used to manipulate the temporal envelope without changing the power spectrum. Many IC neurons demonstrated band-pass rate tuning to ERR between 60 and 1,600 Hz for HCT and between 40 and 500 Hz for SAMN. The tuning was not related to the pure-tone best frequency of neurons but was dependent on the shape of the stimulus envelope, indicating a temporal rather than spectral origin. A phenomenological model suggests that the tuning may arise from peripheral temporal response patterns via synaptic inhibition. We also characterized temporal coding to ERR. Some IC neurons could phase lock to the stimulus envelope up to 900 Hz for either HCT or SAMN, but phase locking was weaker with SAMN. Together, the rate code and the temporal code represent a wide range of ERR, providing strong cues for the pitch of unresolved harmonics.NEW & NOTEWORTHY Envelope repetition rate (ERR) provides crucial cues for pitch perception of frequency components that are not individually resolved by the cochlea, but the neural representation of ERR for stimuli containing many harmonics is poorly characterized. Here we show that the pitch of stimuli with unresolved harmonics is represented by both a rate code and a temporal code for ERR in auditory midbrain neurons and propose possible underlying neural mechanisms with a computational model.
Collapse
Affiliation(s)
- Yaqing Su
- Eaton-Peabody Labs, Massachusetts Eye and Ear, Boston, Massachusetts.,Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Bertrand Delgutte
- Eaton-Peabody Labs, Massachusetts Eye and Ear, Boston, Massachusetts.,Department of Otolaryngology, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
26
|
Yellamsetty A, Bidelman GM. Brainstem correlates of concurrent speech identification in adverse listening conditions. Brain Res 2019; 1714:182-192. [PMID: 30796895 DOI: 10.1016/j.brainres.2019.02.025] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 01/07/2019] [Accepted: 02/19/2019] [Indexed: 01/20/2023]
Abstract
When two voices compete, listeners can segregate and identify concurrent speech sounds using pitch (fundamental frequency, F0) and timbre (harmonic) cues. Speech perception is also hindered by the signal-to-noise ratio (SNR). How clear and degraded concurrent speech sounds are represented at early, pre-attentive stages of the auditory system is not well understood. To this end, we measured scalp-recorded frequency-following responses (FFR) from the EEG while human listeners heard two concurrently presented, steady-state (time-invariant) vowels whose F0 differed by zero or four semitones (ST) presented diotically in either clean (no noise) or noise-degraded (+5dB SNR) conditions. Listeners also performed a speeded double vowel identification task in which they were required to identify both vowels correctly. Behavioral results showed that speech identification accuracy increased with F0 differences between vowels, and this perceptual F0 benefit was larger for clean compared to noise degraded (+5dB SNR) stimuli. Neurophysiological data demonstrated more robust FFR F0 amplitudes for single compared to double vowels and considerably weaker responses in noise. F0 amplitudes showed speech-on-speech masking effects, along with a non-linear constructive interference at 0ST, and suppression effects at 4ST. Correlations showed that FFR F0 amplitudes failed to predict listeners' identification accuracy. In contrast, FFR F1 amplitudes were associated with faster reaction times, although this correlation was limited to noise conditions. The limited number of brain-behavior associations suggests subcortical activity mainly reflects exogenous processing rather than perceptual correlates of concurrent speech perception. Collectively, our results demonstrate that FFRs reflect pre-attentive coding of concurrent auditory stimuli that only weakly predict the success of identifying concurrent speech.
Collapse
Affiliation(s)
- Anusha Yellamsetty
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Department of Communication Sciences & Disorders, University of South Florida, USA.
| | - Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
27
|
Campbell TA, Marsh JE. On corticopetal-corticofugal loops of the new early filter: from cell assemblies to the rostral brainstem. Neuroreport 2019; 30:202-206. [PMID: 30702551 DOI: 10.1097/wnr.0000000000001184] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Affiliation(s)
- Tom A Campbell
- Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - John E Marsh
- Department of Building, Energy and Environmental Engineering, University of Gävle, Gävle, Sweden.,School of Psychology, University of Central Lancashire, Preston, UK
| |
Collapse
|