1
|
Rocha MFB, Advíncula KP, Simões CDESX, Britto DBLDA, Menezes PDL. Benefit of Modulated Masking in hearing according to age. Braz J Otorhinolaryngol 2024; 90:101487. [PMID: 39205366 PMCID: PMC11393591 DOI: 10.1016/j.bjorl.2024.101487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 06/25/2024] [Accepted: 07/24/2024] [Indexed: 09/04/2024] Open
Abstract
OBJECTIVE To analyze the Benefit of Modulated Masking (BMM) on hearing in young, adult and elderly normal-hearing individuals. METHODS The sample included 60 normal-hearing individuals aged 18-75 years who underwent behavioral assessment (sentence recognition test in the presence of steady and modulated noise) and electrophysiological assessment (cortical Auditory Evoked Potential) to investigate BMM. The results were analyzed comparatively using the paired t-test and ANOVA for repeated measures, followed by the Bonferroni post-hoc test (p-value < 0.05). RESULTS A decrease in latencies and an increase in amplitudes of cortical components (P1-N1-P2) was observed due to noise modulation in all age groups. Modulated noise generated better auditory threshold responses (electrophysiological and behavioral), compared to steady noise. The elderly presented a higher threshold in both hearing domains, compared to the other participants, as well as a lower BMM magnitude. CONCLUSION It was possible to conclude that the modulated noise generated less interference in the magnitude of the neural response (smaller latencies) and in the neural processing time (larger amplitudes) for the speech stimulus in all participants. The higher auditory thresholds (electrophysiological and behavioral) and the lower BMM magnitude observed in the elderly group, even in the face of noise modulation, suggest a lower temporal auditory performance in this population, and may indicate a deficit in the temporal resolution capacity, associated with the process of aging. LEVEL OF EVIDENCE: 3
Collapse
Affiliation(s)
| | - Karina Paes Advíncula
- Universidade Federal de Pernambuco, Departamento de Fonoaudiologia, Recife, PE, Brazil
| | | | | | - Pedro de Lemos Menezes
- Universidade Federal de Alagoas, Programa de Pós-Graduação em Biotecnologia, Maceió, AL, Brazil; Universidade Estadual de Ciências da Saúde de Alagoas, Departamento de Fonoaudiologia, Maceió, AL, Brazil
| |
Collapse
|
2
|
Liu W, Wang T, Huang X. The influences of forward context on stop-consonant perception: The combined effects of contrast and acoustic cue activation? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:1903-1920. [PMID: 37756574 DOI: 10.1121/10.0021077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 09/06/2023] [Indexed: 09/29/2023]
Abstract
The perception of the /da/-/ga/ series, distinguished primarily by the third formant (F3) transition, is affected by many nonspeech and speech sounds. Previous studies mainly investigated the influences of context stimuli with frequency bands located in the F3 region and proposed the account of spectral contrast effects. This study examined the effects of context stimuli with bands not in the F3 region. The results revealed that these non-F3-region stimuli (whether with bands higher or lower than the F3 region) mainly facilitated the identification of /ga/; for example, the stimuli (including frequency-modulated glides, sine-wave tones, filtered sentences, and natural vowels) in the low-frequency band (500-1500 Hz) led to more /ga/ responses than those in the low-F3 region (1500-2500 Hz). It is suggested that in the F3 region, context stimuli may act through spectral contrast effects, while in non-F3 regions, context stimuli might activate the acoustic cues of /g/ and further facilitate the identification of /ga/. The combination of contrast and acoustic cue effects can explain more results concerning the forward context influences on the perception of the /da/-/ga/ series, including the effects of non-F3-region stimuli and the imbalanced influences of context stimuli on /da/ and /ga/ perception.
Collapse
Affiliation(s)
- Wenli Liu
- Department of Social Psychology, Zhou Enlai School of Government, Nankai University, 38 Tongshuo Road, Tianjin 300350, China
| | - Tianyu Wang
- Department of Social Psychology, Zhou Enlai School of Government, Nankai University, 38 Tongshuo Road, Tianjin 300350, China
| | - Xianjun Huang
- School of Psychology, Capital Normal University, 105 North West 3rd Ring Road, Beijing 100048, China
| |
Collapse
|
3
|
Shorey AE, Stilp CE. Short-term, not long-term, average spectra of preceding sentences bias consonant categorization. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:2426. [PMID: 37092945 PMCID: PMC10119874 DOI: 10.1121/10.0017862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 03/31/2023] [Accepted: 03/31/2023] [Indexed: 05/03/2023]
Abstract
Speech sound perception is influenced by the spectral properties of surrounding sounds. For example, listeners perceive /g/ (lower F3 onset) more often after sounds with prominent high-F3 frequencies and perceive /d/ (higher F3 onset) more often after sounds with prominent low-F3 frequencies. These biases are known as spectral contrast effects (SCEs). Much of this work examined differences between long-term average spectra (LTAS) of preceding sounds and target speech sounds. Post hoc analyses by Stilp and Assgari [(2021) Atten. Percept. Psychophys. 83(6) 2694-2708] revealed that spectra of the last 475 ms of precursor sentences, not the entire LTAS, best predicted biases in consonant categorization. Here, the influences of proximal (last 500 ms) versus distal (before the last 500 ms) portions of precursor sentences on subsequent consonant categorization were compared. Sentences emphasized different frequency regions in each temporal window (e.g., distal low-F3 emphasis, proximal high-F3 emphasis, and vice versa) naturally or via filtering. In both cases, shifts in consonant categorization were produced in accordance with spectral properties of the proximal window. This was replicated when the distal window did not emphasize either frequency region, but the proximal window did. Results endorse closer consideration of patterns of spectral energy over time in preceding sounds, not just their LTAS.
Collapse
Affiliation(s)
- Anya E Shorey
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| | - Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| |
Collapse
|
4
|
Rocha MFB, Menezes DC, Duarte DSB, Griz SMS, Frizzo ACF, Menezes PDL, Teixeira CF, Advíncula KP. Masking release in cortical auditory evoked potentials with speech stimulus. Codas 2023. [DOI: 10.1590/2317-1782/20212020334en] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
ABSTRACT Purpose To analyze the effect of masking on the Cortical Auditory Evoked Potential with speech stimulus in young adults. Methods Fourteen individuals aged between 19 and 28 years of both sexes with no hearing loss participated in the study. The Cortical Auditory Evoked Potential examination was performed with synthetic speech stimulus /ba/ simultaneous to Speech Shaped Noise presented under three conditions: steady noise with a 30 dB SPLep intensity (weak steady noise), steady noise with a 65 dB SPLep intensity o (strong steady noise) and modulated noise with 30 dB SPLep and 65 dB SPLep intensities at 25Hz and modulation period of 40 ms. Results Higher latencies were observed in the cortical components, except P2, in the condition of strong steady noise and more meaningful measures of amplitude of the cortical components P1, N1 and P2 in the condition of modulated noise with statistically significant difference in comparison to the strong steady noise condition. There was worse wave morphology in the condition of strong steady noise, when compared to the other records. The average electrophysiological thresholds for the conditions of strong steady noise and modulated noise were 60 dB SPLep and 49 dB SPLep, respectively, showing a 11.7 dB mean difference. Conclusion We could infer that there was a lower masking effect of modulated noise when compared to the strong steady noise condition, in the amplitude measurements of the cortical components and an average difference of 11.7 dB between the electrophysiological thresholds (interpreted as the measure of the Masking Release).
Collapse
|
5
|
Stilp CE, Shorey AE, King CJ. Nonspeech sounds are not all equally good at being nonspeech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1842. [PMID: 36182316 DOI: 10.1121/10.0014174] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 08/30/2022] [Indexed: 06/16/2023]
Abstract
Perception of speech sounds has a long history of being compared to perception of nonspeech sounds, with rich and enduring debates regarding how closely they share similar underlying processes. In many instances, perception of nonspeech sounds is directly compared to that of speech sounds without a clear explanation of how related these sounds are to the speech they are selected to mirror (or not mirror). While the extreme acoustic variability of speech sounds is well documented, this variability is bounded by the common source of a human vocal tract. Nonspeech sounds do not share a common source, and as such, exhibit even greater acoustic variability than that observed for speech. This increased variability raises important questions about how well perception of a given nonspeech sound might resemble or model perception of speech sounds. Here, we offer a brief review of extremely diverse nonspeech stimuli that have been used in the efforts to better understand perception of speech sounds. The review is organized according to increasing spectrotemporal complexity: random noise, pure tones, multitone complexes, environmental sounds, music, speech excerpts that are not recognized as speech, and sinewave speech. Considerations are offered for stimulus selection in nonspeech perception experiments moving forward.
Collapse
Affiliation(s)
- Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| | - Anya E Shorey
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| | - Caleb J King
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| |
Collapse
|
6
|
Ozernov-Palchik O, Beach SD, Brown M, Centanni TM, Gaab N, Kuperberg G, Perrachione TK, Gabrieli JDE. Speech-specific perceptual adaptation deficits in children and adults with dyslexia. J Exp Psychol Gen 2022; 151:1556-1572. [PMID: 34843363 PMCID: PMC9148384 DOI: 10.1037/xge0001145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
According to several influential theoretical frameworks, phonological deficits in dyslexia result from reduced sensitivity to acoustic cues that are essential for the development of robust phonemic representations. Some accounts suggest that these deficits arise from impairments in rapid auditory adaptation processes that are either speech-specific or domain-general. Here, we examined the specificity of auditory adaptation deficits in dyslexia using a nonlinguistic tone anchoring (adaptation) task and a linguistic selective adaptation task in children and adults with and without dyslexia. Children and adults with dyslexia had elevated tone-frequency discrimination thresholds, but both groups benefited from anchoring to repeated stimuli to the same extent as typical readers. Additionally, although both dyslexia groups had overall reduced accuracy for speech sound identification, only the child group had reduced categorical perception for speech. Across both age groups, individuals with dyslexia had reduced perceptual adaptation to speech. These results highlight broad auditory perceptual deficits across development in individuals with dyslexia for both linguistic and nonlinguistic domains, but speech-specific adaptation deficits. Finally, mediation models in children and adults revealed that the causal pathways from basic perception and adaptation to phonological awareness through speech categorization were not significant. Thus, rather than having causal effects, perceptual deficits may co-occur with the phonological deficits in dyslexia across development. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Collapse
Affiliation(s)
- Ola Ozernov-Palchik
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
- Harvard Graduate School of Education, Harvard University, Cambridge, Massachusetts, USA
| | - Sara D. Beach
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
- Program in Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, MA, USA
| | - Meredith Brown
- Department of Psychology, Tufts University, Medford, Massachusetts, USA
| | - Tracy M. Centanni
- Department of Psychology, Texas Christian University, Fort Worth, Texas, USA
| | - Nadine Gaab
- Harvard Graduate School of Education, Harvard University, Cambridge, Massachusetts, USA
| | - Gina Kuperberg
- Department of Psychology, Tufts University, Medford, Massachusetts, USA
| | - Tyler K. Perrachione
- Department of Speech, Language, and Hearing Sciences, Boston University, Boston, MA, USA
| | - John D. E. Gabrieli
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
- Program in Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
7
|
Beach SD, Ozernov-Palchik O, May SC, Centanni TM, Perrachione TK, Pantazis D, Gabrieli JDE. The Neural Representation of a Repeated Standard Stimulus in Dyslexia. Front Hum Neurosci 2022; 16:823627. [PMID: 35634200 PMCID: PMC9133793 DOI: 10.3389/fnhum.2022.823627] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Accepted: 04/19/2022] [Indexed: 11/13/2022] Open
Abstract
The neural representation of a repeated stimulus is the standard against which a deviant stimulus is measured in the brain, giving rise to the well-known mismatch response. It has been suggested that individuals with dyslexia have poor implicit memory for recently repeated stimuli, such as the train of standards in an oddball paradigm. Here, we examined how the neural representation of a standard emerges over repetitions, asking whether there is less sensitivity to repetition and/or less accrual of "standardness" over successive repetitions in dyslexia. We recorded magnetoencephalography (MEG) as adults with and without dyslexia were passively exposed to speech syllables in a roving-oddball design. We performed time-resolved multivariate decoding of the MEG sensor data to identify the neural signature of standard vs. deviant trials, independent of stimulus differences. This "multivariate mismatch" was equally robust and had a similar time course in the two groups. In both groups, standards generated by as few as two repetitions were distinct from deviants, indicating normal sensitivity to repetition in dyslexia. However, only in the control group did standards become increasingly different from deviants with repetition. These results suggest that many of the mechanisms that give rise to neural adaptation as well as mismatch responses are intact in dyslexia, with the possible exception of a putatively predictive mechanism that successively integrates recent sensory information into feedforward processing.
Collapse
Affiliation(s)
- Sara D. Beach
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, United States
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, United States
| | - Ola Ozernov-Palchik
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Sidney C. May
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Tracy M. Centanni
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Tyler K. Perrachione
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, United States
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, MA, United States
| | - Dimitrios Pantazis
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - John D. E. Gabrieli
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, United States
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, United States
| |
Collapse
|
8
|
Centanni TM, Beach SD, Ozernov-Palchik O, May S, Pantazis D, Gabrieli JDE. Categorical perception and influence of attention on neural consistency in response to speech sounds in adults with dyslexia. ANNALS OF DYSLEXIA 2022; 72:56-78. [PMID: 34495457 PMCID: PMC8901776 DOI: 10.1007/s11881-021-00241-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 07/21/2021] [Indexed: 06/13/2023]
Abstract
Developmental dyslexia is a common neurodevelopmental disorder that is associated with alterations in the behavioral and neural processing of speech sounds, but the scope and nature of that association is uncertain. It has been proposed that more variable auditory processing could underlie some of the core deficits in this disorder. In the current study, magnetoencephalography (MEG) data were acquired from adults with and without dyslexia while they passively listened to or actively categorized tokens from a /ba/-/da/ consonant continuum. We observed no significant group difference in active categorical perception of this continuum in either of our two behavioral assessments. During passive listening, adults with dyslexia exhibited neural responses that were as consistent as those of typically reading adults in six cortical regions associated with auditory perception, language, and reading. However, they exhibited significantly less consistency in the left supramarginal gyrus, where greater inconsistency correlated significantly with worse decoding skills in the group with dyslexia. The group difference in the left supramarginal gyrus was evident only when neural data were binned with a high temporal resolution and was only significant during the passive condition. Interestingly, consistency significantly improved in both groups during active categorization versus passive listening. These findings suggest that adults with dyslexia exhibit typical levels of neural consistency in response to speech sounds with the exception of the left supramarginal gyrus and that this consistency increases during active versus passive perception of speech sounds similarly in the two groups.
Collapse
Affiliation(s)
- T M Centanni
- McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Department of Psychology, Texas Christian University, Fort Worth, TX, USA.
| | - S D Beach
- McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA
| | - O Ozernov-Palchik
- McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - S May
- McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- Boston College, Boston, MA, USA
| | - D Pantazis
- McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - J D E Gabrieli
- McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
9
|
Rocha MFB, Menezes DC, Duarte DSB, Griz SMS, Frizzo ACF, Menezes PDL, Teixeira CF, Advíncula KP. Masking release in cortical auditory evoked potentials with speech stimulus. Codas 2022; 35:e20200334. [PMID: 36541959 PMCID: PMC10010424 DOI: 10.1590/2317-1782/20212020334pt] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 03/02/2022] [Indexed: 12/23/2022] Open
Abstract
PURPOSE To analyze the effect of masking on the Cortical Auditory Evoked Potential with speech stimulus in young adults. METHODS Fourteen individuals aged between 19 and 28 years of both sexes with no hearing loss participated in the study. The Cortical Auditory Evoked Potential examination was performed with synthetic speech stimulus /ba/ simultaneous to Speech Shaped Noise presented under three conditions: steady noise with a 30 dB SPLep intensity (weak steady noise), steady noise with a 65 dB SPLep intensity o (strong steady noise) and modulated noise with 30 dB SPLep and 65 dB SPLep intensities at 25Hz and modulation period of 40 ms. RESULTS Higher latencies were observed in the cortical components, except P2, in the condition of strong steady noise and more meaningful measures of amplitude of the cortical components P1, N1 and P2 in the condition of modulated noise with statistically significant difference in comparison to the strong steady noise condition. There was worse wave morphology in the condition of strong steady noise, when compared to the other records. The average electrophysiological thresholds for the conditions of strong steady noise and modulated noise were 60 dB SPLep and 49 dB SPLep, respectively, showing a 11.7 dB mean difference. CONCLUSION We could infer that there was a lower masking effect of modulated noise when compared to the strong steady noise condition, in the amplitude measurements of the cortical components and an average difference of 11.7 dB between the electrophysiological thresholds (interpreted as the measure of the Masking Release).
Collapse
Affiliation(s)
- Mônyka Ferreira Borges Rocha
- Programa de Pós-graduação em Saúde da Comunicação Humana, Universidade Federal de Pernambuco - UFPE - Recife (PE), Brasil
| | - Denise Costa Menezes
- Programa de Pós-graduação em Saúde da Comunicação Humana, Departamento de Fonoaudiologia, Universidade Federal de Pernambuco - UFPE - Recife (PE), Brasil
| | | | - Silvana Maria Sobral Griz
- Programa de Pós-graduação em Saúde da Comunicação Humana, Departamento de Fonoaudiologia, Universidade Federal de Pernambuco - UFPE - Recife (PE), Brasil
| | - Ana Claudia Figueiredo Frizzo
- Programa de Pós-graduação em Fonoaudiologia, Universidade Estadual Paulista Julio de Mesquita Filho - (UNESP) - São Paulo (SP), Brasil
| | - Pedro de Lemos Menezes
- Departamento de Fonoaudiologia, Universidade Estadual de Ciências da Saúde de Alagoas - UNCISAL - Maceió (AL), Brasil
| | | | - Karina Paes Advíncula
- Programa de Pós-graduação em Saúde da Comunicação Humana, Departamento de Fonoaudiologia, Universidade Federal de Pernambuco - UFPE - Recife (PE), Brasil
| |
Collapse
|
10
|
Contributions of natural signal statistics to spectral context effects in consonant categorization. Atten Percept Psychophys 2021; 83:2694-2708. [PMID: 33987821 DOI: 10.3758/s13414-021-02310-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/23/2021] [Indexed: 11/08/2022]
Abstract
Speech perception, like all perception, takes place in context. Recognition of a given speech sound is influenced by the acoustic properties of surrounding sounds. When the spectral composition of earlier (context) sounds (e.g., a sentence with more energy at lower third formant [F3] frequencies) differs from that of a later (target) sound (e.g., consonant with intermediate F3 onset frequency), the auditory system magnifies this difference, biasing target categorization (e.g., towards higher-F3-onset /d/). Historically, these studies used filters to force context stimuli to possess certain spectral compositions. Recently, these effects were produced using unfiltered context sounds that already possessed the desired spectral compositions (Stilp & Assgari, 2019, Attention, Perception, & Psychophysics, 81, 2037-2052). Here, this natural signal statistics approach is extended to consonant categorization (/g/-/d/). Context sentences were either unfiltered (already possessing the desired spectral composition) or filtered (to imbue specific spectral characteristics). Long-term spectral characteristics of unfiltered contexts were poor predictors of shifts in consonant categorization, but short-term characteristics (last 475 ms) were excellent predictors. This diverges from vowel data, where long-term and shorter-term intervals (last 1,000 ms) were equally strong predictors. Thus, time scale plays a critical role in how listeners attune to signal statistics in the acoustic environment.
Collapse
|
11
|
Beach SD, Ozernov-Palchik O, May SC, Centanni TM, Gabrieli JDE, Pantazis D. Neural Decoding Reveals Concurrent Phonemic and Subphonemic Representations of Speech Across Tasks. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:254-279. [PMID: 34396148 PMCID: PMC8360503 DOI: 10.1162/nol_a_00034] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 02/21/2021] [Indexed: 06/13/2023]
Abstract
Robust and efficient speech perception relies on the interpretation of acoustically variable phoneme realizations, yet prior neuroimaging studies are inconclusive regarding the degree to which subphonemic detail is maintained over time as categorical representations arise. It is also unknown whether this depends on the demands of the listening task. We addressed these questions by using neural decoding to quantify the (dis)similarity of brain response patterns evoked during two different tasks. We recorded magnetoencephalography (MEG) as adult participants heard isolated, randomized tokens from a /ba/-/da/ speech continuum. In the passive task, their attention was diverted. In the active task, they categorized each token as ba or da. We found that linear classifiers successfully decoded ba vs. da perception from the MEG data. Data from the left hemisphere were sufficient to decode the percept early in the trial, while the right hemisphere was necessary but not sufficient for decoding at later time points. We also decoded stimulus representations and found that they were maintained longer in the active task than in the passive task; however, these representations did not pattern more like discrete phonemes when an active categorical response was required. Instead, in both tasks, early phonemic patterns gave way to a representation of stimulus ambiguity that coincided in time with reliable percept decoding. Our results suggest that the categorization process does not require the loss of subphonemic detail, and that the neural representation of isolated speech sounds includes concurrent phonemic and subphonemic information.
Collapse
Affiliation(s)
- Sara D. Beach
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA
| | - Ola Ozernov-Palchik
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Sidney C. May
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Lynch School of Education and Human Development, Boston College, Chestnut Hill, MA, USA
| | - Tracy M. Centanni
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Psychology, Texas Christian University, Fort Worth, TX, USA
| | - John D. E. Gabrieli
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Dimitrios Pantazis
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
12
|
McFayden TC, Baskin P, Stephens JDW, He S. Cortical Auditory Event-Related Potentials and Categorical Perception of Voice Onset Time in Children With an Auditory Neuropathy Spectrum Disorder. Front Hum Neurosci 2020; 14:184. [PMID: 32523521 PMCID: PMC7261872 DOI: 10.3389/fnhum.2020.00184] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Accepted: 04/27/2020] [Indexed: 11/13/2022] Open
Abstract
Objective: This study evaluated cortical encoding of voice onset time (VOT) in quiet and noise, and their potential associations with the behavioral categorical perception of VOT in children with auditory neuropathy spectrum disorder (ANSD). Design: Subjects were 11 children with ANSD ranging in age between 6.4 and 16.2 years. The stimulus was an /aba/-/apa/ vowel-consonant-vowel continuum comprising eight tokens with VOTs ranging from 0 ms (voiced endpoint) to 88 ms (voiceless endpoint). For speech in noise, speech tokens were mixed with the speech-shaped noise from the Hearing In Noise Test at a signal-to-noise ratio (SNR) of +5 dB. Speech-evoked auditory event-related potentials (ERPs) and behavioral categorization perception of VOT were measured in quiet in all subjects, and at an SNR of +5 dB in seven subjects. The stimuli were presented at 35 dB SL (re: pure tone average) or 115 dB SPL if this limit was less than 35 dB SL. In addition to the onset response, the auditory change complex (ACC) elicited by VOT was recorded in eight subjects. Results: Speech evoked ERPs recorded in all subjects consisted of a vertex positive peak (i.e., P1), followed by a trough occurring approximately 100 ms later (i.e., N2). For results measured in quiet, there was no significant difference in categorical boundaries estimated using ERP measures and behavioral procedures. Categorical boundaries estimated in quiet using both ERP and behavioral measures closely correlated with the most-recently measured Phonetically Balanced Kindergarten (PBK) scores. Adding a competing background noise did not affect categorical boundaries estimated using either behavioral or ERP procedures in three subjects. For the other four subjects, categorical boundaries estimated in noise using behavioral measures were prolonged. However, adding background noise only increased categorical boundaries measured using ERPs in three out of these four subjects. Conclusions: VCV continuum can be used to evaluate behavioral identification and the neural encoding of VOT in children with ANSD. In quiet, categorical boundaries of VOT estimated using behavioral measures and ERP recordings are closely associated with speech recognition performance in children with ANSD. Underlying mechanisms for excessive speech perception deficits in noise may vary for individual patients with ANSD.
Collapse
Affiliation(s)
- Tyler C McFayden
- Department of Psychology, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Paola Baskin
- Department of Anesthesiology, School of Medicine, University of California, San Diego, San Diego, CA, United States
| | - Joseph D W Stephens
- Department of Psychology, North Carolina Agricultural and Technical State University, Greensboro, NC, United States
| | - Shuman He
- Department of Otolaryngology-Head and Neck Surgery, Wexner Medical Center, The Ohio State University, Columbus, OH, United States.,Department of Audiology, Nationwide Children's Hospital, Columbus, OH, United States
| |
Collapse
|
13
|
Stilp CE. Evaluating peripheral versus central contributions to spectral context effects in speech perception. Hear Res 2020; 392:107983. [PMID: 32464456 DOI: 10.1016/j.heares.2020.107983] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 04/07/2020] [Accepted: 04/28/2020] [Indexed: 11/27/2022]
Abstract
Perception of a sound is influenced by spectral properties of surrounding sounds. When frequencies are absent in a preceding acoustic context before being introduced in a subsequent target sound, detection of those frequencies is facilitated via an auditory enhancement effect (EE). When spectral composition differs across a preceding context and subsequent target sound, those differences are perceptually magnified and perception shifts via a spectral contrast effect (SCE). Each effect is thought to receive contributions from peripheral and central neural processing, but the relative contributions are unclear. The present experiments manipulated ear of presentation to elucidate the degrees to which peripheral and central processes contributed to each effect in speech perception. In Experiment 1, EE and SCE magnitudes in consonant categorization were substantially diminished through contralateral presentation of contexts and targets compared to ipsilateral or bilateral presentations. In Experiment 2, spectrally complementary contexts were presented dichotically followed by the target in only one ear. This arrangement was predicted to produce context effects peripherally and cancel them centrally, but the competing contralateral context minimally decreased effect magnitudes. Results confirm peripheral and central contributions to EEs and SCEs in speech perception, but both effects appear to be primarily due to peripheral processing.
Collapse
Affiliation(s)
- Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, KY, 40292, USA.
| |
Collapse
|
14
|
Masking Release for Speech in Modulated Maskers: Electrophysiological and Behavioral Measures. Ear Hear 2020; 40:1009-1015. [PMID: 30557224 DOI: 10.1097/aud.0000000000000683] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The purpose of this study was to obtain an electrophysiological analog of masking release using speech-evoked cortical potentials in steady and modulated maskers and to relate this masking release to behavioral measures for the same stimuli. The hypothesis was that the evoked potentials can be tracked to a lower stimulus level in a modulated masker than in a steady masker and that the magnitude of this electrophysiological masking release is of the same order as that of the behavioral masking release for the same stimuli. DESIGN Cortical potentials evoked by an 80-ms /ba/ stimulus were measured in two steady maskers (30 and 65 dB SPL), and in a masker that modulated between these two levels at a rate of 25 Hz. In each masker, a level series was undertaken to determine electrophysiological threshold. Behavioral detection thresholds were determined in the same maskers using an adaptive tracking procedure. Masking release was defined as the difference between signal thresholds measured in the steady 65-dB SPL masker and the modulated masker. A total of 23 normal-hearing adults participated. RESULTS Electrophysiological thresholds were uniformly elevated relative to behavioral thresholds by about 6.5 dB. However, the magnitude of masking release was about 13.5 dB for both measurement domains. CONCLUSIONS Electrophysiological measures of masking release using speech-evoked cortical auditory evoked potentials correspond closely to behavioral estimates for the same stimuli. This suggests that objective measures based on electrophysiological techniques can be used to reliably gauge aspects of temporal processing ability.
Collapse
|
15
|
Stilp CE. Auditory enhancement and spectral contrast effects in speech perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:1503. [PMID: 31472539 DOI: 10.1121/1.5120181] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Accepted: 07/11/2019] [Indexed: 06/10/2023]
Abstract
The auditory system is remarkably sensitive to changes in the acoustic environment. This is exemplified by two classic effects of preceding spectral context on perception. In auditory enhancement effects (EEs), the absence and subsequent insertion of a frequency component increases its salience. In spectral contrast effects (SCEs), spectral differences between earlier and later (target) sounds are perceptually magnified, biasing target sound categorization. These effects have been suggested to be related, but have largely been studied separately. Here, EEs and SCEs are demonstrated using the same speech materials. In Experiment 1, listeners categorized vowels (/ɪ/-/ɛ/) or consonants (/d/-/g/) following a sentence processed by a bandpass or bandstop filter (vowel tasks: 100-400 or 550-850 Hz; consonant tasks: 1700-2700 or 2700-3700 Hz). Bandpass filtering produced SCEs and bandstop filtering produced EEs, with effect magnitudes significantly correlated at the individual differences level. In Experiment 2, context sentences were processed by variable-depth notch filters in these frequency regions (-5 to -20 dB). EE magnitudes increased at larger notch depths, growing linearly in consonant categorization. This parallels previous research where SCEs increased linearly for larger spectral peaks in the context sentence. These results link EEs and SCEs, as both shape speech categorization in orderly ways.
Collapse
Affiliation(s)
- Christian E Stilp
- 317 Life Sciences Building, University of Louisville, Louisville, Kentucky 40292, USA
| |
Collapse
|
16
|
Perceptual sensitivity to spectral properties of earlier sounds during speech categorization. Atten Percept Psychophys 2019; 80:1300-1310. [PMID: 29492759 DOI: 10.3758/s13414-018-1488-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Speech perception is heavily influenced by surrounding sounds. When spectral properties differ between earlier (context) and later (target) sounds, this can produce spectral contrast effects (SCEs) that bias perception of later sounds. For example, when context sounds have more energy in low-F1 frequency regions, listeners report more high-F1 responses to a target vowel, and vice versa. SCEs have been reported using various approaches for a wide range of stimuli, but most often, large spectral peaks were added to the context to bias speech categorization. This obscures the lower limit of perceptual sensitivity to spectral properties of earlier sounds, i.e., when SCEs begin to bias speech categorization. Listeners categorized vowels (/ɪ/-/ɛ/, Experiment 1) or consonants (/d/-/g/, Experiment 2) following a context sentence with little spectral amplification (+1 to +4 dB) in frequency regions known to produce SCEs. In both experiments, +3 and +4 dB amplification in key frequency regions of the context produced SCEs, but lesser amplification was insufficient to bias performance. This establishes a lower limit of perceptual sensitivity where spectral differences across sounds can bias subsequent speech categorization. These results are consistent with proposed adaptation-based mechanisms that potentially underlie SCEs in auditory perception. SIGNIFICANCE STATEMENT Recent sounds can change what speech sounds we hear later. This can occur when the average frequency composition of earlier sounds differs from that of later sounds, biasing how they are perceived. These "spectral contrast effects" are widely observed when sounds' frequency compositions differ substantially. We reveal the lower limit of these effects, as +3 dB amplification of key frequency regions in earlier sounds was enough to bias categorization of the following vowel or consonant sound. Speech categorization being biased by very small spectral differences across sounds suggests that spectral contrast effects occur frequently in everyday speech perception.
Collapse
|
17
|
Stilp CE, Assgari AA. Consonant categorization exhibits a graded influence of surrounding spectral context. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:EL153. [PMID: 28253661 DOI: 10.1121/1.4974769] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
When spectral properties differ across successive sounds, this difference is perceptually magnified, resulting in spectral contrast effects (SCEs). Recently, Stilp, Anderson, and Winn [(2015) J. Acoust. Soc. Am. 137(6), 3466-3476] revealed that SCEs are graded: more prominent spectral peaks in preceding sounds produced larger SCEs (i.e., category boundary shifts) in categorization of subsequent vowels. Here, a similar relationship between spectral context and SCEs was replicated in categorization of voiced stop consonants. By generalizing this relationship across consonants and vowels, different spectral cues, and different frequency regions, acute and graded sensitivity to spectral context appears to be pervasive in speech perception.
Collapse
Affiliation(s)
- Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA ,
| | - Ashley A Assgari
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA ,
| |
Collapse
|
18
|
El Boghdady N, Kegel A, Lai WK, Dillier N. A neural-based vocoder implementation for evaluating cochlear implant coding strategies. Hear Res 2016; 333:136-149. [PMID: 26775182 DOI: 10.1016/j.heares.2016.01.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Revised: 12/18/2015] [Accepted: 01/07/2016] [Indexed: 10/22/2022]
Abstract
Most simulations of cochlear implant (CI) coding strategies rely on standard vocoders that are based on purely signal processing techniques. However, these models neither account for various biophysical phenomena, such as neural stochasticity and refractoriness, nor for effects of electrical stimulation, such as spectral smearing as a function of stimulus intensity. In this paper, a neural model that accounts for stochastic firing, parasitic spread of excitation across neuron populations, and neuronal refractoriness, was developed and augmented as a preprocessing stage for a standard 22-channel noise-band vocoder. This model was used to subjectively and objectively assess consonant discrimination in commercial and experimental coding strategies. Stimuli consisting of consonant-vowel (CV) and vowel-consonant-vowel (VCV) tokens were processed by either the Advanced Combination Encoder (ACE) or the Excitability Controlled Coding (ECC) strategies, and later resynthesized to audio using the aforementioned vocoder model. Baseline performance was measured using unprocessed versions of the speech tokens. Behavioural responses were collected from seven normal hearing (NH) volunteers, while EEG data were recorded from five NH participants. Psychophysical results indicate that while there may be a difference in consonant perception between the two tested coding strategies, mismatch negativity (MMN) waveforms do not show any marked trends in CV or VCV contrast discrimination.
Collapse
Affiliation(s)
- Nawal El Boghdady
- Institute for Neuroinformatics (INI), Universität Zürich (UZH)/ ETH Zürich (ETHZ), Zürich, Switzerland.
| | - Andrea Kegel
- Laboratory of Experimental Audiology, ENT Department, Universitätsspital Zürich (USZ), Zürich, Switzerland
| | - Wai Kong Lai
- Laboratory of Experimental Audiology, ENT Department, Universitätsspital Zürich (USZ), Zürich, Switzerland
| | - Norbert Dillier
- Laboratory of Experimental Audiology, ENT Department, Universitätsspital Zürich (USZ), Zürich, Switzerland
| |
Collapse
|
19
|
Stasenko A, Bonn C, Teghipco A, Garcea FE, Sweet C, Dombovy M, McDonough J, Mahon BZ. A causal test of the motor theory of speech perception: a case of impaired speech production and spared speech perception. Cogn Neuropsychol 2015; 32:38-57. [PMID: 25951749 DOI: 10.1080/02643294.2015.1035702] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available.
Collapse
Affiliation(s)
- Alena Stasenko
- a Department of Brain & Cognitive Sciences , University of Rochester , Rochester , NY , USA
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Humphries C, Sabri M, Lewis K, Liebenthal E. Hierarchical organization of speech perception in human auditory cortex. Front Neurosci 2014; 8:406. [PMID: 25565939 PMCID: PMC4263085 DOI: 10.3389/fnins.2014.00406] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2014] [Accepted: 11/22/2014] [Indexed: 11/22/2022] Open
Abstract
Human speech consists of a variety of articulated sounds that vary dynamically in spectral composition. We investigated the neural activity associated with the perception of two types of speech segments: (a) the period of rapid spectral transition occurring at the beginning of a stop-consonant vowel (CV) syllable and (b) the subsequent spectral steady-state period occurring during the vowel segment of the syllable. Functional magnetic resonance imaging (fMRI) was recorded while subjects listened to series of synthesized CV syllables and non-phonemic control sounds. Adaptation to specific sound features was measured by varying either the transition or steady-state periods of the synthesized sounds. Two spatially distinct brain areas in the superior temporal cortex were found that were sensitive to either the type of adaptation or the type of stimulus. In a relatively large section of the bilateral dorsal superior temporal gyrus (STG), activity varied as a function of adaptation type regardless of whether the stimuli were phonemic or non-phonemic. Immediately adjacent to this region in a more limited area of the ventral STG, increased activity was observed for phonemic trials compared to non-phonemic trials, however, no adaptation effects were found. In addition, a third area in the bilateral medial superior temporal plane showed increased activity to non-phonemic compared to phonemic sounds. The results suggest a multi-stage hierarchical stream for speech sound processing extending ventrolaterally from the superior temporal plane to the superior temporal sulcus. At successive stages in this hierarchy, neurons code for increasingly more complex spectrotemporal features. At the same time, these representations become more abstracted from the original acoustic form of the sound.
Collapse
Affiliation(s)
- Colin Humphries
- Department of Neurology, Medical College of Wisconsin Milwaukee, WI, USA
| | - Merav Sabri
- Department of Neurology, Medical College of Wisconsin Milwaukee, WI, USA
| | - Kimberly Lewis
- Department of Neurology, Medical College of Wisconsin Milwaukee, WI, USA
| | - Einat Liebenthal
- Department of Neurology, Medical College of Wisconsin Milwaukee, WI, USA ; Department of Psychiatry, Brigham and Women's Hospital Boston, MA, USA
| |
Collapse
|
21
|
May PJC, Tiitinen H. Temporal binding of sound emerges out of anatomical structure and synaptic dynamics of auditory cortex. Front Comput Neurosci 2013; 7:152. [PMID: 24223549 PMCID: PMC3819594 DOI: 10.3389/fncom.2013.00152] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2013] [Accepted: 10/11/2013] [Indexed: 11/21/2022] Open
Abstract
The ability to represent and recognize naturally occuring sounds such as speech depends not only on spectral analysis carried out by the subcortical auditory system but also on the ability of the cortex to bind spectral information over time. In primates, these temporal binding processes are mirrored as selective responsiveness of neurons to species-specific vocalizations. Here, we used computational modeling of auditory cortex to investigate how selectivity to spectrally and temporally complex stimuli is achieved. A set of 208 microcolumns were arranged in a serial core-belt-parabelt structure documented in both humans and animals. Stimulus material comprised multiple consonant-vowel (CV) pseudowords. Selectivity to the spectral structure of the sounds was commonly found in all regions of the model (N = 122 columns out of 208), and this selectivity was only weakly affected by manipulating the structure and dynamics of the model. In contrast, temporal binding was rarer (N = 39), found mostly in the belt and parabelt regions. Thus, the serial core-belt-parabelt structure of auditory cortex is necessary for temporal binding. Further, adaptation due to synaptic depression-rendering the cortical network malleable by stimulus history-was crucial for the emergence of neurons sensitive to the temporal structure of the stimuli. Both spectral selectivity and temporal binding required that a sufficient proportion of the columns interacted in an inhibitory manner. The model and its structural modifications had a small-world structure (i.e., columns formed clusters and were within short node-to-node distances from each other). However, simulations showed that a small-world structure is not a necessary condition for spectral selectivity and temporal binding to emerge. In summary, this study suggests that temporal binding arises out of (1) the serial structure typical to the auditory cortex, (2) synaptic adaptation, and (3) inhibitory interactions between microcolumns.
Collapse
Affiliation(s)
- Patrick J. C. May
- Brain and Mind Laboratory, Department of Biomedical Engineering and Computational Science, School of Science, Aalto University, Aalto, Finland
| | | |
Collapse
|