1
|
de la Cruz-Pavía I, Hegde M, Cabrera L, Nazzi T. Infants' abilities to segment word forms from spectrally degraded speech in the first year of life. Dev Sci 2024; 27:e13533. [PMID: 38853379 DOI: 10.1111/desc.13533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 04/22/2024] [Accepted: 05/15/2024] [Indexed: 06/11/2024]
Abstract
Infants begin to segment word forms from fluent speech-a crucial task in lexical processing-between 4 and 7 months of age. Prior work has established that infants rely on a variety of cues available in the speech signal (i.e., prosodic, statistical, acoustic-segmental, and lexical) to accomplish this task. In two experiments with French-learning 6- and 10-month-olds, we use a psychoacoustic approach to examine if and how degradation of the two fundamental acoustic components extracted from speech by the auditory system, namely, temporal (both frequency and amplitude modulation) and spectral information, impact word form segmentation. Infants were familiarized with passages containing target words, in which frequency modulation (FM) information was replaced with pure tones using a vocoder, while amplitude modulation (AM) was preserved in either 8 or 16 spectral bands. Infants were then tested on their recognition of the target versus novel control words. While the 6-month-olds were unable to segment in either condition, the 10-month-olds succeeded, although only in the 16 spectral band condition. These findings suggest that 6-month-olds need FM temporal cues for speech segmentation while 10-month-olds do not, although they need the AM cues to be presented in enough spectral bands (i.e., 16). This developmental change observed in infants' sensitivity to spectrotemporal cues likely results from an increase in the range of available segmentation procedures, and/or shift from a vowel to a consonant bias in lexical processing between the two ages, as vowels are more affected by our acoustic manipulations. RESEARCH HIGHLIGHTS: Although segmenting speech into word forms is crucial for lexical acquisition, the acoustic information that infants' auditory system extracts to process continuous speech remains unknown. We examined infants' sensitivity to spectrotemporal cues in speech segmentation using vocoded speech, and revealed a developmental change between 6 and 10 months of age. We showed that FM information, that is, the fast temporal modulations of speech, is necessary for 6- but not 10-month-old infants to segment word forms. Moreover, reducing the number of spectral bands impacts 10-month-olds' segmentation abilities, who succeed when 16 bands are preserved, but fail with 8 bands.
Collapse
Affiliation(s)
- Irene de la Cruz-Pavía
- Faculty of Social and Human Sciences, Universidad de Deusto, Bilbao, Spain
- Basque Foundation for Science Ikerbasque, Bilbao, Spain
| | - Monica Hegde
- INCC UMR 8002, CNRS, F-75006, Université Paris Cité, Paris, France
| | | | - Thierry Nazzi
- INCC UMR 8002, CNRS, F-75006, Université Paris Cité, Paris, France
| |
Collapse
|
2
|
Mallikarjun A, Shroads E, Newman RS. Perception of vocoded speech in domestic dogs. Anim Cogn 2024; 27:34. [PMID: 38625429 PMCID: PMC11021312 DOI: 10.1007/s10071-024-01869-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 03/21/2024] [Accepted: 03/24/2024] [Indexed: 04/17/2024]
Abstract
Humans have an impressive ability to comprehend signal-degraded speech; however, the extent to which comprehension of degraded speech relies on human-specific features of speech perception vs. more general cognitive processes is unknown. Since dogs live alongside humans and regularly hear speech, they can be used as a model to differentiate between these possibilities. One often-studied type of degraded speech is noise-vocoded speech (sometimes thought of as cochlear-implant-simulation speech). Noise-vocoded speech is made by dividing the speech signal into frequency bands (channels), identifying the amplitude envelope of each individual band, and then using these envelopes to modulate bands of noise centered over the same frequency regions - the result is a signal with preserved temporal cues, but vastly reduced frequency information. Here, we tested dogs' recognition of familiar words produced in 16-channel vocoded speech. In the first study, dogs heard their names and unfamiliar dogs' names (foils) in vocoded speech as well as natural speech. In the second study, dogs heard 16-channel vocoded speech only. Dogs listened longer to their vocoded name than vocoded foils in both experiments, showing that they can comprehend a 16-channel vocoded version of their name without prior exposure to vocoded speech, and without immediate exposure to the natural-speech version of their name. Dogs' name recognition in the second study was mediated by the number of phonemes in the dogs' name, suggesting that phonological context plays a role in degraded speech comprehension.
Collapse
Affiliation(s)
- Amritha Mallikarjun
- Penn Vet Working Dog Center, University of Pennsylvania School of Veterinary Medicine, Philadelphia, USA.
| | - Emily Shroads
- Department of Hearing and Speech Sciences, University of Maryland, College Park, USA
| | - Rochelle S Newman
- Department of Hearing and Speech Sciences, University of Maryland, College Park, USA
| |
Collapse
|
3
|
Cychosz M, Winn MB, Goupell MJ. How to vocode: Using channel vocoders for cochlear-implant research. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2407-2437. [PMID: 38568143 PMCID: PMC10994674 DOI: 10.1121/10.0025274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 02/14/2024] [Accepted: 02/23/2024] [Indexed: 04/05/2024]
Abstract
The channel vocoder has become a useful tool to understand the impact of specific forms of auditory degradation-particularly the spectral and temporal degradation that reflect cochlear-implant processing. Vocoders have many parameters that allow researchers to answer questions about cochlear-implant processing in ways that overcome some logistical complications of controlling for factors in individual cochlear implant users. However, there is such a large variety in the implementation of vocoders that the term "vocoder" is not specific enough to describe the signal processing used in these experiments. Misunderstanding vocoder parameters can result in experimental confounds or unexpected stimulus distortions. This paper highlights the signal processing parameters that should be specified when describing vocoder construction. The paper also provides guidance on how to determine vocoder parameters within perception experiments, given the experimenter's goals and research questions, to avoid common signal processing mistakes. Throughout, we will assume that experimenters are interested in vocoders with the specific goal of better understanding cochlear implants.
Collapse
Affiliation(s)
- Margaret Cychosz
- Department of Linguistics, University of California, Los Angeles, Los Angeles, California 90095, USA
| | - Matthew B Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, College Park, Maryland 20742, USA
| |
Collapse
|
4
|
Sekine K, Özyürek A. Children benefit from gestures to understand degraded speech but to a lesser extent than adults. Front Psychol 2024; 14:1305562. [PMID: 38303780 PMCID: PMC10832995 DOI: 10.3389/fpsyg.2023.1305562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 12/13/2023] [Indexed: 02/03/2024] Open
Abstract
The present study investigated to what extent children, compared to adults, benefit from gestures to disambiguate degraded speech by manipulating speech signals and manual modality. Dutch-speaking adults (N = 20) and 6- and 7-year-old children (N = 15) were presented with a series of video clips in which an actor produced a Dutch action verb with or without an accompanying iconic gesture. Participants were then asked to repeat what they had heard. The speech signal was either clear or altered into 4- or 8-band noise-vocoded speech. Children had more difficulty than adults in disambiguating degraded speech in the speech-only condition. However, when presented with both speech and gestures, children reached a comparable level of accuracy to that of adults in the degraded-speech-only condition. Furthermore, for adults, the enhancement of gestures was greater in the 4-band condition than in the 8-band condition, whereas children showed the opposite pattern. Gestures help children to disambiguate degraded speech, but children need more phonological information than adults to benefit from use of gestures. Children's multimodal language integration needs to further develop to adapt flexibly to challenging situations such as degraded speech, as tested in our study, or instances where speech is heard with environmental noise or through a face mask.
Collapse
Affiliation(s)
- Kazuki Sekine
- Faculty of Human Sciences, Waseda University, Tokorozawa, Japan
| | - Aslı Özyürek
- Centre for Language Studies, Radboud University, Nijmegen, Netherlands
- Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
5
|
Xu N, Zhao B, Luo L, Zhang K, Shao X, Luan G, Wang Q, Hu W, Wang Q. Two stages of speech envelope tracking in human auditory cortex modulated by speech intelligibility. Cereb Cortex 2023; 33:2215-2228. [PMID: 35695785 DOI: 10.1093/cercor/bhac203] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 05/01/2022] [Accepted: 05/02/2022] [Indexed: 11/13/2022] Open
Abstract
The envelope is essential for speech perception. Recent studies have shown that cortical activity can track the acoustic envelope. However, whether the tracking strength reflects the extent of speech intelligibility processing remains controversial. Here, using stereo-electroencephalogram technology, we directly recorded the activity in human auditory cortex while subjects listened to either natural or noise-vocoded speech. These 2 stimuli have approximately identical envelopes, but the noise-vocoded speech does not have speech intelligibility. According to the tracking lags, we revealed 2 stages of envelope tracking: an early high-γ (60-140 Hz) power stage that preferred the noise-vocoded speech and a late θ (4-8 Hz) phase stage that preferred the natural speech. Furthermore, the decoding performance of high-γ power was better in primary auditory cortex than in nonprimary auditory cortex, consistent with its short tracking delay, while θ phase showed better decoding performance in right auditory cortex. In addition, high-γ responses with sustained temporal profiles in nonprimary auditory cortex were dominant in both envelope tracking and decoding. In sum, we suggested a functional dissociation between high-γ power and θ phase: the former reflects fast and automatic processing of brief acoustic features, while the latter correlates to slow build-up processing facilitated by speech intelligibility.
Collapse
Affiliation(s)
- Na Xu
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No. 119 South Fourth Ring West Road, Fengtai District, Beijing 100070, China.,National Clinical Research Center for Neurological Diseases, No. 119 South Fourth Ring West Road, Fengtai District, Beijing 100070, China
| | - Baotian Zhao
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, No. 119 South Fourth Ring West Road, Fengtai District, Beijing 100070, China
| | - Lu Luo
- School of Psychology, Beijing Sport University, No. 48 Xinxi Road, Haidian District, Beijing 100084, China
| | - Kai Zhang
- Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, No. 119 South Fourth Ring West Road, Fengtai District, Beijing 100070, China
| | - Xiaoqiu Shao
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No. 119 South Fourth Ring West Road, Fengtai District, Beijing 100070, China
| | - Guoming Luan
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Sanbo Brain Hospital, Capital Medical University, No. 50 Yikesong Xiangshan Road, Haidian District, Beijing 100093, China.,Beijing Institute of Brain Disorders, Collaborative Innovation Center for Brain Disorders, Capital Medical University, No.10 Xitoutiao, You An Men, Beijing 100069, China
| | - Qian Wang
- Beijing Key Laboratory of Epilepsy, Epilepsy Center, Sanbo Brain Hospital, Capital Medical University, No. 50 Yikesong Xiangshan Road, Haidian District, Beijing 100093, China.,School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, No.5 Yiheyuan Road, Haidian District, Beijing 100871, China.,IDG/McGovern Institute for Brain Research, Peking University, No.5 Yiheyuan Road, Haidian District, Beijing 100871, China
| | - Wenhan Hu
- Beijing Neurosurgical Institute, Capital Medical University, No. 119 South Fourth Ring West Road, Fengtai District, Beijing 100070, China
| | - Qun Wang
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, No. 119 South Fourth Ring West Road, Fengtai District, Beijing 100070, China.,National Clinical Research Center for Neurological Diseases, No. 119 South Fourth Ring West Road, Fengtai District, Beijing 100070, China.,Beijing Institute of Brain Disorders, Collaborative Innovation Center for Brain Disorders, Capital Medical University, No.10 Xitoutiao, You An Men, Beijing 100069, China
| |
Collapse
|
6
|
Noble AR, Resnick J, Broncheau M, Klotz S, Rubinstein JT, Werner LA, Horn DL. Spectrotemporal Modulation Discrimination in Infants With Normal Hearing. Ear Hear 2023; 44:109-117. [PMID: 36218270 PMCID: PMC9780152 DOI: 10.1097/aud.0000000000001277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
OBJECTIVES Spectral resolution correlates with speech understanding in post-lingually deafened adults with cochlear implants (CIs) and is proposed as a non-linguistic measure of device efficacy in implanted infants. However, spectral resolution develops gradually through adolescence regardless of hearing status. Spectral resolution relies on two different factors that mature at markedly different rates: Resolution of ripple peaks (frequency resolution) matures during infancy whereas sensitivity to across-spectrum intensity modulation (spectral modulation sensitivity) matures by age 12. Investigation of spectral resolution as a clinical measure for implanted infants requires understanding how each factor develops and constrains speech understanding with a CI. This study addresses the limitations of the present literature. First, the paucity of relevant data requires replication and generalization across measures of spectral resolution. Second, criticism that previously used measures of spectral resolution may reflect non-spectral cues needs to be addressed. Third, rigorous behavioral measurement of spectral resolution in individual infants is limited by attrition. To address these limitations, we measured discrimination of spectrally modulated, or rippled, sounds at two modulation depths in normal hearing (NH) infants and adults. Non-spectral cues were limited by constructing stimuli with spectral envelopes that change in phase across time. Pilot testing suggested that dynamic spectral envelope stimuli appeared to hold infants' attention and lengthen habituation time relative to previously used static ripple stimuli. A post-hoc condition was added to ensure that the stimulus noise carrier was not obscuring age differences in spectral resolution. The degree of improvement in discrimination at higher ripple depth represents spectral frequency resolution independent of the overall threshold. It was hypothesized that adults would have better thresholds than infants but both groups would show similar effects of modulation depth. DESIGN Participants were 53 6- to 7-month-old infants and 23 adults with NH with no risk factors for hearing loss who passed bilateral otoacoustic emissions screening. Stimuli were created from complexes with 33- or 100-tones per octave, amplitude-modulated across frequency and time with constant 5 Hz envelope phase-drift and spectral ripple density from 1 to 20 ripples per octave (RPO). An observer-based, single-interval procedure measured the highest RPO (1 to 19) a listener could discriminate from a 20 RPO stimulus. Age-group and stimulus pure-tone complex were between-subjects variables whereas modulation depth (10 or 20 dB) was within-subjects. Linear-mixed model analysis was used to test for the significance of the main effects and interactions. RESULTS All adults and 94% of infants provided ripple density thresholds at both modulation depths. The upper range of threshold approached 17 RPO with the 100-tones/octave carrier and 20 dB depth condition. As expected, mean threshold was significantly better with the 100-tones/octave compared with the 33-tones/octave complex, better in adults than in infants, and better at 20 dB than 10 dB modulation depth. None of the interactions reached significance, suggesting that the effect of modulation depth on the threshold was not different for infants or adults. CONCLUSIONS Spectral ripple discrimination can be measured in infants with minimal listener attrition using dynamic ripple stimuli. Results are consistent with previous findings that spectral resolution is immature in infancy due to immature spectral modulation sensitivity rather than frequency resolution.
Collapse
Affiliation(s)
- Anisha R. Noble
- Department of Otolaryngology – Head and Neck Surgery, University of Washington, Seattle, WA
| | - Jesse Resnick
- Department of Otolaryngology – Head and Neck Surgery, University of Washington, Seattle, WA
| | - Mariette Broncheau
- Department of Otolaryngology – Head and Neck Surgery, University of Washington, Seattle, WA
| | - Stephanie Klotz
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA
| | - Jay T. Rubinstein
- Department of Otolaryngology – Head and Neck Surgery, University of Washington, Seattle, WA
| | - Lynne A. Werner
- Department of Otolaryngology – Head and Neck Surgery, University of Washington, Seattle, WA
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA
| | - David L. Horn
- Department of Otolaryngology – Head and Neck Surgery, University of Washington, Seattle, WA
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA
| |
Collapse
|
7
|
Pourhashemi F, Baart M, van Laarhoven T, Vroomen J. Want to quickly adapt to distorted speech and become a better listener? Read lips, not text. PLoS One 2022; 17:e0278986. [PMID: 36580461 PMCID: PMC9799298 DOI: 10.1371/journal.pone.0278986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 11/28/2022] [Indexed: 12/30/2022] Open
Abstract
When listening to distorted speech, does one become a better listener by looking at the face of the speaker or by reading subtitles that are presented along with the speech signal? We examined this question in two experiments in which we presented participants with spectrally distorted speech (4-channel noise-vocoded speech). During short training sessions, listeners received auditorily distorted words or pseudowords that were partially disambiguated by concurrently presented lipread information or text. After each training session, listeners were tested with new degraded auditory words. Learning effects (based on proportions of correctly identified words) were stronger if listeners had trained with words rather than with pseudowords (a lexical boost), and adding lipread information during training was more effective than adding text (a lipread boost). Moreover, the advantage of lipread speech over text training was also found when participants were tested more than a month later. The current results thus suggest that lipread speech may have surprisingly long-lasting effects on adaptation to distorted speech.
Collapse
Affiliation(s)
- Faezeh Pourhashemi
- Dept. of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| | - Martijn Baart
- Dept. of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
- BCBL, Basque Center on Cognition, Brain, and Language, Donostia, Spain
- * E-mail:
| | - Thijs van Laarhoven
- Dept. of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| | - Jean Vroomen
- Dept. of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| |
Collapse
|
8
|
Lahiff NJ, Slocombe KE, Taglialatela J, Dellwo V, Townsend SW. Degraded and computer-generated speech processing in a bonobo. Anim Cogn 2022; 25:1393-1398. [PMID: 35595881 PMCID: PMC9652166 DOI: 10.1007/s10071-022-01621-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 03/28/2022] [Accepted: 04/11/2022] [Indexed: 11/01/2022]
Abstract
The human auditory system is capable of processing human speech even in situations when it has been heavily degraded, such as during noise-vocoding, when frequency domain-based cues to phonetic content are strongly reduced. This has contributed to arguments that speech processing is highly specialized and likely a de novo evolved trait in humans. Previous comparative research has demonstrated that a language competent chimpanzee was also capable of recognizing degraded speech, and therefore that the mechanisms underlying speech processing may not be uniquely human. However, to form a robust reconstruction of the evolutionary origins of speech processing, additional data from other closely related ape species is needed. Specifically, such data can help disentangle whether these capabilities evolved independently in humans and chimpanzees, or if they were inherited from our last common ancestor. Here we provide evidence of processing of highly varied (degraded and computer-generated) speech in a language competent bonobo, Kanzi. We took advantage of Kanzi's existing proficiency with touchscreens and his ability to report his understanding of human speech through interacting with arbitrary symbols called lexigrams. Specifically, we asked Kanzi to recognise both human (natural) and computer-generated forms of 40 highly familiar words that had been degraded (noise-vocoded and sinusoidal forms) using a match-to-sample paradigm. Results suggest that-apart from noise-vocoded computer-generated speech-Kanzi recognised both natural and computer-generated voices that had been degraded, at rates significantly above chance. Kanzi performed better with all forms of natural voice speech compared to computer-generated speech. This work provides additional support for the hypothesis that the processing apparatus necessary to deal with highly variable speech, including for the first time in nonhuman animals, computer-generated speech, may be at least as old as the last common ancestor we share with bonobos and chimpanzees.
Collapse
Affiliation(s)
- Nicole J Lahiff
- Department of Psychology, University of York, York, UK. .,Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland. .,Department of Comparative Language Science, University of Zurich, Zurich, CH, Switzerland.
| | | | - Jared Taglialatela
- Department of Ecology, Evolution and Organismal Biology, Kennesaw State University, Kennesaw, USA.,Ape Cognition and Conservation Initiative, Des Moines, USA
| | - Volker Dellwo
- Department of Computational Linguistics, University of Zurich, Zurich, CH, Switzerland.,Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland
| | - Simon W Townsend
- Center for the Interdisciplinary Study of Language Evolution, University of Zurich, Zurich, Switzerland. .,Department of Comparative Language Science, University of Zurich, Zurich, CH, Switzerland. .,Department of Psychology, University of Warwick, Coventry, UK.
| |
Collapse
|
9
|
Jahn KN, Arenberg JG, Horn DL. Spectral Resolution Development in Children With Normal Hearing and With Cochlear Implants: A Review of Behavioral Studies. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:1646-1658. [PMID: 35201848 PMCID: PMC9499384 DOI: 10.1044/2021_jslhr-21-00307] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 09/09/2021] [Accepted: 12/01/2021] [Indexed: 06/14/2023]
Abstract
PURPOSE This review article provides a theoretical overview of the development of spectral resolution in children with normal hearing (cNH) and in those who use cochlear implants (CIs), with an emphasis on methodological considerations. The aim was to identify key directions for future research on spectral resolution development in children with CIs. METHOD A comprehensive literature review was conducted to summarize and synthesize previously published behavioral research on spectral resolution development in normal and impaired auditory systems. CONCLUSIONS In cNH, performance on spectral resolution tasks continues to improve through the teenage years and is likely driven by gradual maturation of across-channel intensity resolution. A small but growing body of evidence from children with CIs suggests a more complex relationship between spectral resolution development, patient demographics, and the quality of the CI electrode-neuron interface. Future research should aim to distinguish between the effects of patient-specific variables and the underlying physiology on spectral resolution abilities in children of all ages who are hard of hearing and use auditory prostheses.
Collapse
Affiliation(s)
- Kelly N. Jahn
- Department of Speech, Language, and Hearing, School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson
- Callier Center for Communication Disorders, The University of Texas at Dallas
| | - Julie G. Arenberg
- Department of Otolaryngology – Head and Neck Surgery, Harvard Medical School, Boston, MA
- Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston
| | - David L. Horn
- Virginia Merrill Bloedel Hearing Research Center, Department of Otolaryngology – Head and Neck Surgery, University of Washington, Seattle
- Division of Otolaryngology, Seattle Children's Hospital, WA
| |
Collapse
|
10
|
Martin IA, Goupell MJ, Huang YT. Children's syntactic parsing and sentence comprehension with a degraded auditory signal. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:699. [PMID: 35232101 PMCID: PMC8816517 DOI: 10.1121/10.0009271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 10/15/2021] [Accepted: 12/16/2021] [Indexed: 06/14/2023]
Abstract
During sentence comprehension, young children anticipate syntactic structures using early-arriving words and have difficulties revising incorrect predictions using late-arriving words. However, nearly all work to date has focused on syntactic parsing in idealized speech environments, and little is known about how children's strategies for predicting and revising meanings are affected by signal degradation. This study compares comprehension of active and passive sentences in natural and vocoded speech. In a word-interpretation task, 5-year-olds inferred the meanings of novel words in sentences that (1) encouraged agent-first predictions (e.g., The blicket is eating the seal implies The blicket is the agent), (2) required revising predictions (e.g., The blicket is eaten by the seal implies The blicket is the theme), or (3) weakened predictions by placing familiar nouns in sentence-initial position (e.g., The seal is eating/eaten by the blicket). When novel words promoted agent-first predictions, children misinterpreted passives as actives, and errors increased with vocoded compared to natural speech. However, when familiar words were sentence-initial that weakened agent-first predictions, children accurately interpreted passives, with no signal-degradation effects. This demonstrates that signal quality interacts with interpretive processes during sentence comprehension, and the impacts of speech degradation are greatest when late-arriving information conflicts with predictions.
Collapse
Affiliation(s)
- Isabel A Martin
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Yi Ting Huang
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
11
|
Bsharat-Maalouf D, Karawani H. Learning and bilingualism in challenging listening conditions: How challenging can it be? Cognition 2022; 222:105018. [PMID: 35032867 DOI: 10.1016/j.cognition.2022.105018] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 12/14/2021] [Accepted: 01/05/2022] [Indexed: 11/19/2022]
Abstract
When speech is presented in their second language (L2), bilinguals have more difficulties with speech perception in noise than monolinguals do. However, how noise affects speech perception of bilinguals in their first language (L1) is still unclear. In addition, it is not clear whether bilinguals' speech perception in challenging listening conditions is specific to the type of degradation, or whether there is a shared mechanism for bilingual speech processing under complex listening conditions. Therefore, the current study examined the speech perception of 60 Arabic-Hebrew bilinguals and a control group of native Hebrew speakers during degraded (speech in noise, vocoded speech) and quiet listening conditions. Between participant comparisons (comparing native Hebrew speakers and bilinguals' perceptual performance in L1) and within participant comparisons (perceptual performance of bilinguals in L1 and L2) were conducted. The findings showed that bilinguals in L1 had more difficulty in noisy conditions than their control counterparts did, even when performed like controls under favorable listening conditions. However, bilingualism did not hinder language learning mechanisms. Bilinguals in L1 outperformed native Hebrew speakers in the perception of vocoded speech, demonstrating more extended learning processes. Bilinguals' perceptual performance in L1 versus L2 varied by task complexity. Correlation analyses revealed that bilinguals who coped better with noise degradation were more successful in perceiving the vocoding distortion. Together, these results provide insights into the mechanisms that contribute to speech perceptual performance in challenging listening conditions and suggest that bilinguals' language proficiency and age of language acquisition are not the only factors that affect performance. Rather, duration of exposure to languages, co-activation, and the ability to benefit from exposure to novel stimuli appear to affect the perceptual performance of bilinguals, even when operating in their dominant language. Our findings suggest that bilinguals use a shared mechanism for speech processing under challenging listening conditions.
Collapse
Affiliation(s)
- Dana Bsharat-Maalouf
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
| | - Hanin Karawani
- Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel.
| |
Collapse
|
12
|
Mushtaq F, Wiggins IM, Kitterick PT, Anderson CA, Hartley DEH. Investigating Cortical Responses to Noise-Vocoded Speech in Children with Normal Hearing Using Functional Near-Infrared Spectroscopy (fNIRS). J Assoc Res Otolaryngol 2021; 22:703-717. [PMID: 34581879 PMCID: PMC8599557 DOI: 10.1007/s10162-021-00817-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2020] [Accepted: 09/07/2021] [Indexed: 11/26/2022] Open
Abstract
Whilst functional neuroimaging has been used to investigate cortical processing of degraded speech in adults, much less is known about how these signals are processed in children. An enhanced understanding of cortical correlates of poor speech perception in children would be highly valuable to oral communication applications, including hearing devices. We utilised vocoded speech stimuli to investigate brain responses to degraded speech in 29 normally hearing children aged 6-12 years. Intelligibility of the speech stimuli was altered in two ways by (i) reducing the number of spectral channels and (ii) reducing the amplitude modulation depth of the signal. A total of five different noise-vocoded conditions (with zero, partial or high intelligibility) were presented in an event-related format whilst participants underwent functional near-infrared spectroscopy (fNIRS) neuroimaging. Participants completed a word recognition task during imaging, as well as a separate behavioural speech perception assessment. fNIRS recordings revealed statistically significant sensitivity to stimulus intelligibility across several brain regions. More intelligible stimuli elicited stronger responses in temporal regions, predominantly within the left hemisphere, while right inferior parietal regions showed an opposite, negative relationship. Although there was some evidence that partially intelligible stimuli elicited the strongest responses in the left inferior frontal cortex, a region previous studies have suggested is associated with effortful listening in adults, this effect did not reach statistical significance. These results further our understanding of cortical mechanisms underlying successful speech perception in children. Furthermore, fNIRS holds promise as a clinical technique to help assess speech intelligibility in paediatric populations.
Collapse
Affiliation(s)
- Faizah Mushtaq
- National Institute for Health Research Nottingham Biomedical Research Centre, Nottingham, NG1 5DU, UK.
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, UK.
| | - Ian M Wiggins
- National Institute for Health Research Nottingham Biomedical Research Centre, Nottingham, NG1 5DU, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, UK
| | - Pádraig T Kitterick
- National Institute for Health Research Nottingham Biomedical Research Centre, Nottingham, NG1 5DU, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, UK
| | - Carly A Anderson
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, UK
| | - Douglas E H Hartley
- National Institute for Health Research Nottingham Biomedical Research Centre, Nottingham, NG1 5DU, UK
- Hearing Sciences, Division of Clinical Neuroscience, School of Medicine, University of Nottingham, Nottingham, NG7 2UH, UK
- Nottingham University Hospitals NHS Trust, Nottingham, NG7 2UH, UK
| |
Collapse
|
13
|
Simeon KM, Grieco-Calub TM. The Impact of Hearing Experience on Children's Use of Phonological and Semantic Information During Lexical Access. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2825-2844. [PMID: 34106737 PMCID: PMC8632499 DOI: 10.1044/2021_jslhr-20-00547] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 01/14/2021] [Accepted: 03/02/2021] [Indexed: 06/12/2023]
Abstract
Purpose The purpose of this study was to examine the extent to which phonological competition and semantic priming influence lexical access in school-aged children with cochlear implants (CIs) and children with normal acoustic hearing. Method Participants included children who were 5-10 years of age with either normal hearing (n = 41) or bilateral severe to profound sensorineural hearing loss and used CIs (n = 13). All participants completed a two-alternative forced-choice task while eye gaze to visual images was recorded and quantified during a word recognition task. In this task, the target image was juxtaposed with a competitor image that was either a phonological onset competitor (i.e., shared the same initial consonant-vowel-consonant syllable as the target) or an unrelated distractor. Half of the trials were preceded by an image prime that was semantically related to the target image. Results Children with CIs showed evidence of phonological competition during real-time processing of speech. This effect, however, was less and occurred later in the time course of speech processing than what was observed in children with normal hearing. The presence of a semantically related visual prime reduced the effects of phonological competition in both groups of children but to a greater degree in children with CIs. Conclusions Children with CIs were able to process single words similarly to their counterparts with normal hearing. However, children with CIs appeared to have increased reliance on surrounding semantic information compared to their normal-hearing counterparts.
Collapse
Affiliation(s)
- Katherine M. Simeon
- Roxelyn and Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL
| | - Tina M. Grieco-Calub
- Roxelyn and Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL
- Hugh Knowles Hearing Center, Northwestern University, Evanston, IL
| |
Collapse
|
14
|
Blomquist C, Newman RS, Huang YT, Edwards J. Children With Cochlear Implants Use Semantic Prediction to Facilitate Spoken Word Recognition. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1636-1649. [PMID: 33887149 DOI: 10.1044/2021_jslhr-20-00319] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose Children with cochlear implants (CIs) are more likely to struggle with spoken language than their age-matched peers with normal hearing (NH), and new language processing literature suggests that these challenges may be linked to delays in spoken word recognition. The purpose of this study was to investigate whether children with CIs use language knowledge via semantic prediction to facilitate recognition of upcoming words and help compensate for uncertainties in the acoustic signal. Method Five- to 10-year-old children with CIs heard sentences with an informative verb (draws) or a neutral verb (gets) preceding a target word (picture). The target referent was presented on a screen, along with a phonologically similar competitor (pickle). Children's eye gaze was recorded to quantify efficiency of access of the target word and suppression of phonological competition. Performance was compared to both an age-matched group and vocabulary-matched group of children with NH. Results Children with CIs, like their peers with NH, demonstrated use of informative verbs to look more quickly to the target word and look less to the phonological competitor. However, children with CIs demonstrated less efficient use of semantic cues relative to their peers with NH, even when matched for vocabulary ability. Conclusions Children with CIs use semantic prediction to facilitate spoken word recognition but do so to a lesser extent than children with NH. Children with CIs experience challenges in predictive spoken language processing above and beyond limitations from delayed vocabulary development. Children with CIs with better vocabulary ability demonstrate more efficient use of lexical-semantic cues. Clinical interventions focusing on building knowledge of words and their associations may support efficiency of spoken language processing for children with CIs. Supplemental Material https://doi.org/10.23641/asha.14417627.
Collapse
Affiliation(s)
- Christina Blomquist
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| | - Rochelle S Newman
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| | - Yi Ting Huang
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| | - Jan Edwards
- Department of Hearing and Speech Sciences, University of Maryland, College Park
| |
Collapse
|
15
|
Goupell MJ, Draves GT, Litovsky RY. Recognition of vocoded words and sentences in quiet and multi-talker babble with children and adults. PLoS One 2020; 15:e0244632. [PMID: 33373427 PMCID: PMC7771688 DOI: 10.1371/journal.pone.0244632] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 12/14/2020] [Indexed: 11/18/2022] Open
Abstract
A vocoder is used to simulate cochlear-implant sound processing in normal-hearing listeners. Typically, there is rapid improvement in vocoded speech recognition, but it is unclear if the improvement rate differs across age groups and speech materials. Children (8–10 years) and young adults (18–26 years) were trained and tested over 2 days (4 hours) on recognition of eight-channel noise-vocoded words and sentences, in quiet and in the presence of multi-talker babble at signal-to-noise ratios of 0, +5, and +10 dB. Children achieved poorer performance than adults in all conditions, for both word and sentence recognition. With training, vocoded speech recognition improvement rates were not significantly different between children and adults, suggesting that improvement in learning how to process speech cues degraded via vocoding is absent of developmental differences across these age groups and types of speech materials. Furthermore, this result confirms that the acutely measured age difference in vocoded speech recognition persists after extended training.
Collapse
Affiliation(s)
- Matthew J. Goupell
- Department of Hearing and Speech Sciences, University of Maryland, Maryland, MD, United States of America
- * E-mail:
| | - Garrison T. Draves
- Waisman Center, University of Wisconsin, Madison, WI, United States of America
| | - Ruth Y. Litovsky
- Waisman Center, University of Wisconsin, Madison, WI, United States of America
- Department of Communication Sciences and Disorders, University of Wisconsin, Madison, WI, United States of America
| |
Collapse
|
16
|
Newman RS, Morini G, Shroads E, Chatterjee M. Toddlers' fast-mapping from noise-vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2432. [PMID: 32359241 PMCID: PMC7176458 DOI: 10.1121/10.0001129] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 04/02/2020] [Accepted: 04/03/2020] [Indexed: 06/11/2023]
Abstract
The ability to recognize speech that is degraded spectrally is a critical skill for successfully using a cochlear implant (CI). Previous research has shown that toddlers with normal hearing can successfully recognize noise-vocoded words as long as the signal contains at least eight spectral channels [Newman and Chatterjee. (2013). J. Acoust. Soc. Am. 133(1), 483-494; Newman, Chatterjee, Morini, and Remez. (2015). J. Acoust. Soc. Am. 138(3), EL311-EL317], although they have difficulty with signals that only contain four channels of information. Young children with CIs not only need to match a degraded speech signal to a stored representation (word recognition), but they also need to create new representations (word learning), a task that is likely to be more cognitively demanding. Normal-hearing toddlers aged 34 months were tested on their ability to initially learn (fast-map) new words in noise-vocoded stimuli. While children were successful at fast-mapping new words from 16-channel noise-vocoded stimuli, they failed to do so from 8-channel noise-vocoded speech. The level of degradation imposed by 8-channel vocoding appears sufficient to disrupt fast-mapping in young children. Recent results indicate that only CI patients with high spectral resolution can benefit from more than eight active electrodes. This suggests that for many children with CIs, reduced spectral resolution may limit their acquisition of novel words.
Collapse
Affiliation(s)
- Rochelle S Newman
- Department of Hearing and Speech Sciences, University of Maryland, 0100 Lefrak Hall, College Park, Maryland 20742, USA
| | - Giovanna Morini
- Department of Communication Sciences and Disorders, University of Delaware, 100 Discovery Boulevard, Newark, Delaware 19713, USA
| | - Emily Shroads
- Department of Hearing and Speech Sciences, University of Maryland, 0100 Lefrak Hall, College Park, Maryland 20742, USA
| | - Monita Chatterjee
- Boys Town National Research Hospital, 555 North 30th Street, Omaha, Nebraska 68131, USA
| |
Collapse
|
17
|
Reducing Simulated Channel Interaction Reveals Differences in Phoneme Identification Between Children and Adults With Normal Hearing. Ear Hear 2019; 40:295-311. [PMID: 29927780 DOI: 10.1097/aud.0000000000000615] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Channel interaction, the stimulation of overlapping populations of auditory neurons by distinct cochlear implant (CI) channels, likely limits the speech perception performance of CI users. This study examined the role of vocoder-simulated channel interaction in the ability of children with normal hearing (cNH) and adults with normal hearing (aNH) to recognize spectrally degraded speech. The primary aim was to determine the interaction between number of processing channels and degree of simulated channel interaction on phoneme identification performance as a function of age for cNH and to relate those findings to aNH and to CI users. DESIGN Medial vowel and consonant identification of cNH (age 8-17 years) and young aNH were assessed under six (for children) or nine (for adults) different conditions of spectral degradation. Stimuli were processed using a noise-band vocoder with 8, 12, and 15 channels and synthesis filter slopes of 15 (aNH only), 30, and 60 dB/octave (all NH subjects). Steeper filter slopes (larger numbers) simulated less electrical current spread and, therefore, less channel interaction. Spectrally degraded performance of the NH listeners was also compared with the unprocessed phoneme identification of school-aged children and adults with CIs. RESULTS Spectrally degraded phoneme identification improved as a function of age for cNH. For vowel recognition, cNH exhibited an interaction between the number of processing channels and vocoder filter slope, whereas aNH did not. Specifically, for cNH, increasing the number of processing channels only improved vowel identification in the steepest filter slope condition. Additionally, cNH were more sensitive to changes in filter slope. As the filter slopes increased, cNH continued to receive vowel identification benefit beyond where aNH performance plateaued or reached ceiling. For all NH participants, consonant identification improved with increasing filter slopes but was unaffected by the number of processing channels. Although cNH made more phoneme identification errors overall, their phoneme error patterns were similar to aNH. Furthermore, consonant identification of adults with CI was comparable to aNH listening to simulations with shallow filter slopes (15 dB/octave). Vowel identification of earlier-implanted pediatric ears was better than that of later-implanted ears and more comparable to cNH listening in conditions with steep filter slopes (60 dB/octave). CONCLUSIONS Recognition of spectrally degraded phonemes improved when simulated channel interaction was reduced, particularly for children. cNH showed an interaction between number of processing channels and filter slope for vowel identification. The differences observed between cNH and aNH suggest that identification of spectrally degraded phonemes continues to improve through adolescence and that children may benefit from reduced channel interaction beyond where adult performance has plateaued. Comparison to CI users suggests that early implantation may facilitate development of better phoneme discrimination.
Collapse
|
18
|
Children's Recognition of Emotional Prosody in Spectrally Degraded Speech Is Predicted by Their Age and Cognitive Status. Ear Hear 2019; 39:874-880. [PMID: 29337761 DOI: 10.1097/aud.0000000000000546] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES It is known that school-aged children with cochlear implants show deficits in voice emotion recognition relative to normal-hearing peers. Little, however, is known about normal-hearing children's processing of emotional cues in cochlear implant-simulated, spectrally degraded speech. The objective of this study was to investigate school-aged, normal-hearing children's recognition of voice emotion, and the degree to which their performance could be predicted by their age, vocabulary, and cognitive factors such as nonverbal intelligence and executive function. DESIGN Normal-hearing children (6-19 years old) and young adults were tested on a voice emotion recognition task under three different conditions of spectral degradation using cochlear implant simulations (full-spectrum, 16-channel, and 8-channel noise-vocoded speech). Measures of vocabulary, nonverbal intelligence, and executive function were obtained as well. RESULTS Adults outperformed children on all tasks, and a strong developmental effect was observed. The children's age, the degree of spectral resolution, and nonverbal intelligence were predictors of performance, but vocabulary and executive functions were not, and no interactions were observed between age and spectral resolution. CONCLUSIONS These results indicate that cognitive function and age play important roles in children's ability to process emotional prosody in spectrally degraded speech. The lack of an interaction between the degree of spectral resolution and children's age further suggests that younger and older children may benefit similarly from improvements in spectral resolution. The findings imply that younger and older children with cochlear implants may benefit similarly from technical advances that improve spectral resolution.
Collapse
|
19
|
Huyck JJ, Rosen MJ. Development of perception and perceptual learning for multi-timescale filtered speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:667. [PMID: 30180675 DOI: 10.1121/1.5049369] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 07/18/2018] [Indexed: 06/08/2023]
Abstract
The perception of temporally changing auditory signals has a gradual developmental trajectory. Speech is a time-varying signal, and slow changes in speech (filtered at 0-4 Hz) are preferentially processed by the right hemisphere, while the left extracts faster changes (filtered at 22-40 Hz). This work examined the ability of 8- to 19-year-olds to both perceive and learn to perceive filtered speech presented diotically for each filter type (low vs high) and dichotically for preferred or non-preferred laterality. Across conditions, performance improved with increasing age, indicating that the ability to perceive filtered speech continues to develop into adolescence. Across age, performance was best when both bands were presented dichotically, but with no benefit for presentation to the preferred hemisphere. Listeners thus integrated slow and fast transitions between the two ears, benefitting from more signal information, but not in a hemisphere-specific manner. After accounting for potential ceiling effects, learning was greatest when both bands were presented dichotically. These results do not support the idea that cochlear implants could be improved by providing differentially filtered information to each ear. Listeners who started with poorer performance learned more, a factor which could contribute to the positive cochlear implant outcomes typically seen in younger children.
Collapse
Affiliation(s)
- Julia Jones Huyck
- Speech Pathology and Audiology Program, Kent State University, 1325 Theatre Drive, Kent, Ohio 44242, USA
| | - Merri J Rosen
- Department of Anatomy and Neurobiology, Northeast Ohio Medical University, 4209 State Route 44, Rootstown, Ohio 44272, USA
| |
Collapse
|
20
|
Hawthorne K. Prosody-driven syntax learning is robust to impoverished pitch and spectral cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:2756. [PMID: 29857717 DOI: 10.1121/1.5031130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Across languages, prosodic boundaries tend to align with syntactic boundaries, and both infant and adult language learners capitalize on these correlations to jump-start syntax acquisition. However, it is unclear which prosodic cues-pauses, final-syllable lengthening, and/or pitch resets across boundaries-are necessary for prosodic bootstrapping to occur. It is also unknown how syntax acquisition is impacted when listeners do not have access to the full range of prosodic or spectral information. These questions were addressed using 14-channel noise-vocoded (spectrally degraded) speech. While pre-boundary lengthening and pauses are well-transmitted through noise-vocoded speech, pitch is not; overall intelligibility is also decreased. In two artificial grammar experiments, adult native English speakers showed a similar ability to use English-like prosody to bootstrap unfamiliar syntactic structures from degraded speech and natural, unmanipulated speech. Contrary to previous findings that listeners may require pitch resets and final lengthening to co-occur if no pause cue is present, participants in the degraded speech conditions were able to detect prosodic boundaries from lengthening alone. Results suggest that pitch is not necessary for adult English speakers to perceive prosodic boundaries associated with syntactic structures, and that prosodic bootstrapping is robust to degraded spectral information.
Collapse
Affiliation(s)
- Kara Hawthorne
- Department of Communication Sciences and Disorders, University of Mississippi, 304 George Hall, University, Mississippi 38677, USA
| |
Collapse
|
21
|
Huyck JJ. Comprehension of Degraded Speech Matures During Adolescence. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:1012-1022. [PMID: 29625427 DOI: 10.1044/2018_jslhr-h-17-0252] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Accepted: 01/12/2018] [Indexed: 06/08/2023]
Abstract
PURPOSE The aim of the study was to compare comprehension of spectrally degraded (noise-vocoded [NV]) speech and perceptual learning of NV speech between adolescents and young adults and examine the role of phonological processing and executive functions in this perception. METHOD Sixteen younger adolescents (11-13 years), 16 older adolescents (14-16 years), and 16 young adults (18-22 years) listened to 40 NV sentences and repeated back what they heard. They also completed tests assessing phonological processing and a variety of executive functions. RESULTS Word-report scores were generally poorer for younger adolescents than for the older age groups. Phonological processing also predicted initial word-report scores. Learning (i.e., improvement across training times) did not differ with age. Starting performance and processing speed predicted learning, with greater learning for those who started with the lowest scores and those with faster processing speed. CONCLUSIONS Degraded (NV) speech comprehension is not mature even by early adolescence; however, like adults, adolescents are able to improve their comprehension of degraded speech with training. Thus, although adolescents may have initial difficulty in understanding degraded speech or speech as presented through hearing aids or cochlear implants, they are able to improve their perception with experience. Processing speed and phonological processing may play a role in degraded speech comprehension in these age groups.
Collapse
|
22
|
Some Neurocognitive Correlates of Noise-Vocoded Speech Perception in Children With Normal Hearing: A Replication and Extension of ). Ear Hear 2018; 38:344-356. [PMID: 28045787 DOI: 10.1097/aud.0000000000000393] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Noise-vocoded speech is a valuable research tool for testing experimental hypotheses about the effects of spectral degradation on speech recognition in adults with normal hearing (NH). However, very little research has utilized noise-vocoded speech with children with NH. Earlier studies with children with NH focused primarily on the amount of spectral information needed for speech recognition without assessing the contribution of neurocognitive processes to speech perception and spoken word recognition. In this study, we first replicated the seminal findings reported by ) who investigated effects of lexical density and word frequency on noise-vocoded speech perception in a small group of children with NH. We then extended the research to investigate relations between noise-vocoded speech recognition abilities and five neurocognitive measures: auditory attention (AA) and response set, talker discrimination, and verbal and nonverbal short-term working memory. DESIGN Thirty-one children with NH between 5 and 13 years of age were assessed on their ability to perceive lexically controlled words in isolation and in sentences that were noise-vocoded to four spectral channels. Children were also administered vocabulary assessments (Peabody Picture Vocabulary test-4th Edition and Expressive Vocabulary test-2nd Edition) and measures of AA (NEPSY AA and response set and a talker discrimination task) and short-term memory (visual digit and symbol spans). RESULTS Consistent with the findings reported in the original ) study, we found that children perceived noise-vocoded lexically easy words better than lexically hard words. Words in sentences were also recognized better than the same words presented in isolation. No significant correlations were observed between noise-vocoded speech recognition scores and the Peabody Picture Vocabulary test-4th Edition using language quotients to control for age effects. However, children who scored higher on the Expressive Vocabulary test-2nd Edition recognized lexically easy words better than lexically hard words in sentences. Older children perceived noise-vocoded speech better than younger children. Finally, we found that measures of AA and short-term memory capacity were significantly correlated with a child's ability to perceive noise-vocoded isolated words and sentences. CONCLUSIONS First, we successfully replicated the major findings from the ) study. Because familiarity, phonological distinctiveness and lexical competition affect word recognition, these findings provide additional support for the proposal that several foundational elementary neurocognitive processes underlie the perception of spectrally degraded speech. Second, we found strong and significant correlations between performance on neurocognitive measures and children's ability to recognize words and sentences noise-vocoded to four spectral channels. These findings extend earlier research suggesting that perception of spectrally degraded speech reflects early peripheral auditory processes, as well as additional contributions of executive function, specifically, selective attention and short-term memory processes in spoken word recognition. The present findings suggest that AA and short-term memory support robust spoken word recognition in children with NH even under compromised and challenging listening conditions. These results are relevant to research carried out with listeners who have hearing loss, because they are routinely required to encode, process, and understand spectrally degraded acoustic signals.
Collapse
|
23
|
Finke M, Sandmann P, Bönitz H, Kral A, Büchner A. Consequences of Stimulus Type on Higher-Order Processing in Single-Sided Deaf Cochlear Implant Users. Audiol Neurootol 2016; 21:305-315. [DOI: 10.1159/000452123] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Accepted: 09/20/2016] [Indexed: 11/19/2022] Open
Abstract
Single-sided deaf subjects with a cochlear implant (CI) provide the unique opportunity to compare central auditory processing of the electrical input (CI ear) and the acoustic input (normal-hearing, NH, ear) within the same individual. In these individuals, sensory processing differs between their two ears, while cognitive abilities are the same irrespectively of the sensory input. To better understand perceptual-cognitive factors modulating speech intelligibility with a CI, this electroencephalography study examined the central-auditory processing of words, the cognitive abilities, and the speech intelligibility in 10 postlingually single-sided deaf CI users. We found lower hit rates and prolonged response times for word classification during an oddball task for the CI ear when compared with the NH ear. Also, event-related potentials reflecting sensory (N1) and higher-order processing (N2/N4) were prolonged for word classification (targets versus nontargets) with the CI ear compared with the NH ear. Our results suggest that speech processing via the CI ear and the NH ear differs both at sensory (N1) and cognitive (N2/N4) processing stages, thereby affecting the behavioral performance for speech discrimination. These results provide objective evidence for cognition to be a key factor for speech perception under adverse listening conditions, such as the degraded speech signal provided from the CI.
Collapse
|
24
|
Newman RS, Chatterjee M, Morini G, Remez RE. Toddlers' comprehension of degraded signals: Noise-vocoded versus sine-wave analogs. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:EL311-7. [PMID: 26428832 PMCID: PMC4575314 DOI: 10.1121/1.4929731] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Recent findings suggest that development changes the ability to comprehend degraded speech. Preschool children showed greater difficulties perceiving noise-vocoded speech (a signal that integrates amplitude over broad frequency bands) than sine-wave speech (which maintains the spectral peaks without the spectrum envelope). In contrast, 27-month-old children in the present study could recognize speech with either type of degradation and performed slightly better with eight-channel vocoded speech than with sine-wave speech. This suggests that children's identification performance depends critically on the degree of degradation and that their success in recognizing unfamiliar speech encodings is encouraging overall.
Collapse
Affiliation(s)
- Rochelle S Newman
- Department of Hearing and Speech Sciences, 0100 Lefrak Hall, University of Maryland, College Park, Maryland 20742, USA
| | - Monita Chatterjee
- Boys Town National Research Hospital, 555 North 30th Street, Omaha, Nebraska 68131, USA
| | - Giovanna Morini
- Department of Hearing and Speech Sciences, 0100 Lefrak Hall, University of Maryland, College Park, Maryland 20742, USA
| | - Robert E Remez
- Barnard College, Columbia University, 3009 Broadway, New York, New York 10027, USA
| |
Collapse
|
25
|
Cabrera L, Lorenzi C, Bertoncini J. Infants Discriminate Voicing and Place of Articulation With Reduced Spectral and Temporal Modulation Cues. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2015; 58:1033-1042. [PMID: 25682333 DOI: 10.1044/2015_jslhr-h-14-0121] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Accepted: 02/02/2015] [Indexed: 06/04/2023]
Abstract
PURPOSE This study assessed the role of spectro-temporal modulation cues in the discrimination of 2 phonetic contrasts (voicing and place) for young infants. METHOD A visual-habituation procedure was used to assess the ability of French-learning 6-month-old infants with normal hearing to discriminate voiced versus unvoiced (/aba/-/apa/) and labial versus dental (/aba/-/ada/) stop consonants. The stimuli were processed by tone-excited vocoders to degrade frequency-modulation cues while preserving: (a) amplitude-modulation (AM) cues within 32 analysis frequency bands, (b) slow AM cues only (<16 Hz) within 32 bands, and (c) AM cues within 8 bands. RESULTS Infants exhibited discrimination responses for both phonetic contrasts in each processing condition. However, when fast AM cues were degraded, infants required a longer exposure to vocoded stimuli to reach the habituation criterion. CONCLUSIONS Altogether, these results indicate that the processing of modulation cues conveying phonetic information on voicing and place is "functional" at 6 months. The data also suggest that the perceptual weight of fast AM speech cues may change during development.
Collapse
|
26
|
Maidment DW, Kang HJ, Stewart HJ, Amitay S. Audiovisual integration in children listening to spectrally degraded speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2015; 58:61-68. [PMID: 25203539 DOI: 10.1044/2014_jslhr-s-14-0044] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2014] [Accepted: 09/03/2014] [Indexed: 06/03/2023]
Abstract
PURPOSE The study explored whether visual information improves speech identification in typically developing children with normal hearing when the auditory signal is spectrally degraded. METHOD Children (n=69) and adults (n=15) were presented with noise-vocoded sentences from the Children's Co-ordinate Response Measure (Rosen, 2011) in auditory-only or audiovisual conditions. The number of bands was adaptively varied to modulate the degradation of the auditory signal, with the number of bands required for approximately 79% correct identification calculated as the threshold. RESULTS The youngest children (4- to 5-year-olds) did not benefit from accompanying visual information, in comparison to 6- to 11-year-old children and adults. Audiovisual gain also increased with age in the child sample. CONCLUSIONS The current data suggest that children younger than 6 years of age do not fully utilize visual speech cues to enhance speech perception when the auditory signal is degraded. This evidence not only has implications for understanding the development of speech perception skills in children with normal hearing but may also inform the development of new treatment and intervention strategies that aim to remediate speech perception difficulties in pediatric cochlear implant users.
Collapse
|
27
|
Başkent D, van Rij J, Ng ZY, Free R, Hendriks P. Perception of spectrally degraded reflexives and pronouns by children. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:3844-3852. [PMID: 24180793 DOI: 10.1121/1.4824341] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Speech perception skills in cochlear-implant users are often measured with simple speech materials. In children, it is crucial to fully characterize linguistic development, and this requires linguistically more meaningful materials. The authors propose using the comprehension of reflexives and pronouns, as these specific skills are acquired at different ages. According to the literature, normal-hearing children show adult-like comprehension of reflexives at age 5, while their comprehension of pronouns only reaches adult-like levels around age 10. To provide normative data, a group of younger children (5 to 8 yrs old), older children (10 and 11 yrs old), and adults were tested under conditions without or with spectral degradation, which simulated cochlear-implant speech transmission with four and eight channels. The results without degradation confirmed the different ages of acquisition of reflexives and pronouns. Adding spectral degradation reduced overall performance; however, it did not change the general pattern observed with non-degraded speech. This finding confirms that these linguistic milestones can also be measured with cochlear-implanted children, despite the reduced quality of sound transmission. Thus, the results of the study have implications for clinical practice, as they could contribute to setting realistic expectations and therapeutic goals for children who receive a cochlear implant.
Collapse
Affiliation(s)
- Deniz Başkent
- University of Groningen, University Medical Center Groningen, Department of Otorhinolaryngology/Head and Neck Surgery, P.O. Box 30.001, 9700 RB Groningen, The Netherlands
| | | | | | | | | |
Collapse
|