1
|
Hegde M, Nazzi T, Cabrera L. An auditory perspective on phonological development in infancy. Front Psychol 2024; 14:1321311. [PMID: 38327506 PMCID: PMC10848800 DOI: 10.3389/fpsyg.2023.1321311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/11/2023] [Indexed: 02/09/2024] Open
Abstract
Introduction The auditory system encodes the phonetic features of languages by processing spectro-temporal modulations in speech, which can be described at two time scales: relatively slow amplitude variations over time (AM, further distinguished into the slowest <8-16 Hz and faster components 16-500 Hz), and frequency modulations (FM, oscillating at higher rates about 600-10 kHz). While adults require only the slowest AM cues to identify and discriminate speech sounds, infants have been shown to also require faster AM cues (>8-16 Hz) for similar tasks. Methods Using an observer-based psychophysical method, this study measured the ability of typical-hearing 6-month-olds, 10-month-olds, and adults to detect a change in the vowel or consonant features of consonant-vowel syllables when temporal modulations are selectively degraded. Two acoustically degraded conditions were designed, replacing FM cues with pure tones in 32 frequency bands, and then extracting AM cues in each frequency band with two different low-pass cut- off frequencies: (1) half the bandwidth (Fast AM condition), (2) <8 Hz (Slow AM condition). Results In the Fast AM condition, results show that with reduced FM cues, 85% of 6-month-olds, 72.5% of 10-month-olds, and 100% of adults successfully categorize phonemes. Among participants who passed the Fast AM condition, 67% of 6-month-olds, 75% of 10-month-olds, and 95% of adults passed the Slow AM condition. Furthermore, across the three age groups, the proportion of participants able to detect phonetic category change did not differ between the vowel and consonant conditions. However, age-related differences were observed for vowel categorization: while the 6- and 10-month-old groups did not differ from one another, they both independently differed from adults. Moreover, for consonant categorization, 10-month-olds were more impacted by acoustic temporal degradation compared to 6-month-olds, and showed a greater decline in detection success rates between the Fast AM and Slow AM conditions. Discussion The degradation of FM and faster AM cues (>8 Hz) appears to strongly affect consonant processing at 10 months of age. These findings suggest that between 6 and 10 months, infants show different developmental trajectories in the perceptual weight of speech temporal acoustic cues for vowel and consonant processing, possibly linked to phonological attunement.
Collapse
Affiliation(s)
- Monica Hegde
- Integrative Neuroscience and Cognition Center (INCC-UMR 8002), Université Paris Cité-CNRS, Paris, France
| | | | | |
Collapse
|
2
|
Ni G, Xu Z, Bai Y, Zheng Q, Zhao R, Wu Y, Ming D. EEG-based assessment of temporal fine structure and envelope effect in mandarin syllable and tone perception. Cereb Cortex 2023; 33:11287-11299. [PMID: 37804238 DOI: 10.1093/cercor/bhad366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 09/13/2023] [Accepted: 09/15/2023] [Indexed: 10/09/2023] Open
Abstract
In recent years, speech perception research has benefited from low-frequency rhythm entrainment tracking of the speech envelope. However, speech perception is still controversial regarding the role of speech envelope and temporal fine structure, especially in Mandarin. This study aimed to discuss the dependence of Mandarin syllables and tones perception on the speech envelope and the temporal fine structure. We recorded the electroencephalogram (EEG) of the subjects under three acoustic conditions using the sound chimerism analysis, including (i) the original speech, (ii) the speech envelope and the sinusoidal modulation, and (iii) the fine structure of time and the modulation of the non-speech (white noise) sound envelope. We found that syllable perception mainly depended on the speech envelope, while tone perception depended on the temporal fine structure. The delta bands were prominent, and the parietal and prefrontal lobes were the main activated brain areas, regardless of whether syllable or tone perception was involved. Finally, we decoded the spatiotemporal features of Mandarin perception from the microstate sequence. The spatiotemporal feature sequence of the EEG caused by speech material was found to be specific, suggesting a new perspective for the subsequent auditory brain-computer interface. These results provided a new scheme for the coding strategy of new hearing aids for native Mandarin speakers. HIGHLIGHTS
Collapse
Affiliation(s)
- Guangjian Ni
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin 300392 China
| | - Zihao Xu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
| | - Yanru Bai
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
| | - Qi Zheng
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
| | - Ran Zhao
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
| | - Yubo Wu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
| | - Dong Ming
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China
- Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China
- Haihe Laboratory of Brain-Computer Interaction and Human-Machine Integration, Tianjin 300392 China
| |
Collapse
|
3
|
Benoit C, Carlson RJ, King MC, Horn DL, Rubinstein JT. Behavioral characterization of the cochlear amplifier lesion due to loss of function of stereocilin (STRC) in human subjects. Hear Res 2023; 439:108898. [PMID: 37890241 PMCID: PMC10756798 DOI: 10.1016/j.heares.2023.108898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 09/12/2023] [Accepted: 10/19/2023] [Indexed: 10/29/2023]
Abstract
Loss of function of stereocilin (STRC) is the second most common cause of inherited hearing loss. The loss of the stereocilin protein, encoded by the STRC gene, induces the loss of connection between outer hair cells and tectorial membrane. This only affects the outer hair cells (OHCs) function, involving deficits of active cochlear frequency selectivity and amplifier functions despite preservation of normal inner hair cells. Better understanding of cochlear features associated with mutation of STRC will improve our knowledge of normal cochlear function, the pathophysiology of hearing impairment, and potentially enhance hearing aid and cochlear implant signal processing. Nine subjects with homozygous or compound heterozygous loss of function mutations in STRC were included, age 7-24 years. Temporal and spectral modulation perception were measured, characterized by spectral and temporal modulation transfer functions. Speech-in-noise perception was studied with spondee identification in adaptive steady-state noise and AzBio sentences with 0 and -5 dB SNR multitalker babble. Results were compared with normal hearing (NH) and cochlear implant (CI) listeners to place STRC-/- listeners' hearing capacity in context. Spectral ripple discrimination thresholds in the STRC-/- subjects were poorer than in NH listeners (p < 0.0001) but remained better than for CI listeners (p < 0.0001). Frequency resolution appeared impaired in the STRC-/- group compared to NH listeners but did not reach statistical significance (p = 0.06). Compared to NH listeners, amplitude modulation detection thresholds in the STRC-/- group did not reach significance (p= 0.06) but were better than in CI subjects (p < 0.0001). Temporal resolution in STRC-/- subjects was similar to NH (p = 0.98) but better than in CI listeners (p = 0.04). The spondee reception threshold in the STRC-/- group was worse than NH listeners (p = 0.0008) but better than CI listeners (p = 0.0001). For AzBio sentences, performance at 0 dB SNR was similar between the STRC-/- group and the NH group, 88 % and 97 % respectively. For -5 dB SNR, the STRC-/- performance was significantly poorer than NH, 40 % and 85 % respectively, yet much better than with CI who performed at 54 % at +5 dB SNR in children and 53 % at + 10 dB SNR in adults. To our knowledge, this is the first study of the psychoacoustic performance of human subjects lacking cochlear amplification but with normal inner hair cell function. Our data demonstrate preservation of temporal resolution and a trend to impaired frequency resolution in this group without reaching statistical significance. Speech-in-noise perception compared to NH listeners was impaired as well. All measures were better than those in CI listeners. It remains to be seen if hearing aid modifications, customized for the spectral deficits in STRC-/- listeners can improve speech understanding in noise. Since cochlear implants are also limited by deficient spectral selectivity, STRC-/- hearing may provide an upper bound on what could be obtained with better temporal coding in electrical stimulation.
Collapse
Affiliation(s)
- Charlotte Benoit
- Virginia Merrill Bloedel Hearing Research Center, Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, WA, USA.
| | - Ryan J Carlson
- Departments of Genome Sciences and Medicine, University of Washington, Seattle, WA, USA
| | - Mary-Claire King
- Departments of Genome Sciences and Medicine, University of Washington, Seattle, WA, USA
| | - David L Horn
- Virginia Merrill Bloedel Hearing Research Center, Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, WA, USA; Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA; Division of Pediatric Otolaryngology, Department of Surgery, Seattle Children's Hospital, Seattle, WA, USA
| | - Jay T Rubinstein
- Virginia Merrill Bloedel Hearing Research Center, Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, WA, USA; Department of Bioengineering, University of Washington, Seattle, WA, USA
| |
Collapse
|
4
|
Zhang M, Zhang H, Tang E, Ding H, Zhang Y. Evaluating the Relative Perceptual Salience of Linguistic and Emotional Prosody in Quiet and Noisy Contexts. Behav Sci (Basel) 2023; 13:800. [PMID: 37887450 PMCID: PMC10603920 DOI: 10.3390/bs13100800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 09/22/2023] [Accepted: 09/25/2023] [Indexed: 10/28/2023] Open
Abstract
How people recognize linguistic and emotional prosody in different listening conditions is essential for understanding the complex interplay between social context, cognition, and communication. The perception of both lexical tones and emotional prosody depends on prosodic features including pitch, intensity, duration, and voice quality. However, it is unclear which aspect of prosody is perceptually more salient and resistant to noise. This study aimed to investigate the relative perceptual salience of emotional prosody and lexical tone recognition in quiet and in the presence of multi-talker babble noise. Forty young adults randomly sampled from a pool of native Mandarin Chinese with normal hearing listened to monosyllables either with or without background babble noise and completed two identification tasks, one for emotion recognition and the other for lexical tone recognition. Accuracy and speed were recorded and analyzed using generalized linear mixed-effects models. Compared with emotional prosody, lexical tones were more perceptually salient in multi-talker babble noise. Native Mandarin Chinese participants identified lexical tones more accurately and quickly than vocal emotions at the same signal-to-noise ratio. Acoustic and cognitive dissimilarities between linguistic prosody and emotional prosody may have led to the phenomenon, which calls for further explorations into the underlying psychobiological and neurophysiological mechanisms.
Collapse
Affiliation(s)
- Minyue Zhang
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai 200240, China; (M.Z.); (H.Z.); (E.T.)
| | - Hui Zhang
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai 200240, China; (M.Z.); (H.Z.); (E.T.)
| | - Enze Tang
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai 200240, China; (M.Z.); (H.Z.); (E.T.)
| | - Hongwei Ding
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai 200240, China; (M.Z.); (H.Z.); (E.T.)
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences and Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
5
|
Chan MPY, Kuang J. The effect of tone language background on cue integration in pitch perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:819-830. [PMID: 37563829 DOI: 10.1121/10.0020565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 07/18/2023] [Indexed: 08/12/2023]
Abstract
This study explores the effect of native language and musicality on voice quality cue integration in pitch perception. Previous work by Cui and Kang [(2019). J. Acoust. Soc. Am. 146(6), 4086-4096] found no differences in pitch perception strategies between English and Mandarin speakers. The present study asks whether Cantonese listeners may perform differently, as Cantonese consists of multiple level tones. Participants completed two experiments: (i) a forced choice pitch classification experiment involving four spectral slope permutations that vary in fo across an 11 step continuum, and (ii) the MBEMA test that quantifies listeners' musicality. Results show that Cantonese speakers do not differ from English and Mandarin speakers in terms of overall categoricity and perceptual shift, that Cantonese speakers do not have advantages in musicality, and that musicality is a significant predictor for participants' pitch perception strategies. Listeners with higher musicality scores tend to rely more on fo cues than voice quality cues compared to listeners with lower musicality. These findings support the notion that voice quality integration in pitch perception is not language specific, and may be a universal psychoacoustic phenomenon at a non-lexical level.
Collapse
Affiliation(s)
- May Pik Yu Chan
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6228, USA
| | - Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6228, USA
| |
Collapse
|
6
|
Patience M, Steele J. Relative Difficulty in the Acquisition of the Phonetic Parameters of Obstruent Coda Voicing: Evidence from Mandarin-Speaking Learners of French. LANGUAGE AND SPEECH 2022:238309221114143. [PMID: 36062625 PMCID: PMC10394971 DOI: 10.1177/00238309221114143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
A recurring finding of research on the L2 acquisition of coda obstruent voicing is that, in terms of the phonetic parameters that serve to realize the voicing contrast, learners are overwhelmingly more accurate with duration than the voicing of the obstruent itself. The current work expands our understanding of this asymmetry in two ways. First, as previous studies have focused almost exclusively on learners of English, we investigate here whether L2 learners' superior production of duration is also found among learners of other target languages via a study of Mandarin-speaking learners' production of French stop and fricative codas. Results from 18 Mandarin-speaking learners of French, primarily of beginner and intermediate proficiency who completed a sentence reading task, parallel those of previous studies with greater accuracy observed for vowel duration than the laryngeal voicing of the obstruent. Second, we explore potential sources of this asymmetry, in particular, the roles of L1 experience as well as of universal factors, namely, the relative perceptual salience of duration versus voicing, and the articulatory difficulty of voicing obstruents.
Collapse
|
7
|
Xie D, Luo J, Chao X, Li J, Liu X, Fan Z, Wang H, Xu L. Relationship Between the Ability to Detect Frequency Changes or Temporal Gaps and Speech Perception Performance in Post-lingual Cochlear Implant Users. Front Neurosci 2022; 16:904724. [PMID: 35757528 PMCID: PMC9213807 DOI: 10.3389/fnins.2022.904724] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 05/17/2022] [Indexed: 12/03/2022] Open
Abstract
Previous studies, using modulation stimuli, on the relative effects of frequency resolution and time resolution on CI users’ speech perception failed to reach a consistent conclusion. In this study, frequency change detection and temporal gap detection were used to investigate the frequency resolution and time resolution of CI users, respectively. Psychophysical and neurophysiological methods were used to simultaneously investigate the effects of frequency and time resolution on speech perception in post-lingual cochlear implant (CI) users. We investigated the effects of psychophysical results [frequency change detection threshold (FCDT), gap detection threshold (GDT)], and acoustic change complex (ACC) responses (evoked threshold, latency, or amplitude of ACC induced by frequency change or temporal gap) on speech perception [recognition rate of monosyllabic words, disyllabic words, sentences in quiet, and sentence recognition threshold (SRT) in noise]. Thirty-one adult post-lingual CI users of Mandarin Chinese were enrolled in the study. The stimuli used to induce ACCs to frequency changes were 800-ms pure tones (fundamental frequency was 1,000 Hz); the frequency change occurred at the midpoint of the tones, with six percentages of frequency changes (0, 2, 5, 10, 20, and 50%). Temporal silences with different durations (0, 5, 10, 20, 50, and 100 ms) were inserted in the middle of the 800-ms white noise to induce ACCs evoked by temporal gaps. The FCDT and GDT were obtained by two 2-alternative forced-choice procedures. The results showed no significant correlation between the CI hearing threshold and speech perception in the study participants. In the multiple regression analysis of the influence of simultaneous psychophysical measures and ACC responses on speech perception, GDT significantly predicted every speech perception index, and the ACC amplitude evoked by the temporal gap significantly predicted the recognition of disyllabic words in quiet and SRT in noise. We conclude that when the ability to detect frequency changes and the temporal gap is considered simultaneously, the ability to detect frequency changes may have no significant effect on speech perception, but the ability to detect temporal gaps could significantly predict speech perception.
Collapse
Affiliation(s)
- Dianzhao Xie
- Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Jianfen Luo
- Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Xiuhua Chao
- Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Jinming Li
- Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Xianqi Liu
- Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Zhaomin Fan
- Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Haibo Wang
- Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Lei Xu
- Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Cheeloo College of Medicine, Shandong University, Jinan, China
| |
Collapse
|
8
|
Differential weighting of temporal envelope cues from the low-frequency region for Mandarin sentence recognition in noise. BMC Neurosci 2022; 23:35. [PMID: 35698039 PMCID: PMC9190152 DOI: 10.1186/s12868-022-00721-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2021] [Accepted: 06/01/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Temporal envelope cues are conveyed by cochlear implants (CIs) to hearing loss patients to restore hearing. Although CIs could enable users to communicate in clear listening environments, noisy environments still pose a problem. To improve speech-processing strategies used in Chinese CIs, we explored the relative contributions made by the temporal envelope in various frequency regions, as relevant to Mandarin sentence recognition in noise. METHODS Original speech material from the Mandarin version of the Hearing in Noise Test (MHINT) was mixed with speech-shaped noise (SSN), sinusoidally amplitude-modulated speech-shaped noise (SAM SSN), and sinusoidally amplitude-modulated (SAM) white noise (4 Hz) at a + 5 dB signal-to-noise ratio, respectively. Envelope information of the noise-corrupted speech material was extracted from 30 contiguous bands that were allocated to five frequency regions. The intelligibility of the noise-corrupted speech material (temporal cues from one or two regions were removed) was measured to estimate the relative weights of temporal envelope cues from the five frequency regions. RESULTS In SSN, the mean weights of Regions 1-5 were 0.34, 0.19, 0.20, 0.16, and 0.11, respectively; in SAM SSN, the mean weights of Regions 1-5 were 0.34, 0.17, 0.24, 0.14, and 0.11, respectively; and in SAM white noise, the mean weights of Regions 1-5 were 0.46, 0.24, 0.22, 0.06, and 0.02, respectively. CONCLUSIONS The results suggest that the temporal envelope in the low-frequency region transmits the greatest amount of information in terms of Mandarin sentence recognition for three types of noise, which differed from the perception strategy employed in clear listening environments.
Collapse
|
9
|
Sugiyama Y. Identification of Minimal Pairs of Japanese Pitch Accent in Noise-Vocoded Speech. Front Psychol 2022; 13:887761. [PMID: 35712147 PMCID: PMC9197461 DOI: 10.3389/fpsyg.2022.887761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 05/09/2022] [Indexed: 11/13/2022] Open
Abstract
The perception of lexical pitch accent in Japanese was assessed using noise-excited vocoder speech, which contained no fundamental frequency (fo) or its harmonics. While prosodic information such as in lexical stress in English and lexical tone in Mandarin Chinese is known to be encoded in multiple acoustic dimensions, such multidimensionality is less understood for lexical pitch accent in Japanese. In the present study, listeners were tested under four different conditions to investigate the contribution of non-fo properties to the perception of Japanese pitch accent: noise-vocoded speech stimuli consisting of 10 3-ERBN-wide bands and 15 2-ERBN-wide bands created from a male and female speaker. Results found listeners were able to identify minimal pairs of final-accented and unaccented words at a rate better than chance in all conditions, indicating the presence of secondary cues to Japanese pitch accent. Subsequent analyses were conducted to investigate if the listeners' ability to distinguish minimal pairs was correlated with duration, intensity or formant information. The results found no strong or consistent correlation, suggesting the possibility that listeners used different cues depending on the information available in the stimuli. Furthermore, the comparison of the current results with equivalent studies in English and Mandarin Chinese suggest that, although lexical prosodic information exists in multiple acoustic dimensions in Japanese, the primary cue is more salient than in other languages.
Collapse
|
10
|
Huang W, Wong LLN, Chen F. Just-Noticeable Differences of Fundamental Frequency Change in Mandarin-Speaking Children with Cochlear Implants. Brain Sci 2022; 12:brainsci12040443. [PMID: 35447975 PMCID: PMC9031813 DOI: 10.3390/brainsci12040443] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 03/16/2022] [Accepted: 03/23/2022] [Indexed: 11/16/2022] Open
Abstract
Fundamental frequency (F0) provides the primary acoustic cue for lexical tone perception in tonal languages but remains poorly represented in cochlear implant (CI) systems. Currently, there is still a lack of understanding of sensitivity to F0 change in CI users who speak tonal languages. In the present study, just-noticeable differences (JNDs) of F0 contour and F0 level changes in Mandarin-speaking children with CIs were measured and compared with those in their age-matched normal-hearing (NH) peers. Results showed that children with CIs demonstrated significantly larger JND of F0 contour (JND-C) change and F0 level (JND-L) change compared to NH children. Further within-group comparison revealed that the JND-C change was significantly smaller than the JND-L change among children with CIs, whereas the opposite pattern was observed among NH children. No significant correlations were seen between JND-C change/JND-L change and age at implantation /duration of CI use. The contrast between children with CIs and NH children in sensitivity to F0 contour and F0 level change suggests different mechanisms of F0 processing in these two groups as a result of different hearing experiences.
Collapse
Affiliation(s)
- Wanting Huang
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China;
- Unit of Human Communication, Development, and Information Sciences, Faculty of Education, The University of Hong Kong, Hong Kong 999077, China;
| | - Lena L. N. Wong
- Unit of Human Communication, Development, and Information Sciences, Faculty of Education, The University of Hong Kong, Hong Kong 999077, China;
| | - Fei Chen
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China;
- Correspondence:
| |
Collapse
|
11
|
Eguchi H, Ueda K, Remijn GB, Nakajima Y, Takeichi H. The common limitations in auditory temporal processing for Mandarin Chinese and Japanese. Sci Rep 2022; 12:3002. [PMID: 35194098 PMCID: PMC8863933 DOI: 10.1038/s41598-022-06925-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 02/09/2022] [Indexed: 11/09/2022] Open
Abstract
The present investigation focused on how temporal degradation affected intelligibility in two types of languages, i.e., a tonal language (Mandarin Chinese) and a non-tonal language (Japanese). The temporal resolution of common daily-life sentences spoken by native speakers was systematically degraded with mosaicking (mosaicising), in which the power of original speech in each of regularly spaced time-frequency unit was averaged and temporal fine structure was removed. The results showed very similar patterns of variations in intelligibility for these two languages over a wide range of temporal resolution, implying that temporal degradation crucially affected speech cues other than tonal cues in degraded speech without temporal fine structure. Specifically, the intelligibility of both languages maintained a ceiling up to about the 40-ms segment duration, then the performance gradually declined with increasing segment duration, and reached a floor at about the 150-ms segment duration or longer. The same limitations for the ceiling performance up to 40 ms appeared for the other method of degradation, i.e., local time-reversal, implying that a common temporal processing mechanism was related to the limitations. The general tendency fitted to a dual time-window model of speech processing, in which a short (~ 20-30 ms) and a long (~ 200 ms) time-window run in parallel.
Collapse
Affiliation(s)
- Hikaru Eguchi
- Human Science Course, Graduate School of Design, Kyushu University, 4-9-1 Shiobaru, Minami-ku, Fukuoka, 815-8540, Japan
| | - Kazuo Ueda
- Department of Human Science, Faculty of Design/Research Center for Applied Perceptual Science/Research and Development Center for Five-Sense Devices, Kyushu University, 4-9-1 Shiobaru, Minami-ku, Fukuoka, 815-8540, Japan.
| | - Gerard B Remijn
- Department of Human Science, Faculty of Design/Research Center for Applied Perceptual Science, Kyushu University, 4-9-1 Shiobaru, Minami-ku, Fukuoka, 815-8540, Japan
| | - Yoshitaka Nakajima
- Department of Human Science, Faculty of Design/Research Center for Applied Perceptual Science/Research and Development Center for Five-Sense Devices, Kyushu University, 4-9-1 Shiobaru, Minami-ku, Fukuoka, 815-8540, Japan
- Sound Corporation, 4-10-30-103, Tonoharu, Higashiku, Fukuoka, 813-0001, Japan
| | - Hiroshige Takeichi
- Open Systems Information Science Team, Advanced Data Science Project (ADSP), RIKEN Information R&D and Strategy Headquarters (R-IH), RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
| |
Collapse
|
12
|
Zhang H, Wiener S, Holt LL. Adjustment of cue weighting in speech by speakers and listeners: Evidence from amplitude and duration modifications of Mandarin Chinese tone. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:992. [PMID: 35232077 PMCID: PMC8846952 DOI: 10.1121/10.0009378] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 01/07/2022] [Accepted: 01/10/2022] [Indexed: 06/14/2023]
Abstract
Speech contrasts are signaled by multiple acoustic dimensions, but these dimensions are not equally diagnostic. Moreover, the relative diagnosticity, or weight, of acoustic dimensions in speech can shift in different communicative contexts for both speech perception and speech production. However, the literature remains unclear on whether, and if so how, talkers adjust speech to emphasize different acoustic dimensions in the context of changing communicative demands. Here, we examine the interplay of flexible cue weights in speech production and perception across amplitude and duration, secondary non-spectral acoustic dimensions for phonated Mandarin Chinese lexical tone, across natural speech and whispering, which eliminates fundamental frequency contour, the primary acoustic dimension. Phonated and whispered Mandarin productions from native talkers revealed enhancement of both duration and amplitude cues in whispered, compared to phonated speech. When nonspeech amplitude-modulated noises modeled these patterns of enhancement, identification of the noises as Mandarin lexical tone categories was more accurate than identification of noises modeling phonated speech amplitude and duration cues. Thus, speakers exaggerate secondary cues in whispered speech and listeners make use of this information. Yet, enhancement is not symmetric among the four Mandarin lexical tones, indicating possible constraints on the realization of this enhancement.
Collapse
Affiliation(s)
- Hui Zhang
- Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China
| | - Seth Wiener
- Department of Modern Languages, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA
| | - Lori L Holt
- Department of Psychology and Neuroscience Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA
| |
Collapse
|
13
|
Zheng Z, Li K, Feng G, Guo Y, Li Y, Xiao L, Liu C, He S, Zhang Z, Qian D, Feng Y. Relative Weights of Temporal Envelope Cues in Different Frequency Regions for Mandarin Vowel, Consonant, and Lexical Tone Recognition. Front Neurosci 2021; 15:744959. [PMID: 34924928 PMCID: PMC8678109 DOI: 10.3389/fnins.2021.744959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 11/15/2021] [Indexed: 12/04/2022] Open
Abstract
Objectives: Mandarin-speaking users of cochlear implants (CI) perform poorer than their English counterpart. This may be because present CI speech coding schemes are largely based on English. This study aims to evaluate the relative contributions of temporal envelope (E) cues to Mandarin phoneme (including vowel, and consonant) and lexical tone recognition to provide information for speech coding schemes specific to Mandarin. Design: Eleven normal hearing subjects were studied using acoustic temporal E cues that were extracted from 30 continuous frequency bands between 80 and 7,562 Hz using the Hilbert transform and divided into five frequency regions. Percent-correct recognition scores were obtained with acoustic E cues presented in three, four, and five frequency regions and their relative weights calculated using the least-square approach. Results: For stimuli with three, four, and five frequency regions, percent-correct scores for vowel recognition using E cues were 50.43–84.82%, 76.27–95.24%, and 96.58%, respectively; for consonant recognition 35.49–63.77%, 67.75–78.87%, and 87.87%; for lexical tone recognition 60.80–97.15%, 73.16–96.87%, and 96.73%. For frequency region 1 to frequency region 5, the mean weights in vowel recognition were 0.17, 0.31, 0.22, 0.18, and 0.12, respectively; in consonant recognition 0.10, 0.16, 0.18, 0.23, and 0.33; in lexical tone recognition 0.38, 0.18, 0.14, 0.16, and 0.14. Conclusion: Regions that contributed most for vowel recognition was Region 2 (502–1,022 Hz) that contains first formant (F1) information; Region 5 (3,856–7,562 Hz) contributed most to consonant recognition; Region 1 (80–502 Hz) that contains fundamental frequency (F0) information contributed most to lexical tone recognition.
Collapse
Affiliation(s)
- Zhong Zheng
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China.,Shanghai Key Laboratory of Sleep Disordered Breathing, Shanghai, China
| | - Keyi Li
- Sydney Institute of Language and Commerce, Shanghai University, Shanghai, China
| | - Gang Feng
- Department of Graduate, The First Affiliated Hospital of Jinzhou Medical University, Jinzhou, China
| | - Yang Guo
- Ear, Nose, and Throat Institute and Otorhinolaryngology Department, Eye and ENT Hospital of Fudan University, Shanghai, China
| | - Yinan Li
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China.,Shanghai Key Laboratory of Sleep Disordered Breathing, Shanghai, China
| | - Lili Xiao
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China.,Shanghai Key Laboratory of Sleep Disordered Breathing, Shanghai, China
| | - Chengqi Liu
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China.,Shanghai Key Laboratory of Sleep Disordered Breathing, Shanghai, China
| | - Shouhuan He
- Department of Otolaryngology, Qingpu Branch of Zhongshan Hospital Affiliated to Fudan University, Shanghai, China
| | - Zhen Zhang
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China.,Shanghai Key Laboratory of Sleep Disordered Breathing, Shanghai, China
| | - Di Qian
- Department of Otolaryngology, Shenzhen Longhua District People's Hospital, Shenzhen, China
| | - Yanmei Feng
- Department of Otolaryngology-Head and Neck Surgery, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China.,Shanghai Key Laboratory of Sleep Disordered Breathing, Shanghai, China
| |
Collapse
|
14
|
Abstract
OBJECTIVES Whispered speech offers a unique set of challenges to speech perception and word recognition. The goals of the present study were twofold: First, to determine how listeners recognize whispered speech. Second, to inform major theories of spoken word recognition by considering how recognition changes when major cues to phoneme identity are reduced or largely absent compared with normal voiced speech. DESIGN Using eye tracking in the Visual World Paradigm, we examined how listeners recognize whispered speech. After hearing a target word (normal or whispered), participants selected the corresponding image from a display of four-a target (e.g., money), a word that shares sounds with the target at the beginning (cohort competitor, e.g., mother), a word that shares sounds with the target at the end (rhyme competitor, e.g., honey), and a phonologically unrelated word (e.g., whistle). Eye movements to each object were monitored to measure (1) how fast listeners process whispered speech, and (2) how strongly they consider lexical competitors (cohorts and rhymes) as the speech signal unfolds. RESULTS Listeners were slower to recognize whispered words. Compared with normal speech, listeners displayed slower reaction times to click the target image, were slower to fixate the target, and fixated the target less overall. Further, we found clear evidence that the dynamics of lexical competition are altered during whispered speech recognition. Relative to normal speech, words that overlapped with the target at the beginning (cohorts) displayed slower, reduced, and delayed activation, whereas words that overlapped with the target at the end (rhymes) exhibited faster, more robust, and longer lasting activation. CONCLUSION When listeners are confronted with whispered speech, they engage in a "wait-and-see" approach. Listeners delay lexical access, and by the time they begin to consider what word they are hearing, the beginning of the word has largely come and gone, and activation for cohorts is reduced. However, delays in lexical access actually increase consideration of rhyme competitors; the delay pushes lexical activation to a point later in processing, and the recognition system puts more weight on the word-final overlap between the target and the rhyme.
Collapse
|
15
|
The Role of Lexical Tone Information in the Recognition of Mandarin Sentences in Listeners With Hearing Aids. Ear Hear 2021; 41:532-538. [PMID: 31369470 DOI: 10.1097/aud.0000000000000774] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Lexical tone information provides redundant cues for the recognition of Mandarin sentences in listeners with normal hearing in quiet conditions. The contribution of lexical tones to Mandarin sentence recognition in listeners with hearing aids (HAs) is unclear. This study aimed to remove lexical tone information and examine the effects on Mandarin sentence intelligibility in HA users. The second objective was to investigate the contribution of cognitive abilities (i.e., general cognitive ability, working memory, and attention) on Mandarin sentence perception when the presentation of lexical tone information was mismatched. DESIGN A text-to-speech synthesis engine was used to manipulate Mandarin sentences into three test conditions: (1) a Normal Tone test condition, where no alterations were made to lexical tones within sentences; (2) a Flat Tone test condition, where lexical tones were all changed into tone 1 (i.e., the flat tone); and (3) a Random Tone test condition, where each word in test sentences was randomly assigned one of four Mandarin lexical tones. The manipulated sentence signals were presented to 32 listeners with HAs in both quiet and noisy environments at an 8 dB signal to noise ratio. RESULTS Speech intelligibility was reduced significantly (by approximately 40 percentage points) in the presence of mismatched lexical tone information in both quiet and noise. The difficulty in correctly identifying sentences with mismatched lexical tones among adults with hearing loss was significantly greater than that of adults with normal hearing. Cognitive function was not significantly related to a decline in speech recognition scores. CONCLUSIONS Contextual and other phonemic cues (i.e., consonants and vowels) are inadequate for HA users to perceive sentences with mismatched lexical tone contours in quiet or noise. Also, HA users with better cognitive function could not compensate for the loss of lexical tone information. These results highlight the importance of accurately representing lexical tone information for Mandarin speakers using HAs.
Collapse
|
16
|
Fan L, Kong L, Li L, Qu T. Sensitivity to a Break in Interaural Correlation in Frequency-Gliding Noises. Front Psychol 2021; 12:692785. [PMID: 34220654 PMCID: PMC8247655 DOI: 10.3389/fpsyg.2021.692785] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 05/25/2021] [Indexed: 11/29/2022] Open
Abstract
This study was to investigate whether human listeners are able to detect a binaurally uncorrelated arbitrary-noise fragment embedded in binaurally identical arbitrary-noise markers [a break in correlation, break in interaural correlation (BIAC)] in either frequency-constant (frequency-steady) or frequency-varied (unidirectionally frequency gliding) noise. Ten participants with normal hearing were tested in Experiment 1 for up-gliding, down-gliding, and frequency-steady noises. Twenty-one participants with normal hearing were tested in Experiment 2a for both up-gliding and frequency-steady noises. Another nineteen participants with normal hearing were tested in Experiment 2b for both down-gliding and frequency-steady noises. Listeners were able to detect a BIAC in the frequency-steady noise (center frequency = 400 Hz) and two types of frequency-gliding noises (center frequency: between 100 and 1,600 Hz). The duration threshold for detecting the BIAC in frequency-gliding noises was significantly longer than that in the frequency-steady noise (Experiment 1), and the longest interaural delay at which a duration-fixed BIAC (200 ms) in frequency-gliding noises could be detected was significantly shorter than that in the frequency-steady noise (Experiment 2). Although human listeners can detect a BIAC in frequency-gliding noises, their sensitivity to a BIAC in frequency-gliding noises is much lower than that in frequency-steady noise.
Collapse
Affiliation(s)
- Langchen Fan
- Beijing Key Laboratory of Behavior and Mental Health, School of Psychological and Cognitive Sciences, Peking University, Beijing, China.,Key Laboratory on Machine Perception (Ministry of Education), Department of Machine Intelligence, Peking University, Beijing, China
| | - Lingzhi Kong
- Language Pathology and Brain Science MEG Lab, School of Communication Sciences, Beijing Language and Culture University, Beijing, China
| | - Liang Li
- Beijing Key Laboratory of Behavior and Mental Health, School of Psychological and Cognitive Sciences, Peking University, Beijing, China.,Key Laboratory on Machine Perception (Ministry of Education), Department of Machine Intelligence, Peking University, Beijing, China
| | - Tianshu Qu
- Key Laboratory on Machine Perception (Ministry of Education), Department of Machine Intelligence, Peking University, Beijing, China
| |
Collapse
|
17
|
Morgan SD. Comparing Emotion Recognition and Word Recognition in Background Noise. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1758-1772. [PMID: 33830784 DOI: 10.1044/2021_jslhr-20-00153] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose Word recognition in quiet and in background noise has been thoroughly investigated in previous research to establish segmental speech recognition performance as a function of stimulus characteristics (e.g., audibility). Similar methods to investigate recognition performance for suprasegmental information (e.g., acoustic cues used to make judgments of talker age, sex, or emotional state) have not been performed. In this work, we directly compared emotion and word recognition performance in different levels of background noise to identify psychoacoustic properties of emotion recognition (globally and for specific emotion categories) relative to word recognition. Method Twenty young adult listeners with normal hearing listened to sentences and either reported a target word in each sentence or selected the emotion of the talker from a list of options (angry, calm, happy, and sad) at four signal-to-noise ratios in a background of white noise. Psychometric functions were fit to the recognition data and used to estimate thresholds (midway points on the function) and slopes for word and emotion recognition. Results Thresholds for emotion recognition were approximately 10 dB better than word recognition thresholds, and slopes for emotion recognition were half of those measured for word recognition. Low-arousal emotions had poorer thresholds and shallower slopes than high-arousal emotions, suggesting greater confusion when distinguishing low-arousal emotional speech content. Conclusions Communication of a talker's emotional state continues to be perceptible to listeners in competitive listening environments, even after words are rendered inaudible. The arousal of emotional speech affects listeners' ability to discriminate between emotion categories.
Collapse
Affiliation(s)
- Shae D Morgan
- Department of Otolaryngology - Head and Neck Surgery and Communicative Disorders, University of Louisville, KY
| |
Collapse
|
18
|
Wang X, Xu L. Speech perception in noise: Masking and unmasking. J Otol 2021; 16:109-119. [PMID: 33777124 PMCID: PMC7985001 DOI: 10.1016/j.joto.2020.12.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 12/03/2020] [Accepted: 12/06/2020] [Indexed: 11/23/2022] Open
Abstract
Speech perception is essential for daily communication. Background noise or concurrent talkers, on the other hand, can make it challenging for listeners to track the target speech (i.e., cocktail party problem). The present study reviews and compares existing findings on speech perception and unmasking in cocktail party listening environments in English and Mandarin Chinese. The review starts with an introduction section followed by related concepts of auditory masking. The next two sections review factors that release speech perception from masking in English and Mandarin Chinese, respectively. The last section presents an overall summary of the findings with comparisons between the two languages. Future research directions with respect to the difference in literature on the reviewed topic between the two languages are also discussed.
Collapse
Affiliation(s)
- Xianhui Wang
- Communication Sciences and Disorders, Ohio University, Athens, OH, 45701, USA
| | - Li Xu
- Communication Sciences and Disorders, Ohio University, Athens, OH, 45701, USA
| |
Collapse
|
19
|
Rhee N, Chen A, Kuang J. Going beyond F0: The acquisition of Mandarin tones. JOURNAL OF CHILD LANGUAGE 2021; 48:387-398. [PMID: 32393402 DOI: 10.1017/s0305000920000239] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Using a semi-spontaneous speech corpus, we present evidence from computational modelling of tonal productions from Mandarin-speaking children (4- to 11-years old) and adults, showing that children exceed the adult-level tonal distinction at the age of 7 to 8 years using F0 cues, but do not reach the high adult-level distinction using spectral cues even at the age of 10 to 11 years. The difference in the developmental curves of F0 and spectral cues suggests that, in Mandarin tone production, secondary cues continue to develop even after the mastery of primary cues.
Collapse
|
20
|
Speech Segregation in Active Middle Ear Stimulation: Masking Release With Changing Fundamental Frequency. Ear Hear 2020; 42:709-717. [PMID: 33369941 DOI: 10.1097/aud.0000000000000973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Temporal fine structure information such as low-frequency sounds including the fundamental frequency (F0) is important to separate different talkers in noisy environments. Speech perception in noise is negatively affected by reduced temporal fine structure resolution in cochlear hearing loss. It has been shown that normal-hearing (NH) people as well as cochlear implant patients with preserved acoustic low-frequency hearing benefit from different F0 between concurrent talkers. Though patients with an active middle ear implant (AMEI) report better sound quality compared with hearing aids, they often struggle when listening in noise. The primary objective was to evaluate whether or not patients with a Vibrant Soundbridge AMEI were able to benefit from F0 differences in a concurrent talker situation and if the effect was comparable to NH individuals. DESIGN A total of 13 AMEI listeners and 13 NH individuals were included. A modified variant of the Oldenburg sentence test was used to emulate a concurrent talker scenario. One sentence from the test corpus served as the masker and the remaining sentences as target speech. The F0 of the masker sentence was shifted upward by 4, 8, and 12 semitones. The target and masker sentences were presented simultaneously to the study subjects and the speech reception threshold was assessed by adaptively varying the masker level. To evaluate any impact of the occlusion effect on speech perception, AMEI listeners were tested in two configurations: with a plugged ear-canal contralateral to the implant side, indicated as AMEIcontra, or with both ears plugged, indicated as AMEIboth. RESULTS In both study groups, speech perception improved when the F0 difference between target and masker increased. This was significant when the difference was at least 8 semitones; the F0-based release from masking was 3.0 dB in AMEIcontra (p = 0.009) and 2.9 dB in AMEIboth (p = 0.015), compared with 5.6 dB in NH listeners (p < 0.001). A difference of 12 semitones revealed a F0-based release from masking of 3.5 dB in the AMEIcontra (p = 0.002) and 3.4 dB in the AMEIboth (p = 0.003) condition, compared with 5.0 dB in NH individuals (p < 0.001). CONCLUSIONS Though AMEI users deal with problems resulting from cochlear damage, hearing amplification with the implant enables a masking release based on F0 differences when F0 between a target and masker sentence was at least 8 semitones. Additional occlusion of the ear canal on the implant side did not affect speech performance. The current results complement the knowledge about the benefit of F0 within the acoustic low-frequency hearing.
Collapse
|
21
|
Huang W, Wong LLN, Chen F, Liu H, Liang W. Effects of Fundamental Frequency Contours on Sentence Recognition in Mandarin-Speaking Children With Cochlear Implants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3855-3864. [PMID: 33022190 DOI: 10.1044/2020_jslhr-20-00033] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Purpose Fundamental frequency (F0) is the primary acoustic cue for lexical tone perception in tonal languages but is processed in a limited way in cochlear implant (CI) systems. The aim of this study was to evaluate the importance of F0 contours in sentence recognition in Mandarin-speaking children with CIs and find out whether it is similar to/different from that in age-matched normal-hearing (NH) peers. Method Age-appropriate sentences, with F0 contours manipulated to be either natural or flattened, were randomly presented to preschool children with CIs and their age-matched peers with NH under three test conditions: in quiet, in white noise, and with competing sentences at 0 dB signal-to-noise ratio. Results The neutralization of F0 contours resulted in a significant reduction in sentence recognition. While this was seen only in noise conditions among NH children, it was observed throughout all test conditions among children with CIs. Moreover, the F0 contour-induced accuracy reduction ratios (i.e., the reduction in sentence recognition resulting from the neutralization of F0 contours compared to the normal F0 condition) were significantly greater in children with CIs than in NH children in all test conditions. Conclusions F0 contours play a major role in sentence recognition in both quiet and noise among pediatric implantees, and the contribution of the F0 contour is even more salient than that in age-matched NH children. These results also suggest that there may be differences between children with CIs and NH children in how F0 contours are processed.
Collapse
Affiliation(s)
- Wanting Huang
- Unit of Human Communication, Development, and Information Sciences, Faculty of Education, The University of Hong Kong, China
| | - Lena L N Wong
- Unit of Human Communication, Development, and Information Sciences, Faculty of Education, The University of Hong Kong, China
| | - Fei Chen
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Haihong Liu
- Beijing Key Laboratory of Pediatric Diseases of Otolaryngology, Head and Neck Surgery, Beijing Children's Hospital, China
| | - Wei Liang
- China Rehabilitation Research Center for Hearing and Speech Impairment, Beijing, China
| |
Collapse
|
22
|
Zhou Q, Bi J, Song H, Gu X, Liu B. Mandarin lexical tone recognition in bimodal cochlear implant users. Int J Audiol 2020; 59:548-555. [PMID: 32302240 DOI: 10.1080/14992027.2020.1719437] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Objective: To assess the recognition of lexical tones in Mandarin-speaking bimodal cochlear implant (CI) subjects.Design: Lexical tone recognition in quiet and noise (SNR= +5 dB) was measured with electric stimulation (CI alone) or bimodal stimulation (CI + hearing aid (HA)). The recognition and confusion rates of the four tones (T1, T2, T3 and T4) were analysed. Spearman correlation analysis was performed to examine the relationship between hearing levels in the contralateral ear and bimodal benefits.Study sample: Twenty native Mandarin-speaking bimodal CI users, with ages ranging from 16-49 years.Results: Relative to the CI alone, mean tone recognition with the CI + HA improved significantly from 84.1-92.1% correct in quiet (+8 points) and from 57.9-73.1% correct in noise (+15.2 points). Tone confusions between T2 and T3 were the most prominent in all test conditions, and T4 tended to be labelled as T3 in noise. There was no significant correlation between the bimodal benefits for tone recognition and the unaided or HA-aided pure-tone thresholds at 0.25 kHz.Conclusion: Listeners with CI + HA exhibited significantly better tone recognition than with CI alone. The bimodal advantage for tone recognition was greater in noise than in quiet, perhaps due to a ceiling effect in quiet.
Collapse
Affiliation(s)
- Qian Zhou
- Beijing Tongren Hospital, Beijing Institute of Otolaryngology, Capital Medical University, Beijing, China.,Key Laboratory of Otolaryngology Head and Neck Surgery, Capital Medical University, Ministry of Education, Beijing, China
| | - Jintao Bi
- Beijing Tongren Hospital, Beijing Institute of Otolaryngology, Capital Medical University, Beijing, China.,Key Laboratory of Otolaryngology Head and Neck Surgery, Capital Medical University, Ministry of Education, Beijing, China
| | - Haoheng Song
- Beijing Tongren Hospital, Beijing Institute of Otolaryngology, Capital Medical University, Beijing, China.,Key Laboratory of Otolaryngology Head and Neck Surgery, Capital Medical University, Ministry of Education, Beijing, China
| | - Xin Gu
- Beijing Tongren Hospital, Beijing Institute of Otolaryngology, Capital Medical University, Beijing, China.,Key Laboratory of Otolaryngology Head and Neck Surgery, Capital Medical University, Ministry of Education, Beijing, China
| | - Bo Liu
- Beijing Tongren Hospital, Beijing Institute of Otolaryngology, Capital Medical University, Beijing, China.,Key Laboratory of Otolaryngology Head and Neck Surgery, Capital Medical University, Ministry of Education, Beijing, China
| |
Collapse
|
23
|
Tupper P, Leung K, Wang Y, Jongman A, Sereno JA. Characterizing the distinctive acoustic cues of Mandarin tones. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2570. [PMID: 32359306 DOI: 10.1121/10.0001024] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Accepted: 03/18/2020] [Indexed: 06/11/2023]
Abstract
This study aims to characterize distinctive acoustic features of Mandarin tones based on a corpus of 1025 monosyllabic words produced by 21 native Mandarin speakers. For each tone, 22 acoustic cues were extracted. Besides standard F0, duration, and intensity measures, further cues were determined by fitting two mathematical functions to the pitch contours. The first function is a parabola, which gives three parameters: a mean F0, an F0 slope, and an F0 second derivative. The second is a broken-line function, which models the contour as a continuous curve consisting of two lines with a single breakpoint. Cohen's d, sparse Principal Component Analysis, and other statistical measures are used to identify which of the cues, and which combinations of the cues, are important for distinguishing each tone from each other among all the speakers. Although the specific cues that best characterize the tone contours depend on the particular tone and the statistical measure used, this paper shows that the three cues obtained by fitting a parabola to the tone contour are broadly effective. This research suggests using these three cues as a canonical choice for defining tone characteristics.
Collapse
Affiliation(s)
- Paul Tupper
- Department of Mathematics, Simon Fraser University, Burnaby, British Colombia V5A 1S6, Canada
| | - Keith Leung
- Department of Linguistics, Simon Fraser University, Burnaby, British Colombia V5A 1S6, Canada
| | - Yue Wang
- Department of Linguistics, Simon Fraser University, Burnaby, British Colombia V5A 1S6, Canada
| | - Allard Jongman
- Department of Linguistics, University of Kansas, Lawrence, Kansas 66045, USA
| | - Joan A Sereno
- Department of Linguistics, University of Kansas, Lawrence, Kansas 66045, USA
| |
Collapse
|
24
|
Wang X, Xu L. Mandarin tone perception in multiple-talker babbles and speech-shaped noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:EL307. [PMID: 32359323 PMCID: PMC7127911 DOI: 10.1121/10.0001002] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 03/11/2020] [Accepted: 03/12/2020] [Indexed: 06/11/2023]
Abstract
Lexical tone recognition in multiple-talker babbles (N = 1, 2, 4, 8, 10, or 12) and in speech-shaped noise at different signal-to-noise ratios (SNRs = -18 to -6 dB) were tested in 30 normal-hearing native Mandarin-speaking listeners. Results showed that tone perception was robust to noise. The performance curve as a function of N was non-monotonic. The breakpoint at which the performance plateaued was N = 8 for all SNRs tested with a slight improvement at N > 8 at -6 and -9 dB SNR.
Collapse
Affiliation(s)
- Xianhui Wang
- Communication Sciences and Disorders, Ohio University, Athens, Ohio 45701, ,
| | - Li Xu
- Communication Sciences and Disorders, Ohio University, Athens, Ohio 45701, ,
| |
Collapse
|
25
|
Cui A, Kuang J. The effects of musicality and language background on cue integration in pitch perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:4086. [PMID: 31893734 DOI: 10.1121/1.5134442] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 10/28/2019] [Indexed: 06/10/2023]
Abstract
Pitch perception involves the processing of multidimensional acoustic cues, and listeners can exhibit different cue integration strategies in interpreting pitch. This study aims to examine whether musicality and language experience have effects on listeners' pitch perception strategies. Both Mandarin and English listeners were recruited to participate in two experiments: (1) a pitch classification experiment that tested their relative reliance on f0 and spectral cues, and (2) the Montreal Battery of Evaluation of Musical Abilities that objectively quantified their musical aptitude as continuous musicality scores. Overall, the results show a strong musicality effect: Listeners with higher musicality scores relied more on f0 in pitch perception, while listeners with lower musicality scores were more likely to attend to spectral cues. However, there were no effects of language experience on musicality scores or cue integration strategies in pitch perception. These results suggest that less musical or even amusic subjects may not suffer impairment in linguistic pitch processing due to the multidimensional nature of pitch cues.
Collapse
Affiliation(s)
- Aletheia Cui
- Department of Linguistics, University of Pennsylvania, 3401-C Walnut Street, Suite 300, Philadelphia, Pennsylvania 19104, USA
| | - Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, 3401-C Walnut Street, Suite 300, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
26
|
Jain S, Nataraja NP. The Relationship between Temporal Integration and Temporal Envelope Perception in Noise by Males with Mild Sensorineural Hearing Loss. J Int Adv Otol 2019; 15:257-262. [PMID: 31418715 DOI: 10.5152/iao.2019.6555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
OBJECTIVES A surge of literature indicated that temporal integration and temporal envelope perception contribute largely to the perception of speech. A review of literature showed that the perception of speech with temporal integration and temporal envelope perception in noise might be affected due to sensorineural hearing loss but to a varying degree. Because the temporal integration and temporal envelope share similar physiological processing at the cochlear level, the present study was aimed to identify the relationship between temporal integration and temporal envelope perception in noise by individuals with mild sensorineural hearing loss. MATERIALS AND METHODS Thirty adult males with mild sensorineural hearing loss and thirty age- and gender-matched normal-hearing individuals volunteered for being the participants of the study. The temporal integration was measured using synthetic consonant-vowel-consonant syllables, varied for onset, offset, and onset-offset of second and third formant frequencies of the vowel following and preceding consonants in six equal steps, thus forming a six-step onset, offset, and onset-offset continuum, each. The duration of the transition was kept short (40 ms) in one set of continua and long (80 ms) in another. Temporal integration scores were calculated as the differences in the identification of the categorical boundary between short- and long-transition continua. Temporal envelope perception was measured using sentences processed in quiet, 0 dB, and -5 dB signal-to-noise ratios at 4, 8, 16, and 32 contemporary frequency channels, and the temporal envelope was extracted for each sentence using the Hilbert transformation. RESULTS A significant effect of hearing loss was observed on temporal integration, but not on temporal envelope perception. However, when the temporal integration abilities were controlled, the variable effect of hearing loss on temporal envelope perception was noted. CONCLUSION It was important to measure the temporal integration to accurately account for the envelope perception by individuals with normal hearing and those with hearing loss.
Collapse
Affiliation(s)
- Saransh Jain
- Department of Audiology, JSS Institute of Speech and Hearing, JSS Research Foundation, Mysuru, India
| | | |
Collapse
|
27
|
Nie K, Hannaford S, Director HM, Nishigaki MA, Drennan WR, Rubinstein JT. Mandarin tone recognition in English speakers with normal hearing and with cochlear implants. Int J Audiol 2019; 58:913-922. [DOI: 10.1080/14992027.2019.1632498] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Kaibao Nie
- University of Washington-Seattle, Seattle, WA, USA
- University of Washington-Bothell, Bothell, WA, USA
| | | | | | | | | | | |
Collapse
|
28
|
Walker BA, Gerhards CM, Werner LA, Horn DL. Amplitude modulation detection and temporal modulation cutoff frequency in normal hearing infants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:3667. [PMID: 31255105 PMCID: PMC7112713 DOI: 10.1121/1.5111757] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 05/22/2019] [Accepted: 05/28/2019] [Indexed: 05/30/2023]
Abstract
The goal of this study was to determine if temporal modulation cutoff frequency was mature in three-month-old infants. Normal-hearing infants and young adults were tested in a single-interval forced-choice observer-based psychoacoustic procedure. Two parameters of the temporal modulation transfer function (TMTF) were estimated to separate temporal resolution from amplitude modulation sensitivity. The modulation detection threshold (MDT) of a broadband noise amplitude modulated at 10 Hz estimated the y-intercept of the TMTF. The cutoff frequency of the TMTF, measured at a modulation depth 4 dB greater than the MDT, provided an estimate of temporal resolution. MDT was obtained in 27 of 33 infants while both MDT and cutoff frequency was obtained in 15 infants and in 16 of 16 adults. Mean MDT was approximately 10 dB poorer in infants compared to adults. In contrast, mean temporal modulation cutoff frequency did not differ significantly between age groups. These results suggest that temporal resolution is mature, on average, by three months of age in normal hearing children despite immature sensitivity to amplitude modulation. The temporal modulation cutoff frequency approach used here may be a feasible way to examine development of temporal resolution in young listeners with markedly immature sensitivity to amplitude modulation.
Collapse
Affiliation(s)
- Brian A Walker
- University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Caitlin M Gerhards
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Lynne A Werner
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington 98195, USA
| | - David L Horn
- Department of Otolaryngology-Head and Neck Surgery, Virginia Merrill Bloedel Hearing Research Center, University of Washington, Box 357923, Seattle, Washington 98195, USA
| |
Collapse
|
29
|
Wang HLS, Wang NYH, Chen IC, Tsao Y. Auditory identification of frequency-modulated sweeps and reading difficulties in Chinese. RESEARCH IN DEVELOPMENTAL DISABILITIES 2019; 86:53-61. [PMID: 30660853 DOI: 10.1016/j.ridd.2019.01.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2017] [Revised: 12/31/2018] [Accepted: 01/12/2019] [Indexed: 06/09/2023]
Abstract
BACKGROUND In Chinese Mandarin, lexical tones play an important role of providing contrasts in word meaning. They are pitch patterns expressed by frequency-modulated (FM) signals. Yet, few studies have looked at the relationship between low-level auditory processing of frequency signals and Chinese reading skills. AIMS The study aims to identify the role of auditory frequency processing in Chinese lexical tone awareness as well as character recognition in Chinese-speaking children. METHODS Children with (N = 28) and without (N = 27) developmental dyslexia (DD) were recruited. All participants completed two linguistic tasks, Chinese character recognition and lexical tone awareness, and two auditory frequency processing tasks, frequency discrimination and FM sweep direction identification. RESULTS The results revealed that Chinese-speaking children with DD were significantly poorer at all tasks. Particularly, Chinese character recognition was significantly related to FM sweep identification. Lexical tone awareness was significantly associated with both auditory frequency processing tasks. Regression analyses suggested the influence of FM sweep identification on Chinese character recognition contributed through lexical tone awareness. CONCLUSIONS AND IMPLICATION This study suggests that poor auditory frequency processing may associate with Chinese developmental dyslexia with phonological deficits. In support of the phonological deficit hypothesis, what underlies phonological deficit is likely to be auditory-basis. A potential clinical implication is to reinforce auditory perception and sensitivity through intervention for phonological processing.
Collapse
Affiliation(s)
| | - Natalie Yu-Hsien Wang
- Research Center for Information Technology Innovation (CITI), Academia Sinica, Taipei, Taiwan.
| | - I-Chen Chen
- Department of Special Education, University of Taipei, Taipei, Taiwan.
| | - Yu Tsao
- Research Center for Information Technology Innovation (CITI), Academia Sinica, Taipei, Taiwan.
| |
Collapse
|
30
|
Peng F, McKay CM, Mao D, Hou W, Innes-Brown H. Auditory Brainstem Representation of the Voice Pitch Contours in the Resolved and Unresolved Components of Mandarin Tones. Front Neurosci 2018; 12:820. [PMID: 30505262 PMCID: PMC6250765 DOI: 10.3389/fnins.2018.00820] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 10/22/2018] [Indexed: 11/24/2022] Open
Abstract
Accurate perception of voice pitch plays a vital role in speech understanding, especially for tonal languages such as Mandarin. Lexical tones are primarily distinguished by the fundamental frequency (F0) contour of the acoustic waveform. It has been shown that the auditory system could extract the F0 from the resolved and unresolved harmonics, and the tone identification performance of resolved harmonics was better than unresolved harmonics. To evaluate the neural response to the resolved and unresolved components of Mandarin tones in quiet and in speech-shaped noise, we recorded the frequency-following response. In this study, four types of stimuli were used: speech with either only-resolved harmonics or only-unresolved harmonics, both in quiet and in speech-shaped noise. Frequency-following responses (FFRs) were recorded to alternating-polarity stimuli and were added or subtracted to enhance the neural response to the envelope (FFRENV) or fine structure (FFRTFS), respectively. The neural representation of the F0 strength reflected by the FFRENV was evaluated by the peak autocorrelation value in the temporal domain and the peak phase-locking value (PLV) at F0 in the spectral domain. Both evaluation methods showed that the FFRENV F0 strength in quiet was significantly stronger than in noise for speech including unresolved harmonics, but not for speech including resolved harmonics. The neural representation of the temporal fine structure reflected by the FFRTFS was assessed by the PLV at the harmonic near to F1 (4th of F0). The PLV at harmonic near to F1 (4th of F0) of FFRTFS to resolved harmonics was significantly larger than to unresolved harmonics. Spearman's correlation showed that the FFRENV F0 strength to unresolved harmonics was correlated with tone identification performance in noise (0 dB SNR). These results showed that the FFRENV F0 strength to speech sounds with resolved harmonics was not affected by noise. In contrast, the response to speech sounds with unresolved harmonics, which were significantly smaller in noise compared to quiet. Our results suggest that coding resolved harmonics was more important than coding envelope for tone identification performance in noise.
Collapse
Affiliation(s)
- Fei Peng
- Key Laboratory of Biorheological Science and Technology, Chongqing University, Ministry of Education, Chongqing, China.,The Bionics Institute of Australia, East Melbourne, VIC, Australia.,Medical Bionics Department, University of Melbourne, Melbourne, VIC, Australia.,Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Colette M McKay
- The Bionics Institute of Australia, East Melbourne, VIC, Australia.,Medical Bionics Department, University of Melbourne, Melbourne, VIC, Australia
| | - Darren Mao
- The Bionics Institute of Australia, East Melbourne, VIC, Australia.,Department of Biomedical Engineering, University of Melbourne, Melbourne, VIC, Australia
| | - Wensheng Hou
- Key Laboratory of Biorheological Science and Technology, Chongqing University, Ministry of Education, Chongqing, China.,Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China.,Chongqing Engineering Research Center of Medical Electronics Technology, Chongqing University, Chongqing, China
| | - Hamish Innes-Brown
- The Bionics Institute of Australia, East Melbourne, VIC, Australia.,Medical Bionics Department, University of Melbourne, Melbourne, VIC, Australia
| |
Collapse
|
31
|
Wong P, Cheng ST, Chen F. Cantonese Tone Identification in Three Temporal Cues in Quiet, Speech-Shaped Noise and Two-Talker Babble. Front Psychol 2018; 9:1604. [PMID: 30356874 PMCID: PMC6190861 DOI: 10.3389/fpsyg.2018.01604] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 08/13/2018] [Indexed: 11/13/2022] Open
Abstract
Purpose: Cochlear implant processors deliver mostly temporal envelope information and limited fundamental frequency (F0) information to the users, which make pitch and lexical tone perception challenging for cochlear implantees. Different factors have been found to affect Mandarin tone perception in temporal cues but the most effective temporal cues for lexical tone identification across different backgrounds remained unclear because no study has comprehensively examined the effects and interactions of these factors, particularly, in languages that use both pitch heights and pitch shapes to differentiate lexical meanings. The present study compared identification of Cantonese tones in naturally produced stimuli, and in three temporal cues, namely the amplitude contour cue (TE50), the periodicity cue (TE500), and the temporal fine structure cue (TFS), in three different numbers of frequency bands (B04, B08, B16) in quiet and two types of noise (two male talker-babble and speech-shaped noise). Method: Naturally produced Cantonese tones and synthetic tones that combined different acoustic cues and different number of frequency bands were presented to 18 young native Cantonese speakers for tone identification in quiet and noise. Results: Among the three temporal cues, TFS was the most effective for Cantonese tone identification in quiet and noise, except for T4 (LF) identification. Its effect was even stronger when the tones were presented in 4 or 8 bands rather than 16 bands. Neither TE500 nor TE50 was effective for Cantonese tone identification in quiet or noise. In noise, most tones in TE500 and TE50 were misheard as T4 (LF), demonstrating errors in both tone shapes and tone heights. Types of noise had limited effect on tone identification. Conclusions: Findings on Mandarin tone perception in temporal cues may not be applicable to other tone languages with more complex tonal systems. TFS presented in four bands was the most effective temporal cue for Cantonese tone identification in quiet and noise. Temporal envelope cues were not effective for tone, tone shape or tone height identification in Cantonese. These findings have implications for future design of cochlear implants for tone speakers who use pitch heights or a combination of pitch heights and pitch shapes to differentiate meanings.
Collapse
Affiliation(s)
- Puisan Wong
- Division of Speech and Hearing Sciences, Faculty of Education, University of Hong Kong, Hong Kong, Hong Kong
| | - Sheung Ting Cheng
- Division of Speech and Hearing Sciences, Faculty of Education, University of Hong Kong, Hong Kong, Hong Kong
| | - Fei Chen
- Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China
| |
Collapse
|
32
|
Frequency specificity of amplitude envelope patterns in noise-vocoded speech. Hear Res 2018; 367:169-181. [DOI: 10.1016/j.heares.2018.06.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Revised: 06/03/2018] [Accepted: 06/08/2018] [Indexed: 11/22/2022]
|
33
|
Abstract
Hearing loss (HL) is a common sensory impairment in humans, with significant economic and social impacts. With nearly 20% of the world's population, China has focused on economic development and health awareness to improve the care for its hearing-impaired population. Recently, the Chinese government has initiated national programs such as the China Disabled Persons Federation to fund prevention, treatment, and rehabilitation of hearing impairment. Newborn hearing screening and auditory rehabilitation programs in China have expanded exponentially with government support. While facing many challenges and overcoming obstacles, cochlear implantation (CI) programs in China have also experienced considerable growth. This review discusses the implementation of CI programs for HL in China and presents current HL data including epidemiology, newborn hearing screening, and determination of genetic etiologies. Sharing the experience in Chinese auditory rehabilitation and CI programs will shine a light on the developmental pathway of healthcare infrastructure to meet emerging needs of the hearing-impaired population in other developing countries.
Collapse
|
34
|
Abstract
OBJECTIVES Adults can use slow temporal envelope cues, or amplitude modulation (AM), to identify speech sounds in quiet. Faster AM cues and the temporal fine structure, or frequency modulation (FM), play a more important role in noise. This study assessed whether fast and slow temporal modulation cues play a similar role in infants' speech perception by comparing the ability of normal-hearing 3-month-olds and adults to use slow temporal envelope cues in discriminating consonants contrasts. DESIGN English consonant-vowel syllables differing in voicing or place of articulation were processed by 2 tone-excited vocoders to replace the original FM cues with pure tones in 32 frequency bands. AM cues were extracted in each frequency band with 2 different cutoff frequencies, 256 or 8 Hz. Discrimination was assessed for infants and adults using an observer-based testing method, in quiet or in a speech-shaped noise. RESULTS For infants, the effect of eliminating fast AM cues was the same in quiet and in noise: a high proportion of infants discriminated when both fast and slow AM cues were available, but less than half of the infants also discriminated when only slow AM cues were preserved. For adults, the effect of eliminating fast AM cues was greater in noise than in quiet: All adults discriminated in quiet whether or not fast AM cues were available, but in noise eliminating fast AM cues reduced the percentage of adults reaching criterion from 71 to 21%. CONCLUSIONS In quiet, infants seem to depend on fast AM cues more than adults do. In noise, adults seem to depend on FM cues to a greater extent than infants do. However, infants and adults are similarly affected by a loss of fast AM cues in noise. Experience with the native language seems to change the relative importance of different acoustic cues for speech perception.
Collapse
|
35
|
Huang J, Chang J, Zeng FG. Electro-tactile stimulation (ETS) enhances cochlear-implant Mandarin tone recognition. World J Otorhinolaryngol Head Neck Surg 2018; 3:219-223. [PMID: 29780966 PMCID: PMC5956137 DOI: 10.1016/j.wjorl.2017.12.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2017] [Accepted: 10/24/2017] [Indexed: 11/29/2022] Open
Abstract
Objective Electro-acoustic stimulation (EAS) is an effective method to enhance cochlear-implant performance in individuals who have residual low-frequency acoustic hearing. To help the majority of cochlear implant users who do not have any functional residual acoustic hearing, electro-tactile stimulation (ETS) may be used because tactile sensation has a frequency range and perceptual capabilities similar to that produced by acoustic stimulation in the EAS users. Methods Following up the first ETS study showing enhanced English sentence recognition in noise,1 the present study evaluated the effect of ETS on Mandarin tone recognition in noise in two groups of adult Mandarin-speaking individuals. The first group included 11 normal-hearing individuals who listened to a 4-channel, noise-vocoded, cochlear-implant simulation. The second group included 1 unilateral cochlear-implant user and 2 bilateral users with each of their devices being tested independently. Both groups participated in a 4-alternative, forced-choice task, in which they had to identify a tone that was presented in noise at a 0-dB signal-to-noise ratio via electric stimulation (actual or simulated cochlear implants), tactile stimulation or the combined ETS. Results While electric or tactile stimulation alone produced similar tone recognition (∼40% correct), the ETS enhanced the cochlear-implant tone recognition by 17–18 percentage points. The size of the present ETS enhancement effect was similar to that of the previously reported EAS effect on Mandarin tone recognition. Psychophysical analysis on tactile sensation showed an important role of frequency discrimination in the ETS enhancement. Conclusion Tactile stimulation can potentially enhance Mandarin tone recognition in cochlear-implant users who do not have usable residual acoustic hearing. To optimize this potential, high fundamental frequencies need to be transposed to a 100–200 Hz range.
Collapse
Affiliation(s)
- Juan Huang
- Mind and Brain Institute, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Janice Chang
- Department of Otorhinolaryngology Head and Neck Surgery, University of California, Los Angeles, CA, 90095, USA
| | - Fan-Gang Zeng
- Department of Anatomy and Neurobiology, Center for Hearing Research, 110 Medical Science E, University of California, Irvine, CA, 92697-5320, USA.,Biomedical Engineering, Center for Hearing Research, 110 Medical Science E, University of California, Irvine, CA, 92697-5320, USA.,Cognitive Sciences, Center for Hearing Research, 110 Medical Science E, University of California, Irvine, CA, 92697-5320, USA.,Otorhinolaryngology Head and Neck Surgery, Center for Hearing Research, 110 Medical Science E, University of California, Irvine, CA, 92697-5320, USA
| |
Collapse
|
36
|
Liu H, Peng X, Zhao Y, Ni X. The effectiveness of sound-processing strategies on tonal language cochlear implant users: A systematic review. Pediatr Investig 2017; 1:32-39. [PMID: 32851216 PMCID: PMC7331426 DOI: 10.1002/ped4.12011] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Accepted: 10/12/2017] [Indexed: 11/23/2022] Open
Abstract
IMPORTANCE Contemporary cochlear implants (CIs) are well established as a technology for people with severe-to-profound sensorineural hearing loss, with their effectiveness having been widely reported. However, for tonal language CI recipients, speech perception remains a challenge: Conventional signal processing strategies have been demonstrated to possibly provide insufficient information to encode tonal cues, and CI recipients have exhibited considerable deficits in tone perception. Thus, some tonal language-oriented sound-processing strategies have been introduced. The effects of available tonal language-oriented strategies on tone perception are reviewed and evaluated in this study. The results may aid in designing and improving tonal language-appropriate sound-processing strategies for CI recipients. OBJECTIVE The objective of this systematic review was to investigate the effects of tonal-language-oriented signal processing strategies on tone perception, music perception, word and sentence recognition. METHODS To evaluate the effects of tonal language-oriented strategies on tone perception, we conducted a systematic review. We searched for relevant reports dated from January 1979 to July 2017 using PubMed, Cochrane Library, EBSCO, Web of Science, EMBASE, and 4 Chinese periodical databases (CBMdisc, CNKI, VIP, and Wanfang Data). RESULTS According to our search strategy, 672 potentially eligible studies were retrieved from the databases, with 12 of these studies included in the final review after a 4-stage selection process. The majority of sound-processing strategies designed for tonal language were HiResolution® with Fidelity 120 (HiRes 120), fine structure processing, temporal fine structure (TFS), and C-tone. Generally, acute or short-term comparisons between the tonal language-oriented strategies and the conventional strategy did not reveal statistically significant differences in speech perception (or show a small improvement). However, a tendency toward improved tone perception and subjectively reported overall preferred sound quality was observed with the tonal language-oriented strategies. INTERPRETATION Conventional signal processing strategies typically provided very limited F0 information via temporal envelopes delivered to the stimulating electrodes. In contrast, tonal language-oriented coding strategies attempted to present more spectral information and TFS cues required for tone perception. Thus, a tendency of improved performance in tonal language perception in CI users was shown.
Collapse
Affiliation(s)
- Haihong Liu
- Beijing Key Laboratory for Pediatric Diseases of Otorhinolaryngology, Head and Neck SurgeryMinistry of Education (MOE) Key Laboratory of Major Diseases in ChildrenBeijing Pediatric Research InstituteBeijing Children's HospitalCapital Medical UniversityNational Center for Children's HealthBeijingChina
- Department of Otorhinolaryngology, Head and Neck SurgeryBeijing Children's HospitalCapital Medical UniversityNational Center for Children's HealthBeijingChina
| | - Xiaoxia Peng
- Center for Clinical Epidemiology and Evidence‐Based MedicineBeijing Children's HospitalCapital Medical UniversityNational Center for Children's HealthBeijingChina
| | - Yawen Zhao
- Beijing Key Laboratory for Pediatric Diseases of Otorhinolaryngology, Head and Neck SurgeryMinistry of Education (MOE) Key Laboratory of Major Diseases in ChildrenBeijing Pediatric Research InstituteBeijing Children's HospitalCapital Medical UniversityNational Center for Children's HealthBeijingChina
| | - Xin Ni
- Beijing Key Laboratory for Pediatric Diseases of Otorhinolaryngology, Head and Neck SurgeryMinistry of Education (MOE) Key Laboratory of Major Diseases in ChildrenBeijing Pediatric Research InstituteBeijing Children's HospitalCapital Medical UniversityNational Center for Children's HealthBeijingChina
- Department of Otorhinolaryngology, Head and Neck SurgeryBeijing Children's HospitalCapital Medical UniversityNational Center for Children's HealthBeijingChina
| |
Collapse
|
37
|
Cognitive basis of individual differences in speech perception, production and representations: The role of domain general attentional switching. Atten Percept Psychophys 2017; 79:945-963. [PMID: 28144832 DOI: 10.3758/s13414-017-1283-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This study investigated whether individual differences in cognitive functions, attentional abilities in particular, were associated with individual differences in the quality of phonological representations, resulting in variability in speech perception and production. To do so, we took advantage of a tone merging phenomenon in Cantonese, and identified three groups of typically developed speakers who could differentiate the two rising tones (high and low rising) in both perception and production [+Per+Pro], only in perception [+Per-Pro], or in neither modalities [-Per-Pro]. Perception and production were reflected, respectively, by discrimination sensitivity d' and acoustic measures of pitch offset and rise time differences. Components of event-related potential (ERP)-the mismatch negativity (MMN) and the ERPs to amplitude rise time-were taken to reflect the representations of the acoustic cues of tones. Components of attention and working memory in the auditory and visual modalities were assessed with published test batteries. The results show that individual differences in both perception and production are linked to how listeners encode and represent the acoustic cues (pitch contour and rise time) as reflected by ERPs. The present study has advanced our knowledge from previous work by integrating measures of perception, production, attention, and those reflecting quality of representation, to offer a comprehensive account for the underlying cognitive factors of individual differences in speech processing. Particularly, it is proposed that domain-general attentional switching affects the quality of perceptual representations of the acoustic cues, giving rise to individual differences in perception and production.
Collapse
|
38
|
Cai T, McPherson B, Li C, Yang F. Tone perception in Mandarin-speaking school age children with otitis media with effusion. PLoS One 2017; 12:e0183394. [PMID: 28829840 PMCID: PMC5568745 DOI: 10.1371/journal.pone.0183394] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2017] [Accepted: 08/03/2017] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES The present study explored tone perception ability in school age Mandarin-speaking children with otitis media with effusion (OME) in noisy listening environments. The study investigated the interaction effects of noise, tone type, age, and hearing status on monaural tone perception, and assessed the application of a hierarchical clustering algorithm for profiling hearing impairment in children with OME. METHODS Forty-one children with normal hearing and normal middle ear status and 84 children with OME with or without hearing loss participated in this study. The children with OME were further divided into two subgroups based on their severity and pattern of hearing loss using a hierarchical clustering algorithm. Monaural tone recognition was measured using a picture-identification test format incorporating six sets of monosyllabic words conveying four lexical tones under speech spectrum noise, with the signal-to-noise ratio (SNR) conditions ranging from -9 to -21 dB. RESULTS Linear correlation indicated tone recognition thresholds of children with OME were significantly correlated with age and pure tone hearing thresholds at every frequency tested. Children with hearing thresholds less affected by OME performed similarly to their peers with normal hearing. Tone recognition thresholds of children with auditory status more affected by OME were significantly inferior to those of children with normal hearing or with minor hearing loss. Younger children demonstrated poorer tone recognition performance than older children with OME. A mixed design repeated-measure ANCOVA showed significant main effects of listening condition, hearing status, and tone type on tone recognition. Contrast comparisons revealed that tone recognition scores were significantly better under -12 dB SNR than under -15 dB SNR conditions and tone recognition scores were significantly worse under -18 dB SNR than those obtained under -15 dB SNR conditions. Tone 1 was the easiest tone to identify and Tone 3 was the most difficult tone to identify for all participants, when considering -12, -15, and -18 dB SNR as within-subject variables. The interaction effect between hearing status and tone type indicated that children with greater levels of OME-related hearing loss had more impaired tone perception of Tone 1 and Tone 2 compared to their peers with lesser levels of OME-related hearing loss. However, tone perception of Tone 3 and Tone 4 remained similar among all three groups. Tone 2 and Tone 3 were the most perceptually difficult tones for children with or without OME-related hearing loss in all listening conditions. CONCLUSIONS The hierarchical clustering algorithm demonstrated usefulness in risk stratification for tone perception deficiency in children with OME-related hearing loss. There was marked impairment in tone perception in noise for children with greater levels of OME-related hearing loss. Monaural lexical tone perception in younger children was more vulnerable to noise and OME-related hearing loss than that in older children.
Collapse
Affiliation(s)
- Ting Cai
- Division of Speech and Hearing Sciences, Faculty of Education, The University of Hong Kong, Hong Kong, China
| | - Bradley McPherson
- Division of Speech and Hearing Sciences, Faculty of Education, The University of Hong Kong, Hong Kong, China
| | - Caiwei Li
- Department of Otorhinolaryngology, Shenzhen Children’s Hospital, Shenzhen, China
| | - Feng Yang
- Department of Speech Therapy, Shenzhen Children’s Hospital, Shenzhen, China
| |
Collapse
|
39
|
Masking release with changing fundamental frequency: Electric acoustic stimulation resembles normal hearing subjects. Hear Res 2017; 350:226-234. [DOI: 10.1016/j.heares.2017.05.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/01/2016] [Revised: 03/04/2017] [Accepted: 05/08/2017] [Indexed: 11/20/2022]
|
40
|
Qi B, Mao Y, Liu J, Liu B, Xu L. Relative contributions of acoustic temporal fine structure and envelope cues for lexical tone perception in noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:3022. [PMID: 28599529 PMCID: PMC5415402 DOI: 10.1121/1.4982247] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2016] [Revised: 03/21/2017] [Accepted: 04/11/2017] [Indexed: 06/07/2023]
Abstract
Previous studies have shown that lexical tone perception in quiet relies on the acoustic temporal fine structure (TFS) but not on the envelope (E) cues. The contributions of TFS to speech recognition in noise are under debate. In the present study, Mandarin tone tokens were mixed with speech-shaped noise (SSN) or two-talker babble (TTB) at five signal-to-noise ratios (SNRs; -18 to +6 dB). The TFS and E were then extracted from each of the 30 bands using Hilbert transform. Twenty-five combinations of TFS and E from the sound mixtures of the same tone tokens at various SNRs were created. Twenty normal-hearing, native-Mandarin-speaking listeners participated in the tone-recognition test. Results showed that tone-recognition performance improved as the SNRs in either TFS or E increased. The masking effects on tone perception for the TTB were weaker than those for the SSN. For both types of masker, the perceptual weights of TFS and E in tone perception in noise was nearly equivalent, with E playing a slightly greater role than TFS. Thus, the relative contributions of TFS and E cues to lexical tone perception in noise or in competing-talker maskers differ from those in quiet and those to speech perception of non-tonal languages.
Collapse
Affiliation(s)
- Beier Qi
- Department of Otolaryngology-Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, China
| | - Yitao Mao
- Department of Radiology, Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Jiaxing Liu
- Department of Otolaryngology-Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, China
| | - Bo Liu
- Department of Otolaryngology-Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Beijing, China
| | - Li Xu
- Communication Sciences and Disorders, Ohio University, Athens, Ohio 45701, USA
| |
Collapse
|
41
|
The Relative Weight of Temporal Envelope Cues in Different Frequency Regions for Mandarin Sentence Recognition. Neural Plast 2017; 2017:7416727. [PMID: 28203463 PMCID: PMC5288535 DOI: 10.1155/2017/7416727] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Accepted: 01/04/2017] [Indexed: 11/17/2022] Open
Abstract
Acoustic temporal envelope (E) cues containing speech information are distributed across the frequency spectrum. To investigate the relative weight of E cues in different frequency regions for Mandarin sentence recognition, E information was extracted from 30 contiguous bands across the range of 80–7,562 Hz using Hilbert decomposition and then allocated to five frequency regions. Recognition scores were obtained with acoustic E cues from 1 or 2 random regions from 40 normal-hearing listeners. While the recognition scores ranged from 8.2% to 16.3% when E information from only one region was available, the scores ranged from 57.9% to 87.7% when E information from two frequency regions was presented, suggesting a synergistic effect among the temporal E cues in different frequency regions. Next, the relative contributions of the E information from the five frequency regions to sentence perception were computed using a least-squares approach. The results demonstrated that, for Mandarin Chinese, a tonal language, the temporal E cues of Frequency Region 1 (80–502 Hz) and Region 3 (1,022–1,913 Hz) contributed more to the intelligence of sentence recognition than other regions, particularly the region of 80–502 Hz, which contained fundamental frequency (F0) information.
Collapse
|
42
|
Abstract
OBJECTIVE To determine whether exaggerating the variations in fundamental frequency (F0) contours of Mandarin-based pitch fluctuations could improve tone identification by cochlear implant (CI) users. METHODS Twelve normal-hearing (NH) listeners and 11 CI users were tested for their ability to recognize F0 contours modeled after Mandarin tones, in 4- or 5-alternatives forced-choice paradigms. Two types of stimuli were used: computer-generated complex tones and voice recordings. Four contours were tested with voice recordings: flat, rise, fall, and dip. A fifth contour, peak, was added for complex tones. The F0 range of each contour was varied in an adaptive manner. A maximum-likelihood technique was used to fit a psychometric function to the performance data and extract threshold at 70% accuracy. RESULTS As F0 range increased, performance in tone identification improved but did not reach 100% for some CI users, suggesting that confusions between contours could always be made even with extremely exaggerated contours. Compared with NH participants, CI users required substantially larger F0 ranges to identify tones, on the order of 9.3 versus 0.4 semitones. CI users achieved better performance for complex tones than for voice recordings, whereas the reverse was true for NH participants. Confusion matrices showed that the "flat" tone was often a default option when the tone contour's F0 range presented was too narrow for participants to respond correctly. CONCLUSION These results demonstrate markedly impaired ability for CI users to identify tonal contours, but suggest that the use of exaggerated pitch contours may be helpful for tonal language perception.
Collapse
|
43
|
Preliminary Measurements of Binaural Masking-Level Difference When Using Hearing Aids for Sensorineural Hearing Loss. J Med Biol Eng 2016. [DOI: 10.1007/s40846-016-0174-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
44
|
Tan J, Dowell R, Vogel A. Mandarin Lexical Tone Acquisition in Cochlear Implant Users With Prelingual Deafness: A Review. Am J Audiol 2016; 25:246-56. [PMID: 27387047 DOI: 10.1044/2016_aja-15-0069] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Accepted: 02/20/2016] [Indexed: 11/09/2022] Open
Abstract
PURPOSE The purpose of this review article is to synthesize evidence from the fields of developmental linguistics and cochlear implant technology relevant to the production and perception of Mandarin lexical tone in cochlear implant users with prelingual deafness. The aim of this review was to identify potential factors that determine outcomes for tonal-language speaking cochlear implant users and possible directions for further research. METHOD A computerized database search of MEDLINE, CINAHL, Academic Search Premier, Web of Science, and Google Scholar was undertaken in June and July 2014. Search terms used were lexical tone AND tonal language, speech development AND/OR speech production AND/OR speech perception AND cochlear implants, and pitch perception AND cochlear implants, anywhere in the title or abstract. CONCLUSION Despite the demonstrated limitations of pitch perception in cochlear implant users, there is some evidence that typical production and perception of lexical tone is possible by cochlear implant users with prelingual deafness. Further studies are required to determine the factors that contribute to better outcomes to inform rehabilitation processes for cochlear implant users in tonal-language environments.
Collapse
Affiliation(s)
- Johanna Tan
- The University of Melbourne, Victoria, Australia
| | | | - Adam Vogel
- Center for Neuroscience of Speech, The University of Melbourne, Victoria, Australia
- Hertie Institute for Clinical Brain Research, Eberhard Karls Universität Tübingen, Germany
- Murdoch Childrens Research Institute, The Bruce Lefroy Centre for Genetic Health Research, Melbourne, Victoria, Australia
| |
Collapse
|
45
|
Mao Y, Xu L. Lexical tone recognition in noise in normal-hearing children and prelingually deafened children with cochlear implants. Int J Audiol 2016; 56:S23-S30. [PMID: 27564095 PMCID: PMC5326701 DOI: 10.1080/14992027.2016.1219073] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
OBJECTIVE The purpose of the present study was to investigate Mandarin tone recognition in background noise in children with cochlear implants (CIs), and to examine the potential factors contributing to their performance. DESIGN Tone recognition was tested using a two-alternative forced-choice paradigm in various signal-to-noise ratio (SNR) conditions (i.e. quiet, +12, +6, 0, and -6 dB). Linear correlation analysis was performed to examine possible relationships between the tone-recognition performance of the CI children and the demographic factors. STUDY SAMPLE Sixty-six prelingually deafened children with CIs and 52 normal-hearing (NH) children as controls participated in the study. RESULTS Children with CIs showed an overall poorer tone-recognition performance and were more susceptible to noise than their NH peers. Tone confusions between Mandarin tone 2 and tone 3 were most prominent in both CI and NH children except for in the poorest SNR conditions. Age at implantation was significantly correlated with tone-recognition performance of the CI children in noise. CONCLUSIONS There is a marked deficit in tone recognition in prelingually deafened children with CIs, particularly in noise listening conditions. While factors that contribute to the large individual differences are still elusive, early implantation could be beneficial to tone development in pediatric CI users.
Collapse
Affiliation(s)
- Yitao Mao
- Department of Radiology, Xiangya Hospital, Central South University, Changsha, Hunan, China
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Li Xu
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| |
Collapse
|
46
|
Meng Q, Zheng N, Li X. Loudness Contour Can Influence Mandarin Tone Recognition: Vocoder Simulation and Cochlear Implants. IEEE Trans Neural Syst Rehabil Eng 2016; 25:641-649. [PMID: 27448366 DOI: 10.1109/tnsre.2016.2593489] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Lexical tone recognition with current cochlear implants (CI) remains unsatisfactory due to significantly degraded pitch-related acoustic cues, which dominate the tone recognition by normal-hearing (NH) listeners. Several secondary cues (e.g., amplitude contour, duration, and spectral envelope) that influence tone recognition in NH listeners and CI users have been studied. This work proposes a loudness contour manipulation algorithm, namely Loudness-Tone (L-Tone), to investigate the effects of loudness contour on Mandarin tone recognition and the effectiveness of using loudness cue to enhance tone recognition for CI users. With L-Tone, the intensity of sound samples is multiplied by gain values determined by instantaneous fundamental frequencies (F0s) and pre-defined gain- F0 mapping functions. Perceptual experiments were conducted with a four-channel noise-band vocoder simulation in NH listeners and with CI users. The results suggested that 1) loudness contour is a useful secondary cue for Mandarin tone recognition, especially when pitch cues are significantly degraded; 2) L-Tone can be used to improve Mandarin tone recognition in both simulated and actual CI-hearing without significant negative effect on vowel and consonant recognition. L-Tone is a promising algorithm for incorporation into real-time CI processing and off-line CI rehabilitation training software.
Collapse
|
47
|
Ou J, Law SP. Individual differences in processing pitch contour and rise time in adults: A behavioral and electrophysiological study of Cantonese tone merging. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:3226. [PMID: 27369146 DOI: 10.1121/1.4954252] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
One way to understand the relationship between speech perception and production is to examine cases where the two dissociate. This study investigates the hypothesis that perceptual acuity reflected in event-related potentials (ERPs) to rise time of sound amplitude envelope and pitch contour [reflected in the mismatch negativity (MMN)] may associate with individual differences in production among speakers with otherwise comparable perceptual abilities. To test this hypothesis, advantage was taken of an on-going sound change-tone merging in Cantonese, and compared the ERPs between two groups of typically developed native speakers who could discriminate the high rising and low rising tones with equivalent accuracy but differed in the distinctiveness of their production of these tones. Using a passive oddball paradigm, early positive-going EEG components to rise time and MMN to pitch contour were elicited during perception of the two tones. Significant group differences were found in neural responses to rise time rather than pitch contour. More importantly, individual differences in efficiency of tone discrimination in response latency and magnitude of neural responses to rise time were correlated with acoustic measures of F0 offset and rise time differences in productions of the two rising tones.
Collapse
Affiliation(s)
- Jinghua Ou
- Division of Speech and Hearing Science, the University of Hong Kong, Hong Kong Special Administrative Region
| | - Sam-Po Law
- Division of Speech and Hearing Science, the University of Hong Kong, Hong Kong Special Administrative Region
| |
Collapse
|
48
|
Antoniou M, Wong PCM. Varying irrelevant phonetic features hinders learning of the feature being trained. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:271-8. [PMID: 26827023 PMCID: PMC4714982 DOI: 10.1121/1.4939736] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2015] [Revised: 12/17/2015] [Accepted: 12/23/2015] [Indexed: 06/05/2023]
Abstract
Learning to distinguish nonnative words that differ in a critical phonetic feature can be difficult. Speech training studies typically employ methods that explicitly direct the learner's attention to the relevant nonnative feature to be learned. However, studies on vision have demonstrated that perceptual learning may occur implicitly, by exposing learners to stimulus features, even if they are irrelevant to the task, and it has recently been suggested that this task-irrelevant perceptual learning framework also applies to speech. In this study, subjects took part in a seven-day training regimen to learn to distinguish one of two nonnative features, namely, voice onset time or lexical tone, using explicit training methods consistent with most speech training studies. Critically, half of the subjects were exposed to stimuli that varied not only in the relevant feature, but in the irrelevant feature as well. The results showed that subjects who were trained with stimuli that varied in the relevant feature and held the irrelevant feature constant achieved the best learning outcomes. Varying both features hindered learning and generalization to new stimuli.
Collapse
Affiliation(s)
- Mark Antoniou
- MARCS Institute, Western Sydney University, Locked Bag 1797, Penrith, New South Wales 2751, Australia
| | - Patrick C M Wong
- Department of Linguistics and Modern Languages and Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong Special Administrative Region, People's Republic of China
| |
Collapse
|
49
|
Meng Q, Zheng N, Li X. Mandarin speech-in-noise and tone recognition using vocoder simulations of the temporal limits encoder for cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:301-310. [PMID: 26827026 DOI: 10.1121/1.4939707] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Temporal envelope-based signal processing strategies are widely used in cochlear-implant (CI) systems. It is well recognized that the inability to convey temporal fine structure (TFS) in the stimuli limits CI users' performance, but it is still unclear how to effectively deliver the TFS. A strategy known as the temporal limits encoder (TLE), which employs an approach to derive the amplitude modulator to generate the stimuli coded in an interleaved-sampling strategy, has recently been proposed. The TLE modulator contains information related to the original temporal envelope and a slow-varying TFS from the band signal. In this paper, theoretical analyses are presented to demonstrate the superiority of TLE compared with two existing strategies, the clinically available continuous-interleaved-sampling (CIS) strategy and the experimental harmonic-single-sideband-encoder strategy. Perceptual experiments with vocoder simulations in normal-hearing listeners are conducted to compare the performance of TLE and CIS on two tasks (i.e., Mandarin speech reception in babble noise and tone recognition in quiet). The performance of the TLE modulator is mostly better than (for most tone-band vocoders) or comparable to (for noise-band vocoders) the CIS modulator on both tasks. This work implies that there is some potential for improving the representation of TFS with CIs by using a TLE strategy.
Collapse
Affiliation(s)
- Qinglin Meng
- Shenzhen Key Laboratory of Modern Communication and Information Processing, College of Information Engineering, Shenzhen University, Shenzhen 518060, China
| | - Nengheng Zheng
- Shenzhen Key Laboratory of Modern Communication and Information Processing, College of Information Engineering, Shenzhen University, Shenzhen 518060, China
| | - Xia Li
- Shenzhen Key Laboratory of Modern Communication and Information Processing, College of Information Engineering, Shenzhen University, Shenzhen 518060, China
| |
Collapse
|
50
|
Chang CB, Bowles AR. Context effects on second-language learning of tonal contrasts. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:3703-3716. [PMID: 26723326 DOI: 10.1121/1.4937612] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Studies of lexical tone learning generally focus on monosyllabic contexts, while reports of phonetic learning benefits associated with input variability are based largely on experienced learners. This study trained inexperienced learners on Mandarin tonal contrasts to test two hypotheses regarding the influence of context and variability on tone learning. The first hypothesis was that increased phonetic variability of tones in disyllabic contexts makes initial tone learning more challenging in disyllabic than monosyllabic words. The second hypothesis was that the learnability of a given tone varies across contexts due to differences in tonal variability. Results of a word learning experiment supported both hypotheses: tones were acquired less successfully in disyllables than in monosyllables, and the relative difficulty of disyllables was closely related to contextual tonal variability. These results indicate limited relevance of monosyllable-based data on Mandarin learning for the disyllabic majority of the Mandarin lexicon. Furthermore, in the short term, variability can diminish learning; its effects are not necessarily beneficial but dependent on acquisition stage and other learner characteristics. These findings thus highlight the importance of considering contextual variability and the interaction between variability and type of learner in the design, interpretation, and application of research on phonetic learning.
Collapse
Affiliation(s)
- Charles B Chang
- Linguistics Program, Boston University, 621 Commonwealth Avenue, Boston, Massachusetts 02215, USA
| | - Anita R Bowles
- Rosetta Stone, 135 West Market Street, Harrisonburg, Virginia 22801, USA
| |
Collapse
|