1
|
Georgiou GP, Dimitriou D. Perception of Dutch vowels by Cypriot Greek listeners: To what extent can listeners' patterns be predicted by acoustic and perceptual similarity? Atten Percept Psychophys 2023; 85:2459-2474. [PMID: 37740154 PMCID: PMC10584718 DOI: 10.3758/s13414-023-02781-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/30/2023] [Indexed: 09/24/2023]
Abstract
There have been numerous studies investigating the perception of non-native sounds by listeners with different first language (L1) backgrounds. However, research needs to expand to under-researched languages and incorporate predictions conducted under the assumptions of new speech models. This study aimed to investigate the perception of Dutch vowels by Cypriot Greek adult listeners and test the predictions of cross-linguistic acoustic and perceptual similarity. The predictions of acoustic similarity were formed using a machine-learning algorithm. Listeners completed a classification test, which served as the baseline for developing the predictions of perceptual similarity by employing the framework of the Universal Perceptual Model (UPM), and an AXB discrimination test; the latter allowed the evaluation of both acoustic and perceptual predictions. The findings indicated that listeners classified each non-native vowel as one or more L1 vowels, while the discrimination accuracy over the non-native contrasts was moderate. In addition, cross-linguistic acoustic similarity predicted to a large extent the classification of non-native sounds in terms of L1 categories and both the acoustic and perceptual similarity predicted the discrimination accuracy of all contrasts. Being in line with prior findings, these findings demonstrate that acoustic and perceptual cues are reliable predictors of non-native contrast discrimination and that the UPM model can make accurate estimations for the discrimination patterns of non-native listeners.
Collapse
Affiliation(s)
- Georgios P. Georgiou
- Department of Languages and Literature, University of Nicosia, Nicosia, Cyprus
- University of Nicosia Phonetic Lab, Nicosia, Cyprus
| | | |
Collapse
|
2
|
Xie X, Jaeger TF, Kurumada C. What we do (not) know about the mechanisms underlying adaptive speech perception: A computational framework and review. Cortex 2023; 166:377-424. [PMID: 37506665 DOI: 10.1016/j.cortex.2023.05.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 12/23/2022] [Accepted: 05/05/2023] [Indexed: 07/30/2023]
Abstract
Speech from unfamiliar talkers can be difficult to comprehend initially. These difficulties tend to dissipate with exposure, sometimes within minutes or less. Adaptivity in response to unfamiliar input is now considered a fundamental property of speech perception, and research over the past two decades has made substantial progress in identifying its characteristics. The mechanisms underlying adaptive speech perception, however, remain unknown. Past work has attributed facilitatory effects of exposure to any one of three qualitatively different hypothesized mechanisms: (1) low-level, pre-linguistic, signal normalization, (2) changes in/selection of linguistic representations, or (3) changes in post-perceptual decision-making. Direct comparisons of these hypotheses, or combinations thereof, have been lacking. We describe a general computational framework for adaptive speech perception (ASP) that-for the first time-implements all three mechanisms. We demonstrate how the framework can be used to derive predictions for experiments on perception from the acoustic properties of the stimuli. Using this approach, we find that-at the level of data analysis presently employed by most studies in the field-the signature results of influential experimental paradigms do not distinguish between the three mechanisms. This highlights the need for a change in research practices, so that future experiments provide more informative results. We recommend specific changes to experimental paradigms and data analysis. All data and code for this study are shared via OSF, including the R markdown document that this article is generated from, and an R library that implements the models we present.
Collapse
Affiliation(s)
- Xin Xie
- Language Science, University of California, Irvine, USA.
| | - T Florian Jaeger
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA; Computer Science, University of Rochester, Rochester, NY, USA
| | - Chigusa Kurumada
- Brain and Cognitive Sciences, University of Rochester, Rochester, NY, USA
| |
Collapse
|
3
|
Liu W, Wang T, Huang X. The influences of forward context on stop-consonant perception: The combined effects of contrast and acoustic cue activation? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:1903-1920. [PMID: 37756574 DOI: 10.1121/10.0021077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 09/06/2023] [Indexed: 09/29/2023]
Abstract
The perception of the /da/-/ga/ series, distinguished primarily by the third formant (F3) transition, is affected by many nonspeech and speech sounds. Previous studies mainly investigated the influences of context stimuli with frequency bands located in the F3 region and proposed the account of spectral contrast effects. This study examined the effects of context stimuli with bands not in the F3 region. The results revealed that these non-F3-region stimuli (whether with bands higher or lower than the F3 region) mainly facilitated the identification of /ga/; for example, the stimuli (including frequency-modulated glides, sine-wave tones, filtered sentences, and natural vowels) in the low-frequency band (500-1500 Hz) led to more /ga/ responses than those in the low-F3 region (1500-2500 Hz). It is suggested that in the F3 region, context stimuli may act through spectral contrast effects, while in non-F3 regions, context stimuli might activate the acoustic cues of /g/ and further facilitate the identification of /ga/. The combination of contrast and acoustic cue effects can explain more results concerning the forward context influences on the perception of the /da/-/ga/ series, including the effects of non-F3-region stimuli and the imbalanced influences of context stimuli on /da/ and /ga/ perception.
Collapse
Affiliation(s)
- Wenli Liu
- Department of Social Psychology, Zhou Enlai School of Government, Nankai University, 38 Tongshuo Road, Tianjin 300350, China
| | - Tianyu Wang
- Department of Social Psychology, Zhou Enlai School of Government, Nankai University, 38 Tongshuo Road, Tianjin 300350, China
| | - Xianjun Huang
- School of Psychology, Capital Normal University, 105 North West 3rd Ring Road, Beijing 100048, China
| |
Collapse
|
4
|
Hamza Y, Farhadi A, Schwarz DM, McDonough JM, Carney LH. Representations of fricatives in subcortical model responses: Comparisons with human consonant perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:602-618. [PMID: 37535429 PMCID: PMC10550336 DOI: 10.1121/10.0020536] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 07/11/2023] [Accepted: 07/13/2023] [Indexed: 08/05/2023]
Abstract
Fricatives are obstruent sound contrasts made by airflow constrictions in the vocal tract that produce turbulence across the constriction or at a site downstream from the constriction. Fricatives exhibit significant intra/intersubject and contextual variability. Yet, fricatives are perceived with high accuracy. The current study investigated modeled neural responses to fricatives in the auditory nerve (AN) and inferior colliculus (IC) with the hypothesis that response profiles across populations of neurons provide robust correlates to consonant perception. Stimuli were 270 intervocalic fricatives (10 speakers × 9 fricatives × 3 utterances). Computational model response profiles had characteristic frequencies that were log-spaced from 125 Hz to 8 or 20 kHz to explore the impact of high-frequency responses. Confusion matrices generated by k-nearest-neighbor subspace classifiers were based on the profiles of average rates across characteristic frequencies as feature vectors. Model confusion matrices were compared with published behavioral data. The modeled AN and IC neural responses provided better predictions of behavioral accuracy than the stimulus spectra, and IC showed better accuracy than AN. Behavioral fricative accuracy was explained by modeled neural response profiles, whereas confusions were only partially explained. Extended frequencies improved accuracy based on the model IC, corroborating the importance of extended high frequencies in speech perception.
Collapse
Affiliation(s)
- Yasmeen Hamza
- Department of Biomedical Engineering, University of Rochester, Rochester, New York 14627, USA
| | - Afagh Farhadi
- Department of Electrical and Computer Engineering, University of Rochester, Rochester, New York 14627, USA
| | - Douglas M Schwarz
- Depts. of Neuroscience and Biomedical Engineering, University of Rochester, Rochester, New York 14627, USA
| | - Joyce M McDonough
- Department of Linguistics, University of Rochester, Rochester, New York 14627, USA
| | - Laurel H Carney
- Depts. of Biomedical Engineering, Neuroscience, and Electrical and Computer Engineering, University of Rochester, Rochester, New York 14627, USA
| |
Collapse
|
5
|
Azaiez N, Loberg O, Hämäläinen JA, Leppänen PHT. Auditory P3a response to native and foreign speech in children with or without attentional deficit. Neuropsychologia 2023; 183:108506. [PMID: 36773807 DOI: 10.1016/j.neuropsychologia.2023.108506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 01/24/2023] [Accepted: 02/06/2023] [Indexed: 02/11/2023]
Abstract
The aim of this study was to investigate the attentional mechanism in speech processing of native and foreign language in children with and without attentional deficit. For this purpose, the P3a component, cognitive neuromarker of the attentional processes, was investigated in a two-sequence two-deviant oddball paradigm using Finnish and English speech items via event-related potentials (ERP) technique. The difference waves reflected the temporal brain dynamics of the P3a response in native and foreign language contexts. Cluster-based permutation tests evaluated the group differences over the P3a time window. A correlation analysis was conducted between the P3a response and the attention score (ATTEX) to evaluate whether the behavioral assessment reflected the neural activity. The source reconstruction method (CLARA) was used to investigate the neural origins of the attentional differences between groups and conditions. The ERP results showed a larger P3a response in the group of children with attentional problems (AP) compared to controls (CTR). The P3a response differed statistically between the two groups in the native language processing, but not in the foreign language. The ATTEX score correlated with the P3a amplitude in the native language contrasts. The correlation analyses hint at some hemispheric brain activity difference in the frontal area. The group-level CLARA reconstruction showed activation in the speech perception and attention networks over the frontal, parietal, and temporal areas. Differences in activations of these networks were found between the groups and conditions, with the AP group showing higher activity in the source level, being the origin of the ERP enhancement observed on the scalp level.
Collapse
Affiliation(s)
- Najla Azaiez
- Department of Psychology, Faculty of Education and Psychology, University of Jyväskylä, Finland.
| | - Otto Loberg
- Department of Psychology, Faculty of Science and Technology, Bournemouth University, United Kingdom
| | - Jarmo A Hämäläinen
- Department of Psychology, Faculty of Education and Psychology, University of Jyväskylä, Finland; Jyväskylä Center for Interdisciplinary Brain Research, Department of Psychology, University of Jyväskylä, Finland
| | - Paavo H T Leppänen
- Department of Psychology, Faculty of Education and Psychology, University of Jyväskylä, Finland; Jyväskylä Center for Interdisciplinary Brain Research, Department of Psychology, University of Jyväskylä, Finland
| |
Collapse
|
6
|
Discriminatory Brain Processes of Native and Foreign Language in Children with and without Reading Difficulties. Brain Sci 2022; 13:brainsci13010076. [PMID: 36672057 PMCID: PMC9856413 DOI: 10.3390/brainsci13010076] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 12/21/2022] [Accepted: 12/28/2022] [Indexed: 01/03/2023] Open
Abstract
The association between impaired speech perception and reading difficulty has been well established in native language processing, as can be observed from brain activity. However, there has been scarce investigation of whether this association extends to brain activity during foreign language processing. The relationship between reading skills and neuronal speech representation of foreign language remains unclear. In the present study, we used event-related potentials (ERPs) with high-density EEG to investigate this question. Eleven- to 13-year-old children typically developed (CTR) or with reading difficulties (RD) were tested via a passive auditory oddball paradigm containing native (Finnish) and foreign (English) speech items. The change-detection-related ERP responses, the mismatch response (MMR), and the late discriminative negativity (LDN) were studied. The cluster-based permutation tests within and between groups were performed. The results showed an apparent language effect. In the CTR group, we found an atypical MMR in the foreign language processing and a larger LDN response for speech items containing a diphthong in both languages. In the RD group, we found unstable MMR with lower amplitude and a nonsignificant LDN response. A deficit in the LDN response in both languages was found within the RD group analysis. Moreover, we observed larger brain responses in the RD group and a hemispheric polarity reversal compared to the CTR group responses. Our results provide new evidence that language processing differed between the CTR and RD groups in early and late discriminatory responses and that language processing is linked to reading skills in both native and foreign language contexts.
Collapse
|
7
|
Chen W, van de Weijer J. The role of L1-L2 dissimilarity in L2 segment learning - Implications from the acquisition of English post-alveolar fricatives by Mandarin and Mandarin/Wu speakers. Front Psychol 2022; 13:1017724. [PMID: 36582315 PMCID: PMC9793852 DOI: 10.3389/fpsyg.2022.1017724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 11/02/2022] [Indexed: 12/15/2022] Open
Abstract
This study examines how the concept of L1-L2 dissimilarity should be addressed from a two-way perspective in L2 segment learning, and how it relates to the learning outcomes. We achieved this by investigating the productions of the post-alveolar fricatives /ʃ, ʒ/ by Mandarin and Mandarin/Wu speakers, which were subsequently assessed by native English listeners. In the first experiment, we analyzed the spectral moments of /ʃ, ʒ/ produced by Mandarin monolingual and Mandarin/Wu bilingual speakers to find out how the two groups of speakers pronounced the target segments. In the second experiment, native English listeners were tasked with rating the accentedness of the Mandarin- and Mandarin/Wu-accented /ʃ, ʒ/. Results showed native English listeners scored Mandarin/Wu-accented /ʃ/ as having no accent and Mandarin-accented /ʒ/ as having a heavy accent, indicating that English natives perceived the 'native vs. nonnative' segment dissimilarity differently from Chinese learners of English, and that the L1-L2 dissimilarity perceived from both sides may work together in defining the L2 segment learning outcomes.
Collapse
Affiliation(s)
- Wenjun Chen
- School of Foreign Languages, Ningbo University of Technology, Ningbo, Zhejiang, China,*Correspondence: Wenjun Chen,
| | | |
Collapse
|
8
|
Winn MB, Moore AN. Perceptual weighting of acoustic cues for accommodating gender-related talker differences heard by listeners with normal hearing and with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:496. [PMID: 32873011 PMCID: PMC7402726 DOI: 10.1121/10.0001672] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 05/31/2020] [Accepted: 07/14/2020] [Indexed: 06/11/2023]
Abstract
Listeners must accommodate acoustic differences between vocal tracts and speaking styles of conversation partners-a process called normalization or accommodation. This study explores what acoustic cues are used to make this perceptual adjustment by listeners with normal hearing or with cochlear implants, when the acoustic variability is related to the talker's gender. A continuum between /ʃ/ and /s/ was paired with naturally spoken vocalic contexts that were parametrically manipulated to vary by numerous cues for talker gender including fundamental frequency (F0), vocal tract length (formant spacing), and direct spectral contrast with the fricative. The goal was to examine relative contributions of these cues toward the tendency to have a lower-frequency acoustic boundary for fricatives spoken by men (found in numerous previous studies). Normal hearing listeners relied primarily on formant spacing and much less on F0. The CI listeners were individually variable, with the F0 cue emerging as the strongest cue on average.
Collapse
Affiliation(s)
- Matthew B Winn
- Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Ashley N Moore
- Department of Speech & Hearing Sciences, University of Washington, Seattle, Washington 98105, USA
| |
Collapse
|