1
|
Brodbeck C, Kandylaki KD, Scharenborg O. Neural Representations of Non-native Speech Reflect Proficiency and Interference from Native Language Knowledge. J Neurosci 2024; 44:e0666232023. [PMID: 37963763 PMCID: PMC10851685 DOI: 10.1523/jneurosci.0666-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/23/2023] [Accepted: 08/01/2023] [Indexed: 11/16/2023] Open
Abstract
Learning to process speech in a foreign language involves learning new representations for mapping the auditory signal to linguistic structure. Behavioral experiments suggest that even listeners that are highly proficient in a non-native language experience interference from representations of their native language. However, much of the evidence for such interference comes from tasks that may inadvertently increase the salience of native language competitors. Here we tested for neural evidence of proficiency and native language interference in a naturalistic story listening task. We studied electroencephalography responses of 39 native speakers of Dutch (14 male) to an English short story, spoken by a native speaker of either American English or Dutch. We modeled brain responses with multivariate temporal response functions, using acoustic and language models. We found evidence for activation of Dutch language statistics when listening to English, but only when it was spoken with a Dutch accent. This suggests that a naturalistic, monolingual setting decreases the interference from native language representations, whereas an accent in the listener's own native language may increase native language interference, by increasing the salience of the native language and activating native language phonetic and lexical representations. Brain responses suggest that such interference stems from words from the native language competing with the foreign language in a single word recognition system, rather than being activated in a parallel lexicon. We further found that secondary acoustic representations of speech (after 200 ms latency) decreased with increasing proficiency. This may reflect improved acoustic-phonetic models in more proficient listeners.Significance Statement Behavioral experiments suggest that native language knowledge interferes with foreign language listening, but such effects may be sensitive to task manipulations, as tasks that increase metalinguistic awareness may also increase native language interference. This highlights the need for studying non-native speech processing using naturalistic tasks. We measured neural responses unobtrusively while participants listened for comprehension and characterized the influence of proficiency at multiple levels of representation. We found that salience of the native language, as manipulated through speaker accent, affected activation of native language representations: significant evidence for activation of native language (Dutch) categories was only obtained when the speaker had a Dutch accent, whereas no significant interference was found to a speaker with a native (American) accent.
Collapse
Affiliation(s)
- Christian Brodbeck
- Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut 06269
| | - Katerina Danae Kandylaki
- Department of Neuropsychology and Psychopharmacology, Maastricht University, 6200 MD, Maastricht, The Netherlands
| | - Odette Scharenborg
- Multimedia Computing Group, Delft University of Technology, 2628 XE, Delft, The Netherlands
| |
Collapse
|
2
|
Yang X, Silamu W, Xu M, Li Y. Display-Semantic Transformer for Scene Text Recognition. Sensors (Basel) 2023; 23:8159. [PMID: 37836989 PMCID: PMC10574938 DOI: 10.3390/s23198159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Revised: 09/26/2023] [Accepted: 09/27/2023] [Indexed: 10/15/2023]
Abstract
Linguistic knowledge helps a lot in scene text recognition by providing semantic information to refine the character sequence. The visual model only focuses on the visual texture of characters without actively learning linguistic information, which leads to poor model recognition rates in some noisy (distorted and blurry, etc.) images. In order to address the aforementioned issues, this study builds upon the most recent findings of the Vision Transformer, and our approach (called Display-Semantic Transformer, or DST for short) constructs a masked language model and a semantic visual interaction module. The model can mine deep semantic information from images to assist scene text recognition and improve the robustness of the model. The semantic visual interaction module can better realize the interaction between semantic information and visual features. In this way, the visual features can be enhanced by the semantic information so that the model can achieve a better recognition effect. The experimental results show that our model improves the average recognition accuracy on six benchmark test sets by nearly 2% compared to the baseline. Our model retains the benefits of having a small number of parameters and allows for fast inference speed. Additionally, it attains a more optimal balance between accuracy and speed.
Collapse
Affiliation(s)
- Xinqi Yang
- College of Computer Science and Technology, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
- Xinjiang Laboratory of Multi-Language Information Technology, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
- Xinjiang Multilingual Information Technology Research Center, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
| | - Wushour Silamu
- College of Computer Science and Technology, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
- Xinjiang Laboratory of Multi-Language Information Technology, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
- Xinjiang Multilingual Information Technology Research Center, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
| | - Miaomiao Xu
- College of Computer Science and Technology, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
- Xinjiang Laboratory of Multi-Language Information Technology, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
- Xinjiang Multilingual Information Technology Research Center, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
| | - Yanbing Li
- College of Computer Science and Technology, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
- Xinjiang Laboratory of Multi-Language Information Technology, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
- Xinjiang Multilingual Information Technology Research Center, Xinjiang University, No. 777 Huarui Street, Urumqi 830017, China
| |
Collapse
|
3
|
Tomaschek F, Ramscar M. Understanding the Phonetic Characteristics of Speech Under Uncertainty-Implications of the Representation of Linguistic Knowledge in Learning and Processing. Front Psychol 2022; 13:754395. [PMID: 35548492 PMCID: PMC9083257 DOI: 10.3389/fpsyg.2022.754395] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 03/24/2022] [Indexed: 11/13/2022] Open
Abstract
The uncertainty associated with paradigmatic families has been shown to correlate with their phonetic characteristics in speech, suggesting that representations of complex sublexical relations between words are part of speaker knowledge. To better understand this, recent studies have used two-layer neural network models to examine the way paradigmatic uncertainty emerges in learning. However, to date this work has largely ignored the way choices about the representation of inflectional and grammatical functions (IFS) in models strongly influence what they subsequently learn. To explore the consequences of this, we investigate how representations of IFS in the input-output structures of learning models affect the capacity of uncertainty estimates derived from them to account for phonetic variability in speech. Specifically, we examine whether IFS are best represented as outputs to neural networks (as in previous studies) or as inputs by building models that embody both choices and examining their capacity to account for uncertainty effects in the formant trajectories of word final [ɐ], which in German discriminates around sixty different IFS. Overall, we find that formants are enhanced as the uncertainty associated with IFS decreases. This result dovetails with a growing number of studies of morphological and inflectional families that have shown that enhancement is associated with lower uncertainty in context. Importantly, we also find that in models where IFS serve as inputs-as our theoretical analysis suggests they ought to-its uncertainty measures provide better fits to the empirical variance observed in [ɐ] formants than models where IFS serve as outputs. This supports our suggestion that IFS serve as cognitive cues during speech production, and should be treated as such in modeling. It is also consistent with the idea that when IFS serve as inputs to a learning network. This maintains the distinction between those parts of the network that represent message and those that represent signal. We conclude by describing how maintaining a "signal-message-uncertainty distinction" can allow us to reconcile a range of apparently contradictory findings about the relationship between articulation and uncertainty in context.
Collapse
Affiliation(s)
- Fabian Tomaschek
- Quantitative Linguistics Lab, Department of General Linguistics, University of Tübingen, Tübingen, Germany
| | | |
Collapse
|
4
|
León-Pinilla R, Soto-Rubio A, Prado-Gascó V. Support and Emotional Well-Being of Asylum Seekers and Refugees in Spain. Int J Environ Res Public Health 2020; 17:E8365. [PMID: 33198150 DOI: 10.3390/ijerph17228365] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 10/28/2020] [Accepted: 11/10/2020] [Indexed: 12/16/2022]
Abstract
Although the world’s forcibly displaced population reached 79.5 million in 2019, their difficult situations and the issues they struggle with remain practically invisible in Spanish society. Therefore, it seems necessary to provide greater insight into an invisible reality to improve the refugees’ situation. The present cross-sectional study aims to draw a general profile of refugees’ and asylum seekers’ main characteristics in Spain and their well-being. A total of 186 refugees living in Spain participated. An ad-hoc questionnaire was administered to obtain data regarding sociodemographic profile, language skills, and social and institutional support. A standardized instrument, SPANE, was used to measure well-being. It can be seen that healthcare, followed by legal aid, are the easiest to access. On the other hand, finding a job, having money, and finding housing are the most difficult. In general, it seems possible to say that the refugees present more positive feelings than negative ones, which implies a positive emotional balance, although the average score obtained for emotional balance is quite far from the highest possible score. We consider this to be a pivotal first step which can provide useful information for the further design of aid strategies to improve this vulnerable group’s situation.
Collapse
|