1
|
Abdusalomov AB, Safarov F, Rakhimov M, Turaev B, Whangbo TK. Improved Feature Parameter Extraction from Speech Signals Using Machine Learning Algorithm. SENSORS (BASEL, SWITZERLAND) 2022; 22:8122. [PMID: 36365819 PMCID: PMC9654697 DOI: 10.3390/s22218122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 10/14/2022] [Accepted: 10/20/2022] [Indexed: 06/16/2023]
Abstract
Speech recognition refers to the capability of software or hardware to receive a speech signal, identify the speaker's features in the speech signal, and recognize the speaker thereafter. In general, the speech recognition process involves three main steps: acoustic processing, feature extraction, and classification/recognition. The purpose of feature extraction is to illustrate a speech signal using a predetermined number of signal components. This is because all information in the acoustic signal is excessively cumbersome to handle, and some information is irrelevant in the identification task. This study proposes a machine learning-based approach that performs feature parameter extraction from speech signals to improve the performance of speech recognition applications in real-time smart city environments. Moreover, the principle of mapping a block of main memory to the cache is used efficiently to reduce computing time. The block size of cache memory is a parameter that strongly affects the cache performance. In particular, the implementation of such processes in real-time systems requires a high computation speed. Processing speed plays an important role in speech recognition in real-time systems. It requires the use of modern technologies and fast algorithms that increase the acceleration in extracting the feature parameters from speech signals. Problems with overclocking during the digital processing of speech signals have yet to be completely resolved. The experimental results demonstrate that the proposed method successfully extracts the signal features and achieves seamless classification performance compared to other conventional speech recognition algorithms.
Collapse
Affiliation(s)
| | - Furkat Safarov
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Korea
| | - Mekhriddin Rakhimov
- Department of Artificial Intelligence, Tashkent University of Information Technologies Named after Muhammad Al-Khwarizmi, Tashkent 100200, Uzbekistan
| | - Boburkhon Turaev
- Department of Artificial Intelligence, Tashkent University of Information Technologies Named after Muhammad Al-Khwarizmi, Tashkent 100200, Uzbekistan
| | - Taeg Keun Whangbo
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Korea
| |
Collapse
|
2
|
Scott M. Sensory attenuation from action observation. Exp Brain Res 2022; 240:2923-2937. [PMID: 36123539 DOI: 10.1007/s00221-022-06460-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 08/27/2022] [Indexed: 11/28/2022]
Abstract
A central claim of many embodied approaches to cognition is that understanding others' actions is achieved by covertly simulating the observed actions and their consequences in one's own motor system. If such a simulation occurs, it may be accomplished through forward models, a component of the motor system already known to perform simulations of actions and their consequences in order to support sensory-monitoring of one's own actions. Forward-model simulations cause an attenuation of sensory intensity, so if the simulations hypothesized by embodied cognition are indeed provided by forward models, then action observation should trigger this sensory attenuation. To test this hypothesis, the experiments reported here measured the perceived intensity of a touch sensation on the finger when participants observed an active touch (a finger reaching to touch a ball) vs. a passive touch (a ball rolling to touch an unmoving finger). The touch sensation was perceived as less intense during observation of active touch in comparison with observation of passive touch, providing evidence that forward models are indeed engaged during action observation. The strength of this sensory attenuation is compared and contrasted with a well-established sensory-amplification effect caused by visual attention. This sensory-amplification effect has not generally been considered in studies related to sensory attenuation in action observation, which may explain conflicting results reported in the field.
Collapse
Affiliation(s)
- Mark Scott
- Department of Psychology, Memorial University of Newfoundland, St. John's, Canada.
| |
Collapse
|
3
|
Romanovska L, Janssen R, Bonte M. Longitudinal changes in cortical responses to letter-speech sound stimuli in 8-11 year-old children. NPJ SCIENCE OF LEARNING 2022; 7:2. [PMID: 35079026 PMCID: PMC8789908 DOI: 10.1038/s41539-021-00118-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 12/16/2021] [Indexed: 05/29/2023]
Abstract
While children are able to name letters fairly quickly, the automatisation of letter-speech sound mappings continues over the first years of reading development. In the current longitudinal fMRI study, we explored developmental changes in cortical responses to letters and speech sounds across 3 yearly measurements in a sample of 18 8-11 year old children. We employed a text-based recalibration paradigm in which combined exposure to text and ambiguous speech sounds shifts participants' later perception of the ambiguous sounds towards the text. Our results showed that activity of the left superior temporal and lateral inferior precentral gyri followed a non-linear developmental pattern across the measurement sessions. This pattern is reminiscent of previously reported inverted-u-shape developmental trajectories in children's visual cortical responses to text. Our findings suggest that the processing of letters and speech sounds involves non-linear changes in the brain's spoken language network possibly related to progressive automatisation of reading skills.
Collapse
Affiliation(s)
- Linda Romanovska
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands.
| | - Roef Janssen
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Milene Bonte
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
4
|
Romanovska L, Bonte M. How Learning to Read Changes the Listening Brain. Front Psychol 2021; 12:726882. [PMID: 34987442 PMCID: PMC8721231 DOI: 10.3389/fpsyg.2021.726882] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 11/23/2021] [Indexed: 01/18/2023] Open
Abstract
Reading acquisition reorganizes existing brain networks for speech and visual processing to form novel audio-visual language representations. This requires substantial cortical plasticity that is reflected in changes in brain activation and functional as well as structural connectivity between brain areas. The extent to which a child's brain can accommodate these changes may underlie the high variability in reading outcome in both typical and dyslexic readers. In this review, we focus on reading-induced functional changes of the dorsal speech network in particular and discuss how its reciprocal interactions with the ventral reading network contributes to reading outcome. We discuss how the dynamic and intertwined development of both reading networks may be best captured by approaching reading from a skill learning perspective, using audio-visual learning paradigms and longitudinal designs to follow neuro-behavioral changes while children's reading skills unfold.
Collapse
Affiliation(s)
| | - Milene Bonte
- *Correspondence: Linda Romanovska, ; Milene Bonte,
| |
Collapse
|
5
|
Romanovska L, Janssen R, Bonte M. Cortical responses to letters and ambiguous speech vary with reading skills in dyslexic and typically reading children. NEUROIMAGE-CLINICAL 2021; 30:102588. [PMID: 33618236 PMCID: PMC7907898 DOI: 10.1016/j.nicl.2021.102588] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 01/26/2021] [Accepted: 02/02/2021] [Indexed: 11/25/2022]
Abstract
Text recalibrates ambiguous speech perception in children with and without dyslexia. Dyslexia and poorer reading skills are linked to reduced left fusiform activation. Poorer letter-speech sound matching is linked to higher superior temporal activation.
One of the proposed issues underlying reading difficulties in dyslexia is insufficiently automatized letter-speech sound associations. In the current fMRI experiment, we employ text-based recalibration to investigate letter-speech sound mappings in 8–10 year-old children with and without dyslexia. Here an ambiguous speech sound /a?a/ midway between /aba/ and /ada/ is combined with disambiguating “aba” or “ada” text causing a perceptual shift of the ambiguous /a?a/ sound towards the text (recalibration). This perceptual shift has been found to be reduced in adults but not in children with dyslexia compared to typical readers. Our fMRI results show significantly reduced activation in the left fusiform in dyslexic compared to typical readers, despite comparable behavioural performance. Furthermore, enhanced audio-visual activation within this region was linked to better reading and phonological skills. In contrast, higher activation in bilateral superior temporal cortex was associated with lower letter-speech sound identification fluency. These findings reflect individual differences during the early stages of reading development with reduced recruitment of the left fusiform in dyslexic readers together with an increased involvement of the superior temporal cortex in children with less automatized letter-speech sound associations.
Collapse
Affiliation(s)
- Linda Romanovska
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands.
| | - Roef Janssen
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Milene Bonte
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
6
|
Scott M. Interaural recalibration of phonetic categories. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:EL164. [PMID: 32113262 DOI: 10.1121/10.0000735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Accepted: 01/27/2020] [Indexed: 06/10/2023]
Abstract
Recalibration is a learning process in which perceptual boundaries between speech-sounds adjust through exposure to a supplementary source of information. Using a dichotic-listening methodology, the experiments reported here establish interaural recalibration-in which an ambiguous speech sound in one ear is recalibrated on the basis of a clear sound presented to the other ear. This demonstrates a previously unknown form of recalibration and shows that location-specific recalibration occurs even when people are unaware of location differences between the sounds involved.
Collapse
Affiliation(s)
- Mark Scott
- Department of English Literature and Linguistics, Qatar University, Doha
| |
Collapse
|
7
|
Romanovska L, Janssen R, Bonte M. Reading-Induced Shifts in Speech Perception in Dyslexic and Typically Reading Children. Front Psychol 2019; 10:221. [PMID: 30792685 PMCID: PMC6374624 DOI: 10.3389/fpsyg.2019.00221] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Accepted: 01/22/2019] [Indexed: 11/13/2022] Open
Abstract
One of the proposed mechanisms underlying reading difficulties observed in developmental dyslexia is impaired mapping of visual to auditory speech representations. We investigate these mappings in 20 typically reading and 20 children with dyslexia aged 8–10 years using text-based recalibration. In this paradigm, the pairing of visual text and ambiguous speech sounds shifts (recalibrates) the participant’s perception of the ambiguous speech in subsequent auditory-only post-test trials. Recent research in adults demonstrated this text-induced perceptual shift in typical, but not in dyslexic readers. Our current results instead show significant text-induced recalibration in both typically reading children and children with dyslexia. The strength of this effect was significantly linked to the strength of perceptual adaptation effects in children with dyslexia but not typically reading children. Furthermore, additional analyses in a sample of typically reading children of various reading levels revealed a significant link between recalibration and phoneme categorization. Taken together, our study highlights the importance of considering dynamic developmental changes in reading, letter-speech sound coupling and speech perception when investigating group differences between typical and dyslexic readers.
Collapse
Affiliation(s)
- Linda Romanovska
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Roef Janssen
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Milene Bonte
- Maastricht Brain Imaging Center, Department Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
8
|
Reading-induced shifts of perceptual speech representations in auditory cortex. Sci Rep 2017; 7:5143. [PMID: 28698606 PMCID: PMC5506038 DOI: 10.1038/s41598-017-05356-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2017] [Accepted: 05/30/2017] [Indexed: 11/08/2022] Open
Abstract
Learning to read requires the formation of efficient neural associations between written and spoken language. Whether these associations influence the auditory cortical representation of speech remains unknown. Here we address this question by combining multivariate functional MRI analysis and a newly-developed ‘text-based recalibration’ paradigm. In this paradigm, the pairing of visual text and ambiguous speech sounds shifts (i.e. recalibrates) the perceptual interpretation of the ambiguous sounds in subsequent auditory-only trials. We show that it is possible to retrieve the text-induced perceptual interpretation from fMRI activity patterns in the posterior superior temporal cortex. Furthermore, this auditory cortical region showed significant functional connectivity with the inferior parietal lobe (IPL) during the pairing of text with ambiguous speech. Our findings indicate that reading-related audiovisual mappings can adjust the auditory cortical representation of speech in typically reading adults. Additionally, they suggest the involvement of the IPL in audiovisual and/or higher-order perceptual processes leading to this adjustment. When applied in typical and dyslexic readers of different ages, our text-based recalibration paradigm may reveal relevant aspects of perceptual learning and plasticity during successful and failing reading development.
Collapse
|
9
|
Rosenblum LD, Dorsi J, Dias JW. The Impact and Status of Carol Fowler's Supramodal Theory of Multisensory Speech Perception. ECOLOGICAL PSYCHOLOGY 2016. [DOI: 10.1080/10407413.2016.1230373] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|