1
|
Svantesson M, Olausson H, Eklund A, Thordstein M. Get a New Perspective on EEG: Convolutional Neural Network Encoders for Parametric t-SNE. Brain Sci 2023; 13:brainsci13030453. [PMID: 36979263 PMCID: PMC10046040 DOI: 10.3390/brainsci13030453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 03/03/2023] [Accepted: 03/04/2023] [Indexed: 03/09/2023] Open
Abstract
t-distributed stochastic neighbor embedding (t-SNE) is a method for reducing high-dimensional data to a low-dimensional representation, and is mostly used for visualizing data. In parametric t-SNE, a neural network learns to reproduce this mapping. When used for EEG analysis, the data are usually first transformed into a set of features, but it is not known which features are optimal. The principle of t-SNE was used to train convolutional neural network (CNN) encoders to learn to produce both a high- and a low-dimensional representation, eliminating the need for feature engineering. To evaluate the method, the Temple University EEG Corpus was used to create three datasets with distinct EEG characters: (1) wakefulness and sleep; (2) interictal epileptiform discharges; and (3) seizure activity. The CNN encoders produced low-dimensional representations of the datasets with a structure that conformed well to the EEG characters and generalized to new data. Compared to parametric t-SNE for either a short-time Fourier transform or wavelet representation of the datasets, the developed CNN encoders performed equally well in separating categories, as assessed by support vector machines. The CNN encoders generally produced a higher degree of clustering, both visually and in the number of clusters detected by k-means clustering. The developed principle is promising and could be further developed to create general tools for exploring relations in EEG data.
Collapse
Affiliation(s)
- Mats Svantesson
- Department of Clinical Neurophysiology, University Hospital of Linköping, 58185 Linköping, Sweden
- Center for Social and Affective Neuroscience, Linköping University, 58183 Linköping, Sweden
- Center for Medical Image Science and Visualization, Linköping University, 58183 Linköping, Sweden
- Department of Biomedical and Clinical Sciences, Linköping University, 58183 Linköping, Sweden
- Correspondence:
| | - Håkan Olausson
- Department of Clinical Neurophysiology, University Hospital of Linköping, 58185 Linköping, Sweden
- Center for Social and Affective Neuroscience, Linköping University, 58183 Linköping, Sweden
- Department of Biomedical and Clinical Sciences, Linköping University, 58183 Linköping, Sweden
| | - Anders Eklund
- Center for Medical Image Science and Visualization, Linköping University, 58183 Linköping, Sweden
- Department of Biomedical Engineering, Linköping University, 58183 Linköping, Sweden
- Division of Statistics & Machine Learning, Department of Computer and Information Science, Linköping University, 58183 Linköping, Sweden
| | - Magnus Thordstein
- Department of Clinical Neurophysiology, University Hospital of Linköping, 58185 Linköping, Sweden
- Center for Medical Image Science and Visualization, Linköping University, 58183 Linköping, Sweden
- Department of Biomedical and Clinical Sciences, Linköping University, 58183 Linköping, Sweden
| |
Collapse
|
2
|
Moinuddin KA, Havugimana F, Al-Fahad R, Bidelman GM, Yeasin M. Unraveling Spatial-Spectral Dynamics of Speech Categorization Speed Using Convolutional Neural Networks. Brain Sci 2022; 13:brainsci13010075. [PMID: 36672055 PMCID: PMC9856675 DOI: 10.3390/brainsci13010075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 12/22/2022] [Accepted: 12/24/2022] [Indexed: 12/31/2022] Open
Abstract
The process of categorizing sounds into distinct phonetic categories is known as categorical perception (CP). Response times (RTs) provide a measure of perceptual difficulty during labeling decisions (i.e., categorization). The RT is quasi-stochastic in nature due to individuality and variations in perceptual tasks. To identify the source of RT variation in CP, we have built models to decode the brain regions and frequency bands driving fast, medium and slow response decision speeds. In particular, we implemented a parameter optimized convolutional neural network (CNN) to classify listeners' behavioral RTs from their neural EEG data. We adopted visual interpretation of model response using Guided-GradCAM to identify spatial-spectral correlates of RT. Our framework includes (but is not limited to): (i) a data augmentation technique designed to reduce noise and control the overall variance of EEG dataset; (ii) bandpower topomaps to learn the spatial-spectral representation using CNN; (iii) large-scale Bayesian hyper-parameter optimization to find best performing CNN model; (iv) ANOVA and posthoc analysis on Guided-GradCAM activation values to measure the effect of neural regions and frequency bands on behavioral responses. Using this framework, we observe that α-β (10-20 Hz) activity over left frontal, right prefrontal/frontal, and right cerebellar regions are correlated with RT variation. Our results indicate that attention, template matching, temporal prediction of acoustics, motor control, and decision uncertainty are the most probable factors in RT variation.
Collapse
Affiliation(s)
| | - Felix Havugimana
- Department of EECE, University of Memphis, Memphis, TN 38152, USA
| | - Rakib Al-Fahad
- Department of EECE, University of Memphis, Memphis, TN 38152, USA
| | - Gavin M. Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN 47408, USA
| | - Mohammed Yeasin
- Department of EECE, University of Memphis, Memphis, TN 38152, USA
| |
Collapse
|
3
|
Bakas S, Adamos DA, Laskaris N. On the estimate of music appraisal from surface EEG: a dynamic-network approach based on cross-sensor PAC measurements. J Neural Eng 2021; 18. [PMID: 33975291 DOI: 10.1088/1741-2552/abffe6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 05/11/2021] [Indexed: 11/11/2022]
Abstract
Objective.The aesthetic evaluation of music is strongly dependent on the listener and reflects manifold brain processes that go well beyond the perception of incident sound. Being a high-level cognitive reaction, it is difficult to predict merely from the acoustic features of the audio signal and this poses serious challenges to contemporary music recommendation systems. We attempted to decode music appraisal from brain activity, recorded via wearable EEG, during music listening.Approach.To comply with the dynamic nature of music stimuli, cross-frequency coupling measurements were employed in a time-evolving manner to capture the evolving interactions between distinct brain-rhythms during music listening. Brain response to music was first represented as a continuous flow of functional couplings referring to both regional and inter-regional brain dynamics and then modelled as an ensemble of time-varying (sub)networks. Dynamic graph centrality measures were derived, next, as the final feature-engineering step and, lastly, a support-vector machine was trained to decode the subjective music appraisal. A carefully designed experimental paradigm provided the labeled brain signals.Main results.Using data from 20 subjects, dynamic programming to tailor the decoder to each subject individually and cross-validation, we demonstrated highly satisfactory performance (MAE= 0.948,R2= 0.63) that can be attributed, mostly, to interactions of left frontal gamma rhythm. In addition, our music-appraisal decoder was also employed in a part of the DEAP dataset with similar success. Finally, even a generic version of the decoder (common for all subjects) was found to perform sufficiently.Significance.A novel brain signal decoding scheme was introduced and validated empirically on suitable experimental data. It requires simple operations and leaves room for real-time implementation. Both the code and the experimental data are publicly available.
Collapse
Affiliation(s)
- Stylianos Bakas
- Department of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.,Neuroinformatics GRoup, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Dimitrios A Adamos
- School of Music Studies, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.,Department of Computing, Imperial College London, SW7 2AZ London, United Kingdom.,Neuroinformatics GRoup, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Nikolaos Laskaris
- Department of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.,Neuroinformatics GRoup, Aristotle University of Thessaloniki, Thessaloniki, Greece
| |
Collapse
|
4
|
Price CN, Bidelman GM. Attention reinforces human corticofugal system to aid speech perception in noise. Neuroimage 2021; 235:118014. [PMID: 33794356 PMCID: PMC8274701 DOI: 10.1016/j.neuroimage.2021.118014] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 03/09/2021] [Accepted: 03/25/2021] [Indexed: 12/13/2022] Open
Abstract
Perceiving speech-in-noise (SIN) demands precise neural coding between brainstem and cortical levels of the hearing system. Attentional processes can then select and prioritize task-relevant cues over competing background noise for successful speech perception. In animal models, brainstem-cortical interplay is achieved via descending corticofugal projections from cortex that shape midbrain responses to behaviorally-relevant sounds. Attentional engagement of corticofugal feedback may assist SIN understanding but has never been confirmed and remains highly controversial in humans. To resolve these issues, we recorded source-level, anatomically constrained brainstem frequency-following responses (FFRs) and cortical event-related potentials (ERPs) to speech via high-density EEG while listeners performed rapid SIN identification tasks. We varied attention with active vs. passive listening scenarios whereas task difficulty was manipulated with additive noise interference. Active listening (but not arousal-control tasks) exaggerated both ERPs and FFRs, confirming attentional gain extends to lower subcortical levels of speech processing. We used functional connectivity to measure the directed strength of coupling between levels and characterize "bottom-up" vs. "top-down" (corticofugal) signaling within the auditory brainstem-cortical pathway. While attention strengthened connectivity bidirectionally, corticofugal transmission disengaged under passive (but not active) SIN listening. Our findings (i) show attention enhances the brain's transcription of speech even prior to cortex and (ii) establish a direct role of the human corticofugal feedback system as an aid to cocktail party speech perception.
Collapse
Affiliation(s)
- Caitlin N Price
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences and Disorders, University of Memphis, 4055 North Park Loop, Memphis, TN 38152, USA.
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences and Disorders, University of Memphis, 4055 North Park Loop, Memphis, TN 38152, USA; Department of Anatomy and Neurobiology, University of Tennessee Health Sciences Center, Memphis, TN, USA.
| |
Collapse
|
5
|
Mahmud MS, Yeasin M, Bidelman GM. Data-driven machine learning models for decoding speech categorization from evoked brain responses. J Neural Eng 2021; 18. [PMID: 33690177 DOI: 10.1101/2020.08.03.234997] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 03/09/2021] [Indexed: 05/24/2023]
Abstract
Objective.Categorical perception (CP) of audio is critical to understand how the human brain perceives speech sounds despite widespread variability in acoustic properties. Here, we investigated the spatiotemporal characteristics of auditory neural activity that reflects CP for speech (i.e. differentiates phonetic prototypes from ambiguous speech sounds).Approach.We recorded 64-channel electroencephalograms as listeners rapidly classified vowel sounds along an acoustic-phonetic continuum. We used support vector machine classifiers and stability selection to determine when and where in the brain CP was best decoded across space and time via source-level analysis of the event-related potentials.Main results. We found that early (120 ms) whole-brain data decoded speech categories (i.e. prototypical vs. ambiguous tokens) with 95.16% accuracy (area under the curve 95.14%;F1-score 95.00%). Separate analyses on left hemisphere (LH) and right hemisphere (RH) responses showed that LH decoding was more accurate and earlier than RH (89.03% vs. 86.45% accuracy; 140 ms vs. 200 ms). Stability (feature) selection identified 13 regions of interest (ROIs) out of 68 brain regions [including auditory cortex, supramarginal gyrus, and inferior frontal gyrus (IFG)] that showed categorical representation during stimulus encoding (0-260 ms). In contrast, 15 ROIs (including fronto-parietal regions, IFG, motor cortex) were necessary to describe later decision stages (later 300-800 ms) of categorization but these areas were highly associated with the strength of listeners' categorical hearing (i.e. slope of behavioral identification functions).Significance.Our data-driven multivariate models demonstrate that abstract categories emerge surprisingly early (∼120 ms) in the time course of speech processing and are dominated by engagement of a relatively compact fronto-temporal-parietal brain network.
Collapse
Affiliation(s)
- Md Sultan Mahmud
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States of America
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States of America
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States of America
- University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, United States of America
| |
Collapse
|
6
|
Mahmud MS, Yeasin M, Bidelman GM. Data-driven machine learning models for decoding speech categorization from evoked brain responses. J Neural Eng 2021; 18:10.1088/1741-2552/abecf0. [PMID: 33690177 PMCID: PMC8738965 DOI: 10.1088/1741-2552/abecf0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 03/09/2021] [Indexed: 11/12/2022]
Abstract
Objective.Categorical perception (CP) of audio is critical to understand how the human brain perceives speech sounds despite widespread variability in acoustic properties. Here, we investigated the spatiotemporal characteristics of auditory neural activity that reflects CP for speech (i.e. differentiates phonetic prototypes from ambiguous speech sounds).Approach.We recorded 64-channel electroencephalograms as listeners rapidly classified vowel sounds along an acoustic-phonetic continuum. We used support vector machine classifiers and stability selection to determine when and where in the brain CP was best decoded across space and time via source-level analysis of the event-related potentials.Main results. We found that early (120 ms) whole-brain data decoded speech categories (i.e. prototypical vs. ambiguous tokens) with 95.16% accuracy (area under the curve 95.14%;F1-score 95.00%). Separate analyses on left hemisphere (LH) and right hemisphere (RH) responses showed that LH decoding was more accurate and earlier than RH (89.03% vs. 86.45% accuracy; 140 ms vs. 200 ms). Stability (feature) selection identified 13 regions of interest (ROIs) out of 68 brain regions [including auditory cortex, supramarginal gyrus, and inferior frontal gyrus (IFG)] that showed categorical representation during stimulus encoding (0-260 ms). In contrast, 15 ROIs (including fronto-parietal regions, IFG, motor cortex) were necessary to describe later decision stages (later 300-800 ms) of categorization but these areas were highly associated with the strength of listeners' categorical hearing (i.e. slope of behavioral identification functions).Significance.Our data-driven multivariate models demonstrate that abstract categories emerge surprisingly early (∼120 ms) in the time course of speech processing and are dominated by engagement of a relatively compact fronto-temporal-parietal brain network.
Collapse
Affiliation(s)
- Md Sultan Mahmud
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States of America
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States of America
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States of America
- University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, United States of America
| |
Collapse
|
7
|
Mahmud MS, Yeasin M, Bidelman GM. Speech categorization is better described by induced rather than evoked neural activity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:1644. [PMID: 33765780 PMCID: PMC8267855 DOI: 10.1121/10.0003572] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Categorical perception (CP) describes how the human brain categorizes speech despite inherent acoustic variability. We examined neural correlates of CP in both evoked and induced electroencephalogram (EEG) activity to evaluate which mode best describes the process of speech categorization. Listeners labeled sounds from a vowel gradient while we recorded their EEGs. Using a source reconstructed EEG, we used band-specific evoked and induced neural activity to build parameter optimized support vector machine models to assess how well listeners' speech categorization could be decoded via whole-brain and hemisphere-specific responses. We found whole-brain evoked β-band activity decoded prototypical from ambiguous speech sounds with ∼70% accuracy. However, induced γ-band oscillations showed better decoding of speech categories with ∼95% accuracy compared to evoked β-band activity (∼70% accuracy). Induced high frequency (γ-band) oscillations dominated CP decoding in the left hemisphere, whereas lower frequencies (θ-band) dominated the decoding in the right hemisphere. Moreover, feature selection identified 14 brain regions carrying induced activity and 22 regions of evoked activity that were most salient in describing category-level speech representations. Among the areas and neural regimes explored, induced γ-band modulations were most strongly associated with listeners' behavioral CP. The data suggest that the category-level organization of speech is dominated by relatively high frequency induced brain rhythms.
Collapse
Affiliation(s)
- Md Sultan Mahmud
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, Tennessee 38152, USA
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, Tennessee 38152, USA
| | - Gavin M Bidelman
- School of Communication Sciences and Disorders, University of Memphis, 4055 North Park Loop, Memphis, Tennessee 38152, USA
| |
Collapse
|
8
|
Syrjälä J, Basti A, Guidotti R, Marzetti L, Pizzella V. Decoding working memory task condition using magnetoencephalography source level long-range phase coupling patterns. J Neural Eng 2021; 18:016027. [PMID: 33624612 DOI: 10.1088/1741-2552/abcefe] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
OBJECTIVE The objective of the study is to identify phase coupling patterns that are shared across subjects via a machine learning approach that utilises source space magnetoencephalography (MEG) phase coupling data from a working memory (WM) task. Indeed, phase coupling of neural oscillations is putatively a key factor for communication between distant brain areas and is therefore crucial in performing cognitive tasks, including WM. Previous studies investigating phase coupling during cognitive tasks have often focused on a few a priori selected brain areas or a specific frequency band, and the need for data-driven approaches has been recognised. Machine learning techniques have emerged as valuable tools for the analysis of neuroimaging data since they catch fine-grained differences in the multivariate signal distribution. Here, we expect that these techniques applied to MEG phase couplings can reveal WM-related processes that are shared across individuals. APPROACH We analysed WM data collected as part of the Human Connectome Project. The MEG data were collected while subjects (n = 83) performed N-back WM tasks in two different conditions, namely 2-back (WM condition) and 0-back (control condition). We estimated phase coupling patterns (multivariate phase slope index) for both conditions and for theta, alpha, beta, and gamma bands. The obtained phase coupling data were then used to train a linear support vector machine in order to classify which task condition the subject was performing with an across-subject cross-validation approach. The classification was performed separately based on the data from individual frequency bands and with all bands combined (multiband). Finally, we evaluated the relative importance of the different features (phase couplings) for classification by the means of feature selection probability. MAIN RESULTS The WM condition and control condition were successfully classified based on the phase coupling patterns in the theta (62% accuracy) and alpha bands (60% accuracy) separately. Importantly, the multiband classification showed that phase coupling patterns not only in the theta and alpha but also in the gamma bands are related to WM processing, as testified by improvement in classification performance (71%). SIGNIFICANCE Our study successfully decoded WM tasks using MEG source space functional connectivity. Our approach, combining across-subject classification and a multidimensional metric recently developed by our group, is able to detect patterns of connectivity that are shared across individuals. In other words, the results are generalisable to new individuals and allow meaningful interpretation of task-relevant phase coupling patterns.
Collapse
Affiliation(s)
- Jaakko Syrjälä
- Department of Neuroscience, Imaging and Clinical Sciences, 'Gabriele d'Annunzio' University of Chieti-Pescara, Chieti 66013, Italy
| | | | | | | | | |
Collapse
|
9
|
Carter JA, Bidelman GM. Auditory cortex is susceptible to lexical influence as revealed by informational vs. energetic masking of speech categorization. Brain Res 2021; 1759:147385. [PMID: 33631210 DOI: 10.1016/j.brainres.2021.147385] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 02/15/2021] [Accepted: 02/16/2021] [Indexed: 02/02/2023]
Abstract
Speech perception requires the grouping of acoustic information into meaningful phonetic units via the process of categorical perception (CP). Environmental masking influences speech perception and CP. However, it remains unclear at which stage of processing (encoding, decision, or both) masking affects listeners' categorization of speech signals. The purpose of this study was to determine whether linguistic interference influences the early acoustic-phonetic conversion process inherent to CP. To this end, we measured source level, event related brain potentials (ERPs) from auditory cortex (AC) and inferior frontal gyrus (IFG) as listeners rapidly categorized speech sounds along a /da/ to /ga/ continuum presented in three listening conditions: quiet, and in the presence of forward (informational masker) and time-reversed (energetic masker) 2-talker babble noise. Maskers were matched in overall SNR and spectral content and thus varied only in their degree of linguistic interference (i.e., informational masking). We hypothesized a differential effect of informational versus energetic masking on behavioral and neural categorization responses, where we predicted increased activation of frontal regions when disambiguating speech from noise, especially during lexical-informational maskers. We found (1) informational masking weakens behavioral speech phoneme identification above and beyond energetic masking; (2) low-level AC activity not only codes speech categories but is susceptible to higher-order lexical interference; (3) identifying speech amidst noise recruits a cross hemispheric circuit (ACleft → IFGright) whose engagement varies according to task difficulty. These findings provide corroborating evidence for top-down influences on the early acoustic-phonetic analysis of speech through a coordinated interplay between frontotemporal brain areas.
Collapse
Affiliation(s)
- Jared A Carter
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA.
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| |
Collapse
|
10
|
Sadat-Nejad Y, Beheshti S. Efficient high resolution sLORETA in brain source localization. J Neural Eng 2021; 18. [DOI: 10.1088/1741-2552/abcc48] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 11/19/2020] [Indexed: 11/12/2022]
Abstract
Abstract
Objective. Estimation of the source location within the brain from electroencephalography (EEG) and magnetoencephalography measures is a challenging task. Among the existing techniques in the field, which are known as brain imaging methods, standardized low-resolution brain electromagnetic tomography (sLORETA) is the most popular method due to its simplicity and high accuracy. However, in this work we illustrate that sLORETA is still noisy and the additive noise is causing the blurry image. The existing pre-fixed/manual thresholding process after sLORETA can partially take care of denoising. However, this ad-hoc theresholding can either remove so much of the desired data or leave much of the noise in the process. Manual correction to avoid such extreme cases can be time-consuming. The objective of this paper is to automate the denoising process in the form of adaptive thresholding. Approach. The proposed method, denoted by efficient high-resolution sLORETA (EHR-sLORETA), is based on minimizing the error between the desired denoised source and the source estimates. Main results. The approach is evaluated using synthetic EEG and real EEG data. spatial dispersion (SD), and mean square error (MSE) are used as metrics to provide the quantitative performance of the method. In addition, qualitative analysis of the method is provided for real EEG data. This proposed model demonstrates advantages over the existing methods in sense of accuracy and robustness with SD and MSE comparison. Significance. EHR-sLORETA could have a significant impact on clinical studies with source estimation task, as it improves the accuracy of source estimation and eliminates the need for manual thresholding.
Collapse
|
11
|
Bidelman GM, Pearson C, Harrison A. Lexical Influences on Categorical Speech Perception Are Driven by a Temporoparietal Circuit. J Cogn Neurosci 2021; 33:840-852. [PMID: 33464162 DOI: 10.1162/jocn_a_01678] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Categorical judgments of otherwise identical phonemes are biased toward hearing words (i.e., "Ganong effect") suggesting lexical context influences perception of even basic speech primitives. Lexical biasing could manifest via late stage postperceptual mechanisms related to decision or, alternatively, top-down linguistic inference that acts on early perceptual coding. Here, we exploited the temporal sensitivity of EEG to resolve the spatiotemporal dynamics of these context-related influences on speech categorization. Listeners rapidly classified sounds from a /gɪ/-/kɪ/ gradient presented in opposing word-nonword contexts (GIFT-kift vs. giss-KISS), designed to bias perception toward lexical items. Phonetic perception shifted toward the direction of words, establishing a robust Ganong effect behaviorally. ERPs revealed a neural analog of lexical biasing emerging within ~200 msec. Source analyses uncovered a distributed neural network supporting the Ganong including middle temporal gyrus, inferior parietal lobe, and middle frontal cortex. Yet, among Ganong-sensitive regions, only left middle temporal gyrus and inferior parietal lobe predicted behavioral susceptibility to lexical influence. Our findings confirm lexical status rapidly constrains sublexical categorical representations for speech within several hundred milliseconds but likely does so outside the purview of canonical auditory-sensory brain areas.
Collapse
Affiliation(s)
- Gavin M Bidelman
- University of Memphis, TN.,University of Tennessee Health Sciences Center, Memphis, TN
| | | | | |
Collapse
|
12
|
Mahmud MS, Ahmed F, Al-Fahad R, Moinuddin KA, Yeasin M, Alain C, Bidelman GM. Decoding Hearing-Related Changes in Older Adults' Spatiotemporal Neural Processing of Speech Using Machine Learning. Front Neurosci 2020; 14:748. [PMID: 32765215 PMCID: PMC7378401 DOI: 10.3389/fnins.2020.00748] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 06/25/2020] [Indexed: 12/25/2022] Open
Abstract
Speech perception in noisy environments depends on complex interactions between sensory and cognitive systems. In older adults, such interactions may be affected, especially in those individuals who have more severe age-related hearing loss. Using a data-driven approach, we assessed the temporal (when in time) and spatial (where in the brain) characteristics of cortical speech-evoked responses that distinguish older adults with or without mild hearing loss. We performed source analyses to estimate cortical surface signals from the EEG recordings during a phoneme discrimination task conducted under clear and noise-degraded conditions. We computed source-level ERPs (i.e., mean activation within each ROI) from each of the 68 ROIs of the Desikan-Killiany (DK) atlas, averaged over a randomly chosen 100 trials without replacement to form feature vectors. We adopted a multivariate feature selection method called stability selection and control to choose features that are consistent over a range of model parameters. We use parameter optimized support vector machine (SVM) as a classifiers to investigate the time course and brain regions that segregate groups and speech clarity. For clear speech perception, whole-brain data revealed a classification accuracy of 81.50% [area under the curve (AUC) 80.73%; F1-score 82.00%], distinguishing groups within ∼60 ms after speech onset (i.e., as early as the P1 wave). We observed lower accuracy of 78.12% [AUC 77.64%; F1-score 78.00%] and delayed classification performance when speech was embedded in noise, with group segregation at 80 ms. Separate analysis using left (LH) and right hemisphere (RH) regions showed that LH speech activity was better at distinguishing hearing groups than activity measured in the RH. Moreover, stability selection analysis identified 12 brain regions (among 1428 total spatiotemporal features from 68 regions) where source activity segregated groups with >80% accuracy (clear speech); whereas 16 regions were critical for noise-degraded speech to achieve a comparable level of group segregation (78.7% accuracy). Our results identify critical time-courses and brain regions that distinguish mild hearing loss from normal hearing in older adults and confirm a larger number of active areas, particularly in RH, when processing noise-degraded speech information.
Collapse
Affiliation(s)
- Md Sultan Mahmud
- Department of Electrical and Computer Engineering, The University of Memphis, Memphis, TN, United States
| | - Faruk Ahmed
- Department of Electrical and Computer Engineering, The University of Memphis, Memphis, TN, United States
| | - Rakib Al-Fahad
- Department of Electrical and Computer Engineering, The University of Memphis, Memphis, TN, United States
| | - Kazi Ashraf Moinuddin
- Department of Electrical and Computer Engineering, The University of Memphis, Memphis, TN, United States
| | - Mohammed Yeasin
- Department of Electrical and Computer Engineering, The University of Memphis, Memphis, TN, United States
| | - Claude Alain
- Rotman Research Institute-Baycrest Centre for Geriatric Care, Toronto, ON, Canada.,Department of Psychology, University of Toronto, Toronto, ON, Canada.,Institute of Medical Sciences, University of Toronto, Toronto, ON, Canada
| | - Gavin M Bidelman
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States.,School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States.,Department of Anatomy and Neurobiology, University of Tennessee Health Science Center, Memphis, TN, United States
| |
Collapse
|
13
|
Al-Fahad R, Yeasin M, Glass JO, Conklin HM, Jacola LM, Reddick WE. Early Imaging Based Predictive Modeling of Cognitive Performance Following Therapy for Childhood ALL. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2019; 7:146662-146674. [PMID: 32547892 PMCID: PMC7297193 DOI: 10.1109/access.2019.2946240] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In the United States, Acute Lymphoblastic Leukemia (ALL), the most common child and adolescent malignancy, accounts for roughly 25% of childhood cancers diagnosed annually with a 5-year survival rate as high as 94% [1]. This improved survival rate comes with an increased risk for delayed neurocognitive effects in attention, working memory, and processing speed [2]. Predictive modeling and characterization of neurocognitive effects are critical to inform the family and also to identify patients for interventions targeting. Current state-of-the-art methods mainly use hypothesis-driven statistical testing methods to characterize and model such cognitive events. While these techniques have proven to be useful in understanding cognitive abilities, they are inadequate in explaining causal relationships, as well as individuality and variations. In this study, we developed multivariate data-driven models to measure the late neurocognitive effects of ALL patients using behavioral phenotypes, Diffusion Tensor Magnetic Resonance Imaging (DTI) based tractography data, morphometry statistics, tractography measures, behavioral, and demographic variables. Alongside conventional machine learning and graph mining, we adopted "Stability Selection" to select the most relevant features and choose models that are consistent over a range of parameters. The proposed approach demonstrated substantially improved accuracy (13% - 26%) over existing models and also yielded relevant features that were verified by domain experts.
Collapse
Affiliation(s)
| | | | - John O. Glass
- St. Jude Children’s Research Hospital, Memphis, Tennessee, USA
| | | | - Lisa M. Jacola
- St. Jude Children’s Research Hospital, Memphis, Tennessee, USA
| | | |
Collapse
|