1
|
Sorensen E, Oleson J, Kutlu E, McMurray B. A Bayesian hierarchical model for the analysis of visual analogue scaling tasks. Stat Methods Med Res 2024; 33:953-965. [PMID: 38573790 DOI: 10.1177/09622802241242319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/06/2024]
Abstract
In psychophysics and psychometrics, an integral method to the discipline involves charting how a person's response pattern changes according to a continuum of stimuli. For instance, in hearing science, Visual Analog Scaling tasks are experiments in which listeners hear sounds across a speech continuum and give a numeric rating between 0 and 100 conveying whether the sound they heard was more like word "a" or more like word "b" (i.e. each participant is giving a continuous categorization response). By taking all the continuous categorization responses across the speech continuum, a parametric curve model can be fit to the data and used to analyze any individual's response pattern by speech continuum. Standard statistical modeling techniques are not able to accommodate all of the specific requirements needed to analyze these data. Thus, Bayesian hierarchical modeling techniques are employed to accommodate group-level non-linear curves, individual-specific non-linear curves, continuum-level random effects, and a subject-specific variance that is predicted by other model parameters. In this paper, a Bayesian hierarchical model is constructed to model the data from a Visual Analog Scaling task study of mono-lingual and bi-lingual participants. Any nonlinear curve function could be used and we demonstrate the technique using the 4-parameter logistic function. Overall, the model was found to fit particularly well to the data from the study and results suggested that the magnitude of the slope was what most defined the differences in response patterns between continua.
Collapse
Affiliation(s)
- Eldon Sorensen
- Department of Biostatistics, University of Iowa, Iowa City, IA, USA
| | - Jacob Oleson
- Department of Biostatistics, University of Iowa, Iowa City, IA, USA
| | - Ethan Kutlu
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, IA, USA
- Department of Linguistics, University of Iowa, Iowa City, IA, USA
| | - Bob McMurray
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, IA, USA
- Department of Linguistics, University of Iowa, Iowa City, IA, USA
| |
Collapse
|
2
|
Rizzi R, Bidelman GM. Functional benefits of continuous vs. categorical listening strategies on the neural encoding and perception of noise-degraded speech. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.15.594387. [PMID: 38798410 PMCID: PMC11118460 DOI: 10.1101/2024.05.15.594387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Acoustic information in speech changes continuously, yet listeners form discrete perceptual categories to ease the demands of perception. Being a more continuous/gradient as opposed to a discrete/categorical listener may be further advantageous for understanding speech in noise by increasing perceptual flexibility and resolving ambiguity. The degree to which a listener's responses to a continuum of speech sounds are categorical versus continuous can be quantified using visual analog scaling (VAS) during speech labeling tasks. Here, we recorded event-related brain potentials (ERPs) to vowels along an acoustic-phonetic continuum (/u/ to /a/) while listeners categorized phonemes in both clean and noise conditions. Behavior was assessed using standard two alternative forced choice (2AFC) and VAS paradigms to evaluate categorization under task structures that promote discrete (2AFC) vs. continuous (VAS) hearing, respectively. Behaviorally, identification curves were steeper under 2AFC vs. VAS categorization but were relatively immune to noise, suggesting robust access to abstract, phonetic categories even under signal degradation. Behavioral slopes were positively correlated with listeners' QuickSIN scores, suggesting a behavioral advantage for speech in noise comprehension conferred by gradient listening strategy. At the neural level, electrode level data revealed P2 peak amplitudes of the ERPs were modulated by task and noise; responses were larger under VAS vs. 2AFC categorization and showed larger noise-related delay in latency in the VAS vs. 2AFC condition. More gradient responders also had smaller shifts in ERP latency with noise, suggesting their neural encoding of speech was more resilient to noise degradation. Interestingly, source-resolved ERPs showed that more gradient listening was also correlated with stronger neural responses in left superior temporal gyrus. Our results demonstrate that listening strategy (i.e., being a discrete vs. continuous listener) modulates the categorical organization of speech and behavioral success, with continuous/gradient listening being more advantageous to speech in noise perception.
Collapse
|
3
|
Elmer S, Kurthen I, Meyer M, Giroud N. A multidimensional characterization of the neurocognitive architecture underlying age-related temporal speech processing. Neuroimage 2023; 278:120285. [PMID: 37481009 DOI: 10.1016/j.neuroimage.2023.120285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 07/11/2023] [Accepted: 07/19/2023] [Indexed: 07/24/2023] Open
Abstract
Healthy aging is often associated with speech comprehension difficulties in everyday life situations despite a pure-tone hearing threshold in the normative range. Drawing on this background, we used a multidimensional approach to assess the functional and structural neural correlates underlying age-related temporal speech processing while controlling for pure-tone hearing acuity. Accordingly, we combined structural magnetic resonance imaging and electroencephalography, and collected behavioral data while younger and older adults completed a phonetic categorization and discrimination task with consonant-vowel syllables varying along a voice-onset time continuum. The behavioral results confirmed age-related temporal speech processing singularities which were reflected in a shift of the boundary of the psychometric categorization function, with older adults perceiving more syllable characterized by a short voice-onset time as /ta/ compared to younger adults. Furthermore, despite the absence of any between-group differences in phonetic discrimination abilities, older adults demonstrated longer N100/P200 latencies as well as increased P200 amplitudes while processing the consonant-vowel syllables varying in voice-onset time. Finally, older adults also exhibited a divergent anatomical gray matter infrastructure in bilateral auditory-related and frontal brain regions, as manifested in reduced cortical thickness and surface area. Notably, in the younger adults but not in the older adult cohort, cortical surface area in these two gross anatomical clusters correlated with the categorization of consonant-vowel syllables characterized by a short voice-onset time, suggesting the existence of a critical gray matter threshold that is crucial for consistent mapping of phonetic categories varying along the temporal dimension. Taken together, our results highlight the multifaceted dimensions of age-related temporal speech processing characteristics, and pave the way toward a better understanding of the relationships between hearing, speech and the brain in older age.
Collapse
Affiliation(s)
- Stefan Elmer
- Department of Computational Linguistics, Computational Neuroscience of Speech & Hearing, University of Zurich, Zurich, Switzerland; Competence center Language & Medicine, University of Zurich, Switzerland.
| | - Ira Kurthen
- Department of Computational Linguistics, Computational Neuroscience of Speech & Hearing, University of Zurich, Zurich, Switzerland
| | - Martin Meyer
- Department of Comparative Language Science, University of Zurich, Zurich, Switzerland; Center for Neuroscience Zurich, University and ETH of Zurich, Zurich, Switzerland; Center for the Interdisciplinary Study of Language Evolution (ISLE), University of Zurich, Zurich, Switzerland; Cognitive Psychology Unit, Alpen-Adria University, Klagenfurt, Austria
| | - Nathalie Giroud
- Department of Computational Linguistics, Computational Neuroscience of Speech & Hearing, University of Zurich, Zurich, Switzerland; Center for Neuroscience Zurich, University and ETH of Zurich, Zurich, Switzerland; Competence center Language & Medicine, University of Zurich, Switzerland
| |
Collapse
|
4
|
Apfelbaum KS, Kutlu E, McMurray B, Kapnoula EC. Don't force it! Gradient speech categorization calls for continuous categorization tasks. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:3728. [PMID: 36586841 PMCID: PMC9894657 DOI: 10.1121/10.0015201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 09/12/2022] [Accepted: 10/20/2022] [Indexed: 05/29/2023]
Abstract
Research on speech categorization and phoneme recognition has relied heavily on tasks in which participants listen to stimuli from a speech continuum and are asked to either classify each stimulus (identification) or discriminate between them (discrimination). Such tasks rest on assumptions about how perception maps onto discrete responses that have not been thoroughly investigated. Here, we identify critical challenges in the link between these tasks and theories of speech categorization. In particular, we show that patterns that have traditionally been linked to categorical perception could arise despite continuous underlying perception and that patterns that run counter to categorical perception could arise despite underlying categorical perception. We describe an alternative measure of speech perception using a visual analog scale that better differentiates between processes at play in speech categorization, and we review some recent findings that show how this task can be used to better inform our theories.
Collapse
Affiliation(s)
- Keith S Apfelbaum
- Department of Psychological and Brain Sciences, G60 Psychological and Brain Sciences Building, University of Iowa, Iowa City, Iowa 52242-1407, USA
| | - Ethan Kutlu
- Department of Psychological and Brain Sciences, G60 Psychological and Brain Sciences Building, University of Iowa, Iowa City, Iowa 52242-1407, USA
| | - Bob McMurray
- Department of Psychological and Brain Sciences, G60 Psychological and Brain Sciences Building, University of Iowa, Iowa City, Iowa 52242-1407, USA
| | - Efthymia C Kapnoula
- BCBL, Basque Center on Cognition, Brain and Language, Mikeletegi 69, 20009 Donostia, Spain
| |
Collapse
|
5
|
Mankel K, Shrestha U, Tipirneni-Sajja A, Bidelman GM. Functional Plasticity Coupled With Structural Predispositions in Auditory Cortex Shape Successful Music Category Learning. Front Neurosci 2022; 16:897239. [PMID: 35837119 PMCID: PMC9274125 DOI: 10.3389/fnins.2022.897239] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 05/25/2022] [Indexed: 11/23/2022] Open
Abstract
Categorizing sounds into meaningful groups helps listeners more efficiently process the auditory scene and is a foundational skill for speech perception and language development. Yet, how auditory categories develop in the brain through learning, particularly for non-speech sounds (e.g., music), is not well understood. Here, we asked musically naïve listeners to complete a brief (∼20 min) training session where they learned to identify sounds from a musical interval continuum (minor-major 3rds). We used multichannel EEG to track behaviorally relevant neuroplastic changes in the auditory event-related potentials (ERPs) pre- to post-training. To rule out mere exposure-induced changes, neural effects were evaluated against a control group of 14 non-musicians who did not undergo training. We also compared individual categorization performance with structural volumetrics of bilateral Heschl's gyrus (HG) from MRI to evaluate neuroanatomical substrates of learning. Behavioral performance revealed steeper (i.e., more categorical) identification functions in the posttest that correlated with better training accuracy. At the neural level, improvement in learners' behavioral identification was characterized by smaller P2 amplitudes at posttest, particularly over right hemisphere. Critically, learning-related changes in the ERPs were not observed in control listeners, ruling out mere exposure effects. Learners also showed smaller and thinner HG bilaterally, indicating superior categorization was associated with structural differences in primary auditory brain regions. Collectively, our data suggest successful auditory categorical learning of music sounds is characterized by short-term functional changes (i.e., greater post-training efficiency) in sensory coding processes superimposed on preexisting structural differences in bilateral auditory cortex.
Collapse
Affiliation(s)
- Kelsey Mankel
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States
- Center for Mind and Brain, University of California, Davis, Davis, CA, United States
| | - Utsav Shrestha
- Department of Biomedical Engineering, University of Memphis, Memphis, TN, United States
| | | | - Gavin M. Bidelman
- School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, United States
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, United States
| |
Collapse
|
6
|
Carter JA, Buder EH, Bidelman GM. Nonlinear dynamics in auditory cortical activity reveal the neural basis of perceptual warping in speech categorization. JASA EXPRESS LETTERS 2022; 2:045201. [PMID: 35434716 PMCID: PMC8984957 DOI: 10.1121/10.0009896] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 03/03/2022] [Indexed: 06/14/2023]
Abstract
Surrounding context influences speech listening, resulting in dynamic shifts to category percepts. To examine its neural basis, event-related potentials (ERPs) were recorded during vowel identification with continua presented in random, forward, and backward orders to induce perceptual warping. Behaviorally, sequential order shifted individual listeners' categorical boundary, versus random delivery, revealing perceptual warping (biasing) of the heard phonetic category dependent on recent stimulus history. ERPs revealed later (∼300 ms) activity localized to superior temporal and middle/inferior frontal gyri that predicted listeners' hysteresis/enhanced contrast magnitudes. Findings demonstrate that interactions between frontotemporal brain regions govern top-down, stimulus history effects on speech categorization.
Collapse
Affiliation(s)
- Jared A Carter
- Institute for Intelligent Systems, University of Memphis, Memphis, Tennessee 38152, USA
| | - Eugene H Buder
- School of Communication Sciences and Disorders, University of Memphis, Memphis, Tennessee 38152, USA
| | - Gavin M Bidelman
- Department of Speech, Language and Hearing Sciences, Indiana University, , Bloomington, Indiana 47408, USA , ,
| |
Collapse
|
7
|
Kapnoula EC, McMurray B. Idiosyncratic use of bottom-up and top-down information leads to differences in speech perception flexibility: Converging evidence from ERPs and eye-tracking. BRAIN AND LANGUAGE 2021; 223:105031. [PMID: 34628259 PMCID: PMC11251822 DOI: 10.1016/j.bandl.2021.105031] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 07/29/2021] [Accepted: 09/22/2021] [Indexed: 06/13/2023]
Abstract
Listeners generally categorize speech sounds in a gradient manner. However, recent work, using a visual analogue scaling (VAS) task, suggests that some listeners show more categorical performance, leading to less flexible cue integration and poorer recovery from misperceptions (Kapnoula et al., 2017, 2021). We asked how individual differences in speech gradiency can be reconciled with the well-established gradiency in the modal listener, showing how VAS performance relates to both Visual World Paradigm and EEG measures of gradiency. We also investigated three potential sources of these individual differences: inhibitory control; lexical inhibition; and early cue encoding. We used the N1 ERP component to track pre-categorical encoding of Voice Onset Time (VOT). The N1 linearly tracked VOT, reflecting a fundamentally gradient speech perception; however, for less gradient listeners, this linearity was disrupted near the boundary. Thus, while all listeners are gradient, they may show idiosyncratic encoding of specific cues, affecting downstream processing.
Collapse
Affiliation(s)
- Efthymia C Kapnoula
- Dept. of Psychological and Brain Sciences, University of Iowa, United States; DeLTA Center, University of Iowa, United States; Basque Center on Cognition, Brain and Language, Spain.
| | - Bob McMurray
- Dept. of Psychological and Brain Sciences, University of Iowa, United States; DeLTA Center, University of Iowa, United States; Dept. of Communication Sciences and Disorders, DeLTA Center, University of Iowa, United States; Dept. of Linguistics, DeLTA Center, University of Iowa, United States
| |
Collapse
|
8
|
Qin Z, Gong M, Zhang C. Neural responses in novice learners' perceptual learning and generalization of lexical tones: The effect of training variability. BRAIN AND LANGUAGE 2021; 223:105029. [PMID: 34624686 DOI: 10.1016/j.bandl.2021.105029] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 09/20/2021] [Accepted: 09/22/2021] [Indexed: 06/13/2023]
Abstract
The acoustics of lexical tones are highly variable across talkers, and require second-language (L2) learners' flexibility in accommodating talker-specific tonal variations for successful learning. This study investigated how tone training with high vs. low talker-variability modulated novice learners' neural responses to non-native tones. A passive oddball paradigm tested Mandarin-speaking participants' neural responses to Cantonese low-high and low-mid tonal contrasts in the pretest and posttest. Participants were trained using a tone identification task with feedback, either with high or low talker-variability. The results of mismatch negativity (MMN) showed no group difference in the pretest whereas the high-variability group demonstrated greater neural sensitivity to the low-high tonal contrast produced by a novel talker and a trained talker in the posttest. The finding provides (tentative) novel evidence that training variability may benefit perceptual learning of the relatively easy tone pair and facilitate the formation of talker-independent representations of non-native tones by novice learners.
Collapse
Affiliation(s)
- Zhen Qin
- Division of Humanities, The Hong Kong University of Science and Technology, Hong Kong.
| | - Minzhi Gong
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong.
| | - Caicai Zhang
- Research Centre for Language, Cognition and Neuroscience, Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong.
| |
Collapse
|