1
|
Kapatsinski V, Bramlett AA, Idemaru K. What do you learn from a single cue? Dimensional reweighting and cue reassociation from experience with a newly unreliable phonetic cue. Cognition 2024; 249:105818. [PMID: 38772253 DOI: 10.1016/j.cognition.2024.105818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 05/23/2024]
Abstract
In language comprehension, we use perceptual cues to infer meanings. Some of these cues reside on perceptual dimensions. For example, the difference between bear and pear is cued by a difference in voice onset time (VOT), which is a continuous perceptual dimension. The present paper asks whether, and when, experience with a single value on a dimension behaving unexpectedly is used by the learner to reweight the whole dimension. We show that learners reweight the whole VOT dimension when exposed to a single VOT value (e.g., 45 ms) and provided with feedback indicating that the speaker intended to produce a /b/ 50% of the time and a /p/ the other 50% of the time. Importantly, dimensional reweighting occurs only if 1) the 50/50 feedback is unexpected for the VOT value, and 2) there is another dimension that is predictive of feedback. When no predictive dimension is available, listeners reassociate the experienced VOT value with the more surprising outcome but do not downweight the entire VOT dimension. These results provide support for perceptual representations of speech sounds that combine cues and dimensions, for viewing perceptual learning in speech as a combination of error-driven cue reassociation and dimensional reweighting, and for considering dimensional reweighting to be reallocation of attention that occurs only when there is evidence that reallocating attention would improve prediction accuracy (Harmon, Z., Idemaru, K., & Kapatsinski, V. 2019. Learning mechanisms in cue reweighting. Cognition, 189, 76-88.).
Collapse
Affiliation(s)
- Vsevolod Kapatsinski
- University of Oregon, Department of Linguistics, 161 Straub Hall, University of Oregon, Eugene, OR 97403-1290, United States of America.
| | - Adam A Bramlett
- Carnegie-Mellon University, Department of Modern Languages, 341 Posner Hall, 5000 Forbes Avenue, Pittsburgh, PA 15213, United States of America.
| | - Kaori Idemaru
- University of Oregon, Department of East Asian Languages and Literatures, 114 Friendly Hall University of Oregon, Eugene, OR 97403-1248, United States of America.
| |
Collapse
|
2
|
Kachlicka M, Patel AD, Liu F, Tierney A. Weighting of cues to categorization of song versus speech in tone-language and non-tone-language speakers. Cognition 2024; 246:105757. [PMID: 38442588 DOI: 10.1016/j.cognition.2024.105757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 02/09/2024] [Accepted: 02/20/2024] [Indexed: 03/07/2024]
Abstract
One of the most important auditory categorization tasks a listener faces is determining a sound's domain, a process which is a prerequisite for successful within-domain categorization tasks such as recognizing different speech sounds or musical tones. Speech and song are universal in human cultures: how do listeners categorize a sequence of words as belonging to one or the other of these domains? There is growing interest in the acoustic cues that distinguish speech and song, but it remains unclear whether there are cross-cultural differences in the evidence upon which listeners rely when making this fundamental perceptual categorization. Here we use the speech-to-song illusion, in which some spoken phrases perceptually transform into song when repeated, to investigate cues to this domain-level categorization in native speakers of tone languages (Mandarin and Cantonese speakers residing in the United Kingdom and China) and in native speakers of a non-tone language (English). We find that native tone-language and non-tone-language listeners largely agree on which spoken phrases sound like song after repetition, and we also find that the strength of this transformation is not significantly different across language backgrounds or countries of residence. Furthermore, we find a striking similarity in the cues upon which listeners rely when perceiving word sequences as singing versus speech, including small pitch intervals, flat within-syllable pitch contours, and steady beats. These findings support the view that there are certain widespread cross-cultural similarities in the mechanisms by which listeners judge if a word sequence is spoken or sung.
Collapse
Affiliation(s)
- Magdalena Kachlicka
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London, United Kingdom
| | - Aniruddh D Patel
- Department of Psychology, Tufts University, 419 Boston Ave, Medford, USA; Program in Brain, Mind, and Consciousness, Canadian Institute for Advanced Research, 661 University Avenue, Toronto, Canada
| | - Fang Liu
- School of Psychology and Clinical Language Sciences, University of Reading, Whiteknights, Reading, United Kingdom
| | - Adam Tierney
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London, United Kingdom.
| |
Collapse
|
3
|
Symons AE, Holt LL, Tierney AT. Informational masking influences segmental and suprasegmental speech categorization. Psychon Bull Rev 2024; 31:686-696. [PMID: 37658222 PMCID: PMC11061029 DOI: 10.3758/s13423-023-02364-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/04/2023] [Indexed: 09/03/2023]
Abstract
Auditory categorization requires listeners to integrate acoustic information from multiple dimensions. Attentional theories suggest that acoustic dimensions that are informative attract attention and therefore receive greater perceptual weight during categorization. However, the acoustic environment is often noisy, with multiple sound sources competing for listeners' attention. Amid these adverse conditions, attentional theories predict that listeners will distribute attention more evenly across multiple dimensions. Here we test this prediction using an informational masking paradigm. In two experiments, listeners completed suprasegmental (focus) and segmental (voicing) speech categorization tasks in quiet or in the presence of competing speech. In both experiments, the target speech consisted of short words or phrases that varied in the extent to which fundamental frequency (F0) and durational information signalled category identity. To isolate effects of informational masking, target and competing speech were presented in opposite ears. Across both experiments, there was substantial individual variability in the relative weighting of the two dimensions. These individual differences were consistent across listening conditions, suggesting that they reflect stable perceptual strategies. Consistent with attentional theories of auditory categorization, listeners who relied on a single primary dimension in quiet shifted towards integrating across multiple dimensions in the presence of competing speech. These findings demonstrate that listeners make greater use of the redundancy present in speech when attentional resources are limited.
Collapse
Affiliation(s)
- A E Symons
- Department of Psychological Sciences, Birkbeck, University of London, London, UK
| | - L L Holt
- Department of Psychology and Neuroscience Institute, Carnegie Mellon University, 500 Forbes Avenue, Pittsburgh, PA, USA.
| | - A T Tierney
- Department of Psychological Sciences, Birkbeck, University of London, London, UK
| |
Collapse
|
4
|
Caprini F, Zhao S, Chait M, Agus T, Pomper U, Tierney A, Dick F. Generalization of auditory expertise in audio engineers and instrumental musicians. Cognition 2024; 244:105696. [PMID: 38160651 DOI: 10.1016/j.cognition.2023.105696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 12/04/2023] [Accepted: 12/13/2023] [Indexed: 01/03/2024]
Abstract
From auditory perception to general cognition, the ability to play a musical instrument has been associated with skills both related and unrelated to music. However, it is unclear if these effects are bound to the specific characteristics of musical instrument training, as little attention has been paid to other populations such as audio engineers and designers whose auditory expertise may match or surpass that of musicians in specific auditory tasks or more naturalistic acoustic scenarios. We explored this possibility by comparing students of audio engineering (n = 20) to matched conservatory-trained instrumentalists (n = 24) and to naive controls (n = 20) on measures of auditory discrimination, auditory scene analysis, and speech in noise perception. We found that audio engineers and performing musicians had generally lower psychophysical thresholds than controls, with pitch perception showing the largest effect size. Compared to controls, audio engineers could better memorise and recall auditory scenes composed of non-musical sounds, whereas instrumental musicians performed best in a sustained selective attention task with two competing streams of tones. Finally, in a diotic speech-in-babble task, musicians showed lower signal-to-noise-ratio thresholds than both controls and engineers; however, a follow-up online study did not replicate this musician advantage. We also observed differences in personality that might account for group-based self-selection biases. Overall, we showed that investigating a wider range of forms of auditory expertise can help us corroborate (or challenge) the specificity of the advantages previously associated with musical instrument training.
Collapse
Affiliation(s)
- Francesco Caprini
- Department of Psychological Sciences, Birkbeck, University of London, UK.
| | - Sijia Zhao
- Department of Experimental Psychology, University of Oxford, UK
| | - Maria Chait
- University College London (UCL) Ear Institute, UK
| | - Trevor Agus
- School of Arts, English and Languages, Queen's University Belfast, UK
| | - Ulrich Pomper
- Department of Cognition, Emotion, and Methods in Psychology, Universität Wien, Austria
| | - Adam Tierney
- Department of Psychological Sciences, Birkbeck, University of London, UK
| | - Fred Dick
- Department of Experimental Psychology, University College London (UCL), UK
| |
Collapse
|
5
|
Guerra G, Tierney A, Tijms J, Vaessen A, Bonte M, Dick F. Attentional modulation of neural sound tracking in children with and without dyslexia. Dev Sci 2024; 27:e13420. [PMID: 37350014 DOI: 10.1111/desc.13420] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 04/09/2023] [Accepted: 05/26/2023] [Indexed: 06/24/2023]
Abstract
Auditory selective attention forms an important foundation of children's learning by enabling the prioritisation and encoding of relevant stimuli. It may also influence reading development, which relies on metalinguistic skills including the awareness of the sound structure of spoken language. Reports of attentional impairments and speech perception difficulties in noisy environments in dyslexic readers are also suggestive of the putative contribution of auditory attention to reading development. To date, it is unclear whether non-speech selective attention and its underlying neural mechanisms are impaired in children with dyslexia and to which extent these deficits relate to individual reading and speech perception abilities in suboptimal listening conditions. In this EEG study, we assessed non-speech sustained auditory selective attention in 106 7-to-12-year-old children with and without dyslexia. Children attended to one of two tone streams, detecting occasional sequence repeats in the attended stream, and performed a speech-in-speech perception task. Results show that when children directed their attention to one stream, inter-trial-phase-coherence at the attended rate increased in fronto-central sites; this, in turn, was associated with better target detection. Behavioural and neural indices of attention did not systematically differ as a function of dyslexia diagnosis. However, behavioural indices of attention did explain individual differences in reading fluency and speech-in-speech perception abilities: both these skills were impaired in dyslexic readers. Taken together, our results show that children with dyslexia do not show group-level auditory attention deficits but these deficits may represent a risk for developing reading impairments and problems with speech perception in complex acoustic environments. RESEARCH HIGHLIGHTS: Non-speech sustained auditory selective attention modulates EEG phase coherence in children with/without dyslexia Children with dyslexia show difficulties in speech-in-speech perception Attention relates to dyslexic readers' speech-in-speech perception and reading skills Dyslexia diagnosis is not linked to behavioural/EEG indices of auditory attention.
Collapse
Affiliation(s)
- Giada Guerra
- Centre for Brain and Cognitive Development, Birkbeck College, University of London, London, UK
- Maastricht Brain Imaging Center and Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Adam Tierney
- Centre for Brain and Cognitive Development, Birkbeck College, University of London, London, UK
| | - Jurgen Tijms
- RID, Amsterdam, Netherlands
- Rudolf Berlin Center, Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
| | | | - Milene Bonte
- Maastricht Brain Imaging Center and Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands
| | - Frederic Dick
- Division of Psychology & Language Sciences, UCL, London, UK
| |
Collapse
|
6
|
Petrova K, Jasmin K, Saito K, Tierney AT. Extensive residence in a second language environment modifies perceptual strategies for suprasegmental categorization. J Exp Psychol Learn Mem Cogn 2023; 49:1943-1955. [PMID: 38127498 PMCID: PMC10734206 DOI: 10.1037/xlm0001246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 02/08/2023] [Accepted: 03/06/2023] [Indexed: 12/23/2023]
Abstract
Languages differ in the importance of acoustic dimensions for speech categorization. This poses a potential challenge for second language (L2) learners, and the extent to which adult L2 learners can acquire new perceptual strategies for speech categorization remains unclear. This study investigated the effects of extensive English L2 immersion on speech perception strategies and dimension-selective-attention ability in native Mandarin speakers. Experienced first language (L1) Mandarin speakers (length of U.K. residence > 3 years) demonstrated more native-like weighting of cues to L2 suprasegmental categorization relative to inexperienced Mandarin speakers (length of residence < 1 year), weighting duration more highly. However, both the experienced and the inexperienced Mandarin speakers continued to weight duration less highly and pitch more highly during musical beat categorization and struggled to ignore pitch and selectively attend to amplitude in speech, relative to native English speakers. These results suggest that adult L2 experience can lead to retuning of perceptual strategies in specific contexts, but global acoustic salience is more resistant to change. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Collapse
Affiliation(s)
- Katya Petrova
- Department of Culture, Communication & Media, Institute of Education, University College London
| | - Kyle Jasmin
- Department of Psychology, Royal Holloway University of London
| | - Kazuya Saito
- Department of Culture, Communication & Media, Institute of Education, University College London
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck University of London
| |
Collapse
|
7
|
Obasih CO, Luthra S, Dick F, Holt LL. Auditory category learning is robust across training regimes. Cognition 2023; 237:105467. [PMID: 37148640 PMCID: PMC11415078 DOI: 10.1016/j.cognition.2023.105467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 03/17/2023] [Accepted: 04/21/2023] [Indexed: 05/08/2023]
Abstract
Multiple lines of research have developed training approaches that foster category learning, with important translational implications for education. Increasing exemplar variability, blocking or interleaving by category-relevant dimension, and providing explicit instructions about diagnostic dimensions each have been shown to facilitate category learning and/or generalization. However, laboratory research often must distill the character of natural input regularities that define real-world categories. As a result, much of what we know about category learning has come from studies with simplifying assumptions. We challenge the implicit expectation that these studies reflect the process of category learning of real-world input by creating an auditory category learning paradigm that intentionally violates some common simplifying assumptions of category learning tasks. Across five experiments and nearly 300 adult participants, we used training regimes previously shown to facilitate category learning, but here drew from a more complex and multidimensional category space with tens of thousands of unique exemplars. Learning was equivalently robust across training regimes that changed exemplar variability, altered the blocking of category exemplars, or provided explicit instructions of the category-diagnostic dimension. Each drove essentially equivalent accuracy measures of learning generalization following 40 min of training. These findings suggest that auditory category learning across complex input is not as susceptible to training regime manipulation as previously thought.
Collapse
Affiliation(s)
- Chisom O Obasih
- Department of Psychology, Carnegie Mellon University, United States of America; Neuroscience Institute, Carnegie Mellon University, United States of America; Center for the Neural Basis of Cognition, Carnegie Mellon University, United States of America.
| | - Sahil Luthra
- Department of Psychology, Carnegie Mellon University, United States of America; Neuroscience Institute, Carnegie Mellon University, United States of America; Center for the Neural Basis of Cognition, Carnegie Mellon University, United States of America
| | - Frederic Dick
- Experimental Psychology, University College London, United Kingdom; Birkbeck/UCL Centre for NeuroImaging, United Kingdom
| | - Lori L Holt
- Department of Psychology, Carnegie Mellon University, United States of America; Neuroscience Institute, Carnegie Mellon University, United States of America; Center for the Neural Basis of Cognition, Carnegie Mellon University, United States of America
| |
Collapse
|
8
|
Jasmin K, Tierney A, Obasih C, Holt L. Short-term perceptual reweighting in suprasegmental categorization. Psychon Bull Rev 2023; 30:373-382. [PMID: 35915382 PMCID: PMC9971089 DOI: 10.3758/s13423-022-02146-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/05/2022] [Indexed: 11/08/2022]
Abstract
Segmental speech units such as phonemes are described as multidimensional categories whose perception involves contributions from multiple acoustic input dimensions, and the relative perceptual weights of these dimensions respond dynamically to context. For example, when speech is altered to create an "accent" in which two acoustic dimensions are correlated in a manner opposite that of long-term experience, the dimension that carries less perceptual weight is down-weighted to contribute less in category decisions. It remains unclear, however, whether this short-term reweighting extends to perception of suprasegmental features that span multiple phonemes, syllables, or words, in part because it has remained debatable whether suprasegmental features are perceived categorically. Here, we investigated the relative contribution of two acoustic dimensions to word emphasis. Participants categorized instances of a two-word phrase pronounced with typical covariation of fundamental frequency (F0) and duration, and in the context of an artificial "accent" in which F0 and duration (established in prior research on English speech as "primary" and "secondary" dimensions, respectively) covaried atypically. When categorizing "accented" speech, listeners rapidly down-weighted the secondary dimension (duration). This result indicates that listeners continually track short-term regularities across speech input and dynamically adjust the weight of acoustic evidence for suprasegmental decisions. Thus, dimension-based statistical learning appears to be a widespread phenomenon in speech perception extending to both segmental and suprasegmental categorization.
Collapse
Affiliation(s)
- Kyle Jasmin
- Department of Psychology, Wolfson Building, Royal Holloway, University of London, Egham, Surrey, TW20 0EX, UK.
| | | | | | - Lori Holt
- Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
9
|
Wu YC, Holt LL. Phonetic category activation predicts the direction and magnitude of perceptual adaptation to accented speech. J Exp Psychol Hum Percept Perform 2022; 48:913-925. [PMID: 35849375 PMCID: PMC10236200 DOI: 10.1037/xhp0001037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Unfamiliar accents can systematically shift speech acoustics away from community norms and reduce comprehension. Yet, limited exposure improves comprehension. This perceptual adaptation indicates that the mapping from acoustics to speech representations is dynamic, rather than fixed. But, what drives adjustments is debated. Supervised learning accounts posit that activation of an internal speech representation via disambiguating information generates predictions about patterns of speech input typically associated with the representation. When actual input mismatches predictions, the mapping is adjusted. We tested two hypotheses of this account across consonants and vowels as listeners categorized speech conveying an English-like acoustic regularity or an artificial accent. Across conditions, signal manipulations impacted which of two acoustic dimensions best conveyed category identity, and predicted which dimension would exhibit the effects of perceptual adaptation. Moreover, the strength of phonetic category activation, as estimated by categorization responses reliant on the dominant acoustic dimension, predicted the magnitude of adaptation observed across listeners. The results align with predictions of supervised learning accounts, suggesting that perceptual adaptation arises from speech category activation, corresponding predictions about the patterns of acoustic input that align with the category, and adjustments in subsequent speech perception when input mismatches these expectations. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Collapse
|
10
|
Kachlicka M, Laffere A, Dick F, Tierney A. Slow phase-locked modulations support selective attention to sound. Neuroimage 2022; 252:119024. [PMID: 35231629 PMCID: PMC9133470 DOI: 10.1016/j.neuroimage.2022.119024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 02/16/2022] [Accepted: 02/19/2022] [Indexed: 11/16/2022] Open
Abstract
To make sense of complex soundscapes, listeners must select and attend to task-relevant streams while ignoring uninformative sounds. One possible neural mechanism underlying this process is alignment of endogenous oscillations with the temporal structure of the target sound stream. Such a mechanism has been suggested to mediate attentional modulation of neural phase-locking to the rhythms of attended sounds. However, such modulations are compatible with an alternate framework, where attention acts as a filter that enhances exogenously-driven neural auditory responses. Here we attempted to test several predictions arising from the oscillatory account by playing two tone streams varying across conditions in tone duration and presentation rate; participants attended to one stream or listened passively. Attentional modulation of the evoked waveform was roughly sinusoidal and scaled with rate, while the passive response did not. However, there was only limited evidence for continuation of modulations through the silence between sequences. These results suggest that attentionally-driven changes in phase alignment reflect synchronization of slow endogenous activity with the temporal structure of attended stimuli.
Collapse
Affiliation(s)
- Magdalena Kachlicka
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London WC1E 7HX, England
| | - Aeron Laffere
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London WC1E 7HX, England
| | - Fred Dick
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London WC1E 7HX, England; Division of Psychology & Language Sciences, UCL, Gower Street, London WC1E 6BT, England
| | - Adam Tierney
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London WC1E 7HX, England.
| |
Collapse
|
11
|
Symons AE, Dick F, Tierney AT. Dimension-selective attention and dimensional salience modulate cortical tracking of acoustic dimensions. Neuroimage 2021; 244:118544. [PMID: 34492294 DOI: 10.1016/j.neuroimage.2021.118544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/19/2021] [Accepted: 08/31/2021] [Indexed: 11/17/2022] Open
Abstract
Some theories of auditory categorization suggest that auditory dimensions that are strongly diagnostic for particular categories - for instance voice onset time or fundamental frequency in the case of some spoken consonants - attract attention. However, prior cognitive neuroscience research on auditory selective attention has largely focused on attention to simple auditory objects or streams, and so little is known about the neural mechanisms that underpin dimension-selective attention, or how the relative salience of variations along these dimensions might modulate neural signatures of attention. Here we investigate whether dimensional salience and dimension-selective attention modulate the cortical tracking of acoustic dimensions. In two experiments, participants listened to tone sequences varying in pitch and spectral peak frequency; these two dimensions changed at different rates. Inter-trial phase coherence (ITPC) and amplitude of the EEG signal at the frequencies tagged to pitch and spectral changes provided a measure of cortical tracking of these dimensions. In Experiment 1, tone sequences varied in the size of the pitch intervals, while the size of spectral peak intervals remained constant. Cortical tracking of pitch changes was greater for sequences with larger compared to smaller pitch intervals, with no difference in cortical tracking of spectral peak changes. In Experiment 2, participants selectively attended to either pitch or spectral peak. Cortical tracking was stronger in response to the attended compared to unattended dimension for both pitch and spectral peak. These findings suggest that attention can enhance the cortical tracking of specific acoustic dimensions rather than simply enhancing tracking of the auditory object as a whole.
Collapse
Affiliation(s)
- Ashley E Symons
- Department of Psychological Sciences, Birkbeck College, University of London UK.
| | - Fred Dick
- Department of Psychological Sciences, Birkbeck College, University of London UK; Division of Psychology & Language Sciences, University College London UK
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London UK
| |
Collapse
|
12
|
Viswanathan V, Shinn-Cunningham BG, Heinz MG. Temporal fine structure influences voicing confusions for consonant identification in multi-talker babble. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:2664. [PMID: 34717498 PMCID: PMC8514254 DOI: 10.1121/10.0006527] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 09/07/2021] [Accepted: 09/09/2021] [Indexed: 05/17/2023]
Abstract
To understand the mechanisms of speech perception in everyday listening environments, it is important to elucidate the relative contributions of different acoustic cues in transmitting phonetic content. Previous studies suggest that the envelope of speech in different frequency bands conveys most speech content, while the temporal fine structure (TFS) can aid in segregating target speech from background noise. However, the role of TFS in conveying phonetic content beyond what envelopes convey for intact speech in complex acoustic scenes is poorly understood. The present study addressed this question using online psychophysical experiments to measure the identification of consonants in multi-talker babble for intelligibility-matched intact and 64-channel envelope-vocoded stimuli. Consonant confusion patterns revealed that listeners had a greater tendency in the vocoded (versus intact) condition to be biased toward reporting that they heard an unvoiced consonant, despite envelope and place cues being largely preserved. This result was replicated when babble instances were varied across independent experiments, suggesting that TFS conveys voicing information beyond what is conveyed by envelopes for intact speech in babble. Given that multi-talker babble is a masker that is ubiquitous in everyday environments, this finding has implications for the design of assistive listening devices such as cochlear implants.
Collapse
Affiliation(s)
- Vibha Viswanathan
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, Indiana 47907, USA
| | | | - Michael G. Heinz
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| |
Collapse
|
13
|
Jasmin K, Sun H, Tierney AT. Effects of language experience on domain-general perceptual strategies. Cognition 2020; 206:104481. [PMID: 33075568 DOI: 10.1016/j.cognition.2020.104481] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 09/24/2020] [Accepted: 09/26/2020] [Indexed: 12/27/2022]
Abstract
Speech and music are highly redundant communication systems, with multiple acoustic cues signaling the existence of perceptual categories. This redundancy makes these systems robust to the influence of noise, but necessitates the development of perceptual strategies: listeners need to decide how much importance to place on each source of information. Prior empirical work and modeling has suggested that cue weights primarily reflect within-task statistical learning, as listeners assess the reliability with which different acoustic dimensions signal a category and modify their weights accordingly. Here we present evidence that perceptual experience can lead to changes in cue weighting that extend across tasks and across domains, suggesting that perceptual strategies reflect both global biases and local (i.e. task-specific) learning. In two experiments, native speakers of Mandarin (N = 45)-where pitch is a crucial cue to word identity-placed more importance on pitch and less importance on other dimensions compared to native speakers of non-tonal languages English (N = 45) and Spanish (N = 27), during the perception of both English speech and musical beats. In a third experiment, we further show that Mandarin speakers are better able to attend to pitch and ignore irrelevant variation in other dimensions in speech compared to English and Spanish speakers, and even struggle to ignore pitch when asked to attend to other dimensions. Thus, an individual's idiosyncratic auditory perceptual strategy reflects a complex mixture of congenital predispositions, task-specific learning, and biases instilled by extensive experience in making use of important dimensions in their native language.
Collapse
Affiliation(s)
- Kyle Jasmin
- Department of Psychological Sciences, Birkbeck College, University of London, UK.
| | - Hui Sun
- Department of Psychological Sciences, Birkbeck College, University of London, UK
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London, UK
| |
Collapse
|
14
|
Laffere A, Dick F, Holt LL, Tierney A. Attentional modulation of neural entrainment to sound streams in children with and without ADHD. Neuroimage 2020; 224:117396. [PMID: 32979522 DOI: 10.1016/j.neuroimage.2020.117396] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 08/25/2020] [Accepted: 09/14/2020] [Indexed: 01/06/2023] Open
Abstract
To extract meaningful information from complex auditory scenes like a noisy playground, rock concert, or classroom, children can direct attention to different sound streams. One means of accomplishing this might be to align neural activity with the temporal structure of a target stream, such as a specific talker or melody. However, this may be more difficult for children with ADHD, who can struggle with accurately perceiving and producing temporal intervals. In this EEG study, we found that school-aged children's attention to one of two temporally-interleaved isochronous tone 'melodies' was linked to an increase in phase-locking at the melody's rate, and a shift in neural phase that aligned the neural responses with the attended tone stream. Children's attention task performance and neural phase alignment with the attended melody were linked to performance on temporal production tasks, suggesting that children with more robust control over motor timing were better able to direct attention to the time points associated with the target melody. Finally, we found that although children with ADHD performed less accurately on the tonal attention task than typically developing children, they showed the same degree of attentional modulation of phase locking and neural phase shifts, suggesting that children with ADHD may have difficulty with attentional engagement rather than attentional selection.
Collapse
Affiliation(s)
- Aeron Laffere
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London, WC1E 7HX, United Kingdom
| | - Fred Dick
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London, WC1E 7HX, United Kingdom; Division of Psychology & Language Sciences, UCL, Gower Street, London, WC1E 6BT, United Kingdom
| | - Lori L Holt
- Department of Psychology, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, United States
| | - Adam Tierney
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London, WC1E 7HX, United Kingdom.
| |
Collapse
|
15
|
Jasmin K, Dick F, Holt LL, Tierney A. Tailored perception: Individuals' speech and music perception strategies fit their perceptual abilities. J Exp Psychol Gen 2020; 149:914-934. [PMID: 31589067 PMCID: PMC7133494 DOI: 10.1037/xge0000688] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Revised: 08/09/2019] [Accepted: 08/12/2019] [Indexed: 01/09/2023]
Abstract
Perception involves integration of multiple dimensions that often serve overlapping, redundant functions, for example, pitch, duration, and amplitude in speech. Individuals tend to prioritize these dimensions differently (stable, individualized perceptual strategies), but the reason for this has remained unclear. Here we show that perceptual strategies relate to perceptual abilities. In a speech cue weighting experiment (trial N = 990), we first demonstrate that individuals with a severe deficit for pitch perception (congenital amusics; N = 11) categorize linguistic stimuli similarly to controls (N = 11) when the main distinguishing cue is duration, which they perceive normally. In contrast, in a prosodic task where pitch cues are the main distinguishing factor, we show that amusics place less importance on pitch and instead rely more on duration cues-even when pitch differences in the stimuli are large enough for amusics to discern. In a second experiment testing musical and prosodic phrase interpretation (N = 16 amusics; 15 controls), we found that relying on duration allowed amusics to overcome their pitch deficits to perceive speech and music successfully. We conclude that auditory signals, because of their redundant nature, are robust to impairments for specific dimensions, and that optimal speech and music perception strategies depend not only on invariant acoustic dimensions (the physical signal), but on perceptual dimensions whose precision varies across individuals. Computational models of speech perception (indeed, all types of perception involving redundant cues e.g., vision and touch) should therefore aim to account for the precision of perceptual dimensions and characterize individuals as well as groups. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Collapse
Affiliation(s)
| | - Fred Dick
- Department of Psychological Sciences
| | | | | |
Collapse
|
16
|
Abstract
UNLABELLED An accumulating body of evidence highlights the contribution of general cognitive processes, such as attention, to language-related skills. OBJECTIVE The purpose of the present study was to explore how interference control (a subcomponent of selective attention) is affected in developmental dyslexia (DD) by means of control over simple stimulus-response mappings. Furthermore, we aimed to examine interference control in adults with DD across sensory modalities. METHODS The performance of 14 dyslexic adults and 14 matched controls was compared on visual/auditory Simon tasks, in which conflict was presented in terms of an incongruent mapping between the location of a visual/auditory stimulus and the appropriate motor response. RESULTS In the auditory task, dyslexic participants exhibited larger Simon effect costs; namely, they showed disproportionately larger reaction times (RTs)/errors costs when the auditory stimulus and response were incongruent relative to RT/errors costs of non-impaired readers. In the visual Simon task, both groups presented Simon effect costs to the same extent. CONCLUSION These results indicate that the ability to control auditory selective attention is carried out less effectively in those with DD compared with visually controlled processing. The implications of this impaired process for the language-related skills of individuals with DD are discussed.
Collapse
|
17
|
Laffere A, Dick F, Tierney A. Effects of auditory selective attention on neural phase: individual differences and short-term training. Neuroimage 2020; 213:116717. [PMID: 32165265 DOI: 10.1016/j.neuroimage.2020.116717] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 03/02/2020] [Accepted: 03/04/2020] [Indexed: 02/06/2023] Open
Abstract
How does the brain follow a sound that is mixed with others in a noisy environment? One possible strategy is to allocate attention to task-relevant time intervals. Prior work has linked auditory selective attention to alignment of neural modulations with stimulus temporal structure. However, since this prior research used relatively easy tasks and focused on analysis of main effects of attention across participants, relatively little is known about the neural foundations of individual differences in auditory selective attention. Here we investigated individual differences in auditory selective attention by asking participants to perform a 1-back task on a target auditory stream while ignoring a distractor auditory stream presented 180° out of phase. Neural entrainment to the attended auditory stream was strongly linked to individual differences in task performance. Some variability in performance was accounted for by degree of musical training, suggesting a link between long-term auditory experience and auditory selective attention. To investigate whether short-term improvements in auditory selective attention are possible, we gave participants 2 h of auditory selective attention training and found improvements in both task performance and enhancements of the effects of attention on neural phase angle. Our results suggest that although there exist large individual differences in auditory selective attention and attentional modulation of neural phase angle, this skill improves after a small amount of targeted training.
Collapse
Affiliation(s)
- Aeron Laffere
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London, WC1E 7HX, UK
| | - Fred Dick
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London, WC1E 7HX, UK; Division of Psychology & Language Sciences, UCL, Gower Street, London, WC1E 6BT, UK
| | - Adam Tierney
- Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London, WC1E 7HX, UK.
| |
Collapse
|
18
|
Harmon Z, Idemaru K, Kapatsinski V. Learning mechanisms in cue reweighting. Cognition 2019; 189:76-88. [PMID: 30928780 DOI: 10.1016/j.cognition.2019.03.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Revised: 03/16/2019] [Accepted: 03/20/2019] [Indexed: 10/27/2022]
Abstract
Feedback has been shown to be effective in shifting attention across perceptual cues to a phonological contrast in speech perception (Francis, Baldwin & Nusbaum, 2000). However, the learning mechanisms behind this process remain obscure. We compare the predictions of supervised error-driven learning (Rescorla & Wagner, 1972) and reinforcement learning (Sutton & Barto, 1998) using computational simulations. Supervised learning predicts downweighting of an informative cue when the learner receives evidence that it is no longer informative. In contrast, reinforcement learning suggests that a reduction in cue weight requires positive evidence for the informativeness of an alternative cue. Experimental evidence supports the latter prediction, implicating reinforcement learning as the mechanism behind the effect of feedback on cue weighting in speech perception. Native English listeners were exposed to either bimodal or unimodal VOT distributions spanning the unaspirated/aspirated boundary (bear/pear). VOT is the primary cue to initial stop voicing in English. However, lexical feedback in training indicated that VOT was no longer predictive of voicing. Reduction in the weight of VOT was observed only when participants could use an alternative cue, F0, to predict voicing. Frequency distributions had no effect on learning. Overall, the results suggest that attention shifting in learning the phonetic cues to phonological categories is accomplished using simple reinforcement learning principles that also guide the choice of actions in other domains.
Collapse
Affiliation(s)
- Zara Harmon
- Department of Linguistics, University of Oregon, United States.
| | - Kaori Idemaru
- Department of East Asian Languages and Literatures, University of Oregon, United States
| | | |
Collapse
|