1
|
Kachlicka M, Tierney A. Voice actors show enhanced neural tracking of pitch, prosody perception, and music perception. Cortex 2024; 178:213-222. [PMID: 39024939 DOI: 10.1016/j.cortex.2024.06.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 05/28/2024] [Accepted: 06/26/2024] [Indexed: 07/20/2024]
Abstract
Experiences with sound that make strong demands on the precision of perception, such as musical training and experience speaking a tone language, can enhance auditory neural encoding. Are high demands on the precision of perception necessary for training to drive auditory neural plasticity? Voice actors are an ideal subject population for answering this question. Voice acting requires exaggerating prosodic cues to convey emotion, character, and linguistic structure, drawing upon attention to sound, memory for sound features, and accurate sound production, but not fine perceptual precision. Here we assessed neural encoding of pitch using the frequency-following response (FFR), as well as prosody, music, and sound perception, in voice actors and a matched group of non-actors. We find that the consistency of neural sound encoding, prosody perception, and musical phrase perception are all enhanced in voice actors, suggesting that a range of neural and behavioural auditory processing enhancements can result from training which lacks fine perceptual precision. However, fine discrimination was not enhanced in voice actors but was linked to degree of musical experience, suggesting that low-level auditory processing can only be enhanced by demanding perceptual training. These findings suggest that training which taxes attention, memory, and production but is not perceptually taxing may be a way to boost neural encoding of sound and auditory pattern detection in individuals with poor auditory skills.
Collapse
Affiliation(s)
- Magdalena Kachlicka
- School of Psychological Sciences, Birkbeck, University of London, London, UK
| | - Adam Tierney
- School of Psychological Sciences, Birkbeck, University of London, London, UK.
| |
Collapse
|
2
|
Symons AE, Holt LL, Tierney AT. Informational masking influences segmental and suprasegmental speech categorization. Psychon Bull Rev 2024; 31:686-696. [PMID: 37658222 PMCID: PMC11061029 DOI: 10.3758/s13423-023-02364-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/04/2023] [Indexed: 09/03/2023]
Abstract
Auditory categorization requires listeners to integrate acoustic information from multiple dimensions. Attentional theories suggest that acoustic dimensions that are informative attract attention and therefore receive greater perceptual weight during categorization. However, the acoustic environment is often noisy, with multiple sound sources competing for listeners' attention. Amid these adverse conditions, attentional theories predict that listeners will distribute attention more evenly across multiple dimensions. Here we test this prediction using an informational masking paradigm. In two experiments, listeners completed suprasegmental (focus) and segmental (voicing) speech categorization tasks in quiet or in the presence of competing speech. In both experiments, the target speech consisted of short words or phrases that varied in the extent to which fundamental frequency (F0) and durational information signalled category identity. To isolate effects of informational masking, target and competing speech were presented in opposite ears. Across both experiments, there was substantial individual variability in the relative weighting of the two dimensions. These individual differences were consistent across listening conditions, suggesting that they reflect stable perceptual strategies. Consistent with attentional theories of auditory categorization, listeners who relied on a single primary dimension in quiet shifted towards integrating across multiple dimensions in the presence of competing speech. These findings demonstrate that listeners make greater use of the redundancy present in speech when attentional resources are limited.
Collapse
Affiliation(s)
- A E Symons
- Department of Psychological Sciences, Birkbeck, University of London, London, UK
| | - L L Holt
- Department of Psychology and Neuroscience Institute, Carnegie Mellon University, 500 Forbes Avenue, Pittsburgh, PA, USA.
| | - A T Tierney
- Department of Psychological Sciences, Birkbeck, University of London, London, UK
| |
Collapse
|
3
|
Petrova K, Jasmin K, Saito K, Tierney AT. Extensive residence in a second language environment modifies perceptual strategies for suprasegmental categorization. J Exp Psychol Learn Mem Cogn 2023; 49:1943-1955. [PMID: 38127498 PMCID: PMC10734206 DOI: 10.1037/xlm0001246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 02/08/2023] [Accepted: 03/06/2023] [Indexed: 12/23/2023]
Abstract
Languages differ in the importance of acoustic dimensions for speech categorization. This poses a potential challenge for second language (L2) learners, and the extent to which adult L2 learners can acquire new perceptual strategies for speech categorization remains unclear. This study investigated the effects of extensive English L2 immersion on speech perception strategies and dimension-selective-attention ability in native Mandarin speakers. Experienced first language (L1) Mandarin speakers (length of U.K. residence > 3 years) demonstrated more native-like weighting of cues to L2 suprasegmental categorization relative to inexperienced Mandarin speakers (length of residence < 1 year), weighting duration more highly. However, both the experienced and the inexperienced Mandarin speakers continued to weight duration less highly and pitch more highly during musical beat categorization and struggled to ignore pitch and selectively attend to amplitude in speech, relative to native English speakers. These results suggest that adult L2 experience can lead to retuning of perceptual strategies in specific contexts, but global acoustic salience is more resistant to change. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Collapse
Affiliation(s)
- Katya Petrova
- Department of Culture, Communication & Media, Institute of Education, University College London
| | - Kyle Jasmin
- Department of Psychology, Royal Holloway University of London
| | - Kazuya Saito
- Department of Culture, Communication & Media, Institute of Education, University College London
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck University of London
| |
Collapse
|
4
|
Liu J, Hilton CB, Bergelson E, Mehr SA. Language experience predicts music processing in a half-million speakers of fifty-four languages. Curr Biol 2023; 33:1916-1925.e4. [PMID: 37105166 PMCID: PMC10306420 DOI: 10.1016/j.cub.2023.03.067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 02/08/2023] [Accepted: 03/23/2023] [Indexed: 04/29/2023]
Abstract
Tonal languages differ from other languages in their use of pitch (tones) to distinguish words. Lifelong experience speaking and hearing tonal languages has been argued to shape auditory processing in ways that generalize beyond the perception of linguistic pitch to the perception of pitch in other domains like music. We conducted a meta-analysis of prior studies testing this idea, finding moderate evidence supporting it. But prior studies were limited by mostly small sample sizes representing a small number of languages and countries, making it challenging to disentangle the effects of linguistic experience from variability in music training, cultural differences, and other potential confounds. To address these issues, we used web-based citizen science to assess music perception skill on a global scale in 34,034 native speakers of 19 tonal languages (e.g., Mandarin, Yoruba). We compared their performance to 459,066 native speakers of other languages, including 6 pitch-accented (e.g., Japanese) and 29 non-tonal languages (e.g., Hungarian). Whether or not participants had taken music lessons, native speakers of all 19 tonal languages had an improved ability to discriminate musical melodies on average, relative to speakers of non-tonal languages. But this improvement came with a trade-off: tonal language speakers were also worse at processing the musical beat. The results, which held across native speakers of many diverse languages and were robust to geographic and demographic variation, demonstrate that linguistic experience shapes music perception, with implications for relations between music, language, and culture in the human mind.
Collapse
Affiliation(s)
- Jingxuan Liu
- Columbia Business School, Columbia University, 665 W 130th Street, New York, NY 10027, USA; Department of Psychology & Neuroscience, Duke University, 417 Chapel Drive, Durham, NC 27708, USA.
| | - Courtney B Hilton
- Yale Child Study Center, Yale University, 300 George Street #900, New Haven, CT 06511, USA; School of Psychology, University of Auckland, 23 Symonds Street, Auckland 1010, New Zealand.
| | - Elika Bergelson
- Department of Psychology & Neuroscience, Duke University, 417 Chapel Drive, Durham, NC 27708, USA
| | - Samuel A Mehr
- Yale Child Study Center, Yale University, 300 George Street #900, New Haven, CT 06511, USA; School of Psychology, University of Auckland, 23 Symonds Street, Auckland 1010, New Zealand.
| |
Collapse
|
5
|
Jasmin K, Tierney A, Obasih C, Holt L. Short-term perceptual reweighting in suprasegmental categorization. Psychon Bull Rev 2023; 30:373-382. [PMID: 35915382 PMCID: PMC9971089 DOI: 10.3758/s13423-022-02146-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/05/2022] [Indexed: 11/08/2022]
Abstract
Segmental speech units such as phonemes are described as multidimensional categories whose perception involves contributions from multiple acoustic input dimensions, and the relative perceptual weights of these dimensions respond dynamically to context. For example, when speech is altered to create an "accent" in which two acoustic dimensions are correlated in a manner opposite that of long-term experience, the dimension that carries less perceptual weight is down-weighted to contribute less in category decisions. It remains unclear, however, whether this short-term reweighting extends to perception of suprasegmental features that span multiple phonemes, syllables, or words, in part because it has remained debatable whether suprasegmental features are perceived categorically. Here, we investigated the relative contribution of two acoustic dimensions to word emphasis. Participants categorized instances of a two-word phrase pronounced with typical covariation of fundamental frequency (F0) and duration, and in the context of an artificial "accent" in which F0 and duration (established in prior research on English speech as "primary" and "secondary" dimensions, respectively) covaried atypically. When categorizing "accented" speech, listeners rapidly down-weighted the secondary dimension (duration). This result indicates that listeners continually track short-term regularities across speech input and dynamically adjust the weight of acoustic evidence for suprasegmental decisions. Thus, dimension-based statistical learning appears to be a widespread phenomenon in speech perception extending to both segmental and suprasegmental categorization.
Collapse
Affiliation(s)
- Kyle Jasmin
- Department of Psychology, Wolfson Building, Royal Holloway, University of London, Egham, Surrey, TW20 0EX, UK.
| | | | | | - Lori Holt
- Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
6
|
Saito K, Kachlicka M, Suzukida Y, Petrova K, Lee BJ, Tierney A. Auditory precision hypothesis-L2: Dimension-specific relationships between auditory processing and second language segmental learning. Cognition 2022; 229:105236. [PMID: 36027789 DOI: 10.1016/j.cognition.2022.105236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 01/11/2022] [Accepted: 07/25/2022] [Indexed: 11/29/2022]
Abstract
Growing evidence suggests a broad relationship between individual differences in auditory processing ability and the rate and ultimate attainment of language acquisition throughout the lifespan, including post-pubertal second language (L2) speech learning. However, little is known about how the precision of processing of specific auditory dimensions relates to the acquisition of specific L2 segmental contrasts. In the context of 100 late Japanese-English bilinguals with diverse profiles of classroom and immersion experience, the current study set out to investigate the link between the perception of several auditory dimensions (F3 frequency, F2 frequency, and duration) in non-verbal sounds and English [r]-[l] perception and production proficiency. Whereas participants' biographical factors (the presence/absence of immersion) accounted for a large amount of variance in the success of learning this contrast, the outcomes were also tied to their acuity to the most reliable, new auditory cues (F3 variation) and the less reliable but already-familiar cues (F2 variation). This finding suggests that individuals can vary in terms of how they perceive, utilize, and make the most of information conveyed by specific acoustic dimensions. When perceiving more naturalistic spoken input, where speech contrasts can be distinguished via a combination of numerous cues, some can attain a high-level of L2 speech proficiency by using nativelike and/or non-nativelike strategies in a complementary fashion.
Collapse
|
7
|
Symons AE, Dick F, Tierney AT. Dimension-selective attention and dimensional salience modulate cortical tracking of acoustic dimensions. Neuroimage 2021; 244:118544. [PMID: 34492294 DOI: 10.1016/j.neuroimage.2021.118544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/19/2021] [Accepted: 08/31/2021] [Indexed: 11/17/2022] Open
Abstract
Some theories of auditory categorization suggest that auditory dimensions that are strongly diagnostic for particular categories - for instance voice onset time or fundamental frequency in the case of some spoken consonants - attract attention. However, prior cognitive neuroscience research on auditory selective attention has largely focused on attention to simple auditory objects or streams, and so little is known about the neural mechanisms that underpin dimension-selective attention, or how the relative salience of variations along these dimensions might modulate neural signatures of attention. Here we investigate whether dimensional salience and dimension-selective attention modulate the cortical tracking of acoustic dimensions. In two experiments, participants listened to tone sequences varying in pitch and spectral peak frequency; these two dimensions changed at different rates. Inter-trial phase coherence (ITPC) and amplitude of the EEG signal at the frequencies tagged to pitch and spectral changes provided a measure of cortical tracking of these dimensions. In Experiment 1, tone sequences varied in the size of the pitch intervals, while the size of spectral peak intervals remained constant. Cortical tracking of pitch changes was greater for sequences with larger compared to smaller pitch intervals, with no difference in cortical tracking of spectral peak changes. In Experiment 2, participants selectively attended to either pitch or spectral peak. Cortical tracking was stronger in response to the attended compared to unattended dimension for both pitch and spectral peak. These findings suggest that attention can enhance the cortical tracking of specific acoustic dimensions rather than simply enhancing tracking of the auditory object as a whole.
Collapse
Affiliation(s)
- Ashley E Symons
- Department of Psychological Sciences, Birkbeck College, University of London UK.
| | - Fred Dick
- Department of Psychological Sciences, Birkbeck College, University of London UK; Division of Psychology & Language Sciences, University College London UK
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London UK
| |
Collapse
|
8
|
Jasmin K, Dick F, Tierney AT. The Multidimensional Battery of Prosody Perception (MBOPP). Wellcome Open Res 2021; 5:4. [PMID: 35282675 PMCID: PMC8881696 DOI: 10.12688/wellcomeopenres.15607.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/20/2021] [Indexed: 11/20/2022] Open
Abstract
Prosody can be defined as the rhythm and intonation patterns spanning words, phrases and sentences. Accurate perception of prosody is an important component of many aspects of language processing, such as parsing grammatical structures, recognizing words, and determining where emphasis may be placed. Prosody perception is important for language acquisition and can be impaired in language-related developmental disorders. However, existing assessments of prosodic perception suffer from some shortcomings. These include being unsuitable for use with typically developing adults due to ceiling effects and failing to allow the investigator to distinguish the unique contributions of individual acoustic features such as pitch and temporal cues. Here we present the Multi-Dimensional Battery of Prosody Perception (MBOPP), a novel tool for the assessment of prosody perception. It consists of two subtests: Linguistic Focus, which measures the ability to hear emphasis or sentential stress, and Phrase Boundaries, which measures the ability to hear where in a compound sentence one phrase ends, and another begins. Perception of individual acoustic dimensions (Pitch and Duration) can be examined separately, and test difficulty can be precisely calibrated by the experimenter because stimuli were created using a continuous voice morph space. We present validation analyses from a sample of 59 individuals and discuss how the battery might be deployed to examine perception of prosody in various populations.
Collapse
Affiliation(s)
- Kyle Jasmin
- Department of Psychology, Royal Holloway, University of London, Ehgam, TW20 0EX, UK
| | - Frederic Dick
- Psychological Sciences, Birkbeck University of London, London, WC1E 7HX, UK
| | | |
Collapse
|
9
|
Jasmin K, Dick F, Tierney AT. The Multidimensional Battery of Prosody Perception (MBOPP). Wellcome Open Res 2021; 5:4. [PMID: 35282675 PMCID: PMC8881696 DOI: 10.12688/wellcomeopenres.15607.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/20/2021] [Indexed: 09/03/2023] Open
Abstract
Prosody can be defined as the rhythm and intonation patterns spanning words, phrases and sentences. Accurate perception of prosody is an important component of many aspects of language processing, such as parsing grammatical structures, recognizing words, and determining where emphasis may be placed. Prosody perception is important for language acquisition and can be impaired in language-related developmental disorders. However, existing assessments of prosodic perception suffer from some shortcomings. These include being unsuitable for use with typically developing adults due to ceiling effects and failing to allow the investigator to distinguish the unique contributions of individual acoustic features such as pitch and temporal cues. Here we present the Multi-Dimensional Battery of Prosody Perception (MBOPP), a novel tool for the assessment of prosody perception. It consists of two subtests: Linguistic Focus, which measures the ability to hear emphasis or sentential stress, and Phrase Boundaries, which measures the ability to hear where in a compound sentence one phrase ends, and another begins. Perception of individual acoustic dimensions (Pitch and Duration) can be examined separately, and test difficulty can be precisely calibrated by the experimenter because stimuli were created using a continuous voice morph space. We present validation analyses from a sample of 59 individuals and discuss how the battery might be deployed to examine perception of prosody in various populations.
Collapse
Affiliation(s)
- Kyle Jasmin
- Department of Psychology, Royal Holloway, University of London, Ehgam, TW20 0EX, UK
| | - Frederic Dick
- Psychological Sciences, Birkbeck University of London, London, WC1E 7HX, UK
| | | |
Collapse
|
10
|
Beccacece L, Abondio P, Cilli E, Restani D, Luiselli D. Human Genomics and the Biocultural Origin of Music. Int J Mol Sci 2021; 22:5397. [PMID: 34065521 PMCID: PMC8160972 DOI: 10.3390/ijms22105397] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 05/03/2021] [Accepted: 05/18/2021] [Indexed: 12/11/2022] Open
Abstract
Music is an exclusive feature of humankind. It can be considered as a form of universal communication, only partly comparable to the vocalizations of songbirds. Many trends of research in this field try to address music origins, as well as the genetic bases of musicality. On one hand, several hypotheses have been made on the evolution of music and its role, but there is still debate, and comparative studies suggest a gradual evolution of some abilities underlying musicality in primates. On the other hand, genome-wide studies highlight several genes associated with musical aptitude, confirming a genetic basis for different musical skills which humans show. Moreover, some genes associated with musicality are involved also in singing and song learning in songbirds, suggesting a likely evolutionary convergence between humans and songbirds. This comprehensive review aims at presenting the concept of music as a sociocultural manifestation within the current debate about its biocultural origin and evolutionary function, in the context of the most recent discoveries related to the cross-species genetics of musical production and perception.
Collapse
Affiliation(s)
- Livia Beccacece
- Laboratory of Molecular Anthropology, Department of Biological, Geological and Environmental Sciences, University of Bologna, 40126 Bologna, Italy;
| | - Paolo Abondio
- Laboratory of Molecular Anthropology, Department of Biological, Geological and Environmental Sciences, University of Bologna, 40126 Bologna, Italy;
| | - Elisabetta Cilli
- Department of Cultural Heritage, University of Bologna—Ravenna Campus, 48121 Ravenna, Italy; (E.C.); (D.R.)
| | - Donatella Restani
- Department of Cultural Heritage, University of Bologna—Ravenna Campus, 48121 Ravenna, Italy; (E.C.); (D.R.)
| | - Donata Luiselli
- Department of Cultural Heritage, University of Bologna—Ravenna Campus, 48121 Ravenna, Italy; (E.C.); (D.R.)
| |
Collapse
|
11
|
Jasmin K, Sun H, Tierney AT. Effects of language experience on domain-general perceptual strategies. Cognition 2020; 206:104481. [PMID: 33075568 DOI: 10.1016/j.cognition.2020.104481] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 09/24/2020] [Accepted: 09/26/2020] [Indexed: 12/27/2022]
Abstract
Speech and music are highly redundant communication systems, with multiple acoustic cues signaling the existence of perceptual categories. This redundancy makes these systems robust to the influence of noise, but necessitates the development of perceptual strategies: listeners need to decide how much importance to place on each source of information. Prior empirical work and modeling has suggested that cue weights primarily reflect within-task statistical learning, as listeners assess the reliability with which different acoustic dimensions signal a category and modify their weights accordingly. Here we present evidence that perceptual experience can lead to changes in cue weighting that extend across tasks and across domains, suggesting that perceptual strategies reflect both global biases and local (i.e. task-specific) learning. In two experiments, native speakers of Mandarin (N = 45)-where pitch is a crucial cue to word identity-placed more importance on pitch and less importance on other dimensions compared to native speakers of non-tonal languages English (N = 45) and Spanish (N = 27), during the perception of both English speech and musical beats. In a third experiment, we further show that Mandarin speakers are better able to attend to pitch and ignore irrelevant variation in other dimensions in speech compared to English and Spanish speakers, and even struggle to ignore pitch when asked to attend to other dimensions. Thus, an individual's idiosyncratic auditory perceptual strategy reflects a complex mixture of congenital predispositions, task-specific learning, and biases instilled by extensive experience in making use of important dimensions in their native language.
Collapse
Affiliation(s)
- Kyle Jasmin
- Department of Psychological Sciences, Birkbeck College, University of London, UK.
| | - Hui Sun
- Department of Psychological Sciences, Birkbeck College, University of London, UK
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London, UK
| |
Collapse
|
12
|
Jasmin K, Dick F, Stewart L, Tierney AT. Altered functional connectivity during speech perception in congenital amusia. eLife 2020; 9:e53539. [PMID: 32762842 PMCID: PMC7449693 DOI: 10.7554/elife.53539] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Accepted: 08/03/2020] [Indexed: 12/11/2022] Open
Abstract
Individuals with congenital amusia have a lifelong history of unreliable pitch processing. Accordingly, they downweight pitch cues during speech perception and instead rely on other dimensions such as duration. We investigated the neural basis for this strategy. During fMRI, individuals with amusia (N = 15) and controls (N = 15) read sentences where a comma indicated a grammatical phrase boundary. They then heard two sentences spoken that differed only in pitch and/or duration cues and selected the best match for the written sentence. Prominent reductions in functional connectivity were detected in the amusia group between left prefrontal language-related regions and right hemisphere pitch-related regions, which reflected the between-group differences in cue weights in the same groups of listeners. Connectivity differences between these regions were not present during a control task. Our results indicate that the reliability of perceptual dimensions is linked with functional connectivity between frontal and perceptual regions and suggest a compensatory mechanism.
Collapse
Affiliation(s)
- Kyle Jasmin
- Department of Psychological Sciences, Birkbeck University of LondonLondonUnited Kingdom
- UCL Institute of Cognitive Neuroscience, University College LondonLondonUnited Kingdom
| | - Frederic Dick
- Department of Psychological Sciences, Birkbeck University of LondonLondonUnited Kingdom
- Department of Experimental Psychology, University College LondonLondonUnited Kingdom
| | - Lauren Stewart
- Department of Psychology, Goldsmiths University of LondonLondonUnited Kingdom
| | - Adam Taylor Tierney
- Department of Psychological Sciences, Birkbeck University of LondonLondonUnited Kingdom
| |
Collapse
|
13
|
Greenlaw KM, Puschmann S, Coffey EBJ. Decoding of Envelope vs. Fundamental Frequency During Complex Auditory Stream Segregation. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2020; 1:268-287. [PMID: 37215227 PMCID: PMC10158587 DOI: 10.1162/nol_a_00013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Accepted: 04/25/2020] [Indexed: 05/24/2023]
Abstract
Hearing-in-noise perception is a challenging task that is critical to human function, but how the brain accomplishes it is not well understood. A candidate mechanism proposes that the neural representation of an attended auditory stream is enhanced relative to background sound via a combination of bottom-up and top-down mechanisms. To date, few studies have compared neural representation and its task-related enhancement across frequency bands that carry different auditory information, such as a sound's amplitude envelope (i.e., syllabic rate or rhythm; 1-9 Hz), and the fundamental frequency of periodic stimuli (i.e., pitch; >40 Hz). Furthermore, hearing-in-noise in the real world is frequently both messier and richer than the majority of tasks used in its study. In the present study, we use continuous sound excerpts that simultaneously offer predictive, visual, and spatial cues to help listeners separate the target from four acoustically similar simultaneously presented sound streams. We show that while both lower and higher frequency information about the entire sound stream is represented in the brain's response, the to-be-attended sound stream is strongly enhanced only in the slower, lower frequency sound representations. These results are consistent with the hypothesis that attended sound representations are strengthened progressively at higher level, later processing stages, and that the interaction of multiple brain systems can aid in this process. Our findings contribute to our understanding of auditory stream separation in difficult, naturalistic listening conditions and demonstrate that pitch and envelope information can be decoded from single-channel EEG data.
Collapse
Affiliation(s)
- Keelin M. Greenlaw
- Department of Psychology, Concordia University, Montreal, QC, Canada
- International Laboratory for Brain, Music and Sound Research (BRAMS)
- The Centre for Research on Brain, Language and Music (CRBLM)
| | | | | |
Collapse
|
14
|
Parthasarathy A, Hancock KE, Bennett K, DeGruttola V, Polley DB. Bottom-up and top-down neural signatures of disordered multi-talker speech perception in adults with normal hearing. eLife 2020; 9:e51419. [PMID: 31961322 PMCID: PMC6974362 DOI: 10.7554/elife.51419] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 12/15/2019] [Indexed: 12/16/2022] Open
Abstract
In social settings, speech waveforms from nearby speakers mix together in our ear canals. Normally, the brain unmixes the attended speech stream from the chorus of background speakers using a combination of fast temporal processing and cognitive active listening mechanisms. Of >100,000 patient records,~10% of adults visited our clinic because of reduced hearing, only to learn that their hearing was clinically normal and should not cause communication difficulties. We found that multi-talker speech intelligibility thresholds varied widely in normal hearing adults, but could be predicted from neural phase-locking to frequency modulation (FM) cues measured with ear canal EEG recordings. Combining neural temporal fine structure processing, pupil-indexed listening effort, and behavioral FM thresholds accounted for 78% of the variability in multi-talker speech intelligibility. The disordered bottom-up and top-down markers of poor multi-talker speech perception identified here could inform the design of next-generation clinical tests for hidden hearing disorders.
Collapse
Affiliation(s)
- Aravindakshan Parthasarathy
- Eaton-Peabody LaboratoriesMassachusetts Eye and Ear InfirmaryBostonUnited States
- Department of Otolaryngology – Head and Neck SurgeryHarvard Medical SchoolBostonUnited States
| | - Kenneth E Hancock
- Eaton-Peabody LaboratoriesMassachusetts Eye and Ear InfirmaryBostonUnited States
- Department of Otolaryngology – Head and Neck SurgeryHarvard Medical SchoolBostonUnited States
| | - Kara Bennett
- Bennett Statistical Consulting IncBallstonUnited States
| | - Victor DeGruttola
- Department of BiostatisticsHarvard TH Chan School of Public HealthBostonUnited States
| | - Daniel B Polley
- Eaton-Peabody LaboratoriesMassachusetts Eye and Ear InfirmaryBostonUnited States
- Department of Otolaryngology – Head and Neck SurgeryHarvard Medical SchoolBostonUnited States
| |
Collapse
|