1
|
Park JJ, Baek SC, Suh MW, Choi J, Kim SJ, Lim Y. The effect of topic familiarity and volatility of auditory scene on selective auditory attention. Hear Res 2023; 433:108770. [PMID: 37104990 DOI: 10.1016/j.heares.2023.108770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 04/06/2023] [Accepted: 04/15/2023] [Indexed: 04/29/2023]
Abstract
Selective auditory attention has been shown to modulate the cortical representation of speech. This effect has been well documented in acoustically more challenging environments. However, the influence of top-down factors, in particular topic familiarity, on this process remains unclear, despite evidence that semantic information can promote speech-in-noise perception. Apart from individual features forming a static listening condition, dynamic and irregular changes of auditory scenes-volatile listening environments-have been less studied. To address these gaps, we explored the influence of topic familiarity and volatile listening on the selective auditory attention process during dichotic listening using electroencephalography. When stories with unfamiliar topics were presented, participants' comprehension was severely degraded. However, their cortical activity selectively tracked the speech of the target story well. This implies that topic familiarity hardly influences the speech tracking neural index, possibly when the bottom-up information is sufficient. However, when the listening environment was volatile and the listeners had to re-engage in new speech whenever auditory scenes altered, the neural correlates of the attended speech were degraded. In particular, the cortical response to the attended speech and the spatial asymmetry of the response to the left and right attention were significantly attenuated around 100-200 ms after the speech onset. These findings suggest that volatile listening environments could adversely affect the modulation effect of selective attention, possibly by hampering proper attention due to increased perceptual load.
Collapse
Affiliation(s)
- Jonghwa Jeonglok Park
- Center for Intelligent & Interactive Robotics, Artificial Intelligence and Robot Institute, Korea Institute of Science and Technology, Seoul 02792, South Korea; Department of Electrical and Computer Engineering, College of Engineering, Seoul National University, Seoul 08826, South Korea
| | - Seung-Cheol Baek
- Center for Intelligent & Interactive Robotics, Artificial Intelligence and Robot Institute, Korea Institute of Science and Technology, Seoul 02792, South Korea; Research Group Neurocognition of Music and Language, Max Planck Institute for Empirical Aesthetics, Grüneburgweg 14, Frankfurt am Main 60322, Germany
| | - Myung-Whan Suh
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul 03080, South Korea
| | - Jongsuk Choi
- Center for Intelligent & Interactive Robotics, Artificial Intelligence and Robot Institute, Korea Institute of Science and Technology, Seoul 02792, South Korea; Department of AI Robotics, KIST School, Korea University of Science and Technology, Seoul 02792, South Korea
| | - Sung June Kim
- Department of Electrical and Computer Engineering, College of Engineering, Seoul National University, Seoul 08826, South Korea
| | - Yoonseob Lim
- Center for Intelligent & Interactive Robotics, Artificial Intelligence and Robot Institute, Korea Institute of Science and Technology, Seoul 02792, South Korea; Department of HY-KIST Bio-convergence, Hanyang University, Seoul 04763, South Korea.
| |
Collapse
|
2
|
Simon JZ, Commuri V, Kulasingham JP. Time-locked auditory cortical responses in the high-gamma band: A window into primary auditory cortex. Front Neurosci 2022; 16:1075369. [PMID: 36570848 PMCID: PMC9773383 DOI: 10.3389/fnins.2022.1075369] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 11/24/2022] [Indexed: 12/13/2022] Open
Abstract
Primary auditory cortex is a critical stage in the human auditory pathway, a gateway between subcortical and higher-level cortical areas. Receiving the output of all subcortical processing, it sends its output on to higher-level cortex. Non-invasive physiological recordings of primary auditory cortex using electroencephalography (EEG) and magnetoencephalography (MEG), however, may not have sufficient specificity to separate responses generated in primary auditory cortex from those generated in underlying subcortical areas or neighboring cortical areas. This limitation is important for investigations of effects of top-down processing (e.g., selective-attention-based) on primary auditory cortex: higher-level areas are known to be strongly influenced by top-down processes, but subcortical areas are often assumed to perform strictly bottom-up processing. Fortunately, recent advances have made it easier to isolate the neural activity of primary auditory cortex from other areas. In this perspective, we focus on time-locked responses to stimulus features in the high gamma band (70-150 Hz) and with early cortical latency (∼40 ms), intermediate between subcortical and higher-level areas. We review recent findings from physiological studies employing either repeated simple sounds or continuous speech, obtaining either a frequency following response (FFR) or temporal response function (TRF). The potential roles of top-down processing are underscored, and comparisons with invasive intracranial EEG (iEEG) and animal model recordings are made. We argue that MEG studies employing continuous speech stimuli may offer particular benefits, in that only a few minutes of speech generates robust high gamma responses from bilateral primary auditory cortex, and without measurable interference from subcortical or higher-level areas.
Collapse
Affiliation(s)
- Jonathan Z. Simon
- Department of Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD, United States,Department of Biology, University of Maryland, College Park, College Park, MD, United States,Institute for Systems Research, University of Maryland, College Park, College Park, MD, United States,*Correspondence: Jonathan Z. Simon,
| | - Vrishab Commuri
- Department of Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD, United States
| | | |
Collapse
|
3
|
Aller M, Økland HS, MacGregor LJ, Blank H, Davis MH. Differential Auditory and Visual Phase-Locking Are Observed during Audio-Visual Benefit and Silent Lip-Reading for Speech Perception. J Neurosci 2022; 42:6108-6120. [PMID: 35760528 PMCID: PMC9351641 DOI: 10.1523/jneurosci.2476-21.2022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 04/04/2022] [Accepted: 04/12/2022] [Indexed: 11/21/2022] Open
Abstract
Speech perception in noisy environments is enhanced by seeing facial movements of communication partners. However, the neural mechanisms by which audio and visual speech are combined are not fully understood. We explore MEG phase-locking to auditory and visual signals in MEG recordings from 14 human participants (6 females, 8 males) that reported words from single spoken sentences. We manipulated the acoustic clarity and visual speech signals such that critical speech information is present in auditory, visual, or both modalities. MEG coherence analysis revealed that both auditory and visual speech envelopes (auditory amplitude modulations and lip aperture changes) were phase-locked to 2-6 Hz brain responses in auditory and visual cortex, consistent with entrainment to syllable-rate components. Partial coherence analysis was used to separate neural responses to correlated audio-visual signals and showed non-zero phase-locking to auditory envelope in occipital cortex during audio-visual (AV) speech. Furthermore, phase-locking to auditory signals in visual cortex was enhanced for AV speech compared with audio-only speech that was matched for intelligibility. Conversely, auditory regions of the superior temporal gyrus did not show above-chance partial coherence with visual speech signals during AV conditions but did show partial coherence in visual-only conditions. Hence, visual speech enabled stronger phase-locking to auditory signals in visual areas, whereas phase-locking of visual speech in auditory regions only occurred during silent lip-reading. Differences in these cross-modal interactions between auditory and visual speech signals are interpreted in line with cross-modal predictive mechanisms during speech perception.SIGNIFICANCE STATEMENT Verbal communication in noisy environments is challenging, especially for hearing-impaired individuals. Seeing facial movements of communication partners improves speech perception when auditory signals are degraded or absent. The neural mechanisms supporting lip-reading or audio-visual benefit are not fully understood. Using MEG recordings and partial coherence analysis, we show that speech information is used differently in brain regions that respond to auditory and visual speech. While visual areas use visual speech to improve phase-locking to auditory speech signals, auditory areas do not show phase-locking to visual speech unless auditory speech is absent and visual speech is used to substitute for missing auditory signals. These findings highlight brain processes that combine visual and auditory signals to support speech understanding.
Collapse
Affiliation(s)
- Máté Aller
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Heidi Solberg Økland
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Lucy J MacGregor
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| | - Helen Blank
- University Medical Center Hamburg-Eppendorf, Hamburg, 20246, Germany
| | - Matthew H Davis
- MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, CB2 7EF, United Kingdom
| |
Collapse
|
4
|
Corcoran AW, Perera R, Koroma M, Kouider S, Hohwy J, Andrillon T. Expectations boost the reconstruction of auditory features from electrophysiological responses to noisy speech. Cereb Cortex 2022; 33:691-708. [PMID: 35253871 PMCID: PMC9890472 DOI: 10.1093/cercor/bhac094] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/11/2022] [Accepted: 02/12/2022] [Indexed: 02/04/2023] Open
Abstract
Online speech processing imposes significant computational demands on the listening brain, the underlying mechanisms of which remain poorly understood. Here, we exploit the perceptual "pop-out" phenomenon (i.e. the dramatic improvement of speech intelligibility after receiving information about speech content) to investigate the neurophysiological effects of prior expectations on degraded speech comprehension. We recorded electroencephalography (EEG) and pupillometry from 21 adults while they rated the clarity of noise-vocoded and sine-wave synthesized sentences. Pop-out was reliably elicited following visual presentation of the corresponding written sentence, but not following incongruent or neutral text. Pop-out was associated with improved reconstruction of the acoustic stimulus envelope from low-frequency EEG activity, implying that improvements in perceptual clarity were mediated via top-down signals that enhanced the quality of cortical speech representations. Spectral analysis further revealed that pop-out was accompanied by a reduction in theta-band power, consistent with predictive coding accounts of acoustic filling-in and incremental sentence processing. Moreover, delta-band power, alpha-band power, and pupil diameter were all increased following the provision of any written sentence information, irrespective of content. Together, these findings reveal distinctive profiles of neurophysiological activity that differentiate the content-specific processes associated with degraded speech comprehension from the context-specific processes invoked under adverse listening conditions.
Collapse
Affiliation(s)
- Andrew W Corcoran
- Corresponding author: Room E672, 20 Chancellors Walk, Clayton, VIC 3800, Australia.
| | - Ricardo Perera
- Cognition & Philosophy Laboratory, School of Philosophical, Historical, and International Studies, Monash University, Melbourne, VIC 3800 Australia
| | - Matthieu Koroma
- Brain and Consciousness Group (ENS, EHESS, CNRS), Département d’Études Cognitives, École Normale Supérieure-PSL Research University, Paris 75005, France
| | - Sid Kouider
- Brain and Consciousness Group (ENS, EHESS, CNRS), Département d’Études Cognitives, École Normale Supérieure-PSL Research University, Paris 75005, France
| | - Jakob Hohwy
- Cognition & Philosophy Laboratory, School of Philosophical, Historical, and International Studies, Monash University, Melbourne, VIC 3800 Australia,Monash Centre for Consciousness & Contemplative Studies, Monash University, Melbourne, VIC 3800 Australia
| | - Thomas Andrillon
- Monash Centre for Consciousness & Contemplative Studies, Monash University, Melbourne, VIC 3800 Australia,Paris Brain Institute, Sorbonne Université, Inserm-CNRS, Paris 75013, France
| |
Collapse
|
5
|
Regev M, Halpern AR, Owen AM, Patel AD, Zatorre RJ. Mapping Specific Mental Content during Musical Imagery. Cereb Cortex 2021; 31:3622-3640. [PMID: 33749742 DOI: 10.1093/cercor/bhab036] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Revised: 02/02/2021] [Accepted: 02/05/2021] [Indexed: 11/12/2022] Open
Abstract
Humans can mentally represent auditory information without an external stimulus, but the specificity of these internal representations remains unclear. Here, we asked how similar the temporally unfolding neural representations of imagined music are compared to those during the original perceived experience. We also tested whether rhythmic motion can influence the neural representation of music during imagery as during perception. Participants first memorized six 1-min-long instrumental musical pieces with high accuracy. Functional MRI data were collected during: 1) silent imagery of melodies to the beat of a visual metronome; 2) same but while tapping to the beat; and 3) passive listening. During imagery, inter-subject correlation analysis showed that melody-specific temporal response patterns were reinstated in right associative auditory cortices. When tapping accompanied imagery, the melody-specific neural patterns were reinstated in more extensive temporal-lobe regions bilaterally. These results indicate that the specific contents of conscious experience are encoded similarly during imagery and perception in the dynamic activity of auditory cortices. Furthermore, rhythmic motion can enhance the reinstatement of neural patterns associated with the experience of complex sounds, in keeping with models of motor to sensory influences in auditory processing.
Collapse
Affiliation(s)
- Mor Regev
- Montreal Neurological Institute, McGill University, Montreal, QC H3A 2B4, Canada.,International Laboratory for Brain, Music and Sound Research, Montreal, QC H2V 2J2, Canada.,Centre for Research in Language, Brain, and Music, Montreal, QC H3A 1E3, Canada
| | - Andrea R Halpern
- Department of Psychology, Bucknell University, Lewisburg, PA 17837, USA
| | - Adrian M Owen
- Brain and Mind Institute, Department of Psychology and Department of Physiology and Pharmacology, Western University, London, ON N6A 5B7, Canada.,Canadian Institute for Advanced Research, Brain, Mind, and Consciousness program
| | - Aniruddh D Patel
- Canadian Institute for Advanced Research, Brain, Mind, and Consciousness program.,Department of Psychology, Tufts University, Medford, MA 02155, USA
| | - Robert J Zatorre
- Montreal Neurological Institute, McGill University, Montreal, QC H3A 2B4, Canada.,International Laboratory for Brain, Music and Sound Research, Montreal, QC H2V 2J2, Canada.,Centre for Research in Language, Brain, and Music, Montreal, QC H3A 1E3, Canada.,Canadian Institute for Advanced Research, Brain, Mind, and Consciousness program
| |
Collapse
|
6
|
Tremblay P, Basirat A, Pinto S, Sato M. Visual prediction cues can facilitate behavioural and neural speech processing in young and older adults. Neuropsychologia 2021; 159:107949. [PMID: 34228997 DOI: 10.1016/j.neuropsychologia.2021.107949] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Revised: 06/16/2021] [Accepted: 07/01/2021] [Indexed: 02/06/2023]
Abstract
The ability to process speech evolves over the course of the lifespan. Understanding speech at low acoustic intensity and in the presence of background noise becomes harder, and the ability for older adults to benefit from audiovisual speech also appears to decline. These difficulties can have important consequences on quality of life. Yet, a consensus on the cause of these difficulties is still lacking. The objective of this study was to examine the processing of speech in young and older adults under different modalities (i.e. auditory [A], visual [V], audiovisual [AV]) and in the presence of different visual prediction cues (i.e., no predictive cue (control), temporal predictive cue, phonetic predictive cue, and combined temporal and phonetic predictive cues). We focused on recognition accuracy and four auditory evoked potential (AEP) components: P1-N1-P2 and N2. Thirty-four right-handed French-speaking adults were recruited, including 17 younger adults (28 ± 2 years; 20-42 years) and 17 older adults (67 ± 3.77 years; 60-73 years). Participants completed a forced-choice speech identification task. The main findings of the study are: (1) The faciliatory effect of visual information was reduced, but present, in older compared to younger adults, (2) visual predictive cues facilitated speech recognition in younger and older adults alike, (3) age differences in AEPs were localized to later components (P2 and N2), suggesting that aging predominantly affects higher-order cortical processes related to speech processing rather than lower-level auditory processes. (4) Specifically, AV facilitation on P2 amplitude was lower in older adults, there was a reduced effect of the temporal predictive cue on N2 amplitude for older compared to younger adults, and P2 and N2 latencies were longer for older adults. Finally (5) behavioural performance was associated with P2 amplitude in older adults. Our results indicate that aging affects speech processing at multiple levels, including audiovisual integration (P2) and auditory attentional processes (N2). These findings have important implications for understanding barriers to communication in older ages, as well as for the development of compensation strategies for those with speech processing difficulties.
Collapse
Affiliation(s)
- Pascale Tremblay
- Département de Réadaptation, Faculté de Médecine, Université Laval, Quebec City, Canada; Cervo Brain Research Centre, Quebec City, Canada.
| | - Anahita Basirat
- Univ. Lille, CNRS, UMR 9193 - SCALab - Sciences Cognitives et Sciences Affectives, Lille, France
| | - Serge Pinto
- France Aix Marseille Univ, CNRS, LPL, Aix-en-Provence, France
| | - Marc Sato
- France Aix Marseille Univ, CNRS, LPL, Aix-en-Provence, France
| |
Collapse
|
7
|
Abstract
Speech processing in the human brain is grounded in non-specific auditory processing in the general mammalian brain, but relies on human-specific adaptations for processing speech and language. For this reason, many recent neurophysiological investigations of speech processing have turned to the human brain, with an emphasis on continuous speech. Substantial progress has been made using the phenomenon of "neural speech tracking", in which neurophysiological responses time-lock to the rhythm of auditory (and other) features in continuous speech. One broad category of investigations concerns the extent to which speech tracking measures are related to speech intelligibility, which has clinical applications in addition to its scientific importance. Recent investigations have also focused on disentangling different neural processes that contribute to speech tracking. The two lines of research are closely related, since processing stages throughout auditory cortex contribute to speech comprehension, in addition to subcortical processing and higher order and attentional processes.
Collapse
Affiliation(s)
- Christian Brodbeck
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, U.S.A
| | - Jonathan Z. Simon
- Institute for Systems Research, University of Maryland, College Park, Maryland 20742, U.S.A
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, U.S.A
- Department of Biology, University of Maryland, College Park, Maryland 20742, U.S.A
| |
Collapse
|
8
|
Brodbeck C, Jiao A, Hong LE, Simon JZ. Neural speech restoration at the cocktail party: Auditory cortex recovers masked speech of both attended and ignored speakers. PLoS Biol 2020; 18:e3000883. [PMID: 33091003 PMCID: PMC7644085 DOI: 10.1371/journal.pbio.3000883] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 11/05/2020] [Accepted: 09/14/2020] [Indexed: 01/09/2023] Open
Abstract
Humans are remarkably skilled at listening to one speaker out of an acoustic mixture of several speech sources. Two speakers are easily segregated, even without binaural cues, but the neural mechanisms underlying this ability are not well understood. One possibility is that early cortical processing performs a spectrotemporal decomposition of the acoustic mixture, allowing the attended speech to be reconstructed via optimally weighted recombinations that discount spectrotemporal regions where sources heavily overlap. Using human magnetoencephalography (MEG) responses to a 2-talker mixture, we show evidence for an alternative possibility, in which early, active segregation occurs even for strongly spectrotemporally overlapping regions. Early (approximately 70-millisecond) responses to nonoverlapping spectrotemporal features are seen for both talkers. When competing talkers’ spectrotemporal features mask each other, the individual representations persist, but they occur with an approximately 20-millisecond delay. This suggests that the auditory cortex recovers acoustic features that are masked in the mixture, even if they occurred in the ignored speech. The existence of such noise-robust cortical representations, of features present in attended as well as ignored speech, suggests an active cortical stream segregation process, which could explain a range of behavioral effects of ignored background speech. How do humans focus on one speaker when several are talking? MEG responses to a continuous two-talker mixture suggest that, even though listeners attend only to one of the talkers, their auditory cortex tracks acoustic features from both speakers. This occurs even when those features are locally masked by the other speaker.
Collapse
Affiliation(s)
- Christian Brodbeck
- Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- * E-mail:
| | - Alex Jiao
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
| | - L. Elliot Hong
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| | - Jonathan Z. Simon
- Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America
- Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland, United States of America
- Department of Biology, University of Maryland, College Park, Maryland, United States of America
| |
Collapse
|
9
|
Decruy L, Lesenfants D, Vanthornhout J, Francart T. Top-down modulation of neural envelope tracking: The interplay with behavioral, self-report and neural measures of listening effort. Eur J Neurosci 2020; 52:3375-3393. [PMID: 32306466 DOI: 10.1111/ejn.14753] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 04/09/2020] [Accepted: 04/11/2020] [Indexed: 11/27/2022]
Abstract
When listening to natural speech, our brain activity tracks the slow amplitude modulations of speech, also called the speech envelope. Moreover, recent research has demonstrated that this neural envelope tracking can be affected by top-down processes. The present study was designed to examine if neural envelope tracking is modulated by the effort that a person expends during listening. Five measures were included to quantify listening effort: two behavioral measures based on a novel dual-task paradigm, a self-report effort measure and two neural measures related to phase synchronization and alpha power. Electroencephalography responses to sentences, presented at a wide range of subject-specific signal-to-noise ratios, were recorded in thirteen young, normal-hearing adults. A comparison of the five measures revealed different effects of listening effort as a function of speech understanding. Reaction times on the primary task and self-reported effort decreased with increasing speech understanding. In contrast, reaction times on the secondary task and alpha power showed a peak-shaped behavior with highest effort at intermediate speech understanding levels. With regard to neural envelope tracking, we found that the reaction times on the secondary task and self-reported effort explained a small part of the variability in theta-band envelope tracking. Speech understanding was found to strongly modulate neural envelope tracking. More specifically, our results demonstrated a robust increase in envelope tracking with increasing speech understanding. The present study provides new insights in the relations among different effort measures and highlights the potential of neural envelope tracking to objectively measure speech understanding in young, normal-hearing adults.
Collapse
Affiliation(s)
- Lien Decruy
- Department of Neurosciences Research, Group Experimental Oto-rhino-laryngology (ExpORL), KU Leuven, Leuven, Belgium
| | - Damien Lesenfants
- Department of Neurosciences Research, Group Experimental Oto-rhino-laryngology (ExpORL), KU Leuven, Leuven, Belgium
| | - Jonas Vanthornhout
- Department of Neurosciences Research, Group Experimental Oto-rhino-laryngology (ExpORL), KU Leuven, Leuven, Belgium
| | - Tom Francart
- Department of Neurosciences Research, Group Experimental Oto-rhino-laryngology (ExpORL), KU Leuven, Leuven, Belgium
| |
Collapse
|
10
|
Teng X, Ma M, Yang J, Blohm S, Cai Q, Tian X. Constrained Structure of Ancient Chinese Poetry Facilitates Speech Content Grouping. Curr Biol 2020; 30:1299-1305.e7. [PMID: 32142700 DOI: 10.1016/j.cub.2020.01.059] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 11/08/2019] [Accepted: 01/17/2020] [Indexed: 11/19/2022]
Abstract
Ancient Chinese poetry is constituted by structured language that deviates from ordinary language usage [1, 2]; its poetic genres impose unique combinatory constraints on linguistic elements [3]. How does the constrained poetic structure facilitate speech segmentation when common linguistic [4-8] and statistical cues [5, 9] are unreliable to listeners in poems? We generated artificial Jueju, which arguably has the most constrained structure in ancient Chinese poetry, and presented each poem twice as an isochronous sequence of syllables to native Mandarin speakers while conducting magnetoencephalography (MEG) recording. We found that listeners deployed their prior knowledge of Jueju to build the line structure and to establish the conceptual flow of Jueju. Unprecedentedly, we found a phase precession phenomenon indicating predictive processes of speech segmentation-the neural phase advanced faster after listeners acquired knowledge of incoming speech. The statistical co-occurrence of monosyllabic words in Jueju negatively correlated with speech segmentation, which provides an alternative perspective on how statistical cues facilitate speech segmentation. Our findings suggest that constrained poetic structures serve as a temporal map for listeners to group speech contents and to predict incoming speech signals. Listeners can parse speech streams by using not only grammatical and statistical cues but also their prior knowledge of the form of language. VIDEO ABSTRACT.
Collapse
Affiliation(s)
- Xiangbin Teng
- Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Frankfurt 60322, Germany
| | - Min Ma
- Google Inc., 111 8th Avenue, New York, NY 10010, United States
| | - Jinbiao Yang
- Division of Arts and Sciences, New York University Shanghai, Shanghai 200122, China; NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China; Max Planck Institute for Psycholinguistics, Wundtlaan 1, Nijmegen 6525 XD, the Netherlands; Centre for Language Studies, Radboud University, Erasmusplein 1, Nijmegen 6525 HT, the Netherlands
| | - Stefan Blohm
- Department of Neuroscience, Max Planck Institute for Empirical Aesthetics, Frankfurt 60322, Germany
| | - Qing Cai
- NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China; Key Laboratory of Brain Functional Genomics (MOE & STCSM), Shanghai Changning-ECNU Mental Health Center, Institute of Cognitive Neuroscience, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China
| | - Xing Tian
- Division of Arts and Sciences, New York University Shanghai, Shanghai 200122, China; NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai 200062, China; Key Laboratory of Brain Functional Genomics (MOE & STCSM), Shanghai Changning-ECNU Mental Health Center, Institute of Cognitive Neuroscience, School of Psychology and Cognitive Science, East China Normal University, Shanghai 200062, China.
| |
Collapse
|