1
|
Etani T, Miura A, Kawase S, Fujii S, Keller PE, Vuust P, Kudo K. A review of psychological and neuroscientific research on musical groove. Neurosci Biobehav Rev 2024; 158:105522. [PMID: 38141692 DOI: 10.1016/j.neubiorev.2023.105522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 12/18/2023] [Accepted: 12/19/2023] [Indexed: 12/25/2023]
Abstract
When listening to music, we naturally move our bodies rhythmically to the beat, which can be pleasurable and difficult to resist. This pleasurable sensation of wanting to move the body to music has been called "groove." Following pioneering humanities research, psychological and neuroscientific studies have provided insights on associated musical features, behavioral responses, phenomenological aspects, and brain structural and functional correlates of the groove experience. Groove research has advanced the field of music science and more generally informed our understanding of bidirectional links between perception and action, and the role of the motor system in prediction. Activity in motor and reward-related brain networks during music listening is associated with the groove experience, and this neural activity is linked to temporal prediction and learning. This article reviews research on groove as a psychological phenomenon with neurophysiological correlates that link musical rhythm perception, sensorimotor prediction, and reward processing. Promising future research directions range from elucidating specific neural mechanisms to exploring clinical applications and socio-cultural implications of groove.
Collapse
Affiliation(s)
- Takahide Etani
- School of Medicine, College of Medical, Pharmaceutical, and Health, Kanazawa University, Kanazawa, Japan; Graduate School of Media and Governance, Keio University, Fujisawa, Japan; Advanced Research Center for Human Sciences, Waseda University, Tokorozawa, Japan.
| | - Akito Miura
- Faculty of Human Sciences, Waseda University, Tokorozawa, Japan
| | - Satoshi Kawase
- The Faculty of Psychology, Kobe Gakuin University, Kobe, Japan
| | - Shinya Fujii
- Faculty of Environment and Information Studies, Keio University, Fujisawa, Japan
| | - Peter E Keller
- Center for Music in the Brain, Aarhus University, Aarhus, Denmark/The Royal Academy of Music Aarhus/Aalborg, Denmark; The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Penrith, Australia
| | - Peter Vuust
- Center for Music in the Brain, Aarhus University, Aarhus, Denmark/The Royal Academy of Music Aarhus/Aalborg, Denmark
| | - Kazutoshi Kudo
- Graduate School of Arts and Sciences, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
2
|
Su Y, MacGregor LJ, Olasagasti I, Giraud AL. A deep hierarchy of predictions enables online meaning extraction in a computational model of human speech comprehension. PLoS Biol 2023; 21:e3002046. [PMID: 36947552 PMCID: PMC10079236 DOI: 10.1371/journal.pbio.3002046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 04/06/2023] [Accepted: 02/22/2023] [Indexed: 03/23/2023] Open
Abstract
Understanding speech requires mapping fleeting and often ambiguous soundwaves to meaning. While humans are known to exploit their capacity to contextualize to facilitate this process, how internal knowledge is deployed online remains an open question. Here, we present a model that extracts multiple levels of information from continuous speech online. The model applies linguistic and nonlinguistic knowledge to speech processing, by periodically generating top-down predictions and incorporating bottom-up incoming evidence in a nested temporal hierarchy. We show that a nonlinguistic context level provides semantic predictions informed by sensory inputs, which are crucial for disambiguating among multiple meanings of the same word. The explicit knowledge hierarchy of the model enables a more holistic account of the neurophysiological responses to speech compared to using lexical predictions generated by a neural network language model (GPT-2). We also show that hierarchical predictions reduce peripheral processing via minimizing uncertainty and prediction error. With this proof-of-concept model, we demonstrate that the deployment of hierarchical predictions is a possible strategy for the brain to dynamically utilize structured knowledge and make sense of the speech input.
Collapse
Affiliation(s)
- Yaqing Su
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Swiss National Centre of Competence in Research "Evolving Language" (NCCR EvolvingLanguage), Geneva, Switzerland
| | - Lucy J MacGregor
- Medical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom
| | - Itsaso Olasagasti
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Swiss National Centre of Competence in Research "Evolving Language" (NCCR EvolvingLanguage), Geneva, Switzerland
| | - Anne-Lise Giraud
- Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland
- Swiss National Centre of Competence in Research "Evolving Language" (NCCR EvolvingLanguage), Geneva, Switzerland
- Institut Pasteur, Université Paris Cité, Inserm, Institut de l'Audition, Paris, France
| |
Collapse
|
3
|
Jiang Y, Komatsu M, Chen Y, Xie R, Zhang K, Xia Y, Gui P, Liang Z, Wang L. Constructing the hierarchy of predictive auditory sequences in the marmoset brain. eLife 2022; 11:74653. [PMID: 35174784 PMCID: PMC8893719 DOI: 10.7554/elife.74653] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 02/16/2022] [Indexed: 11/13/2022] Open
Abstract
Our brains constantly generate predictions of sensory input that are compared with actual inputs, propagate the prediction-errors through a hierarchy of brain regions, and subsequently update the internal predictions of the world. However, the essential feature of predictive coding, the notion of hierarchical depth and its neural mechanisms, remains largely unexplored. Here, we investigated the hierarchical depth of predictive auditory processing by combining functional magnetic resonance imaging (fMRI) and high-density whole-brain electrocorticography (ECoG) in marmoset monkeys during an auditory local-global paradigm in which the temporal regularities of the stimuli were designed at two hierarchical levels. The prediction-errors and prediction updates were examined as neural responses to auditory mismatches and omissions. Using fMRI, we identified a hierarchical gradient along the auditory pathway: midbrain and sensory regions represented local, shorter-time-scale predictive processing followed by associative auditory regions, whereas anterior temporal and prefrontal areas represented global, longer-time-scale sequence processing. The complementary ECoG recordings confirmed the activations at cortical surface areas and further differentiated the signals of prediction-error and update, which were transmitted via putative bottom-up γ and top-down β oscillations, respectively. Furthermore, omission responses caused by absence of input, reflecting solely the two levels of prediction signals that are unique to the hierarchical predictive coding framework, demonstrated the hierarchical top-down process of predictions in the auditory, temporal, and prefrontal areas. Thus, our findings support the hierarchical predictive coding framework, and outline how neural networks and spatiotemporal dynamics are used to represent and arrange a hierarchical structure of auditory sequences in the marmoset brain.
Collapse
Affiliation(s)
- Yuwei Jiang
- Institute of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| | - Misako Komatsu
- Laboratory for Molecular Analysis of Higher Brain Function, Center for Brain Science, RIKEN, Saitama, Japan
| | - Yuyan Chen
- Institute of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| | - Ruoying Xie
- Institute of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| | - Kaiwei Zhang
- Institute of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| | - Ying Xia
- Institute of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| | - Peng Gui
- Institute of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| | - Zhifeng Liang
- Institute of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| | - Liping Wang
- Institute of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
4
|
Coffey EBJ, Arseneau-Bruneau I, Zhang X, Baillet S, Zatorre RJ. Oscillatory Entrainment of the Frequency-following Response in Auditory Cortical and Subcortical Structures. J Neurosci 2021; 41:4073-4087. [PMID: 33731448 PMCID: PMC8176755 DOI: 10.1523/jneurosci.2313-20.2021] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 02/23/2021] [Accepted: 02/24/2021] [Indexed: 11/21/2022] Open
Abstract
There is much debate about the existence and function of neural oscillatory mechanisms in the auditory system. The frequency-following response (FFR) is an index of neural periodicity encoding that can provide a vehicle to study entrainment in frequency ranges relevant to speech and music processing. Criteria for entrainment include the presence of poststimulus oscillations and phase alignment between stimulus and endogenous activity. To test the hypothesis of entrainment, in experiment 1 we collected FFR data for a repeated syllable using magnetoencephalography (MEG) and electroencephalography in 20 male and female human adults. We observed significant oscillatory activity after stimulus offset in auditory cortex and subcortical auditory nuclei, consistent with entrainment. In these structures, the FFR fundamental frequency converged from a lower value over 100 ms to the stimulus frequency, consistent with phase alignment, and diverged to a lower value after offset, consistent with relaxation to a preferred frequency. In experiment 2, we tested how transitions between stimulus frequencies affected the MEG FFR to a train of tone pairs in 30 people. We found that the FFR was affected by the frequency of the preceding tone for up to 40 ms at subcortical levels, and even longer durations at cortical levels. Our results suggest that oscillatory entrainment may be an integral part of periodic sound representation throughout the auditory neuraxis. The functional role of this mechanism is unknown, but it could serve as a fine-scale temporal predictor for frequency information, enhancing stability and reducing susceptibility to degradation that could be useful in real-life noisy environments.SIGNIFICANCE STATEMENT Neural oscillations are proposed to be a ubiquitous aspect of neural function, but their contribution to auditory encoding is not clear, particularly at higher frequencies associated with pitch encoding. In a magnetoencephalography experiment, we found converging evidence that the frequency-following response has an oscillatory component according to established criteria: poststimulus resonance, progressive entrainment of the neural frequency to the stimulus frequency, and relaxation toward the original state on stimulus offset. In a second experiment, we found that the frequency and amplitude of the frequency-following response to tones are affected by preceding stimuli. These findings support the contribution of intrinsic oscillations to the encoding of sound, and raise new questions about their functional roles, possibly including stabilization and low-level predictive coding.
Collapse
Affiliation(s)
- Emily B J Coffey
- Department of Psychology, Concordia University, Montreal, Quebec H4B 1R6, Canada
- Montreal Neurological Institute, McGill University, Montreal, Quebec H3A 2B4, Canada
- Laboratory for Brain, Music and Sound Research (BRAMS), Montreal, Quebec H3C 3J7, Canada
- Centre for Research on Brain, Language and Music (CRBLM), Montreal, Quebec H3G 2A8, Canada
| | - Isabelle Arseneau-Bruneau
- Montreal Neurological Institute, McGill University, Montreal, Quebec H3A 2B4, Canada
- Laboratory for Brain, Music and Sound Research (BRAMS), Montreal, Quebec H3C 3J7, Canada
- Centre for Research on Brain, Language and Music (CRBLM), Montreal, Quebec H3G 2A8, Canada
- Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT), McGill University, Montreal, Quebec H3A 1E3, Canada
| | - Xiaochen Zhang
- Montreal Neurological Institute, McGill University, Montreal, Quebec H3A 2B4, Canada
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200030, People's Republic of China
| | - Sylvain Baillet
- Montreal Neurological Institute, McGill University, Montreal, Quebec H3A 2B4, Canada
- Centre for Research on Brain, Language and Music (CRBLM), Montreal, Quebec H3G 2A8, Canada
| | - Robert J Zatorre
- Montreal Neurological Institute, McGill University, Montreal, Quebec H3A 2B4, Canada
- Laboratory for Brain, Music and Sound Research (BRAMS), Montreal, Quebec H3C 3J7, Canada
- Centre for Research on Brain, Language and Music (CRBLM), Montreal, Quebec H3G 2A8, Canada
- Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT), McGill University, Montreal, Quebec H3A 1E3, Canada
| |
Collapse
|
5
|
Combining predictive coding and neural oscillations enables online syllable recognition in natural speech. Nat Commun 2020; 11:3117. [PMID: 32561726 PMCID: PMC7305192 DOI: 10.1038/s41467-020-16956-5] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 06/04/2020] [Indexed: 11/08/2022] Open
Abstract
On-line comprehension of natural speech requires segmenting the acoustic stream into discrete linguistic elements. This process is argued to rely on theta-gamma oscillation coupling, which can parse syllables and encode them in decipherable neural activity. Speech comprehension also strongly depends on contextual cues that help predicting speech structure and content. To explore the effects of theta-gamma coupling on bottom-up/top-down dynamics during on-line syllable identification, we designed a computational model (Precoss—predictive coding and oscillations for speech) that can recognise syllable sequences in continuous speech. The model uses predictions from internal spectro-temporal representations of syllables and theta oscillations to signal syllable onsets and duration. Syllable recognition is best when theta-gamma coupling is used to temporally align spectro-temporal predictions with the acoustic input. This neurocomputational modelling work demonstrates that the notions of predictive coding and neural oscillations can be brought together to account for on-line dynamic sensory processing. The authors present a model to parse and recognise syllables on-line in natural speech sentences that combine predictive coding and neural oscillations. They use simulations from different versions of the model to establish the importance of both theta-gamma coupling and the reset of accumulated evidence in continuous speech processing.
Collapse
|
6
|
Large-Scale Networks for Auditory Sensory Gating in the Awake Mouse. eNeuro 2019; 6:ENEURO.0207-19.2019. [PMID: 31444224 PMCID: PMC6734044 DOI: 10.1523/eneuro.0207-19.2019] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 06/28/2019] [Accepted: 07/01/2019] [Indexed: 12/03/2022] Open
Abstract
The amplitude of the brain response to a repeated auditory stimulus is diminished as compared to the response to the first tone (T1) for interstimulus intervals (ISI) lasting up to hundreds of milliseconds. This adaptation process, called auditory sensory gating (ASG), is altered in various psychiatric diseases including schizophrenia and is classically studied by focusing on early evoked cortical responses to the second tone (T2) using 500-ms ISI. However, mechanisms underlying ASG are still not well-understood. We investigated ASG in awake mice from the brainstem to cortex at variable ISIs (125–2000 ms) using high-density EEG and intracerebral recordings. While ASG decreases at longer ISIs, it is still present at durations (500–2000 ms) far beyond the time during which brain responses to T1 could still be detected. T1 induces a sequence of specific stable scalp EEG topographies that correspond to the successive activation of distinct neural networks lasting about 350 ms. These brain states remain unaltered if T2 is presented during this period, although T2 is processed by the brain, suggesting that ongoing networks of brain activity are active for longer than early evoked-potentials and are not overwritten by an upcoming new stimulus. Intracerebral recordings demonstrate that ASG is already present at the level of ventral cochlear nucleus (vCN) and inferior colliculus and is amplified across the hierarchy in bottom-up direction. This study uncovers the extended stability of sensory-evoked brain states and long duration of ASG, and sheds light on generators of ASG and possible interactions between bottom-up and top-down mechanisms.
Collapse
|