1
|
Herff SA, Bonetti L, Cecchetti G, Vuust P, Kringelbach ML, Rohrmeier MA. Hierarchical syntax model of music predicts theta power during music listening. Neuropsychologia 2024; 199:108905. [PMID: 38740179 DOI: 10.1016/j.neuropsychologia.2024.108905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 03/07/2024] [Accepted: 05/06/2024] [Indexed: 05/16/2024]
Abstract
Linguistic research showed that the depth of syntactic embedding is reflected in brain theta power. Here, we test whether this also extends to non-linguistic stimuli, specifically music. We used a hierarchical model of musical syntax to continuously quantify two types of expert-annotated harmonic dependencies throughout a piece of Western classical music: prolongation and preparation. Prolongations can roughly be understood as a musical analogue to linguistic coordination between constituents that share the same function (e.g., 'pizza' and 'pasta' in 'I ate pizza and pasta'). Preparation refers to the dependency between two harmonies whereby the first implies a resolution towards the second (e.g., dominant towards tonic; similar to how the adjective implies the presence of a noun in 'I like spicy … '). Source reconstructed MEG data of sixty-five participants listening to the musical piece was then analysed. We used Bayesian Mixed Effects models to predict theta envelope in the brain, using the number of open prolongation and preparation dependencies as predictors whilst controlling for audio envelope. We observed that prolongation and preparation both carry independent and distinguishable predictive value for theta band fluctuation in key linguistic areas such as the Angular, Superior Temporal, and Heschl's Gyri, or their right-lateralised homologues, with preparation showing additional predictive value for areas associated with the reward system and prediction. Musical expertise further mediated these effects in language-related brain areas. Results show that predictions of precisely formalised music-theoretical models are reflected in the brain activity of listeners which furthers our understanding of the perception and cognition of musical structure.
Collapse
Affiliation(s)
- Steffen A Herff
- Sydney Conservatorium of Music, University of Sydney, Sydney, Australia; The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, Australia; Digital and Cognitive Musicology Lab, College of Humanities, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.
| | - Leonardo Bonetti
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University & The Royal Academy of Music, Aarhus/Aalborg, Denmark; Centre for Eudaimonia and Human Flourishing, Linacre College, University of Oxford, Oxford, United Kingdom; Department of Psychiatry, University of Oxford, Oxford, United Kingdom
| | - Gabriele Cecchetti
- The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Sydney, Australia; Digital and Cognitive Musicology Lab, College of Humanities, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Peter Vuust
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University & The Royal Academy of Music, Aarhus/Aalborg, Denmark
| | - Morten L Kringelbach
- Center for Music in the Brain, Department of Clinical Medicine, Aarhus University & The Royal Academy of Music, Aarhus/Aalborg, Denmark; Centre for Eudaimonia and Human Flourishing, Linacre College, University of Oxford, Oxford, United Kingdom; Department of Psychiatry, University of Oxford, Oxford, United Kingdom
| | - Martin A Rohrmeier
- Digital and Cognitive Musicology Lab, College of Humanities, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| |
Collapse
|
2
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. SCIENCE ADVANCES 2024; 10:eadk0010. [PMID: 38363839 PMCID: PMC10871532 DOI: 10.1126/sciadv.adk0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception varies along several pitch-based dimensions: (i) the absolute pitch of notes, (ii) the difference in pitch between successive notes, and (iii) the statistical expectation of each note given prior context. How the brain represents these dimensions and whether their encoding is specialized for music remains unknown. We recorded high-density neurophysiological activity directly from the human auditory cortex while participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial map for representing distinct melodic dimensions. The same participants listened to spoken English, and we compared responses to music and speech. Cortical sites selective for music encoded expectation, while sites that encoded pitch and pitch-change in music used the same neural code to represent equivalent properties of speech. Findings reveal how the perception of melody recruits both music-specific and general-purpose sound representations.
Collapse
Affiliation(s)
- Narayan Sankaran
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K. Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Frederic Theunissen
- Department of Psychology, University of California, Berkeley, 2121 Berkeley Way, Berkeley, CA 94720, USA
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| |
Collapse
|
3
|
Shan T, Cappelloni MS, Maddox RK. Subcortical responses to music and speech are alike while cortical responses diverge. Sci Rep 2024; 14:789. [PMID: 38191488 PMCID: PMC10774448 DOI: 10.1038/s41598-023-50438-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 12/20/2023] [Indexed: 01/10/2024] Open
Abstract
Music and speech are encountered daily and are unique to human beings. Both are transformed by the auditory pathway from an initial acoustical encoding to higher level cognition. Studies of cortex have revealed distinct brain responses to music and speech, but differences may emerge in the cortex or may be inherited from different subcortical encoding. In the first part of this study, we derived the human auditory brainstem response (ABR), a measure of subcortical encoding, to recorded music and speech using two analysis methods. The first method, described previously and acoustically based, yielded very different ABRs between the two sound classes. The second method, however, developed here and based on a physiological model of the auditory periphery, gave highly correlated responses to music and speech. We determined the superiority of the second method through several metrics, suggesting there is no appreciable impact of stimulus class (i.e., music vs speech) on the way stimulus acoustics are encoded subcortically. In this study's second part, we considered the cortex. Our new analysis method resulted in cortical music and speech responses becoming more similar but with remaining differences. The subcortical and cortical results taken together suggest that there is evidence for stimulus-class dependent processing of music and speech at the cortical but not subcortical level.
Collapse
Affiliation(s)
- Tong Shan
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA
- Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
| | - Madeline S Cappelloni
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA
- Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
| | - Ross K Maddox
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA.
- Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA.
- Center for Visual Science, University of Rochester, Rochester, NY, USA.
- Department of Neuroscience, University of Rochester, Rochester, NY, USA.
| |
Collapse
|
4
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562771. [PMID: 37905047 PMCID: PMC10614915 DOI: 10.1101/2023.10.17.562771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception of melody varies along several pitch-based dimensions: (1) the absolute pitch of notes, (2) the difference in pitch between successive notes, and (3) the higher-order statistical expectation of each note conditioned on its prior context. While humans readily perceive melody, how these dimensions are collectively represented in the brain and whether their encoding is specialized for music remains unknown. Here, we recorded high-density neurophysiological activity directly from the surface of human auditory cortex while Western participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial code for representing distinct dimensions of melody. The same participants listened to spoken English, and we compared evoked responses to music and speech. Cortical sites selective for music were systematically driven by the encoding of expectation. In contrast, sites that encoded pitch and pitch-change used the same neural code to represent equivalent properties of speech. These findings reveal the multidimensional nature of melody encoding, consisting of both music-specific and domain-general sound representations in auditory cortex. Teaser The human brain contains both general-purpose and music-specific neural populations for processing distinct attributes of melody.
Collapse
|
5
|
Theta Band (4-8 Hz) Oscillations Reflect Online Processing of Rhythm in Speech Production. Brain Sci 2022; 12:brainsci12121593. [PMID: 36552053 PMCID: PMC9775388 DOI: 10.3390/brainsci12121593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/08/2022] [Accepted: 11/14/2022] [Indexed: 11/24/2022] Open
Abstract
How speech prosody is processed in the brain during language production remains an unsolved issue. The present work used the phrase-recall paradigm to analyze brain oscillation underpinning rhythmic processing in speech production. Participants were told to recall target speeches aloud consisting of verb-noun pairings with a common (e.g., [2+2], the numbers in brackets represent the number of syllables) or uncommon (e.g., [1+3]) rhythmic pattern. Target speeches were preceded by rhythmic musical patterns, either congruent or incongruent, created by using pure tones at various temporal intervals. Electroencephalogram signals were recorded throughout the experiment. Behavioral results in 2+2 target speeches showed a rhythmic priming effect when comparing congruent and incongruent conditions. Cerebral-acoustic coherence analysis showed that neural activities synchronized with the rhythmic patterns of primes. Furthermore, target phrases that had congruent rhythmic patterns with a prime rhythm were associated with increased theta-band (4-8 Hz) activity in the time window of 400-800 ms in both the 2+2 and 1+3 target conditions. These findings suggest that rhythmic patterns can be processed online. Neural activities synchronize with the rhythmic input and speakers create an abstract rhythmic pattern before and during articulation in speech production.
Collapse
|
6
|
Kim SG. On the encoding of natural music in computational models and human brains. Front Neurosci 2022; 16:928841. [PMID: 36203808 PMCID: PMC9531138 DOI: 10.3389/fnins.2022.928841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 08/15/2022] [Indexed: 11/13/2022] Open
Abstract
This article discusses recent developments and advances in the neuroscience of music to understand the nature of musical emotion. In particular, it highlights how system identification techniques and computational models of music have advanced our understanding of how the human brain processes the textures and structures of music and how the processed information evokes emotions. Musical models relate physical properties of stimuli to internal representations called features, and predictive models relate features to neural or behavioral responses and test their predictions against independent unseen data. The new frameworks do not require orthogonalized stimuli in controlled experiments to establish reproducible knowledge, which has opened up a new wave of naturalistic neuroscience. The current review focuses on how this trend has transformed the domain of the neuroscience of music.
Collapse
|
7
|
Wang Y, Tang Z, Zhang X, Yang L. Auditory and cross-modal attentional bias toward positive natural sounds: Behavioral and ERP evidence. Front Hum Neurosci 2022; 16:949655. [PMID: 35967006 PMCID: PMC9372282 DOI: 10.3389/fnhum.2022.949655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Accepted: 06/28/2022] [Indexed: 11/13/2022] Open
Abstract
Recently, researchers have expanded the investigation into attentional biases toward positive stimuli; however, few studies have examined attentional biases toward positive auditory information. In three experiments, the present study employed an emotional spatial cueing task using emotional sounds as cues and auditory stimuli (Experiment 1) or visual stimuli (Experiment 2 and Experiment 3) as targets to explore whether auditory or visual spatial attention could be modulated by positive auditory cues. Experiment 3 also examined the temporal dynamics of cross-modal auditory bias toward positive natural sounds using event-related potentials (ERPs). The behavioral results of the three experiments consistently demonstrated that response times to targets were faster after positive auditory cues than they were after neutral auditory cues in the valid condition, indicating that healthy participants showed a selective auditory attentional bias (Experiment 1) and cross-modal attentional bias (Experiment 2 and Experiment 3) toward positive natural sounds. The results of Experiment 3 showed that N1 amplitudes were more negative after positive sounds than they were after neutral sounds, which further provided electrophysiological evidence that positive auditory information enhances attention at early stages in healthy adults. The results of the experiments performed in the present study suggest that humans exhibit an attentional bias toward positive natural sounds.
Collapse
Affiliation(s)
- Yanmei Wang
- Shanghai Key Laboratory of Mental Health and Psychological Crisis Intervention, School of Psychology and Cognitive Science, East China Normal University, Shanghai, China
- Shanghai Changning Mental Health Center, Shanghai, China
| | - Zhenwei Tang
- Shanghai Key Laboratory of Mental Health and Psychological Crisis Intervention, School of Psychology and Cognitive Science, East China Normal University, Shanghai, China
| | - Xiaoxuan Zhang
- Shanghai Key Laboratory of Mental Health and Psychological Crisis Intervention, School of Psychology and Cognitive Science, East China Normal University, Shanghai, China
- Shanghai Changning Mental Health Center, Shanghai, China
| | - Libing Yang
- Shanghai Key Laboratory of Mental Health and Psychological Crisis Intervention, School of Psychology and Cognitive Science, East China Normal University, Shanghai, China
| |
Collapse
|
8
|
Norman-Haignere SV, Long LK, Devinsky O, Doyle W, Irobunda I, Merricks EM, Feldstein NA, McKhann GM, Schevon CA, Flinker A, Mesgarani N. Multiscale temporal integration organizes hierarchical computation in human auditory cortex. Nat Hum Behav 2022; 6:455-469. [PMID: 35145280 PMCID: PMC8957490 DOI: 10.1038/s41562-021-01261-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 11/18/2021] [Indexed: 01/11/2023]
Abstract
To derive meaning from sound, the brain must integrate information across many timescales. What computations underlie multiscale integration in human auditory cortex? Evidence suggests that auditory cortex analyses sound using both generic acoustic representations (for example, spectrotemporal modulation tuning) and category-specific computations, but the timescales over which these putatively distinct computations integrate remain unclear. To answer this question, we developed a general method to estimate sensory integration windows-the time window when stimuli alter the neural response-and applied our method to intracranial recordings from neurosurgical patients. We show that human auditory cortex integrates hierarchically across diverse timescales spanning from ~50 to 400 ms. Moreover, we find that neural populations with short and long integration windows exhibit distinct functional properties: short-integration electrodes (less than ~200 ms) show prominent spectrotemporal modulation selectivity, while long-integration electrodes (greater than ~200 ms) show prominent category selectivity. These findings reveal how multiscale integration organizes auditory computation in the human brain.
Collapse
Affiliation(s)
- Sam V Norman-Haignere
- Zuckerman Mind, Brain, Behavior Institute, Columbia University,HHMI Postdoctoral Fellow of the Life Sciences Research Foundation
| | - Laura K. Long
- Zuckerman Mind, Brain, Behavior Institute, Columbia University,Doctoral Program in Neurobiology and Behavior, Columbia University
| | - Orrin Devinsky
- Department of Neurology, NYU Langone Medical Center,Comprehensive Epilepsy Center, NYU Langone Medical Center
| | - Werner Doyle
- Comprehensive Epilepsy Center, NYU Langone Medical Center,Department of Neurosurgery, NYU Langone Medical Center
| | - Ifeoma Irobunda
- Department of Neurology, Columbia University Irving Medical Center
| | | | - Neil A. Feldstein
- Department of Neurological Surgery, Columbia University Irving Medical Center
| | - Guy M. McKhann
- Department of Neurological Surgery, Columbia University Irving Medical Center
| | | | - Adeen Flinker
- Department of Neurology, NYU Langone Medical Center,Comprehensive Epilepsy Center, NYU Langone Medical Center,Department of Biomedical Engineering, NYU Tandon School of Engineering
| | - Nima Mesgarani
- Zuckerman Mind, Brain, Behavior Institute, Columbia University,Doctoral Program in Neurobiology and Behavior, Columbia University,Department of Electrical Engineering, Columbia University
| |
Collapse
|
9
|
Hölle D, Blum S, Kissner S, Debener S, Bleichner MG. Real-Time Audio Processing of Real-Life Soundscapes for EEG Analysis: ERPs Based on Natural Sound Onsets. FRONTIERS IN NEUROERGONOMICS 2022; 3:793061. [PMID: 38235458 PMCID: PMC10790832 DOI: 10.3389/fnrgo.2022.793061] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 01/03/2021] [Indexed: 01/19/2024]
Abstract
With smartphone-based mobile electroencephalography (EEG), we can investigate sound perception beyond the lab. To understand sound perception in the real world, we need to relate naturally occurring sounds to EEG data. For this, EEG and audio information need to be synchronized precisely, only then it is possible to capture fast and transient evoked neural responses and relate them to individual sounds. We have developed Android applications (AFEx and Record-a) that allow for the concurrent acquisition of EEG data and audio features, i.e., sound onsets, average signal power (RMS), and power spectral density (PSD) on smartphone. In this paper, we evaluate these apps by computing event-related potentials (ERPs) evoked by everyday sounds. One participant listened to piano notes (played live by a pianist) and to a home-office soundscape. Timing tests showed a stable lag and a small jitter (< 3 ms) indicating a high temporal precision of the system. We calculated ERPs to sound onsets and observed the typical P1-N1-P2 complex of auditory processing. Furthermore, we show how to relate information on loudness (RMS) and spectra (PSD) to brain activity. In future studies, we can use this system to study sound processing in everyday life.
Collapse
Affiliation(s)
- Daniel Hölle
- Neurophysiology of Everyday Life Group, Department of Psychology, University of Oldenburg, Oldenburg, Germany
| | - Sarah Blum
- Neuropsychology Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany
- Cluster of Excellence Hearing4all, Oldenburg, Germany
| | - Sven Kissner
- Institute for Hearing Technology and Audiology, Jade University of Applied Sciences, Oldenburg, Germany
| | - Stefan Debener
- Neuropsychology Lab, Department of Psychology, University of Oldenburg, Oldenburg, Germany
| | - Martin G. Bleichner
- Neurophysiology of Everyday Life Group, Department of Psychology, University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
10
|
Zuk NJ, Murphy JW, Reilly RB, Lalor EC. Envelope reconstruction of speech and music highlights stronger tracking of speech at low frequencies. PLoS Comput Biol 2021; 17:e1009358. [PMID: 34534211 PMCID: PMC8480853 DOI: 10.1371/journal.pcbi.1009358] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 09/29/2021] [Accepted: 08/18/2021] [Indexed: 11/19/2022] Open
Abstract
The human brain tracks amplitude fluctuations of both speech and music, which reflects acoustic processing in addition to the encoding of higher-order features and one's cognitive state. Comparing neural tracking of speech and music envelopes can elucidate stimulus-general mechanisms, but direct comparisons are confounded by differences in their envelope spectra. Here, we use a novel method of frequency-constrained reconstruction of stimulus envelopes using EEG recorded during passive listening. We expected to see music reconstruction match speech in a narrow range of frequencies, but instead we found that speech was reconstructed better than music for all frequencies we examined. Additionally, models trained on all stimulus types performed as well or better than the stimulus-specific models at higher modulation frequencies, suggesting a common neural mechanism for tracking speech and music. However, speech envelope tracking at low frequencies, below 1 Hz, was associated with increased weighting over parietal channels, which was not present for the other stimuli. Our results highlight the importance of low-frequency speech tracking and suggest an origin from speech-specific processing in the brain.
Collapse
Affiliation(s)
- Nathaniel J. Zuk
- Department of Electronic & Electrical Engineering, Trinity College, The University of Dublin, Dublin, Ireland
- Department of Mechanical, Manufacturing & Biomedical Engineering, Trinity College, The University of Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College, The University of Dublin, Dublin, Ireland
- Department of Biomedical Engineering, University of Rochester, Rochester, New York, United States of America
- Del Monte Institute of Neuroscience, University of Rochester Medical Center, Rochester, New York, United States of America
| | - Jeremy W. Murphy
- Department of Electronic & Electrical Engineering, Trinity College, The University of Dublin, Dublin, Ireland
| | - Richard B. Reilly
- Department of Mechanical, Manufacturing & Biomedical Engineering, Trinity College, The University of Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College, The University of Dublin, Dublin, Ireland
- Trinity Centre for Biomedical Engineering, Trinity College, The University of Dublin, Dublin, Ireland
| | - Edmund C. Lalor
- Department of Electronic & Electrical Engineering, Trinity College, The University of Dublin, Dublin, Ireland
- Department of Biomedical Engineering, University of Rochester, Rochester, New York, United States of America
- Del Monte Institute of Neuroscience, University of Rochester Medical Center, Rochester, New York, United States of America
| |
Collapse
|
11
|
Boebinger D, Norman-Haignere SV, McDermott JH, Kanwisher N. Music-selective neural populations arise without musical training. J Neurophysiol 2021; 125:2237-2263. [PMID: 33596723 PMCID: PMC8285655 DOI: 10.1152/jn.00588.2020] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 02/12/2021] [Accepted: 02/12/2021] [Indexed: 11/22/2022] Open
Abstract
Recent work has shown that human auditory cortex contains neural populations anterior and posterior to primary auditory cortex that respond selectively to music. However, it is unknown how this selectivity for music arises. To test whether musical training is necessary, we measured fMRI responses to 192 natural sounds in 10 people with almost no musical training. When voxel responses were decomposed into underlying components, this group exhibited a music-selective component that was very similar in response profile and anatomical distribution to that previously seen in individuals with moderate musical training. We also found that musical genres that were less familiar to our participants (e.g., Balinese gamelan) produced strong responses within the music component, as did drum clips with rhythm but little melody, suggesting that these neural populations are broadly responsive to music as a whole. Our findings demonstrate that the signature properties of neural music selectivity do not require musical training to develop, showing that the music-selective neural populations are a fundamental and widespread property of the human brain.NEW & NOTEWORTHY We show that music-selective neural populations are clearly present in people without musical training, demonstrating that they are a fundamental and widespread property of the human brain. Additionally, we show music-selective neural populations respond strongly to music from unfamiliar genres as well as music with rhythm but little pitch information, suggesting that they are broadly responsive to music as a whole.
Collapse
Affiliation(s)
- Dana Boebinger
- Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, Massachusetts
- Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Sam V Norman-Haignere
- Laboratoire des Sytèmes Perceptifs, Département d'Études Cognitives, École Normale Supérieure, PSL Research University, CNRS, Paris France
- Zuckerman Institute for Brain Research, Columbia University, New York, New York
| | - Josh H McDermott
- Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, Massachusetts
- Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts
- Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Nancy Kanwisher
- Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts
- Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, Massachusetts
| |
Collapse
|
12
|
de Cheveigné A, Slaney M, Fuglsang SA, Hjortkjaer J. Auditory stimulus-response modeling with a match-mismatch task. J Neural Eng 2021; 18. [PMID: 33849003 DOI: 10.1088/1741-2552/abf771] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 04/13/2021] [Indexed: 11/12/2022]
Abstract
Objective.An auditory stimulus can be related to the brain response that it evokes by a stimulus-response model fit to the data. This offers insight into perceptual processes within the brain and is also of potential use for devices such as brain computer interfaces (BCIs). The quality of the model can be quantified by measuring the fit with a regression problem, or by applying it to a classification task and measuring its performance.Approach.Here we focus on amatch-mismatch(MM) task that entails deciding whether a segment of brain signal matches, via a model, the auditory stimulus that evoked it.Main results. Using these metrics, we describe a range of models of increasing complexity that we compare to methods in the literature, showing state-of-the-art performance. We document in detail one particular implementation, calibrated on a publicly-available database, that can serve as a robust reference to evaluate future developments.Significance.The MM task allows stimulus-response models to be evaluated in the limit of very high model accuracy, making it an attractive alternative to the more commonly used task of auditory attention detection. The MM task does not require class labels, so it is immune to mislabeling, and it is applicable to data recorded in listening scenarios with only one sound source, thus it is cheap to obtain large quantities of training and testing data. Performance metrics from this task, associated with regression accuracy, provide complementary insights into the relation between stimulus and response, as well as information about discriminatory power directly applicable to BCI applications.
Collapse
Affiliation(s)
- Alain de Cheveigné
- Laboratoire des Systèmes Perceptifs, Paris, CNRS UMR 8248, France.,Département d'Etudes Cognitives, Ecole Normale Supérieure, Paris, PSL, France.,UCL Ear Institute, London, United Kingdom.,Audition, DEC, ENS, 29 rue d'Ulm, 75230 Paris, France
| | - Malcolm Slaney
- Google Research, Machine Hearing Group, Mountain View, CA, United States of America
| | - Søren A Fuglsang
- Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Hvidovre, Copenhagen, Denmark
| | - Jens Hjortkjaer
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark.,Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Hvidovre, Copenhagen, Denmark
| |
Collapse
|
13
|
Losorelli S, Kaneshiro B, Musacchia GA, Blevins NH, Fitzgerald MB. Factors influencing classification of frequency following responses to speech and music stimuli. Hear Res 2020; 398:108101. [PMID: 33142106 DOI: 10.1016/j.heares.2020.108101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 09/25/2020] [Accepted: 10/19/2020] [Indexed: 01/08/2023]
Abstract
Successful mapping of meaningful labels to sound input requires accurate representation of that sound's acoustic variances in time and spectrum. For some individuals, such as children or those with hearing loss, having an objective measure of the integrity of this representation could be useful. Classification is a promising machine learning approach which can be used to objectively predict a stimulus label from the brain response. This approach has been previously used with auditory evoked potentials (AEP) such as the frequency following response (FFR), but a number of key issues remain unresolved before classification can be translated into clinical practice. Specifically, past efforts at FFR classification have used data from a given subject for both training and testing the classifier. It is also unclear which components of the FFR elicit optimal classification accuracy. To address these issues, we recorded FFRs from 13 adults with normal hearing in response to speech and music stimuli. We compared labeling accuracy of two cross-validation classification approaches using FFR data: (1) a more traditional method combining subject data in both the training and testing set, and (2) a "leave-one-out" approach, in which subject data is classified based on a model built exclusively from the data of other individuals. We also examined classification accuracy on decomposed and time-segmented FFRs. Our results indicate that the accuracy of leave-one-subject-out cross validation approaches that obtained in the more conventional cross-validation classifications while allowing a subject's results to be analysed with respect to normative data pooled from a separate population. In addition, we demonstrate that classification accuracy is highest when the entire FFR is used to train the classifier. Taken together, these efforts contribute key steps toward translation of classification-based machine learning approaches into clinical practice.
Collapse
Affiliation(s)
- Steven Losorelli
- Department of Otolaryngology Head and Neck Surgery, Stanford University School of Medicine, Palo Alto, CA, USA.
| | - Blair Kaneshiro
- Department of Otolaryngology Head and Neck Surgery, Stanford University School of Medicine, Palo Alto, CA, USA.
| | - Gabriella A Musacchia
- Department of Otolaryngology Head and Neck Surgery, Stanford University School of Medicine, Palo Alto, CA, USA; Department of Audiology, University of the Pacific, San Francisco, CA, USA.
| | - Nikolas H Blevins
- Department of Otolaryngology Head and Neck Surgery, Stanford University School of Medicine, Palo Alto, CA, USA.
| | - Matthew B Fitzgerald
- Department of Otolaryngology Head and Neck Surgery, Stanford University School of Medicine, Palo Alto, CA, USA.
| |
Collapse
|