1
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. SCIENCE ADVANCES 2024; 10:eadk0010. [PMID: 38363839 PMCID: PMC10871532 DOI: 10.1126/sciadv.adk0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception varies along several pitch-based dimensions: (i) the absolute pitch of notes, (ii) the difference in pitch between successive notes, and (iii) the statistical expectation of each note given prior context. How the brain represents these dimensions and whether their encoding is specialized for music remains unknown. We recorded high-density neurophysiological activity directly from the human auditory cortex while participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial map for representing distinct melodic dimensions. The same participants listened to spoken English, and we compared responses to music and speech. Cortical sites selective for music encoded expectation, while sites that encoded pitch and pitch-change in music used the same neural code to represent equivalent properties of speech. Findings reveal how the perception of melody recruits both music-specific and general-purpose sound representations.
Collapse
Affiliation(s)
- Narayan Sankaran
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K. Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Frederic Theunissen
- Department of Psychology, University of California, Berkeley, 2121 Berkeley Way, Berkeley, CA 94720, USA
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| |
Collapse
|
2
|
Harris I, Niven EC, Griffin A, Scott SK. Is song processing distinct and special in the auditory cortex? Nat Rev Neurosci 2023; 24:711-722. [PMID: 37783820 DOI: 10.1038/s41583-023-00743-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/30/2023] [Indexed: 10/04/2023]
Abstract
Is the singing voice processed distinctively in the human brain? In this Perspective, we discuss what might distinguish song processing from speech processing in light of recent work suggesting that some cortical neuronal populations respond selectively to song and we outline the implications for our understanding of auditory processing. We review the literature regarding the neural and physiological mechanisms of song production and perception and show that this provides evidence for key differences between song and speech processing. We conclude by discussing the significance of the notion that song processing is special in terms of how this might contribute to theories of the neurobiological origins of vocal communication and to our understanding of the neural circuitry underlying sound processing in the human cortex.
Collapse
Affiliation(s)
- Ilana Harris
- Institute of Cognitive Neuroscience, University College London, London, UK
| | - Efe C Niven
- Institute of Cognitive Neuroscience, University College London, London, UK
| | - Alex Griffin
- Department of Psychology, University of Cambridge, Cambridge, UK
| | - Sophie K Scott
- Institute of Cognitive Neuroscience, University College London, London, UK.
| |
Collapse
|
3
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562771. [PMID: 37905047 PMCID: PMC10614915 DOI: 10.1101/2023.10.17.562771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception of melody varies along several pitch-based dimensions: (1) the absolute pitch of notes, (2) the difference in pitch between successive notes, and (3) the higher-order statistical expectation of each note conditioned on its prior context. While humans readily perceive melody, how these dimensions are collectively represented in the brain and whether their encoding is specialized for music remains unknown. Here, we recorded high-density neurophysiological activity directly from the surface of human auditory cortex while Western participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial code for representing distinct dimensions of melody. The same participants listened to spoken English, and we compared evoked responses to music and speech. Cortical sites selective for music were systematically driven by the encoding of expectation. In contrast, sites that encoded pitch and pitch-change used the same neural code to represent equivalent properties of speech. These findings reveal the multidimensional nature of melody encoding, consisting of both music-specific and domain-general sound representations in auditory cortex. Teaser The human brain contains both general-purpose and music-specific neural populations for processing distinct attributes of melody.
Collapse
|
4
|
May L, Halpern AR, Paulsen SD, Casey MA. Imagined Musical Scale Relationships Decoded from Auditory Cortex. J Cogn Neurosci 2022; 34:1326-1339. [PMID: 35554552 DOI: 10.1162/jocn_a_01858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Notes in a musical scale convey different levels of stability or incompleteness, forming what is known as a tonal hierarchy. Levels of stability conveyed by these scale degrees are partly responsible for generating expectations as a melody proceeds, for emotions deriving from fulfillment (or not) of those expectations, and for judgments of overall melodic well-formedness. These functions can be extracted even during imagined music. We investigated whether patterns of neural activity in fMRI could be used to identify heard and imagined notes, and if patterns associated with heard notes could identify notes that were merely imagined. We presented trained musicians with the beginning of a scale (key and timbre were varied). The next note in the scale was either heard or imagined. A probe tone task assessed sensitivity to the tonal hierarchy, and state and trait measures of imagery were included as predictors. Multivoxel classification yielded above-chance results in primary auditory cortex (Heschl's gyrus) for heard scale-degree decoding. Imagined scale-degree decoding was successful in multiple cortical regions spanning bilateral superior temporal, inferior parietal, precentral, and inferior frontal areas. The right superior temporal gyrus yielded successful cross-decoding of heard-to-imagined scale-degree, indicating a shared pathway between tonal-hierarchy perception and imagery. Decoding in right and left superior temporal gyrus and right inferior frontal gyrus was more successful in people with more differentiated tonal hierarchies and in left inferior frontal gyrus among people with higher self-reported auditory imagery vividness, providing a link between behavioral traits and success of neural decoding. These results point to the neural specificity of imagined auditory experiences-even of such functional knowledge-but also document informative individual differences in the precision of that neural response.
Collapse
|
5
|
Di Liberto GM, Marion G, Shamma SA. Accurate Decoding of Imagined and Heard Melodies. Front Neurosci 2021; 15:673401. [PMID: 34421512 PMCID: PMC8375770 DOI: 10.3389/fnins.2021.673401] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 06/17/2021] [Indexed: 11/16/2022] Open
Abstract
Music perception requires the human brain to process a variety of acoustic and music-related properties. Recent research used encoding models to tease apart and study the various cortical contributors to music perception. To do so, such approaches study temporal response functions that summarise the neural activity over several minutes of data. Here we tested the possibility of assessing the neural processing of individual musical units (bars) with electroencephalography (EEG). We devised a decoding methodology based on a maximum correlation metric across EEG segments (maxCorr) and used it to decode melodies from EEG based on an experiment where professional musicians listened and imagined four Bach melodies multiple times. We demonstrate here that accurate decoding of melodies in single-subjects and at the level of individual musical units is possible, both from EEG signals recorded during listening and imagination. Furthermore, we find that greater decoding accuracies are measured for the maxCorr method than for an envelope reconstruction approach based on backward temporal response functions (bTRFenv). These results indicate that low-frequency neural signals encode information beyond note timing, especially with respect to low-frequency cortical signals below 1 Hz, which are shown to encode pitch-related information. Along with the theoretical implications of these results, we discuss the potential applications of this decoding methodology in the context of novel brain-computer interface solutions.
Collapse
Affiliation(s)
- Giovanni M Di Liberto
- Laboratoire des Systèmes Perceptifs, CNRS, Paris, France.,Ecole Normale Supérieure, PSL University, Paris, France.,Department of Mechanical, Manufacturing and Biomedical Engineering, Trinity Centre for Biomedical Engineering, Trinity College, Trinity Institute of Neuroscience, The University of Dublin, Dublin, Ireland.,Centre for Biomedical Engineering, School of Electrical and Electronic Engineering and UCD University College Dublin, Dublin, Ireland
| | - Guilhem Marion
- Laboratoire des Systèmes Perceptifs, CNRS, Paris, France
| | - Shihab A Shamma
- Laboratoire des Systèmes Perceptifs, CNRS, Paris, France.,Institute for Systems Research, Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD, United States
| |
Collapse
|
6
|
Schreiner T, Petzka M, Staudigl T, Staresina BP. Endogenous memory reactivation during sleep in humans is clocked by slow oscillation-spindle complexes. Nat Commun 2021; 12:3112. [PMID: 34035303 PMCID: PMC8149676 DOI: 10.1038/s41467-021-23520-2] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 05/04/2021] [Indexed: 02/04/2023] Open
Abstract
Sleep is thought to support memory consolidation via reactivation of prior experiences, with particular electrophysiological sleep signatures (slow oscillations (SOs) and sleep spindles) gating the information flow between relevant brain areas. However, empirical evidence for a role of endogenous memory reactivation (i.e., without experimentally delivered memory cues) for consolidation in humans is lacking. Here, we devised a paradigm in which participants acquired associative memories before taking a nap. Multivariate decoding was then used to capture endogenous memory reactivation during non-rapid eye movement (NREM) sleep in surface EEG recordings. Our results reveal reactivation of learning material during SO-spindle complexes, with the precision of SO-spindle coupling predicting reactivation strength. Critically, reactivation strength (i.e. classifier evidence in favor of the previously studied stimulus category) in turn predicts the level of consolidation across participants. These results elucidate the memory function of sleep in humans and emphasize the importance of SOs and spindles in clocking endogenous consolidation processes.
Collapse
Affiliation(s)
- Thomas Schreiner
- Department of Psychology, Ludwig-Maximilians-University, Munich, Germany
| | - Marit Petzka
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Birmingham, UK
| | - Tobias Staudigl
- Department of Psychology, Ludwig-Maximilians-University, Munich, Germany
| | - Bernhard P Staresina
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Birmingham, UK.
| |
Collapse
|
7
|
Sauvé SA, Cho A, Zendel BR. Mapping Tonal Hierarchy in the Brain. Neuroscience 2021; 465:187-202. [PMID: 33774126 DOI: 10.1016/j.neuroscience.2021.03.019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 03/09/2021] [Accepted: 03/16/2021] [Indexed: 11/25/2022]
Abstract
In Western tonal music, pitches are organized hierarchically based on their perceived fit in a specific tonal context. This hierarchy forms scales that are commonly used in Western tonal music. The hierarchical nature of tonal structure is well established behaviourally; however, the neural underpinnings are largely unknown. In this study, EEG data and goodness-of-fit ratings were collected from 34 participants who listened to an arpeggio followed by a probe tone, where the probe tone could be any chromatic scale degree and the context any of the major keys. Goodness-of-fit ratings corresponded to the classic tonal hierarchy. N1, P2 and the Early Right Anterior Negativity (ERAN) were significantly modulated by scale degree. Furthermore, neural marker amplitudes and latencies were significantly correlated with similar magnitude to both pitch height and goodness-of-fit ratings. This is different from the clearer divide between pitch height correlating with early neural markers (100-200 ms) and tonal hierarchy correlating with late neural markers (200-1000 ms) reported by Sankaran et al. (2020) and Quiroga-Martinez et al. (2019). Finally, individual differences were greater than any main effects detected when pooling participants and brain-behavior correlations vary widely (i.e. r = -0.8 to 0.8).
Collapse
Affiliation(s)
- Sarah A Sauvé
- Division of Community Health and Humanities, Faculty of Medicine, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador A1C 5S7, Canada.
| | - Alex Cho
- Division of Community Health and Humanities, Faculty of Medicine, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador A1C 5S7, Canada
| | - Benjamin Rich Zendel
- Division of Community Health and Humanities, Faculty of Medicine, Memorial University of Newfoundland, St. John's, Newfoundland and Labrador A1C 5S7, Canada; Aging Research Centre - Newfoundland and Labrador, Memorial University of Newfoundland, Grenfell Campus, Memorial University, Canada
| |
Collapse
|
8
|
Losorelli S, Kaneshiro B, Musacchia GA, Blevins NH, Fitzgerald MB. Factors influencing classification of frequency following responses to speech and music stimuli. Hear Res 2020; 398:108101. [PMID: 33142106 DOI: 10.1016/j.heares.2020.108101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 09/25/2020] [Accepted: 10/19/2020] [Indexed: 01/08/2023]
Abstract
Successful mapping of meaningful labels to sound input requires accurate representation of that sound's acoustic variances in time and spectrum. For some individuals, such as children or those with hearing loss, having an objective measure of the integrity of this representation could be useful. Classification is a promising machine learning approach which can be used to objectively predict a stimulus label from the brain response. This approach has been previously used with auditory evoked potentials (AEP) such as the frequency following response (FFR), but a number of key issues remain unresolved before classification can be translated into clinical practice. Specifically, past efforts at FFR classification have used data from a given subject for both training and testing the classifier. It is also unclear which components of the FFR elicit optimal classification accuracy. To address these issues, we recorded FFRs from 13 adults with normal hearing in response to speech and music stimuli. We compared labeling accuracy of two cross-validation classification approaches using FFR data: (1) a more traditional method combining subject data in both the training and testing set, and (2) a "leave-one-out" approach, in which subject data is classified based on a model built exclusively from the data of other individuals. We also examined classification accuracy on decomposed and time-segmented FFRs. Our results indicate that the accuracy of leave-one-subject-out cross validation approaches that obtained in the more conventional cross-validation classifications while allowing a subject's results to be analysed with respect to normative data pooled from a separate population. In addition, we demonstrate that classification accuracy is highest when the entire FFR is used to train the classifier. Taken together, these efforts contribute key steps toward translation of classification-based machine learning approaches into clinical practice.
Collapse
Affiliation(s)
- Steven Losorelli
- Department of Otolaryngology Head and Neck Surgery, Stanford University School of Medicine, Palo Alto, CA, USA.
| | - Blair Kaneshiro
- Department of Otolaryngology Head and Neck Surgery, Stanford University School of Medicine, Palo Alto, CA, USA.
| | - Gabriella A Musacchia
- Department of Otolaryngology Head and Neck Surgery, Stanford University School of Medicine, Palo Alto, CA, USA; Department of Audiology, University of the Pacific, San Francisco, CA, USA.
| | - Nikolas H Blevins
- Department of Otolaryngology Head and Neck Surgery, Stanford University School of Medicine, Palo Alto, CA, USA.
| | - Matthew B Fitzgerald
- Department of Otolaryngology Head and Neck Surgery, Stanford University School of Medicine, Palo Alto, CA, USA.
| |
Collapse
|
9
|
The Rapid Emergence of Musical Pitch Structure in Human Cortex. J Neurosci 2020; 40:2108-2118. [PMID: 32001611 DOI: 10.1523/jneurosci.1399-19.2020] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2019] [Revised: 01/06/2020] [Accepted: 01/07/2020] [Indexed: 11/21/2022] Open
Abstract
In tonal music, continuous acoustic waveforms are mapped onto discrete, hierarchically arranged, internal representations of pitch. To examine the neural dynamics underlying this transformation, we presented male and female human listeners with tones embedded within a Western tonal context while recording their cortical activity using magnetoencephalography. Machine learning classifiers were then trained to decode different tones from their underlying neural activation patterns at each peristimulus time sample, providing a dynamic measure of their dissimilarity in cortex. Comparing the time-varying dissimilarity between tones with the predictions of acoustic and perceptual models, we observed a temporal evolution in the brain's representational structure. Whereas initial dissimilarities mirrored their fundamental-frequency separation, dissimilarities beyond 200 ms reflected the perceptual status of each tone within the tonal hierarchy of Western music. These effects occurred regardless of stimulus regularities within the context or whether listeners were engaged in a task requiring explicit pitch analysis. Lastly, patterns of cortical activity that discriminated between tones became increasingly stable in time as the information coded by those patterns transitioned from low-to-high level properties. Current results reveal the dynamics with which the complex perceptual structure of Western tonal music emerges in cortex at the timescale of an individual tone.SIGNIFICANCE STATEMENT Little is understood about how the brain transforms an acoustic waveform into the complex perceptual structure of musical pitch. Applying neural decoding techniques to the cortical activity of human subjects engaged in music listening, we measured the dynamics of information processing in the brain on a moment-to-moment basis as subjects heard each tone. In the first 200 ms after onset, transient patterns of neural activity coded the fundamental frequency of tones. Subsequently, a period emerged during which more temporally stable activation patterns coded the perceptual status of each tone within the "tonal hierarchy" of Western music. Our results provide a crucial link between the complex perceptual structure of tonal music and the underlying neural dynamics from which it emerges.
Collapse
|
10
|
Ogg M, Carlson TA, Slevc LR. The Rapid Emergence of Auditory Object Representations in Cortex Reflect Central Acoustic Attributes. J Cogn Neurosci 2019; 32:111-123. [PMID: 31560265 DOI: 10.1162/jocn_a_01472] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Human listeners are bombarded by acoustic information that the brain rapidly organizes into coherent percepts of objects and events in the environment, which aids speech and music perception. The efficiency of auditory object recognition belies the critical constraint that acoustic stimuli necessarily require time to unfold. Using magnetoencephalography, we studied the time course of the neural processes that transform dynamic acoustic information into auditory object representations. Participants listened to a diverse set of 36 tokens comprising everyday sounds from a typical human environment. Multivariate pattern analysis was used to decode the sound tokens from the magnetoencephalographic recordings. We show that sound tokens can be decoded from brain activity beginning 90 msec after stimulus onset with peak decoding performance occurring at 155 msec poststimulus onset. Decoding performance was primarily driven by differences between category representations (e.g., environmental vs. instrument sounds), although within-category decoding was better than chance. Representational similarity analysis revealed that these emerging neural representations were related to harmonic and spectrotemporal differences among the stimuli, which correspond to canonical acoustic features processed by the auditory pathway. Our findings begin to link the processing of physical sound properties with the perception of auditory objects and events in cortex.
Collapse
|