1
|
Salakka I, Pitkäniemi A, Pentikäinen E, Saari P, Toiviainen P, Särkämö T. Emotional and musical factors combined with song-specific age predict the subjective autobiographical saliency of music in older adults. Psychol Music 2024; 52:305-321. [PMID: 38708378 PMCID: PMC11068497 DOI: 10.1177/03057356231186961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/07/2024]
Abstract
Music that evokes strong emotional responses is often experienced as autobiographically salient. Through emotional experience, the musical features of songs could also contribute to their subjective autobiographical saliency. Songs which have been popular during adolescence or young adulthood (ages 10-30) are more likely to evoke stronger memories, a phenomenon known as a reminiscence bump. In the present study, we sought to determine how song-specific age, emotional responsiveness to music, musical features, and subjective memory functioning contribute to the subjective autobiographical saliency of music in older adults. In a music listening study, 112 participants rated excerpts of popular songs from the 1950s to the 1980s for autobiographical saliency. Additionally, they filled out questionnaires about emotional responsiveness to music and subjective memory functioning. The song excerpts' musical features were extracted computationally using MIRtoolbox. Results showed that autobiographical saliency was best predicted by song-specific age and emotional responsiveness to music and musical features. Newer songs that were more similar in rhythm to older songs were also rated higher in autobiographical saliency. Overall, this study contributes to autobiographical memory research by uncovering a set of factors affecting the subjective autobiographical saliency of music.
Collapse
Affiliation(s)
- Ilja Salakka
- Music, Ageing and Rehabilitation Team, Cognitive Brain Research Unit, Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- Centre of Excellence in Music, Mind, Body and Brain, University of Jyväskylä and University of Helsinki, Helsinki, Finland
- Rehabilitation Foundation, Helsinki, Finland
| | - Anni Pitkäniemi
- Music, Ageing and Rehabilitation Team, Cognitive Brain Research Unit, Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- Centre of Excellence in Music, Mind, Body and Brain, University of Jyväskylä and University of Helsinki, Helsinki, Finland
| | - Emmi Pentikäinen
- Music, Ageing and Rehabilitation Team, Cognitive Brain Research Unit, Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- Centre of Excellence in Music, Mind, Body and Brain, University of Jyväskylä and University of Helsinki, Helsinki, Finland
| | - Pasi Saari
- Department of Music, Art and Culture Studies, University of Jyväskylä, Jyväskylä, Finland
| | - Petri Toiviainen
- Centre of Excellence in Music, Mind, Body and Brain, University of Jyväskylä and University of Helsinki, Helsinki, Finland
- Department of Music, Art and Culture Studies, University of Jyväskylä, Jyväskylä, Finland
| | - Teppo Särkämö
- Music, Ageing and Rehabilitation Team, Cognitive Brain Research Unit, Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- Centre of Excellence in Music, Mind, Body and Brain, University of Jyväskylä and University of Helsinki, Helsinki, Finland
| |
Collapse
|
2
|
Nussbaum C, Schirmer A, Schweinberger SR. Musicality - Tuned to the melody of vocal emotions. Br J Psychol 2024; 115:206-225. [PMID: 37851369 DOI: 10.1111/bjop.12684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 09/12/2023] [Accepted: 09/24/2023] [Indexed: 10/19/2023]
Abstract
Musicians outperform non-musicians in vocal emotion perception, likely because of increased sensitivity to acoustic cues, such as fundamental frequency (F0) and timbre. Yet, how musicians make use of these acoustic cues to perceive emotions, and how they might differ from non-musicians, is unclear. To address these points, we created vocal stimuli that conveyed happiness, fear, pleasure or sadness, either in all acoustic cues, or selectively in either F0 or timbre only. We then compared vocal emotion perception performance between professional/semi-professional musicians (N = 39) and non-musicians (N = 38), all socialized in Western music culture. Compared to non-musicians, musicians classified vocal emotions more accurately. This advantage was seen in the full and F0-modulated conditions, but was absent in the timbre-modulated condition indicating that musicians excel at perceiving the melody (F0), but not the timbre of vocal emotions. Further, F0 seemed more important than timbre for the recognition of all emotional categories. Additional exploratory analyses revealed a link between time-varying F0 perception in music and voices that was independent of musical training. Together, these findings suggest that musicians are particularly tuned to the melody of vocal emotions, presumably due to a natural predisposition to exploit melodic patterns.
Collapse
Affiliation(s)
- Christine Nussbaum
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University, Jena, Germany
- Voice Research Unit, Friedrich Schiller University, Jena, Germany
| | - Annett Schirmer
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University, Jena, Germany
- Institute of Psychology, University of Innsbruck, Innsbruck, Austria
| | - Stefan R Schweinberger
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University, Jena, Germany
- Voice Research Unit, Friedrich Schiller University, Jena, Germany
- Swiss Center for Affective Sciences, University of Geneva, Geneva, Switzerland
| |
Collapse
|
3
|
Nussbaum C, Schirmer A, Schweinberger SR. Electrophysiological Correlates of Vocal Emotional Processing in Musicians and Non-Musicians. Brain Sci 2023; 13:1563. [PMID: 38002523 PMCID: PMC10670383 DOI: 10.3390/brainsci13111563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 10/31/2023] [Accepted: 11/03/2023] [Indexed: 11/26/2023] Open
Abstract
Musicians outperform non-musicians in vocal emotion recognition, but the underlying mechanisms are still debated. Behavioral measures highlight the importance of auditory sensitivity towards emotional voice cues. However, it remains unclear whether and how this group difference is reflected at the brain level. Here, we compared event-related potentials (ERPs) to acoustically manipulated voices between musicians (n = 39) and non-musicians (n = 39). We used parameter-specific voice morphing to create and present vocal stimuli that conveyed happiness, fear, pleasure, or sadness, either in all acoustic cues or selectively in either pitch contour (F0) or timbre. Although the fronto-central P200 (150-250 ms) and N400 (300-500 ms) components were modulated by pitch and timbre, differences between musicians and non-musicians appeared only for a centro-parietal late positive potential (500-1000 ms). Thus, this study does not support an early auditory specialization in musicians but suggests instead that musicality affects the manner in which listeners use acoustic voice cues during later, controlled aspects of emotion evaluation.
Collapse
Affiliation(s)
- Christine Nussbaum
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University, 07743 Jena, Germany;
- Voice Research Unit, Friedrich Schiller University, 07743 Jena, Germany
| | - Annett Schirmer
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University, 07743 Jena, Germany;
- Institute of Psychology, University of Innsbruck, 6020 Innsbruck, Austria
| | - Stefan R. Schweinberger
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University, 07743 Jena, Germany;
- Voice Research Unit, Friedrich Schiller University, 07743 Jena, Germany
- Swiss Center for Affective Sciences, University of Geneva, 1202 Geneva, Switzerland
| |
Collapse
|
4
|
Verma T, Aker SC, Marozeau J. Effect of Vibrotactile Stimulation on Auditory Timbre Perception for Normal-Hearing Listeners and Cochlear-Implant Users. Trends Hear 2023; 27:23312165221138390. [PMID: 36789758 PMCID: PMC9932763 DOI: 10.1177/23312165221138390] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023] Open
Abstract
The study tests the hypothesis that vibrotactile stimulation can affect timbre perception. A multidimensional scaling experiment was conducted. Twenty listeners with normal hearing and nine cochlear implant users were asked to judge the dissimilarity of a set of synthetic sounds that varied in attack time and amplitude modulation depth. The listeners were simultaneously presented with vibrotactile stimuli, which varied also in attack time and amplitude modulation depth. The results showed that alterations to the temporal waveform of the tactile stimuli affected the listeners' dissimilarity judgments of the audio. A three-dimensional analysis revealed evidence of crossmodal processing where the audio and tactile equivalents combined accounted for their dissimilarity judgments. For the normal-hearing listeners, 86% of the first dimension was explained by audio impulsiveness and 14% by tactile impulsiveness; 75% of the second dimension was explained by the audio roughness or fast amplitude modulation, while its tactile counterpart explained 25%. Interestingly, the third dimension revealed a combination of 43% of audio impulsiveness and 57% of tactile amplitude modulation. For the CI listeners, the first dimension was mostly accounted for by the tactile roughness and the second by the audio impulsiveness. This experiment shows that the perception of timbre can be affected by tactile input and could lead to the developing of new audio-tactile devices for people with hearing impairment.
Collapse
Affiliation(s)
- Tushar Verma
- Music and Cochlear Implant Lab, Department of Health Technology,
Technical
University of Denmark, Kongens Lyngby,
Denmark
| | - Scott C. Aker
- Music and Cochlear Implant Lab, Department of Health Technology,
Technical
University of Denmark, Kongens Lyngby,
Denmark,Oticon Medical, Smørum, Denmark
| | - Jeremy Marozeau
- Music and Cochlear Implant Lab, Department of Health Technology,
Technical
University of Denmark, Kongens Lyngby,
Denmark,Jeremy Marozeau, Music and Cochlear Implant
Lab, Department of Health Technology, Technical University of Denmark, Kongens
Lyngby, 2800, Denmark.
| |
Collapse
|
5
|
Nussbaum C, Schirmer A, Schweinberger SR. Contributions of fundamental frequency and timbre to vocal emotion perception and their electrophysiological correlates. Soc Cogn Affect Neurosci 2022; 17:1145-1154. [PMID: 35522247 PMCID: PMC9714422 DOI: 10.1093/scan/nsac033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 04/12/2022] [Accepted: 05/06/2022] [Indexed: 01/12/2023] Open
Abstract
Our ability to infer a speaker's emotional state depends on the processing of acoustic parameters such as fundamental frequency (F0) and timbre. Yet, how these parameters are processed and integrated to inform emotion perception remains largely unknown. Here we pursued this issue using a novel parameter-specific voice morphing technique to create stimuli with emotion modulations in only F0 or only timbre. We used these stimuli together with fully modulated vocal stimuli in an event-related potential (ERP) study in which participants listened to and identified stimulus emotion. ERPs (P200 and N400) and behavioral data converged in showing that both F0 and timbre support emotion processing but do so differently for different emotions: Whereas F0 was most relevant for responses to happy, fearful and sad voices, timbre was most relevant for responses to voices expressing pleasure. Together, these findings offer original insights into the relative significance of different acoustic parameters for early neuronal representations of speaker emotion and show that such representations are predictive of subsequent evaluative judgments.
Collapse
Affiliation(s)
- Christine Nussbaum
- Correspondence should be addressed to Christine Nussbaum, Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University Jena, Leutragraben 1, Jena 07743, Germany. E-mail:
| | - Annett Schirmer
- Department of Psychology, The Chinese University of Hong Kong, Shatin 999077, Hong Kong SAR,Brain and Mind Institute, The Chinese University of Hong Kong, Shatin 999077, Hong Kong SAR,Center for Cognition and Brain Studies, The Chinese University of Hong Kong, Shatin 999077, Hong Kong SAR
| | - Stefan R Schweinberger
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University, Jena 07743, Germany,Voice Research Unit, Friedrich Schiller University, Jena 07743, Germany,Swiss Center for Affective Sciences, University of Geneva, Geneva 1202, Switzerland
| |
Collapse
|
6
|
Reymore L, Beauvais-Lacasse E, Smith BK, McAdams S. Modeling Noise-Related Timbre Semantic Categories of Orchestral Instrument Sounds With Audio Features, Pitch Register, and Instrument Family. Front Psychol 2022; 13:796422. [PMID: 35432090 PMCID: PMC9010607 DOI: 10.3389/fpsyg.2022.796422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 02/21/2022] [Indexed: 11/13/2022] Open
Abstract
Audio features such as inharmonicity, noisiness, and spectral roll-off have been identified as correlates of "noisy" sounds. However, such features are likely involved in the experience of multiple semantic timbre categories of varied meaning and valence. This paper examines the relationships of stimulus properties and audio features with the semantic timbre categories raspy/grainy/rough, harsh/noisy, and airy/breathy. Participants (n = 153) rated a random subset of 52 stimuli from a set of 156 approximately 2-s orchestral instrument sounds representing varied instrument families (woodwinds, brass, strings, percussion), registers (octaves 2 through 6, where middle C is in octave 4), and both traditional and extended playing techniques (e.g., flutter-tonguing, bowing at the bridge). Stimuli were rated on the three semantic categories of interest, as well as on perceived playing exertion and emotional valence. Correlational analyses demonstrated a strong negative relationship between positive valence and perceived physical exertion. Exploratory linear mixed models revealed significant effects of extended technique and pitch register on valence, the perception of physical exertion, raspy/grainy/rough, and harsh/noisy. Instrument family was significantly related to ratings of airy/breathy. With an updated version of the Timbre Toolbox (R-2021 A), we used 44 summary audio features, extracted from the stimuli using spectral and harmonic representations, as input for various models built to predict mean semantic ratings for each sound on the three semantic categories, on perceived exertion, and on valence. Random Forest models predicting semantic ratings from audio features outperformed Partial Least-Squares Regression models, consistent with previous results suggesting that non-linear methods are advantageous in timbre semantic predictions using audio features. Relative Variable Importance measures from the models among the three semantic categories demonstrate that although these related semantic categories are associated in part with overlapping features, they can be differentiated through individual patterns of audio feature relationships.
Collapse
Affiliation(s)
- Lindsey Reymore
- Department of Music Research, Schulich School of Music, McGill University, Montreal, QC, Canada
| | | | | | | |
Collapse
|
7
|
Kazazis S, Depalle P, McAdams S. Interval and Ratio Scaling of Spectral Audio Descriptors. Front Psychol 2022; 13:835401. [PMID: 35432077 PMCID: PMC9007158 DOI: 10.3389/fpsyg.2022.835401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 03/07/2022] [Indexed: 11/13/2022] Open
Abstract
Two experiments were conducted for the derivation of psychophysical scales of the following audio descriptors: spectral centroid, spectral spread, spectral skewness, odd-to-even harmonic ratio, spectral deviation, and spectral slope. The stimulus sets of each audio descriptor were synthesized and (wherever possible) independently controlled through appropriate synthesis techniques. Partition scaling methods were used in both experiments, and the scales were constructed by fitting well-behaving functions to the listeners' ratings. In the first experiment, the listeners' task was the estimation of the relative differences between successive levels of a particular audio descriptor. The median values of listeners' ratings increased with increasing feature values, which confirmed listeners' abilities to estimate intervals. However, there was a large variability in the reliability of the derived interval scales depending on the stimulus spacing in each trial. In the second experiment, listeners had control over the stimulus values and were asked to divide the presented range of values into perceptually equal intervals, which provides a ratio scale. For every descriptor, the reliability of the derived ratio scales was excellent. The unit of a particular ratio scale was assigned empirically so as to facilitate qualitative comparisons between the scales of all audio descriptors. The construction of psychophysical scales based on univariate stimuli allowed for the establishment of cause-and-effect relations between audio descriptors and perceptual dimensions, contrary to past research that has relied on multivariate stimuli and has only examined the correlations between the two. Most importantly, this study provides an understanding of the ways in which the sensation magnitudes of several audio descriptors are apprehended.
Collapse
Affiliation(s)
- Savvas Kazazis
- Schulich School of Music, McGill University, Montreal, QC, Canada
| | | | | |
Collapse
|
8
|
Wang X, Wei Y, Heng L, McAdams S. A Cross-Cultural Analysis of the Influence of Timbre on Affect Perception in Western Classical Music and Chinese Music Traditions. Front Psychol 2021; 12:732865. [PMID: 34659045 PMCID: PMC8511703 DOI: 10.3389/fpsyg.2021.732865] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 09/01/2021] [Indexed: 12/04/2022] Open
Abstract
Timbre is one of the psychophysical cues that has a great impact on affect perception, although, it has not been the subject of much cross-cultural research. Our aim is to investigate the influence of timbre on the perception of affect conveyed by Western and Chinese classical music using a cross-cultural approach. Four listener groups (Western musicians, Western nonmusicians, Chinese musicians, and Chinese nonmusicians; 40 per group) were presented with 48 musical excerpts, which included two musical excerpts (one piece of Chinese and one piece of Western classical music) per affect quadrant from the valence-arousal space, representing angry, happy, peaceful, and sad emotions and played with six different instruments (erhu, dizi, pipa, violin, flute, and guitar). Participants reported ratings of valence, tension arousal, energy arousal, preference, and familiarity on continuous scales ranging from 1 to 9. ANOVA reveals that participants’ cultural backgrounds have a greater impact on affect perception than their musical backgrounds, and musicians more clearly distinguish between a perceived measure (valence) and a felt measure (preference) than do nonmusicians. We applied linear partial least squares regression to explore the relation between affect perception and acoustic features. The results show that the important acoustic features for valence and energy arousal are similar, which are related mostly to spectral variation, the shape of the temporal envelope, and the dynamic range. The important acoustic features for tension arousal describe the shape of the spectral envelope, noisiness, and the shape of the temporal envelope. The explanation for the similarity of perceived affect ratings between instruments is the similar acoustic features that were caused by the physical characteristics of specific instruments and performing techniques.
Collapse
Affiliation(s)
- Xin Wang
- School of Music and Recording Art, Communication University of China, Beijing, China
| | - Yujia Wei
- School of Music and Recording Art, Communication University of China, Beijing, China
| | - Lena Heng
- Schulich School of Music, McGill University, Montreal, QC, Canada
| | - Stephen McAdams
- Schulich School of Music, McGill University, Montreal, QC, Canada
| |
Collapse
|
9
|
Durojaye C, Fink L, Roeske T, Wald-Fuhrmann M, Larrouy-Maestri P. Perception of Nigerian Dùndún Talking Drum Performances as Speech-Like vs. Music-Like: The Role of Familiarity and Acoustic Cues. Front Psychol 2021; 12:652673. [PMID: 34093341 PMCID: PMC8173200 DOI: 10.3389/fpsyg.2021.652673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Accepted: 04/21/2021] [Indexed: 11/23/2022] Open
Abstract
It seems trivial to identify sound sequences as music or speech, particularly when the sequences come from different sound sources, such as an orchestra and a human voice. Can we also easily distinguish these categories when the sequence comes from the same sound source? On the basis of which acoustic features? We investigated these questions by examining listeners’ classification of sound sequences performed by an instrument intertwining both speech and music: the dùndún talking drum. The dùndún is commonly used in south-west Nigeria as a musical instrument but is also perfectly fit for linguistic usage in what has been described as speech surrogates in Africa. One hundred seven participants from diverse geographical locations (15 different mother tongues represented) took part in an online experiment. Fifty-one participants reported being familiar with the dùndún talking drum, 55% of those being speakers of Yorùbá. During the experiment, participants listened to 30 dùndún samples of about 7s long, performed either as music or Yorùbá speech surrogate (n = 15 each) by a professional musician, and were asked to classify each sample as music or speech-like. The classification task revealed the ability of the listeners to identify the samples as intended by the performer, particularly when they were familiar with the dùndún, though even unfamiliar participants performed above chance. A logistic regression predicting participants’ classification of the samples from several acoustic features confirmed the perceptual relevance of intensity, pitch, timbre, and timing measures and their interaction with listener familiarity. In all, this study provides empirical evidence supporting the discriminating role of acoustic features and the modulatory role of familiarity in teasing apart speech and music.
Collapse
Affiliation(s)
- Cecilia Durojaye
- Department of Music, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany.,Department of Psychology, Arizona State University, Tempe, AZ, United States
| | - Lauren Fink
- Department of Music, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany.,Max Planck-NYU, Center for Language, Music, and Emotion, Frankfurt am Main, Germany
| | - Tina Roeske
- Department of Music, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany
| | - Melanie Wald-Fuhrmann
- Department of Music, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany.,Max Planck-NYU, Center for Language, Music, and Emotion, Frankfurt am Main, Germany
| | - Pauline Larrouy-Maestri
- Max Planck-NYU, Center for Language, Music, and Emotion, Frankfurt am Main, Germany.,Neuroscience Department, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany
| |
Collapse
|
10
|
Siedenburg K, Goldmann K, van de Par S. Tracking Musical Voices in Bach's The Art of the Fugue: Timbral Heterogeneity Differentially Affects Younger Normal-Hearing Listeners and Older Hearing-Aid Users. Front Psychol 2021; 12:608684. [PMID: 33935864 PMCID: PMC8079728 DOI: 10.3389/fpsyg.2021.608684] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 03/15/2021] [Indexed: 12/03/2022] Open
Abstract
Auditory scene analysis is an elementary aspect of music perception, yet only little research has scrutinized auditory scene analysis under realistic musical conditions with diverse samples of listeners. This study probed the ability of younger normal-hearing listeners and older hearing-aid users in tracking individual musical voices or lines in JS Bach's The Art of the Fugue. Five-second excerpts with homogeneous or heterogenous instrumentation of 2–4 musical voices were presented from spatially separated loudspeakers and preceded by a short cue for signaling the target voice. Listeners tracked the cued voice and detected whether an amplitude modulation was imposed on the cued voice or a distractor voice. Results indicated superior performance of young normal-hearing listeners compared to older hearing-aid users. Performance was generally better in conditions with fewer voices. For young normal-hearing listeners, there was interaction between the number of voices and the instrumentation: performance degraded less drastically with an increase in the number of voices for timbrally heterogeneous mixtures compared to homogeneous mixtures. Older hearing-aid users generally showed smaller effects of the number of voices and instrumentation, but no interaction between the two factors. Moreover, tracking performance of older hearing aid users did not differ when these participants did or did not wear hearing aids. These results shed light on the role of timbral differentiation in musical scene analysis and suggest reduced musical scene analysis abilities of older hearing-impaired listeners in a realistic musical scenario.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Kirsten Goldmann
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Steven van de Par
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
11
|
Abstract
INTRODUCTION Cochlear implants (CIs) are biomedical devices that restore sound perception for people with severe-to-profound sensorineural hearing loss. Most postlingually deafened CI users are able to achieve excellent speech recognition in quiet environments. However, current CI sound processors remain limited in their ability to deliver fine spectrotemporal information, making it difficult for CI users to perceive complex sounds. Limited access to complex acoustic cues such as music, environmental sounds, lexical tones, and voice emotion may have significant ramifications on quality of life, social development, and community interactions. AREAS COVERED The purpose of this review article is to summarize the literature on CIs and music perception, with an emphasis on music training in pediatric CI recipients. The findings have implications on our understanding of noninvasive, accessible methods for improving auditory processing and may help advance our ability to improve sound quality and performance for implantees. EXPERT OPINION Music training, particularly in the pediatric population, may be able to continue to enhance auditory processing even after performance plateaus. The effects of these training programs appear generalizable to non-trained musical tasks, speech prosody and, emotion perception. Future studies should employ rigorous control groups involving a non-musical acoustic intervention, standardized auditory stimuli, and the provision of feedback.
Collapse
Affiliation(s)
- Nicole T Jiam
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco School of Medicine , San Francisco, CA, USA
| | - Charles Limb
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco School of Medicine , San Francisco, CA, USA
| |
Collapse
|
12
|
Abstract
While absolute pitch (AP)—the ability to name musical pitches globally and without reference—is rare in expert musicians, anecdotal evidence suggests that some musicians may better identify pitches played on their primary instrument than pitches played on other instruments. We call this phenomenon “instrument-specific absolute pitch” (ISAP). In this paper we present a theory of ISAP. Specifically, we offer the hypothesis that some expert musicians without global AP may be able to more accurately identify pitches played on their primary instrument(s), and we propose timbral cues and articulatory motor imagery as two underlying mechanisms. Depending on whether informative timbral cues arise from performer- or instrument-specific idiosyncrasies or from timbre-facilitated tonotopic representations, we predict that performance may be enhanced for notes played by oneself, notes played on one’s own personal instrument, and/or notes played on any exemplar of one’s own instrument type. Sounds of one’s primary instrument may moreover activate kinesthetic memory and motor imagery, aiding pitch identification. In order to demonstrate how our theory can be tested, we report the methodology and analysis of two exemplary experiments conducted on two case-study participants who are professional oboists. The aim of the first experiment was to determine whether the oboists demonstrated ISAP ability, while the purpose of the second experiment was to provide a preliminary investigation of the underlying mechanisms. The results of the first experiment revealed that only one of the two oboists showed an advantage for identifying oboe tones over piano tones. For this oboist demonstrating ISAP, the second experiment demonstrated that pitch-naming accuracy decreased and variance around the correct pitch value increased as an effect of transposition and motor interference, but not of instrument or performer. These preliminary data suggest that some musicians possess ISAP while others do not. Timbral cues and motor imagery may both play roles in the acquisition of this ability. Based on our case study findings, we provide methodological considerations and recommendations for future empirical testing of our theory of ISAP.
Collapse
Affiliation(s)
- Lindsey Reymore
- School of Music, The Ohio State University, Columbus, OH, United States.,Schulich School of Music, McGill University, Montréal, QC, Canada
| | - Niels Chr Hansen
- Aarhus Institute of Advanced Studies, Aarhus University, Aarhus, Denmark.,Center for Music in the Brain, Aarhus University, Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark
| |
Collapse
|
13
|
Siedenburg K, Röttges S, Wagener KC, Hohmann V. Can You Hear Out the Melody? Testing Musical Scene Perception in Young Normal-Hearing and Older Hearing-Impaired Listeners. Trends Hear 2020; 24:2331216520945826. [PMID: 32895034 PMCID: PMC7502688 DOI: 10.1177/2331216520945826] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
It is well known that hearing loss compromises auditory scene analysis abilities,
as is usually manifested in difficulties of understanding speech in noise.
Remarkably little is known about auditory scene analysis of hearing-impaired
(HI) listeners when it comes to musical sounds. Specifically, it is unclear to
which extent HI listeners are able to hear out a melody or an instrument from a
musical mixture. Here, we tested a group of younger normal-hearing (yNH) and
older HI (oHI) listeners with moderate hearing loss in their ability to match
short melodies and instruments presented as part of mixtures. Four-tone
sequences were used in conjunction with a simple musical accompaniment that
acted as a masker (cello/piano dyads or spectrally matched noise). In each
trial, a signal-masker mixture was presented, followed by two different versions
of the signal alone. Listeners indicated which signal version was part of the
mixture. Signal versions differed either in terms of the sequential order of the
pitch sequence or in terms of timbre (flute vs. trumpet). Signal-to-masker
thresholds were measured by varying the signal presentation level in an adaptive
two-down/one-up procedure. We observed that thresholds of oHI listeners were
elevated by on average 10 dB compared with that of yNH listeners. In contrast to
yNH listeners, oHI listeners did not show evidence of listening in dips of the
masker. Musical training of participants was associated with a lowering of
thresholds. These results may indicate detrimental effects of hearing loss on
central aspects of musical scene perception.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University of Oldenburg
| | - Saskia Röttges
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University of Oldenburg
| | | | - Volker Hohmann
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University of Oldenburg.,Hörzentrum Oldenburg GmbH & Hörtech gGmbH, Oldenburg, Germany
| |
Collapse
|
14
|
Chen X, Huang S, Hei X, Zeng H. Felt Emotion Elicited by Music: Are Sensitivities to Various Musical Features Different for Young Children and Young Adults? Span J Psychol 2020; 23:e8. [PMID: 32434622 DOI: 10.1017/SJP.2020.8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
In the present study, we extended the issue of how people access emotion through nonverbal information by testing the effects of simple (tempo) and complex (timbre) acoustic features of music on felt emotion. Three- to six-year-old young children (n = 100; 48% female) and university students (n = 64; 37.5% female) took part in three experiments in which acoustic features of music were manipulated to determine whether there are links between perceived emotion and felt emotion in processing musical segments. After exposure to segments of music, participants completed a felt emotion judgment task. The chi-square test showed significant tempo effects, ps < .001 (Exp. 1), and strong combined effects of mode and tempo on felt emotion. In addition, strength of these effects changed across age. However, these combined effects were significantly stronger under the tempo-and-mode consistent condition, ps < .001 (Exp. 2) than inconsistent condition (Exp. 3). In other words, simple versus complex acoustic features had stronger effects on felt emotion, and that sensitivity to these features, especially complex features, changed across age. These findings suggest that felt emotion evoked by acoustic features of a given piece of music might be affected by both innate abilities and by the strength of mappings between acoustic features and emotion.
Collapse
|
15
|
Erickson ML, Faulkner K, Johnstone PM, Hedrick MS, Stone T. Multidimensional Timbre Spaces of Cochlear Implant Vocoded and Non-vocoded Synthetic Female Singing Voices. Front Neurosci 2020; 14:307. [PMID: 32372904 PMCID: PMC7179674 DOI: 10.3389/fnins.2020.00307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 03/16/2020] [Indexed: 12/04/2022] Open
Abstract
Many post-lingually deafened cochlear implant (CI) users report that they no longer enjoy listening to music, which could possibly contribute to a perceived reduction in quality of life. One aspect of music perception, vocal timbre perception, may be difficult for CI users because they may not be able to use the same timbral cues available to normal hearing listeners. Vocal tract resonance frequencies have been shown to provide perceptual cues to voice categories such as baritone, tenor, mezzo-soprano, and soprano, while changes in glottal source spectral slope are believed to be related to perception of vocal quality dimensions such as fluty vs. brassy. As a first step toward understanding vocal timbre perception in CI users, we employed an 8-channel noise-band vocoder to test how vocoding can alter the timbral perception of female synthetic sung vowels across pitches. Non-vocoded and vocoded stimuli were synthesized with vibrato using 3 excitation source spectral slopes and 3 vocal tract transfer functions (mezzo-soprano, intermediate, soprano) at the pitches C4, B4, and F5. Six multi-dimensional scaling experiments were conducted: C4 not vocoded, C4 vocoded, B4 not vocoded, B4 vocoded, F5 not vocoded, and F5 vocoded. At the pitch C4, for both non-vocoded and vocoded conditions, dimension 1 grouped stimuli according to voice category and was most strongly predicted by spectral centroid from 0 to 2 kHz. While dimension 2 grouped stimuli according to excitation source spectral slope, it was organized slightly differently and predicted by different acoustic parameters in the non-vocoded and vocoded conditions. For pitches B4 and F5 spectral centroid from 0 to 2 kHz most strongly predicted dimension 1. However, while dimension 1 separated all 3 voice categories in the vocoded condition, dimension 1 only separated the soprano stimuli from the intermediate and mezzo-soprano stimuli in the non-vocoded condition. While it is unclear how these results predict timbre perception in CI listeners, in general, these results suggest that perhaps some aspects of vocal timbre may remain.
Collapse
Affiliation(s)
- Molly L. Erickson
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States
| | | | | | | | | |
Collapse
|
16
|
Lega C, Cattaneo Z, Ancona N, Vecchi T, Rinaldi L. Instrumental expertise and musical timbre modulate the spatial representation of pitch. Q J Exp Psychol (Hove) 2020; 73:1162-1172. [PMID: 31965917 DOI: 10.1177/1747021819897779] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Humans show a tendency to represent pitch in a spatial format. A classical finding supporting this spatial representation is the Spatial-Musical Association of Response Codes (SMARC) effect, reflecting faster responses to low tones when pressing a left/bottom-side key and to high tones when pressing a right/top-side key. Despite available evidence suggesting that the horizontal and vertical SMARC effect may be differently modulated by instrumental expertise and musical timbre, no study has so far directly explored this hypothesis in a unified framework. Here, we investigated this possibility by comparing the performance of professional pianists, professional clarinettists and non-musicians in an implicit timbre judgement task, in both horizontal and vertical response settings. Results showed that instrumental expertise significantly modulates the SMARC effect: whereas in the vertical plane a comparable SMARC effect was observed in all groups, in the horizontal plane the SMARC effect was significantly modulated by the specific instrumental expertise, with pianists showing a stronger pitch-space association compared to clarinettists and non-musicians. Moreover, the influence of pitch along the horizontal dimension was stronger in those pianists who started the instrumental training at a younger age. Results also showed an influence of musical timbre in driving the horizontal, but not the vertical, SMARC effect, with only piano notes inducing a pitch-space association. Taken together, these findings suggest that sensorimotor experience due to instrumental training and musical timbre affect the mental representation of pitch on the horizontal space, whereas the one on the vertical space would be mainly independent from musical practice.
Collapse
Affiliation(s)
- Carlotta Lega
- Department of Psychology, University of Milano-Bicocca, Milano, Italy
| | - Zaira Cattaneo
- Department of Psychology, University of Milano-Bicocca, Milano, Italy.,IRCCS Mondino Foundation, Pavia, Italy
| | - Noemi Ancona
- Department of Brain and Behavioural Sciences, University of Pavia, Pavia, Italy
| | - Tomaso Vecchi
- IRCCS Mondino Foundation, Pavia, Italy.,Department of Brain and Behavioural Sciences, University of Pavia, Pavia, Italy
| | - Luca Rinaldi
- Department of Brain and Behavioural Sciences, University of Pavia, Pavia, Italy
| |
Collapse
|
17
|
Abstract
People tend to associate stimuli from different sensory modalities, a phenomenon known as crossmodal correspondences. We conducted two experiments to investigate how Chinese participants associated musical notes produced by four types of Chinese instruments (bowed strings, plucked strings, winds, and percussion instruments) with different colors, taste terms, and fabric textures. Specifically, the participants were asked to select a sound to match each color patch or taste term in Experiment 1 and to match the experience of touching each fabric in Experiment 2. The results demonstrated some associations between pitch and color, taste term, or the smoothness of fabrics. Moreover, certain types of Chinese instruments were preferentially chosen to match some of the colors, taste terms, and the texture of certain fabrics. These findings therefore provided insights about the perception of Chinese music and shed light on how to apply the multisensory features of sounds to enhance the composition, performance, and appreciation of music.
Collapse
Affiliation(s)
- Yuxuan Qi
- Department of Psychology, Tsinghua University, Beijing, China
| | - Fuxing Huang
- Department of Psychology, Tsinghua University, Beijing, China
| | - Zeyan Li
- School of Material Science and Engineering, Tsinghua University, Beijing, China
| | - Xiaoang Wan
- Department of Psychology, Tsinghua University, Beijing, China
| |
Collapse
|
18
|
Ogg M, Slevc LR. Acoustic Correlates of Auditory Object and Event Perception: Speakers, Musical Timbres, and Environmental Sounds. Front Psychol 2019; 10:1594. [PMID: 31379658 PMCID: PMC6650748 DOI: 10.3389/fpsyg.2019.01594] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2019] [Accepted: 06/25/2019] [Indexed: 11/13/2022] Open
Abstract
Human listeners must identify and orient themselves to auditory objects and events in their environment. What acoustic features support a listener's ability to differentiate the great variety of natural sounds they might encounter? Studies of auditory object perception typically examine identification (and confusion) responses or dissimilarity ratings between pairs of objects and events. However, the majority of this prior work has been conducted within single categories of sound. This separation has precluded a broader understanding of the general acoustic attributes that govern auditory object and event perception within and across different behaviorally relevant sound classes. The present experiments take a broader approach by examining multiple categories of sound relative to one another. This approach bridges critical gaps in the literature and allows us to identify (and assess the relative importance of) features that are useful for distinguishing sounds within, between and across behaviorally relevant sound categories. To do this, we conducted behavioral sound identification (Experiment 1) and dissimilarity rating (Experiment 2) studies using a broad set of stimuli that leveraged the acoustic variability within and between different sound categories via a diverse set of 36 sound tokens (12 utterances from different speakers, 12 instrument timbres, and 12 everyday objects from a typical human environment). Multidimensional scaling solutions as well as analyses of item-pair-level responses as a function of different acoustic qualities were used to understand what acoustic features informed participants' responses. In addition to the spectral and temporal envelope qualities noted in previous work, listeners' dissimilarity ratings were associated with spectrotemporal variability and aperiodicity. Subsets of these features (along with fundamental frequency variability) were also useful for making specific within or between sound category judgments. Dissimilarity ratings largely paralleled sound identification performance, however the results of these tasks did not completely mirror one another. In addition, musical training was related to improved sound identification performance.
Collapse
Affiliation(s)
- Mattson Ogg
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, College Park, MD, United States
- Department of Psychology, University of Maryland, College Park, College Park, MD, United States
| | - L. Robert Slevc
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, College Park, MD, United States
- Department of Psychology, University of Maryland, College Park, College Park, MD, United States
| |
Collapse
|
19
|
Allen EJ, Burton PC, Mesik J, Olman CA, Oxenham AJ. Cortical Correlates of Attention to Auditory Features. J Neurosci 2019; 39:3292-300. [PMID: 30804086 DOI: 10.1523/JNEUROSCI.0588-18.2019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2018] [Revised: 02/12/2019] [Accepted: 02/13/2019] [Indexed: 11/21/2022] Open
Abstract
Pitch and timbre are two primary features of auditory perception that are generally considered independent. However, an increase in pitch (produced by a change in fundamental frequency) can be confused with an increase in brightness (an attribute of timbre related to spectral centroid) and vice versa. Previous work indicates that pitch and timbre are processed in overlapping regions of the auditory cortex, but are separable to some extent via multivoxel pattern analysis. Here, we tested whether attention to one or other feature increases the spatial separation of their cortical representations and if attention can enhance the cortical representation of these features in the absence of any physical change in the stimulus. Ten human subjects (four female, six male) listened to pairs of tone triplets varying in pitch, timbre, or both and judged which tone triplet had the higher pitch or brighter timbre. Variations in each feature engaged common auditory regions with no clear distinctions at a univariate level. Attending to one did not improve the separability of the neural representations of pitch and timbre at the univariate level. At the multivariate level, the classifier performed above chance in distinguishing between conditions in which pitch or timbre was discriminated. The results confirm that the computations underlying pitch and timbre perception are subserved by strongly overlapping cortical regions, but reveal that attention to one or other feature leads to distinguishable activation patterns even in the absence of physical differences in the stimuli.SIGNIFICANCE STATEMENT Although pitch and timbre are generally thought of as independent auditory features of a sound, pitch height and timbral brightness can be confused for one another. This study shows that pitch and timbre variations are represented in overlapping regions of auditory cortex, but that they produce distinguishable patterns of activation. Most importantly, the patterns of activation can be distinguished based on whether subjects attended to pitch or timbre even when the stimuli remained physically identical. The results therefore show that variations in pitch and timbre are represented by overlapping neural networks, but that attention to different features of the same sound can lead to distinguishable patterns of activation.
Collapse
|
20
|
Luo X, Soslowsky S, Pulling KR. Interaction Between Pitch and Timbre Perception in Normal-Hearing Listeners and Cochlear Implant Users. J Assoc Res Otolaryngol 2019; 20:57-72. [PMID: 30377852 DOI: 10.1007/s10162-018-00701-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Accepted: 10/07/2018] [Indexed: 10/28/2022] Open
Abstract
Despite their mutually exclusive definitions, pitch and timbre perception interact with each other in normal-hearing (NH) listeners. Cochlear implant (CI) users have worse than normal pitch and timbre perception. However, the pitch-timbre interaction with CIs is not well understood. This study tested the interaction between pitch and sharpness (an aspect of timbre) perception related to the fundamental frequency (F0) and spectral slope of harmonic complex tones, respectively, in both NH listeners and CI users. In experiment 1, the F0 (and spectral slope) difference limens (DLs) were measured with a fixed spectral slope (and F0) and 20-dB amplitude roving. Then, the F0 and spectral slope were varied congruently or incongruently by the same multiple of individual DLs to assess the pitch and sharpness ranking sensitivity. Both NH and CI subjects had significantly higher pitch and sharpness ranking sensitivity with congruent than with incongruent F0 and spectral slope variations, and showed a similar symmetric interaction between pitch and timbre perception. In experiment 2, CI users' melodic contour identification (MCI) was tested in three spectral slope (no, congruent, and incongruent spectral slope variations by the same multiple of individual DLs as the F0 variations) and two amplitude conditions (0- and 20-dB amplitude roving). When there was no amplitude roving, the MCI scores were significantly higher with congruent than with no, and in turn than with incongruent spectral slope variations. The 20-dB amplitude roving significantly reduced the overall MCI scores and the effect of spectral slope variations. These results reflected a confusion between higher (or lower) pitch and sharper (or duller) timbre and offered important implications for understanding and enhancing pitch and timbre perception with CIs.
Collapse
|
21
|
Osman AF, Lee CM, Escabí MA, Read HL. A Hierarchy of Time Scales for Discriminating and Classifying the Temporal Shape of Sound in Three Auditory Cortical Fields. J Neurosci 2018; 38:6967-82. [PMID: 29954851 DOI: 10.1523/JNEUROSCI.2871-17.2018] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Revised: 05/29/2018] [Accepted: 06/17/2018] [Indexed: 11/21/2022] Open
Abstract
Auditory cortex is essential for mammals, including rodents, to detect temporal "shape" cues in the sound envelope but it remains unclear how different cortical fields may contribute to this ability (Lomber and Malhotra, 2008; Threlkeld et al., 2008). Previously, we found that precise spiking patterns provide a potential neural code for temporal shape cues in the sound envelope in the primary auditory (A1), and ventral auditory field (VAF) and caudal suprarhinal auditory field (cSRAF) of the rat (Lee et al., 2016). Here, we extend these findings and characterize the time course of the temporally precise output of auditory cortical neurons in male rats. A pairwise sound discrimination index and a Naive Bayesian classifier are used to determine how these spiking patterns could provide brain signals for behavioral discrimination and classification of sounds. We find response durations and optimal time constants for discriminating sound envelope shape increase in rank order with: A1 < VAF < cSRAF. Accordingly, sustained spiking is more prominent and results in more robust sound discrimination in non-primary cortex versus A1. Spike-timing patterns classify 10 different sound envelope shape sequences and there is a twofold increase in maximal performance when pooling output across the neuron population indicating a robust distributed neural code in all three cortical fields. Together, these results support the idea that temporally precise spiking patterns from primary and non-primary auditory cortical fields provide the necessary signals for animals to discriminate and classify a large range of temporal shapes in the sound envelope.SIGNIFICANCE STATEMENT Functional hierarchies in the visual cortices support the concept that classification of visual objects requires successive cortical stages of processing including a progressive increase in classical receptive field size. The present study is significant as it supports the idea that a similar progression exists in auditory cortices in the time domain. We demonstrate for the first time that three cortices provide temporal spiking patterns for robust temporal envelope shape discrimination but only the ventral non-primary cortices do so on long time scales. This study raises the possibility that primary and non-primary cortices provide unique temporal spiking patterns and time scales for perception of sound envelope shape.
Collapse
|
22
|
Abstract
Amati and Stradivari violins are highly appreciated by musicians and collectors, but the objective understanding of their acoustic qualities is still lacking. By applying speech analysis techniques, we found early Italian violins to emulate the vocal tract resonances of male singers, comparable to basses or baritones. Stradivari pushed these resonance peaks higher to resemble the shorter vocal tract lengths of tenors or altos. Stradivari violins also exhibit vowel qualities that correspond to lower tongue height and backness. These properties may explain the characteristic brilliance of Stradivari violins. The ideal for violin tone in the Baroque era was to imitate the human voice, and we found that Cremonese violins are capable of producing the formant features of human singers. The shape and design of the modern violin are largely influenced by two makers from Cremona, Italy: The instrument was invented by Andrea Amati and then improved by Antonio Stradivari. Although the construction methods of Amati and Stradivari have been carefully examined, the underlying acoustic qualities which contribute to their popularity are little understood. According to Geminiani, a Baroque violinist, the ideal violin tone should “rival the most perfect human voice.” To investigate whether Amati and Stradivari violins produce voice-like features, we recorded the scales of 15 antique Italian violins as well as male and female singers. The frequency response curves are similar between the Andrea Amati violin and human singers, up to ∼4.2 kHz. By linear predictive coding analyses, the first two formants of the Amati exhibit vowel-like qualities (F1/F2 = 503/1,583 Hz), mapping to the central region on the vowel diagram. Its third and fourth formants (F3/F4 = 2,602/3,731 Hz) resemble those produced by male singers. Using F1 to F4 values to estimate the corresponding vocal tract length, we observed that antique Italian violins generally resemble basses/baritones, but Stradivari violins are closer to tenors/altos. Furthermore, the vowel qualities of Stradivari violins show reduced backness and height. The unique formant properties displayed by Stradivari violins may represent the acoustic correlate of their distinctive brilliance perceived by musicians. Our data demonstrate that the pioneering designs of Cremonese violins exhibit voice-like qualities in their acoustic output.
Collapse
|
23
|
Abstract
This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.
Collapse
Affiliation(s)
- Yingjiu Nie
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | | | - Michael Morikawa
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | - Victoria André
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | - Harley Wheeler
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | - Qian-Jie Fu
- 3 Department of Head and Neck Surgery, University of California-Los Angeles, CA, USA
| |
Collapse
|
24
|
Disbergen NR, Valente G, Formisano E, Zatorre RJ. Assessing Top-Down and Bottom-Up Contributions to Auditory Stream Segregation and Integration With Polyphonic Music. Front Neurosci 2018; 12:121. [PMID: 29563861 PMCID: PMC5845899 DOI: 10.3389/fnins.2018.00121] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 02/15/2018] [Indexed: 11/24/2022] Open
Abstract
Polyphonic music listening well exemplifies processes typically involved in daily auditory scene analysis situations, relying on an interactive interplay between bottom-up and top-down processes. Most studies investigating scene analysis have used elementary auditory scenes, however real-world scene analysis is far more complex. In particular, music, contrary to most other natural auditory scenes, can be perceived by either integrating or, under attentive control, segregating sound streams, often carried by different instruments. One of the prominent bottom-up cues contributing to multi-instrument music perception is their timbre difference. In this work, we introduce and validate a novel paradigm designed to investigate, within naturalistic musical auditory scenes, attentive modulation as well as its interaction with bottom-up processes. Two psychophysical experiments are described, employing custom-composed two-voice polyphonic music pieces within a framework implementing a behavioral performance metric to validate listener instructions requiring either integration or segregation of scene elements. In Experiment 1, the listeners' locus of attention was switched between individual instruments or the aggregate (i.e., both instruments together), via a task requiring the detection of temporal modulations (i.e., triplets) incorporated within or across instruments. Subjects responded post-stimulus whether triplets were present in the to-be-attended instrument(s). Experiment 2 introduced the bottom-up manipulation by adding a three-level morphing of instrument timbre distance to the attentional framework. The task was designed to be used within neuroimaging paradigms; Experiment 2 was additionally validated behaviorally in the functional Magnetic Resonance Imaging (fMRI) environment. Experiment 1 subjects (N = 29, non-musicians) completed the task at high levels of accuracy, showing no group differences between any experimental conditions. Nineteen listeners also participated in Experiment 2, showing a main effect of instrument timbre distance, even though within attention-condition timbre-distance contrasts did not demonstrate any timbre effect. Correlation of overall scores with morph-distance effects, computed by subtracting the largest from the smallest timbre distance scores, showed an influence of general task difficulty on the timbre distance effect. Comparison of laboratory and fMRI data showed scanner noise had no adverse effect on task performance. These Experimental paradigms enable to study both bottom-up and top-down contributions to auditory stream segregation and integration within psychophysical and neuroimaging experiments.
Collapse
Affiliation(s)
- Niels R Disbergen
- Department of Cognitive Neuroscience, Maastricht University, Maastricht, Netherlands.,Maastricht Brain Imaging Center (MBIC), Maastricht, Netherlands
| | - Giancarlo Valente
- Department of Cognitive Neuroscience, Maastricht University, Maastricht, Netherlands.,Maastricht Brain Imaging Center (MBIC), Maastricht, Netherlands
| | - Elia Formisano
- Department of Cognitive Neuroscience, Maastricht University, Maastricht, Netherlands.,Maastricht Brain Imaging Center (MBIC), Maastricht, Netherlands
| | - Robert J Zatorre
- Cognitive Neuroscience Unit, Montreal Neurological Institute, McGill University, Montreal, QC, Canada.,International Laboratory for Brain Music and Sound Research (BRAMS), Montreal, QC, Canada
| |
Collapse
|
25
|
Chi CF, Dewi RS, Surbakti YY, Hsieh DY. The perceived quality of in-vehicle auditory signals: a structural equation modelling approach. Ergonomics 2017; 60:1471-1484. [PMID: 28441909 DOI: 10.1080/00140139.2017.1323121] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 04/16/2017] [Indexed: 06/07/2023]
Abstract
The current study applied Structural Equation Modelling to analyse the relationship among pitch, loudness, tempo and timbre and their relationship with perceived sound quality. Twenty-eight auditory signals of horn, indicator, door open warning and parking sensor were collected from 11 car brands. Twenty-one experienced drivers were recruited to evaluate all sound signals with 11 semantic differential scales. The results indicate that for the continuous sounds, pitch, loudness and timbre each had a direct impact on the perceived quality. Besides the direct impacts, pitch also had an impact on loudness perception. For the intermittent sounds, tempo and timbre each had a direct impact on the perceived quality. These results can help to identify the psychoacoustic attributes affecting the consumers' quality perception and help to design preferable sounds for vehicles. In the end, a design guideline is proposed for the development of auditory signals that adopts the current study's research findings as well as those of other relevant research. Practitioner Summary: This study applied Structural Equation Modelling to analyse the relationship among pitch, loudness, tempo and timbre and their relationship with perceived sound quality. The result can help to identify psychoacoustic attributes affecting the consumers' quality perception and help to design preferable sounds for vehicles.
Collapse
Affiliation(s)
- Chia-Fen Chi
- a Department of Industrial Management, National Taiwan University of Science and Technology , Taipei , Taiwan
| | - Ratna Sari Dewi
- b Department of Industrial Engineering, Sepuluh Nopember Institute of Technology , Surabaya , Indonesia
| | - Yopie Yutama Surbakti
- a Department of Industrial Management, National Taiwan University of Science and Technology , Taipei , Taiwan
| | - Dong-Yu Hsieh
- a Department of Industrial Management, National Taiwan University of Science and Technology , Taipei , Taiwan
| |
Collapse
|
26
|
Piazza EA, Iordan MC, Lew-Williams C. Mothers Consistently Alter Their Unique Vocal Fingerprints When Communicating with Infants. Curr Biol 2017; 27:3162-3167.e3. [PMID: 29033333 PMCID: PMC5656453 DOI: 10.1016/j.cub.2017.08.074] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2017] [Revised: 07/25/2017] [Accepted: 08/30/2017] [Indexed: 11/25/2022]
Abstract
The voice is the most direct link we have to others' minds, allowing us to communicate using a rich variety of speech cues [1, 2]. This link is particularly critical early in life as parents draw infants into the structure of their environment using infant-directed speech (IDS), a communicative code with unique pitch and rhythmic characteristics relative to adult-directed speech (ADS) [3, 4]. To begin breaking into language, infants must discern subtle statistical differences about people and voices in order to direct their attention toward the most relevant signals. Here, we uncover a new defining feature of IDS: mothers significantly alter statistical properties of vocal timbre when speaking to their infants. Timbre, the tone color or unique quality of a sound, is a spectral fingerprint that helps us instantly identify and classify sound sources, such as individual people and musical instruments [5-7]. We recorded 24 mothers' naturalistic speech while they interacted with their infants and with adult experimenters in their native language. Half of the participants were English speakers, and half were not. Using a support vector machine classifier, we found that mothers consistently shifted their timbre between ADS and IDS. Importantly, this shift was similar across languages, suggesting that such alterations of timbre may be universal. These findings have theoretical implications for understanding how infants tune in to their local communicative environments. Moreover, our classification algorithm for identifying infant-directed timbre has direct translational implications for speech recognition technology.
Collapse
Affiliation(s)
- Elise A Piazza
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA; Department of Psychology, Princeton University, Princeton, NJ 08544, USA.
| | - Marius Cătălin Iordan
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08544, USA; Department of Psychology, Princeton University, Princeton, NJ 08544, USA
| | - Casey Lew-Williams
- Department of Psychology, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
27
|
Abstract
Whether music and language evolved independently of each other or whether both evolved from a common precursor remains a hotly debated topic. We here emphasize the role of vowels in the language-music relationship, arguing for a shared heritage of music and speech. Vowels play a decisive role in generating the sound or sonority of syllables, the main vehicles for transporting prosodic information in speech and singing. Timbre is, beyond question, the primary parameter that allows us to discriminate between different vowels, but vowels also have intrinsic pitch, intensity, and duration. There are striking correspondences between the number of vowels and the number of pitches in musical scales across cultures: an upper limit of roughly 12 elements, a lower limit of 2, and a frequency peak at 5-7 elements. Moreover, there is evidence for correspondences between vowels and scales even in specific cultures, e.g., cultures with three vowels tend to have tritonic scales. We report a match between vowel pitch and musical pitch in meaningless syllables of Alpine yodelers, and highlight the relevance of vocal timbre in the music of many non-Western cultures, in which vocal timbre/vowel timbre and musical melody are often intertwined. Studies showing the pivotal role of vowels and their musical qualities in the ontogeny of language and in infant directed speech, will be used as further arguments supporting the hypothesis that music and speech evolved from a common prosodic precursor, where the vowels exhibited both pitch and timbre variations.
Collapse
|
28
|
Mathevon N, Casey C, Reichmuth C, Charrier I. Northern Elephant Seals Memorize the Rhythm and Timbre of Their Rivals' Voices. Curr Biol 2017; 27:2352-2356.e2. [PMID: 28736171 DOI: 10.1016/j.cub.2017.06.035] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Revised: 05/18/2017] [Accepted: 06/13/2017] [Indexed: 12/31/2022]
Abstract
The evolutionary origin of rhythm perception, a cognitive ability essential to musicality, remains unresolved [1-5]. The ability to perceive and memorize rhythmic sounds is widely shared among humans [6] but seems rare among other mammals [7, 8]. Although the perception of temporal metrical patterns has been found in a few species, this ability has only been demonstrated through behavioral training [9] (but see [10] for an example of spontaneous tempo coordination in a bonobo), and there is no experimental evidence to indicate its biological function. Furthermore, there is no example of a non-human mammal able to remember and recognize auditory rhythmic patterns among a wide range of tempi. In the northern elephant seal Mirounga angustirostris, the calls of mature males comprise a rhythmic series of pulses, with the call of each individual characterized by its tempo and timbre; these individual vocal signatures are stable over years and across contexts [11]. Here, we report that northern elephant seal males routinely memorize and recognize the unique tempo and timbre of their rivals' voices and use this rhythmic information to individually identify competitors, which facilitates navigation within the social network of the rookery. By performing playbacks with natural and modified vocalizations, we show that males are sensitive to call rhythm disruption independently of modification of spectral features and that they use both temporal and spectral cues to identify familiar rivals. While spectral features of calls typically encode individual identity in mammalian vocalizations [12], this is the first example of this phenomenon involving sound rhythm.
Collapse
Affiliation(s)
- Nicolas Mathevon
- Equipe Neuro-Ethologie Sensorielle, ENES/Neuro-PSI, CNRS UMR 9197, Université de Lyon/Saint-Etienne, 23 rue Michelon, 42023 Saint-Etienne Cedex 2, France.
| | - Caroline Casey
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95060, USA
| | - Colleen Reichmuth
- Institute of Marine Sciences, Long Marine Laboratory, University of California Santa Cruz, Santa Cruz, CA 95060, USA
| | - Isabelle Charrier
- Université Paris-Saclay, Université Paris-Sud, CNRS, UMR 9197, Institut des Neurosciences Paris-Saclay, 91405 Orsay, France.
| |
Collapse
|
29
|
Abstract
There is evidence from a number of recent studies that most listeners are able to extract information related to song identity, emotion, or genre from music excerpts with durations in the range of tenths of seconds. Because of these very short durations, timbre as a multifaceted auditory attribute appears as a plausible candidate for the type of features that listeners make use of when processing short music excerpts. However, the importance of timbre in listening tasks that involve short excerpts has not yet been demonstrated empirically. Hence, the goal of this study was to develop a method that allows to explore to what degree similarity judgments of short music clips can be modeled with low-level acoustic features related to timbre. We utilized the similarity data from two large samples of participants: Sample I was obtained via an online survey, used 16 clips of 400 ms length, and contained responses of 137,339 participants. Sample II was collected in a lab environment, used 16 clips of 800 ms length, and contained responses from 648 participants. Our model used two sets of audio features which included commonly used timbre descriptors and the well-known Mel-frequency cepstral coefficients as well as their temporal derivates. In order to predict pairwise similarities, the resulting distances between clips in terms of their audio features were used as predictor variables with partial least-squares regression. We found that a sparse selection of three to seven features from both descriptor sets—mainly encoding the coarse shape of the spectrum as well as spectrotemporal variability—best predicted similarities across the two sets of sounds. Notably, the inclusion of non-acoustic predictors of musical genre and record release date allowed much better generalization performance and explained up to 50% of shared variance (R2) between observations and model predictions. Overall, the results of this study empirically demonstrate that both acoustic features related to timbre as well as higher level categorical features such as musical genre play a major role in the perception of short music clips.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of OldenburgOldenburg, Germany
| | | |
Collapse
|
30
|
Gorzelańczyk EJ, Podlipniak P, Walecki P, Karpiński M, Tarnowska E. Pitch Syntax Violations Are Linked to Greater Skin Conductance Changes, Relative to Timbral Violations - The Predictive Role of the Reward System in Perspective of Cortico-subcortical Loops. Front Psychol 2017; 8:586. [PMID: 28458648 PMCID: PMC5394172 DOI: 10.3389/fpsyg.2017.00586] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Accepted: 03/29/2017] [Indexed: 12/03/2022] Open
Abstract
According to contemporary opinion emotional reactions to syntactic violations are due to surprise as a result of the general mechanism of prediction. The classic view is that, the processing of musical syntax can be explained by activity of the cerebral cortex. However, some recent studies have indicated that subcortical brain structures, including those related to the processing of emotions, are also important during the processing of syntax. In order to check whether emotional reactions play a role in the processing of pitch syntax or are only the result of the general mechanism of prediction, the comparison of skin conductance levels reacting to three types of melodies were recorded. In this study, 28 subjects listened to three types of short melodies prepared in Musical Instrument Digital Interface Standard files (MIDI) – tonally correct, tonally violated (with one out-of-key – i.e., of high information content), and tonally correct but with one note played in a different timbre. The BioSemi ActiveTwo with two passive Nihon Kohden electrodes was used. Skin conductance levels were positively correlated with the presented stimuli (timbral changes and tonal violations). Although changes in skin conductance levels were also observed in response to the change in timbre, the reactions to tonal violations were significantly stronger. Therefore, despite the fact that timbral change is at least as equally unexpected as an out-of-key note, the processing of pitch syntax mainly generates increased activation of the sympathetic part of the autonomic nervous system. These results suggest that the cortico–subcortical loops (especially the anterior cingulate – limbic loop) may play an important role in the processing of musical syntax.
Collapse
Affiliation(s)
- Edward J Gorzelańczyk
- Department of Theoretical Basis of Bio-Medical Sciences and Medical Informatics, Nicolaus Copernicus University Collegium MedicumBydgoszcz, Poland.,Non-Public Health Care Center Sue Ryder HomeBydgoszcz, Poland.,Medseven-Outpatient Addiction TreatmentBydgoszcz, Poland.,Institute of Philosophy, Kazimierz Wielki UniversityBydgoszcz, Poland
| | - Piotr Podlipniak
- Institute of Musicology, Adam Mickiewicz University in PoznańPoznań, Poland
| | - Piotr Walecki
- Department of Bioinformatics and Telemedicine, Jagiellonian University Collegium MedicumKrakow, Poland
| | - Maciej Karpiński
- Institute of Linguistics, Adam Mickiewicz University in PoznańPoznań, Poland
| | - Emilia Tarnowska
- Institute of Acoustics, Adam Mickiewicz University in PoznańPoznań, Poland
| |
Collapse
|
31
|
Allen EJ, Burton PC, Olman CA, Oxenham AJ. Representations of Pitch and Timbre Variation in Human Auditory Cortex. J Neurosci 2017; 37:1284-1293. [PMID: 28025255 PMCID: PMC5296797 DOI: 10.1523/jneurosci.2336-16.2016] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2016] [Revised: 09/30/2016] [Accepted: 12/10/2016] [Indexed: 11/21/2022] Open
Abstract
Pitch and timbre are two primary dimensions of auditory perception, but how they are represented in the human brain remains a matter of contention. Some animal studies of auditory cortical processing have suggested modular processing, with different brain regions preferentially coding for pitch or timbre, whereas other studies have suggested a distributed code for different attributes across the same population of neurons. This study tested whether variations in pitch and timbre elicit activity in distinct regions of the human temporal lobes. Listeners were presented with sequences of sounds that varied in either fundamental frequency (eliciting changes in pitch) or spectral centroid (eliciting changes in brightness, an important attribute of timbre), with the degree of pitch or timbre variation in each sequence parametrically manipulated. The BOLD responses from auditory cortex increased with increasing sequence variance along each perceptual dimension. The spatial extent, region, and laterality of the cortical regions most responsive to variations in pitch or timbre at the univariate level of analysis were largely overlapping. However, patterns of activation in response to pitch or timbre variations were discriminable in most subjects at an individual level using multivoxel pattern analysis, suggesting a distributed coding of the two dimensions bilaterally in human auditory cortex. SIGNIFICANCE STATEMENT Pitch and timbre are two crucial aspects of auditory perception. Pitch governs our perception of musical melodies and harmonies, and conveys both prosodic and (in tone languages) lexical information in speech. Brightness-an aspect of timbre or sound quality-allows us to distinguish different musical instruments and speech sounds. Frequency-mapping studies have revealed tonotopic organization in primary auditory cortex, but the use of pure tones or noise bands has precluded the possibility of dissociating pitch from brightness. Our results suggest a distributed code, with no clear anatomical distinctions between auditory cortical regions responsive to changes in either pitch or timbre, but also reveal a population code that can differentiate between changes in either dimension within the same cortical regions.
Collapse
Affiliation(s)
- Emily J Allen
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455
| | - Philip C Burton
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455
| | - Cheryl A Olman
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455
| |
Collapse
|
32
|
Abstract
We study short-term recognition of timbre using familiar recorded tones from acoustic instruments and unfamiliar transformed tones that do not readily evoke sound-source categories. Participants indicated whether the timbre of a probe sound matched with one of three previously presented sounds (item recognition). In Exp. 1, musicians better recognised familiar acoustic compared to unfamiliar synthetic sounds, and this advantage was particularly large in the medial serial position. There was a strong correlation between correct rejection rate and the mean perceptual dissimilarity of the probe to the tones from the sequence. Exp. 2 compared musicians' and non-musicians' performance with concurrent articulatory suppression, visual interference, and with a silent control condition. Both suppression tasks disrupted performance by a similar margin, regardless of musical training of participants or type of sounds. Our results suggest that familiarity with sound source categories and attention play important roles in short-term memory for timbre, which rules out accounts solely based on sensory persistence.
Collapse
Affiliation(s)
- Kai Siedenburg
- a Schulich School of Music , McGill University , Montreal , QC , Canada.,b Department of Medical Physics and Acoustics , University of Oldenburg , Oldenburg , Germany
| | - Stephen McAdams
- a Schulich School of Music , McGill University , Montreal , QC , Canada
| |
Collapse
|
33
|
Grube M, Bruffaerts R, Schaeverbeke J, Neyens V, De Weer AS, Seghers A, Bergmans B, Dries E, Griffiths TD, Vandenberghe R. Core auditory processing deficits in primary progressive aphasia. Brain 2016; 139:1817-29. [PMID: 27060523 PMCID: PMC4892752 DOI: 10.1093/brain/aww067] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Accepted: 02/12/2016] [Indexed: 12/14/2022] Open
Abstract
The extent to which non-linguistic auditory processing deficits may contribute to the phenomenology of primary progressive aphasia is not established. Using non-linguistic stimuli devoid of meaning we assessed three key domains of auditory processing (pitch, timing and timbre) in a consecutive series of 18 patients with primary progressive aphasia (eight with semantic variant, six with non-fluent/agrammatic variant, and four with logopenic variant), as well as 28 age-matched healthy controls. We further examined whether performance on the psychoacoustic tasks in the three domains related to the patients’ speech and language and neuropsychological profile. At the group level, patients were significantly impaired in the three domains. Patients had the most marked deficits within the rhythm domain for the processing of short sequences of up to seven tones. Patients with the non-fluent variant showed the most pronounced deficits at the group and the individual level. A subset of patients with the semantic variant were also impaired, though less severely. The patients with the logopenic variant did not show any significant impairments. Significant deficits in the non-fluent and the semantic variant remained after partialling out effects of executive dysfunction. Performance on a subset of the psychoacoustic tests correlated with conventional verbal repetition tests. In sum, a core central auditory impairment exists in primary progressive aphasia for non-linguistic stimuli. While the non-fluent variant is clinically characterized by a motor speech deficit (output problem), perceptual processing of tone sequences is clearly deficient. This may indicate the co-occurrence in the non-fluent variant of a deficit in working memory for auditory objects. Parsimoniously we propose that auditory timing pathways are altered, which are used in common for processing acoustic sequence structure in both speech output and acoustic input.
Collapse
Affiliation(s)
- Manon Grube
- 1 Institute of Neuroscience, Medical School, Newcastle University, Newcastle-upon-Tyne, UK 2 Machine Learning Group, Department of Computer Science, Berlin Institute of Technology, Berlin, Germany
| | - Rose Bruffaerts
- 3 Laboratory for Cognitive Neurology, KU Leuven Department of Neurosciences, Belgium 4 Neurology Department, University Hospitals Leuven, Leuven, Belgium
| | - Jolien Schaeverbeke
- 3 Laboratory for Cognitive Neurology, KU Leuven Department of Neurosciences, Belgium 4 Neurology Department, University Hospitals Leuven, Leuven, Belgium
| | - Veerle Neyens
- 3 Laboratory for Cognitive Neurology, KU Leuven Department of Neurosciences, Belgium
| | - An-Sofie De Weer
- 3 Laboratory for Cognitive Neurology, KU Leuven Department of Neurosciences, Belgium
| | - Alexandra Seghers
- 3 Laboratory for Cognitive Neurology, KU Leuven Department of Neurosciences, Belgium 4 Neurology Department, University Hospitals Leuven, Leuven, Belgium
| | - Bruno Bergmans
- 4 Neurology Department, University Hospitals Leuven, Leuven, Belgium
| | - Eva Dries
- 3 Laboratory for Cognitive Neurology, KU Leuven Department of Neurosciences, Belgium 4 Neurology Department, University Hospitals Leuven, Leuven, Belgium
| | - Timothy D Griffiths
- 1 Institute of Neuroscience, Medical School, Newcastle University, Newcastle-upon-Tyne, UK 6 Wellcome Centre for Neuroimaging, University College London, UK
| | - Rik Vandenberghe
- 3 Laboratory for Cognitive Neurology, KU Leuven Department of Neurosciences, Belgium 4 Neurology Department, University Hospitals Leuven, Leuven, Belgium 7 Alzheimer Research Centre KU Leuven, Leuven research Institute for Neuroscience and Disease, University of Leuven, Belgium
| |
Collapse
|
34
|
Erickson ML. Acoustic Properties of the Voice Source and the Vocal Tract: Are They Perceptually Independent? J Voice 2016; 30:772.e9-772.e22. [PMID: 26822389 DOI: 10.1016/j.jvoice.2015.11.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Accepted: 11/13/2015] [Indexed: 10/22/2022]
Abstract
OBJECTIVE/HYPOTHESIS This study sought to determine whether the properties of the voice source and vocal tract are perceptually independent. STUDY DESIGN Within-subjects design. METHODS This study employed a paired-comparison paradigm where listeners heard synthetic voices and rated them as same or different using a visual analog scale. Stimuli were synthesized using three different source slopes and two different formant patterns (mezzo-soprano and soprano) on the vowel /a/ at four pitches: A3, C4, B4, and F5. RESULTS Whereas formant pattern was the strongest effect, difference in source slope also affected perceived quality difference. Source slope and formant pattern were not independently perceived. CONCLUSION These results suggest that when judging laryngeal adduction using perceptual information, judgments may not be accurate when the stimuli are of differing formant patterns.
Collapse
Affiliation(s)
- Molly L Erickson
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, Tennessee..
| |
Collapse
|
35
|
Abstract
This article investigates the cross-modal correspondences between musical timbre and shapes. Previously, such features as pitch, loudness, light intensity, visual size, and color characteristics have mostly been used in studies of audio-visual correspondences. Moreover, in most studies, simple stimuli e.g., simple tones have been utilized. In this experiment, 23 musical sounds varying in fundamental frequency and timbre but fixed in loudness were used. Each sound was presented once against colored shapes and once against grayscale shapes. Subjects had to select the visual equivalent of a given sound i.e., its shape, color (or grayscale) and vertical position. This scenario permitted studying the associations between normalized timbre and visual shapes as well as some of the previous findings for more complex stimuli. One hundred and nineteen subjects (31 females and 88 males) participated in the online experiment. Subjects included 36 claimed professional musicians, 47 claimed amateur musicians, and 36 claimed non-musicians. Thirty-one subjects have also claimed to have synesthesia-like experiences. A strong association between timbre of envelope normalized sounds and visual shapes was observed. Subjects have strongly associated soft timbres with blue, green or light gray rounded shapes, harsh timbres with red, yellow or dark gray sharp angular shapes and timbres having elements of softness and harshness together with a mixture of the two previous shapes. Color or grayscale had no effect on timbre-shape associations. Fundamental frequency was not associated with height, grayscale or color. The significant correspondence between timbre and shape revealed by the present work allows designing substitution systems which might help the blind to perceive shapes through timbre.
Collapse
Affiliation(s)
- Mohammad Adeli
- Neurocomputational and Intelligent Signal Processing Research Group (NECOTIS), Département de Génie Électrique et de Génie Informatique, Faculté de Génie, Université de Sherbrooke Sherbrooke, QC, Canada
| | - Jean Rouat
- Neurocomputational and Intelligent Signal Processing Research Group (NECOTIS), Département de Génie Électrique et de Génie Informatique, Faculté de Génie, Université de Sherbrooke Sherbrooke, QC, Canada ; Neuroscience Lab, Département de Sciences Biologiques, Université de Montréal Montréal, QC, Canada
| | - Stéphane Molotchnikoff
- Neurocomputational and Intelligent Signal Processing Research Group (NECOTIS), Département de Génie Électrique et de Génie Informatique, Faculté de Génie, Université de Sherbrooke Sherbrooke, QC, Canada ; Neuroscience Lab, Département de Sciences Biologiques, Université de Montréal Montréal, QC, Canada
| |
Collapse
|
36
|
Bernays M, Traube C. Investigating pianists' individuality in the performance of five timbral nuances through patterns of articulation, touch, dynamics, and pedaling. Front Psychol 2014; 5:157. [PMID: 24624099 PMCID: PMC3941302 DOI: 10.3389/fpsyg.2014.00157] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2013] [Accepted: 02/08/2014] [Indexed: 11/13/2022] Open
Abstract
Timbre is an essential expressive feature in piano performance. Concert pianists use a vast palette of timbral nuances to color their performances at the microstructural level. Although timbre is generally envisioned in the pianistic community as an abstract concept carried through an imaged vocabulary, performers may share some common strategies of timbral expression in piano performance. Yet there may remain further leeway for idiosyncratic processes in the production of piano timbre nuances. In this study, we examined the patterns of timbral expression in performances by four expert pianists. Each pianist performed four short pieces, each with five different timbral intentions (bright, dark, dry, round, and velvety). The performances were recorded with the high-accuracy Bösendorfer CEUS system. Fine-grained performance features of dynamics, touch, articulation and pedaling were extracted. Reduced PCA performance spaces and descriptive performance portraits confirmed that pianists exhibited unique, specific profiles for different timbral intentions, derived from underlying traits of general individuality, while sharing some broad commonalities of dynamics and articulation for each timbral intention. These results confirm that pianists' abstract notions of timbre correspond to reliable patterns of performance technique. Furthermore, these effects suggest that pianists can express individual styles while complying with specific timbral intentions.
Collapse
Affiliation(s)
- Michel Bernays
- IDMIL/SPCL, Schulich School of Music, McGill University Montreal, QC, Canada
| | - Caroline Traube
- LRGM, OICRM, Faculté de musique, Université de Montréal Montreal, QC, Canada
| |
Collapse
|
37
|
Vurma A. Timbre-induced pitch shift from the perspective of Signal Detection Theory: the impact of musical expertise, silence interval, and pitch region. Front Psychol 2014; 5:44. [PMID: 24550867 PMCID: PMC3907698 DOI: 10.3389/fpsyg.2014.00044] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2013] [Accepted: 01/14/2014] [Indexed: 11/29/2022] Open
Abstract
The paradigm of Signal Detection Theory (SDT) was used to analyze the ability of professional pianists (N = 16) and string players (N = 15) to discriminate small F0 differences between consecutive musical tones, presented in pairs, with identical and with different (bright and dull) timbres. The sensitivity (d′) and response bias (c) were heavily dependent on the timbral arrangement of the pairs of tones (the “comparable tones”), which can be interpreted as the influence of timbre-induced pitch shift on F0 discrimination. The participants were somewhat biased to “miss” signals when comparable tones had identical timbres and to make “false alarms” when the tones had different timbres. The d′ was lowest when the tones with a lower F0 in those stimulus-pairs containing tones with different timbres had a brighter timber, and highest when both tones had bright timbre. On average, the string players had a somewhat higher d′ and their perception was slightly less influenced by timbre-induced pitch shift when compared to the pianists. Nevertheless, the dependence of d′ and c on the timbral arrangement of the tones was registered in the case of all the participants at all the investigated pitch regions around D#3, D4, and C#5. Furthermore, the presence of a silence of 3.5 s—a silence interval—between the tones to be compared had an impact on both d′- and c-values as well as on the degree of vulnerability to timbre-induced pitch shift.
Collapse
Affiliation(s)
- Allan Vurma
- Department of Musicology, Estonian Academy of Music and Theatre Tallinn, Estonia
| |
Collapse
|
38
|
Cousineau M, Carcagno S, Demany L, Pressnitzer D. What is a melody? On the relationship between pitch and brightness of timbre. Front Syst Neurosci 2014; 7:127. [PMID: 24478638 PMCID: PMC3894522 DOI: 10.3389/fnsys.2013.00127] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Accepted: 12/25/2013] [Indexed: 11/13/2022] Open
Abstract
Previous studies showed that the perceptual processing of sound sequences is more efficient when the sounds vary in pitch than when they vary in loudness. We show here that sequences of sounds varying in brightness of timbre are processed with the same efficiency as pitch sequences. The sounds used consisted of two simultaneous pure tones one octave apart, and the listeners’ task was to make same/different judgments on pairs of sequences varying in length (one, two, or four sounds). In one condition, brightness of timbre was varied within the sequences by changing the relative level of the two pure tones. In other conditions, pitch was varied by changing fundamental frequency, or loudness was varied by changing the overall level. In all conditions, only two possible sounds could be used in a given sequence, and these two sounds were equally discriminable. When sequence length increased from one to four, discrimination performance decreased substantially for loudness sequences, but to a smaller extent for brightness sequences and pitch sequences. In the latter two conditions, sequence length had a similar effect on performance. These results suggest that the processes dedicated to pitch and brightness analysis, when probed with a sequence-discrimination task, share unexpected similarities.
Collapse
Affiliation(s)
- Marion Cousineau
- International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, University of Montreal Montreal, QC, Canada
| | | | | | - Daniel Pressnitzer
- Laboratoire des Systèmes Perceptifs, CNRS UMR 8248 Paris, France ; Département d'études cognitives, École normale supérieure Paris, France
| |
Collapse
|
39
|
Stevens CJ, Tardieu J, Dunbar-Hall P, Best CT, Tillmann B. Expectations in culturally unfamiliar music: influences of proximal and distal cues and timbral characteristics. Front Psychol 2013; 4:789. [PMID: 24223562 PMCID: PMC3819523 DOI: 10.3389/fpsyg.2013.00789] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2013] [Accepted: 10/07/2013] [Indexed: 11/13/2022] Open
Abstract
Listeners' musical perception is influenced by cues that can be stored in short-term memory (e.g., within the same musical piece) or long-term memory (e.g., based on one's own musical culture). The present study tested how these cues (referred to as, respectively, proximal and distal cues) influence the perception of music from an unfamiliar culture. Western listeners who were naïve to Gamelan music judged completeness and coherence for newly constructed melodies in the Balinese gamelan tradition. In these melodies, we manipulated the final tone with three possibilities: the original gong tone, an in-scale tone replacement or an out-of-scale tone replacement. We also manipulated the musical timbre employed in Gamelan pieces. We hypothesized that novice listeners are sensitive to out-of-scale changes, but not in-scale changes, and that this might be influenced by the more unfamiliar timbre created by Gamelan "sister" instruments whose harmonics beat with the harmonics of the other instrument, creating a timbrally "shimmering" sound. The results showed: (1) out-of-scale endings were judged less complete than original gong and in-scale endings; (2) for melodies played with "sister" instruments, in-scale endings were judged as less complete than original endings. Furthermore, melodies using the original scale tones were judged more coherent than melodies containing few or multiple tone replacements; melodies played on single instruments were judged more coherent than the same melodies played on sister instruments. Additionally, there is some indication of within-session statistical learning, with expectations for the initially-novel materials developing during the course of the experiment. The data suggest the influence of both distal cues (e.g., previously unfamiliar timbres) and proximal cues (within the same sequence and over the experimental session) on the perception of melodies from other cultural systems based on unfamiliar tunings and scale systems.
Collapse
|
40
|
Marozeau J, Innes-Brown H, Blamey PJ. The acoustic and perceptual cues affecting melody segregation for listeners with a cochlear implant. Front Psychol 2013; 4:790. [PMID: 24223563 PMCID: PMC3818467 DOI: 10.3389/fpsyg.2013.00790] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2013] [Accepted: 10/07/2013] [Indexed: 11/13/2022] Open
Abstract
Our ability to listen selectively to single sound sources in complex auditory environments is termed "auditory stream segregation."This ability is affected by peripheral disorders such as hearing loss, as well as plasticity in central processing such as occurs with musical training. Brain plasticity induced by musical training can enhance the ability to segregate sound, leading to improvements in a variety of auditory abilities. The melody segregation ability of 12 cochlear-implant recipients was tested using a new method to determine the perceptual distance needed to segregate a simple 4-note melody from a background of interleaved random-pitch distractor notes. In experiment 1, participants rated the difficulty of segregating the melody from distracter notes. Four physical properties of the distracter notes were changed. In experiment 2, listeners were asked to rate the dissimilarity between melody patterns whose notes differed on the four physical properties simultaneously. Multidimensional scaling analysis transformed the dissimilarity ratings into perceptual distances. Regression between physical and perceptual cues then derived the minimal perceptual distance needed to segregate the melody. The most efficient streaming cue for CI users was loudness. For the normal hearing listeners without musical backgrounds, a greater difference on the perceptual dimension correlated to the temporal envelope is needed for stream segregation in CI users. No differences in streaming efficiency were found between the perceptual dimensions linked to the F0 and the spectral envelope. Combined with our previous results in normally-hearing musicians and non-musicians, the results show that differences in training as well as differences in peripheral auditory processing (hearing impairment and the use of a hearing device) influences the way that listeners use different acoustic cues for segregating interleaved musical streams.
Collapse
Affiliation(s)
- Jeremy Marozeau
- Department of Medical Bionics, University of Melbourne Melbourne, VIC, Australia ; Bionics Institute Melbourne, VIC, Australia
| | | | | |
Collapse
|
41
|
Spreckelmeyer KN, Altenmüller E, Colonius H, Münte TF. Preattentive processing of emotional musical tones: a multidimensional scaling and ERP study. Front Psychol 2013; 4:656. [PMID: 24065950 PMCID: PMC3779798 DOI: 10.3389/fpsyg.2013.00656] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Accepted: 09/03/2013] [Indexed: 11/15/2022] Open
Abstract
Musical emotion can be conveyed by subtle variations in timbre. Here, we investigated whether the brain is capable to discriminate tones differing in emotional expression by recording event-related potentials (ERPs) in an oddball paradigm under preattentive listening conditions. First, using multidimensional Fechnerian scaling, pairs of violin tones played with a happy or sad intonation were rated same or different by a group of non-musicians. Three happy and three sad tones were selected for the ERP experiment. The Fechnerian distances between tones within an emotion were in the same range as the distances between tones of different emotions. In two conditions, either 3 happy and 1 sad or 3 sad and 1 happy tone were presented in pseudo-random order. A mismatch negativity for the emotional deviant was observed, indicating that in spite of considerable perceptual differences between the three equiprobable tones of the standard emotion, a template was formed based on timbral cues against which the emotional deviant was compared. Based on Juslin's assumption of redundant code usage, we propose that tones were grouped together, because they were identified as belonging to one emotional category based on different emotion-specific cues. These results indicate that the brain forms an emotional memory trace at a preattentive level and thus, extends previous investigations in which emotional deviance was confounded with physical dissimilarity. Differences between sad and happy tones were observed which might be due to the fact that the happy emotion is mostly communicated by suprasegmental features.
Collapse
|
42
|
Abstract
Pitch, the perceptual correlate of fundamental frequency (F0), plays an important role in speech, music, and animal vocalizations. Changes in F0 over time help define musical melodies and speech prosody, while comparisons of simultaneous F0 are important for musical harmony, and for segregating competing sound sources. This study compared listeners' ability to detect differences in F0 between pairs of sequential or simultaneous tones that were filtered into separate, nonoverlapping spectral regions. The timbre differences induced by filtering led to poor F0 discrimination in the sequential, but not the simultaneous, conditions. Temporal overlap of the two tones was not sufficient to produce good performance; instead performance appeared to depend on the two tones being integrated into the same perceptual object. The results confirm the difficulty of comparing the pitches of sequential sounds with different timbres and suggest that, for simultaneous sounds, pitch differences may be detected through a decrease in perceptual fusion rather than an explicit coding and comparison of the underlying F0s.
Collapse
Affiliation(s)
- Elizabeth M O Borchert
- Auditory Perception and Cognition Laboratory, Department of Psychology, Uiversity of Minnesota, Twin Cities, MN, USA.
| | | | | |
Collapse
|
43
|
Abstract
This paper reviews the basic aspects of auditory processing that play a role in the perception of speech. The frequency selectivity of the auditory system, as measured using masking experiments, is described and used to derive the internal representation of the spectrum (the excitation pattern) of speech sounds. The perception of timbre and distinctions in quality between vowels are related to both static and dynamic aspects of the spectra of sounds. The perception of pitch and its role in speech perception are described. Measures of the temporal resolution of the auditory system are described and a model of temporal resolution based on a sliding temporal integrator is outlined. The combined effects of frequency and temporal resolution can be modelled by calculation of the spectro-temporal excitation pattern, which gives good insight into the internal representation of speech sounds. For speech presented in quiet, the resolution of the auditory system in frequency and time usually markedly exceeds the resolution necessary for the identification or discrimination of speech sounds, which partly accounts for the robust nature of speech perception. However, for people with impaired hearing, speech perception is often much less robust.
Collapse
Affiliation(s)
- Brian C J Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, UK.
| |
Collapse
|