1
|
Zhuang Y, Yang S. Exploring quantitative indices to characterize piano timbre with precision validated using measurement system analysis. Front Psychol 2024; 15:1363329. [PMID: 38933586 PMCID: PMC11204798 DOI: 10.3389/fpsyg.2024.1363329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Accepted: 05/14/2024] [Indexed: 06/28/2024] Open
Abstract
Aim Timbre in piano performance plays a critical role in enhancing musical expression. However, timbre control in current piano performance education relies mostly on descriptive characterization, which involves large variations of interpretation. The current study aimed to mitigate the limitations by identifying quantitative indices with adequate precision to characterize piano timbre. Methods A total of 24 sounds of G6 were recorded from 3 grand pianos, by 2 performers, and with 4 repetitions. The sounds were processed and analyzed with audio software for the frequencies and volumes of harmonic series in the spectrum curves. Ten quantitative timbre indices were calculated. Precision validation with statistical gage R&R analysis was conducted to gage the repeatability (between repetitions) and reproducibility (between performers) of the indices. The resultant percentage study variation (%SV) of an index must be ≤10% to be considered acceptable for characterizing piano timbre with enough precision. Results Out of the 10 indices, 4 indices had acceptable precision in characterizing piano timbre with %SV ≤10%, including the square sum of relative volume (4.40%), the frequency-weighted arithmetic mean of relative volume (4.29%), the sum of relative volume (3.11%), and the frequency-weighted sum of relative volume (2.09%). The novel indices identified in the current research will provide valuable tools to advance the measurement and communication of timbre and advance music performance education.
Collapse
Affiliation(s)
- Yuan Zhuang
- Department of Arts and Media, Tongji University, Shanghai, China
| | - Shuo Yang
- Department of Biomedical Engineering, Illinois Institute of Technology, Chicago, IL, United States
| |
Collapse
|
2
|
Thoret E, Andrillon T, Gauriau C, Léger D, Pressnitzer D. Sleep deprivation detected by voice analysis. PLoS Comput Biol 2024; 20:e1011849. [PMID: 38315733 PMCID: PMC10890756 DOI: 10.1371/journal.pcbi.1011849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 02/23/2024] [Accepted: 01/22/2024] [Indexed: 02/07/2024] Open
Abstract
Sleep deprivation has an ever-increasing impact on individuals and societies. Yet, to date, there is no quick and objective test for sleep deprivation. Here, we used automated acoustic analyses of the voice to detect sleep deprivation. Building on current machine-learning approaches, we focused on interpretability by introducing two novel ideas: the use of a fully generic auditory representation as input feature space, combined with an interpretation technique based on reverse correlation. The auditory representation consisted of a spectro-temporal modulation analysis derived from neurophysiology. The interpretation method aimed to reveal the regions of the auditory representation that supported the classifiers' decisions. Results showed that generic auditory features could be used to detect sleep deprivation successfully, with an accuracy comparable to state-of-the-art speech features. Furthermore, the interpretation revealed two distinct effects of sleep deprivation on the voice: changes in slow temporal modulations related to prosody and changes in spectral features related to voice quality. Importantly, the relative balance of the two effects varied widely across individuals, even though the amount of sleep deprivation was controlled, thus confirming the need to characterize sleep deprivation at the individual level. Moreover, while the prosody factor correlated with subjective sleepiness reports, the voice quality factor did not, consistent with the presence of both explicit and implicit consequences of sleep deprivation. Overall, the findings show that individual effects of sleep deprivation may be observed in vocal biomarkers. Future investigations correlating such markers with objective physiological measures of sleep deprivation could enable "sleep stethoscopes" for the cost-effective diagnosis of the individual effects of sleep deprivation.
Collapse
Affiliation(s)
- Etienne Thoret
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
- Aix-Marseille University, CNRS, Institut de Neurosciences de la Timone (INT) UMR7289, Perception Representation Image Sound Music (PRISM) UMR7061, Laboratoire d’Informatique et Systèmes (LIS) UMR7020, Marseille, France
- Institute of Language Communication and the Brain, Aix-Marseille University, Marseille, France
| | - Thomas Andrillon
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, Mov’it team, Inserm, CNRS, Paris, France
- Université Paris Cité, VIFASOM, ERC 7330, Vigilance Fatigue Sommeil et santé publique, Paris, France
- APHP, Hôtel-Dieu, Centre du Sommeil et de la Vigilance, Paris, France
| | - Caroline Gauriau
- Université Paris Cité, VIFASOM, ERC 7330, Vigilance Fatigue Sommeil et santé publique, Paris, France
- APHP, Hôtel-Dieu, Centre du Sommeil et de la Vigilance, Paris, France
| | - Damien Léger
- Université Paris Cité, VIFASOM, ERC 7330, Vigilance Fatigue Sommeil et santé publique, Paris, France
- APHP, Hôtel-Dieu, Centre du Sommeil et de la Vigilance, Paris, France
| | - Daniel Pressnitzer
- Laboratoire des systèmes perceptifs, Département d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| |
Collapse
|
3
|
Thoret E, Ystad S, Kronland-Martinet R. Hearing as adaptive cascaded envelope interpolation. Commun Biol 2023; 6:671. [PMID: 37355702 PMCID: PMC10290642 DOI: 10.1038/s42003-023-05040-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/12/2023] [Indexed: 06/26/2023] Open
Abstract
The human auditory system is designed to capture and encode sounds from our surroundings and conspecifics. However, the precise mechanisms by which it adaptively extracts the most important spectro-temporal information from sounds are still not fully understood. Previous auditory models have explained sound encoding at the cochlear level using static filter banks, but this vision is incompatible with the nonlinear and adaptive properties of the auditory system. Here we propose an approach that considers the cochlear processes as envelope interpolations inspired by cochlear physiology. It unifies linear and nonlinear adaptive behaviors into a single comprehensive framework that provides a data-driven understanding of auditory coding. It allows simulating a broad range of psychophysical phenomena from virtual pitches and combination tones to consonance and dissonance of harmonic sounds. It further predicts the properties of the cochlear filters such as frequency selectivity. Here we propose a possible link between the parameters of the model and the density of hair cells on the basilar membrane. Cascaded Envelope Interpolation may lead to improvements in sound processing for hearing aids by providing a non-linear, data-driven, way to preprocessing of acoustic signals consistent with peripheral processes.
Collapse
Affiliation(s)
- Etienne Thoret
- Aix Marseille Univ, CNRS, UMR7061 PRISM, UMR7020 LIS, Marseille, France.
- Institute of Language, Communication, and the Brain (ILCB), Marseille, France.
| | - Sølvi Ystad
- CNRS, Aix Marseille Univ, UMR 7061 PRISM, Marseille, France
| | | |
Collapse
|
4
|
Rosi V, Arias Sarah P, Houix O, Misdariis N, Susini P. Shared mental representations underlie metaphorical sound concepts. Sci Rep 2023; 13:5180. [PMID: 36997613 PMCID: PMC10063581 DOI: 10.1038/s41598-023-32214-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 03/24/2023] [Indexed: 04/01/2023] Open
Abstract
Communication between sound and music experts is based on the shared understanding of a metaphorical vocabulary derived from other sensory modalities. Yet, the impact of sound expertise on the mental representation of these sound concepts remains blurry. To address this issue, we investigated the acoustic portraits of four metaphorical sound concepts (brightness, warmth, roundness, and roughness) in three groups of participants (sound engineers, conductors, and non-experts). Participants (N = 24) rated a corpus of orchestral instrument sounds (N = 520) using Best-Worst Scaling. With this data-driven method, we sorted the sound corpus for each concept and population. We compared the population ratings and ran machine learning algorithms to unveil the acoustic portraits of each concept. Overall, the results revealed that sound engineers were the most consistent. We found that roughness is widely shared while brightness is expertise dependent. The frequent use of brightness by expert populations suggests that its meaning got specified through sound expertise. As for roundness and warmth, it seems that the importance of pitch and noise in their acoustic definition is the key to distinguishing them. These results provide crucial information on the mental representations of a metaphorical vocabulary of sound and whether it is shared or refined by sound expertise.
Collapse
Affiliation(s)
- Victor Rosi
- Sound Perception and Design Group, STMS, Ircam - Sorbonne Université - CNRS - Ministère de la Culture, 1 Place Igor Stravinsky, 75004, Paris, France.
| | - Pablo Arias Sarah
- School of Psychology and Neuroscience, University of Glasgow, 62 Hillhead Street, Glasgow, G12 8QB, UK
- Lund University Cognitive Science, Lund University, Box 192, 221 00, Lund, Sweden
| | - Olivier Houix
- Sound Perception and Design Group, STMS, Ircam - Sorbonne Université - CNRS - Ministère de la Culture, 1 Place Igor Stravinsky, 75004, Paris, France
| | - Nicolas Misdariis
- Sound Perception and Design Group, STMS, Ircam - Sorbonne Université - CNRS - Ministère de la Culture, 1 Place Igor Stravinsky, 75004, Paris, France
| | - Patrick Susini
- Sound Perception and Design Group, STMS, Ircam - Sorbonne Université - CNRS - Ministère de la Culture, 1 Place Igor Stravinsky, 75004, Paris, France
| |
Collapse
|
5
|
Loriette A, Liu W, Bevilacqua F, Caramiaux B. Describing movement learning using metric learning. PLoS One 2023; 18:e0272509. [PMID: 36735670 PMCID: PMC9897515 DOI: 10.1371/journal.pone.0272509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 07/20/2022] [Indexed: 02/04/2023] Open
Abstract
Analysing movement learning can rely on human evaluation, e.g. annotating video recordings, or on computing means in applying metrics on behavioural data. However, it remains challenging to relate human perception of movement similarity to computational measures that aim at modelling such similarity. In this paper, we propose a metric learning method bridging the gap between human ratings of movement similarity in a motor learning task and computational metric evaluation on the same task. It applies metric learning on a Dynamic Time Warping algorithm to derive an optimal set of movement features that best explain human ratings. We evaluated this method on an existing movement dataset, which comprises videos of participants practising a complex gesture sequence toward a target template, as well as the collected data that describes the movements. We show that it is possible to establish a linear relationship between human ratings and our learned computational metric. This learned metric can be used to describe the most salient temporal moments implicitly used by annotators, as well as movement parameters that correlate with motor improvements in the dataset. We conclude with possibilities to generalise this method for designing computational tools dedicated to movement annotation and evaluation of skill learning.
Collapse
Affiliation(s)
| | - Wanyu Liu
- STMS IRCAM-CNRS-Sorbonne Université, Paris, France
| | | | | |
Collapse
|
6
|
McAdams S, Thoret E, Wang G, Montrey M. Timbral cues for learning to generalize musical instrument identity across pitch register. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:797. [PMID: 36859162 DOI: 10.1121/10.0017100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 01/12/2023] [Indexed: 06/18/2023]
Abstract
Timbre provides an important cue to identify musical instruments. Many timbral attributes covary with other parameters like pitch. This study explores listeners' ability to construct categories of instrumental sound sources from sounds that vary in pitch. Nonmusicians identified 11 instruments from the woodwind, brass, percussion, and plucked and bowed string families. In experiment 1, they were trained to identify instruments playing a pitch of C4, and in experiments 2 and 3, they were trained with a five-tone sequence (F#3-F#4), exposing them to the way timbre varies with pitch. Participants were required to reach a threshold of 75% correct identification in training. In the testing phase, successful listeners heard single tones (experiments 1 and 2) or three-tone sequences from (A3-D#4) (experiment 3) across each instrument's full pitch range to test their ability to generalize identification from the learned sound(s). Identification generalization over pitch varies a great deal across instruments. No significant differences were found between single-pitch and multi-pitch training or testing conditions. Identification rates can be predicted moderately well by spectrograms or modulation spectra. These results suggest that listeners use the most relevant acoustical invariance to identify musical instrument sounds, also using previous experience with the tested instruments.
Collapse
Affiliation(s)
- Stephen McAdams
- Schulich School of Music, McGill University, Montreal, Québec H3A 1E3, Canada
| | - Etienne Thoret
- Aix-Marseille University, Centre National de la Recherche Scientifique, Perception Representations Image Sound Music Laboratory, Unité Mixte de Recherche 7061, Laboratoire d'Informatique et Systèmes, Unité Mixte de Recherche 7020, 13009 Marseille, France
| | - Grace Wang
- Cognitive Science Program, McGill University, Montreal, Québec H3A 3R1, Canada
| | - Marcel Montrey
- Department of Psychology, McGill University, Montreal, Québec H3A 1G1, Canada
| |
Collapse
|
7
|
Donhauser PW, Klein D. Audio-Tokens: A toolbox for rating, sorting and comparing audio samples in the browser. Behav Res Methods 2023; 55:508-515. [PMID: 35297013 PMCID: PMC10027774 DOI: 10.3758/s13428-022-01803-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/19/2022] [Indexed: 12/30/2022]
Abstract
Here we describe a JavaScript toolbox to perform online rating studies with auditory material. The main feature of the toolbox is that audio samples are associated with visual tokens on the screen that control audio playback and can be manipulated depending on the type of rating. This allows the collection of single- and multidimensional feature ratings, as well as categorical and similarity ratings. The toolbox ( github.com/pwdonh/audio_tokens ) can be used via a plugin for the widely used jsPsych, as well as using plain JavaScript for custom applications. We expect the toolbox to be useful in psychological research on speech and music perception, as well as for the curation and annotation of datasets in machine learning.
Collapse
Affiliation(s)
- Peter W Donhauser
- Cognitive Neuroscience Unit, Montreal Neurological Institute, McGill University, Montreal, QC, H3A 2B4, Canada.
- Ernst Strüngmann Institute for Neuroscience in Cooperation with Max Planck Society, 60528, Frankfurt am Main, Germany.
| | - Denise Klein
- Cognitive Neuroscience Unit, Montreal Neurological Institute, McGill University, Montreal, QC, H3A 2B4, Canada.
- Centre for Research on Brain, Language and Music, McGill University, Montreal, QC, H3G 2A8, Canada.
| |
Collapse
|
8
|
Marczyk A, O'Brien B, Tremblay P, Woisard V, Ghio A. Correlates of vowel clarity in the spectrotemporal modulation domain: Application to speech impairment evaluation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:2675. [PMID: 36456260 DOI: 10.1121/10.0015024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 10/13/2022] [Indexed: 06/17/2023]
Abstract
This article reports on vowel clarity metrics based on spectrotemporal modulations of speech signals. Motivated by previous findings on the relevance of modulation-based metrics for speech intelligibility assessment and pathology classification, the current study used factor analysis to identify regions within a bi-dimensional modulation space, the magnitude power spectrum, as in Elliott and Theunissen [(2009). PLoS Comput. Biol. 5(3), e1000302] by relating them to a set of conventional acoustic metrics of vowel space area and vowel distinctiveness. Two indices based on the energy ratio between high and low modulation rates across temporal and spectral dimensions of the modulation space emerged from the analyses. These indices served as input for measurements of central tendency and classification analyses that aimed to identify vowel-related speech impairments in French native speakers with head and neck cancer (HNC) and Parkinson dysarthria (PD). Following the analysis, vowel-related speech impairment was identified in HNC speakers, but not in PD. These results were consistent with findings based on subjective evaluations of speech intelligibility. The findings reported are consistent with previous studies indicating that impaired speech is associated with attenuation in energy in higher spectrotemporal modulation bands.
Collapse
Affiliation(s)
- Anna Marczyk
- Aix-Marseille Université, CNRS, LPL, UMR 7309, Aix-en-Provence, France
| | - Benjamin O'Brien
- Aix-Marseille Université, CNRS, LPL, UMR 7309, Aix-en-Provence, France
| | - Pascale Tremblay
- Universite Laval, Faculte de Medecine, Departement de Readaptation, Quebec City, Quebec G1V 0A6, Canada
| | | | - Alain Ghio
- Aix-Marseille Université, CNRS, LPL, UMR 7309, Aix-en-Provence, France
| |
Collapse
|
9
|
Yang L. Analysis of Erhu Performance Effect in Public Health Music Works Based on Artificial Intelligence Technology. JOURNAL OF ENVIRONMENTAL AND PUBLIC HEALTH 2022; 2022:9251793. [PMID: 36089953 PMCID: PMC9458413 DOI: 10.1155/2022/9251793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 08/08/2022] [Accepted: 08/09/2022] [Indexed: 11/29/2022]
Abstract
With the rise of Erhu teaching in recent years, a large number of people have joined the team to learn Erhu playing. However, due to the high cost of teaching and the unique one-to-one teaching mode between teachers and students, Erhu education resources are very scarce. Learning Erhu performance has become a luxury activity. Nowadays, with the rise of artificial intelligence, computer music is developing rapidly. Music has two important aspects: composition and performance. Different kinds of instruments convey different styles, and players inject different rhythms and dynamics into their performance, thus producing rich expressive force. The development of image style conversion, which opens people's evaluation of music performance, is an important issue in many fields of artificial intelligence (it is also known as intelligence, machine intelligence, referring to the intelligence shown by the machine made by people. Usually, artificial intelligence refers to the technique of presenting human intelligence through ordinary computer programs). For an Erhu song, there are various factors that affect its effectiveness, and there are many indexes to evaluate it, such as sense of rhythm, expressive force, musical sense, style, and so on. Using a computer to simulate the evaluation process is essential to find out the mathematical relationship between the factors that affect the performance of music and the evaluation indexes. Neural network is a kind of mathematical model proposed by simulating the way of thinking of human brain in artificial intelligence. It has the advantages of not having strict requirements on data distribution, nonlinear data processing method, strong robustness, and dynamics and is very suitable for the mathematical model of evaluation system. In addition, the neural network also has a strong theoretical basis, and their application in various industries has developed basically mature. This paper tries to introduce a deep neural network mathematical model into the evaluation system of Erhu performance, and the experimental results prove the reliability and practicality of the method in this paper. It can provide a method basis and theoretical reference for evaluation of Erhu performance effect.
Collapse
Affiliation(s)
- Li Yang
- Music Department, Normal College, Changshu Institute of Technology, Changshu 215500, China
| |
Collapse
|
10
|
Vuust P, Heggli OA, Friston KJ, Kringelbach ML. Music in the brain. Nat Rev Neurosci 2022; 23:287-305. [PMID: 35352057 DOI: 10.1038/s41583-022-00578-5] [Citation(s) in RCA: 104] [Impact Index Per Article: 52.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/22/2022] [Indexed: 02/06/2023]
Abstract
Music is ubiquitous across human cultures - as a source of affective and pleasurable experience, moving us both physically and emotionally - and learning to play music shapes both brain structure and brain function. Music processing in the brain - namely, the perception of melody, harmony and rhythm - has traditionally been studied as an auditory phenomenon using passive listening paradigms. However, when listening to music, we actively generate predictions about what is likely to happen next. This enactive aspect has led to a more comprehensive understanding of music processing involving brain structures implicated in action, emotion and learning. Here we review the cognitive neuroscience literature of music perception. We show that music perception, action, emotion and learning all rest on the human brain's fundamental capacity for prediction - as formulated by the predictive coding of music model. This Review elucidates how this formulation of music perception and expertise in individuals can be extended to account for the dynamics and underlying brain mechanisms of collective music making. This in turn has important implications for human creativity as evinced by music improvisation. These recent advances shed new light on what makes music meaningful from a neuroscientific perspective.
Collapse
Affiliation(s)
- Peter Vuust
- Center for Music in the Brain, Aarhus University and The Royal Academy of Music (Det Jyske Musikkonservatorium), Aarhus, Denmark.
| | - Ole A Heggli
- Center for Music in the Brain, Aarhus University and The Royal Academy of Music (Det Jyske Musikkonservatorium), Aarhus, Denmark
| | - Karl J Friston
- Wellcome Centre for Human Neuroimaging, University College London, London, UK
| | - Morten L Kringelbach
- Center for Music in the Brain, Aarhus University and The Royal Academy of Music (Det Jyske Musikkonservatorium), Aarhus, Denmark.,Department of Psychiatry, University of Oxford, Oxford, UK.,Centre for Eudaimonia and Human Flourishing, Linacre College, University of Oxford, Oxford, UK
| |
Collapse
|
11
|
Abstract
White light can be decomposed into different colors, and a complex sound wave can be decomposed into its partials. While the physics behind transverse and longitudinal waves is quite different and several theories have been developed to investigate the complexity of colors and timbres, we can try to model their structural similarities through the language of categories. Then, we consider color mixing and color transition in painting, comparing them with timbre superposition and timbre morphing in orchestration and computer music in light of bicategories and bigroupoids. Colors and timbres can be a probe to investigate some relevant aspects of visual and auditory perception jointly with their connections. Thus, the use of categories proposed here aims to investigate color/timbre perception, influencing the computer science developments in this area.
Collapse
|
12
|
Rozzi CA, Voltini A, Antonacci F, Nucci M, Grassi M. A listening experiment comparing the timbre of two Stradivari with other violins. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:443. [PMID: 35105053 DOI: 10.1121/10.0009320] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 12/28/2021] [Indexed: 06/14/2023]
Abstract
The violins of Stradivari are recognized worldwide as an excellence in craftsmanship, a model for instrument makers, and an unachievable desire for collectors and players. However, despite the myth surrounding these instruments, blindfolded players tendentially prefer to play modern violins. Here, we present a double blind listening experiment aimed at analyzing and comparatively rating the sound timbre of violins. The mythic instruments were listened to among other well regarded and not so well regarded violins. 70 listeners (violin makers of the Cremona area) rated the timbre difference between the simple musical scales played on a test and a reference violin, and the results showed that their preference converged on one particular Stradivari. The acoustical measurements revealed some similarities between the subjective ratings and the physical characteristics of the violins. It is speculated that the myth of Stradivari could have been boosted, among other factors, by the specimens of tonal superior quality, which biased favourably the judgment on his instruments and spread on all of the maker's production. These results contribute to the understanding of the timbre of violins and suggest the characteristics that are in a relationship with the pleasantness of the timbre.
Collapse
Affiliation(s)
| | - Alessandro Voltini
- Cremona International Violin Making School "A. Stradivari," Cremona, Italy
| | - Fabio Antonacci
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milano, Italy
| | - Massimo Nucci
- Dipartimento di Psicologia Generale, Università di Padova, Padova, Italy
| | - Massimo Grassi
- Dipartimento di Psicologia Generale, Università di Padova, Padova, Italy
| |
Collapse
|
13
|
Abstract
Perception adapts to the properties of prior stimulation, as illustrated by phenomena such as visual color constancy or speech context effects. In the auditory domain, only little is known about adaptive processes when it comes to the attribute of auditory brightness. Here, we report an experiment that tests whether listeners adapt to spectral colorations imposed on naturalistic music and speech excerpts. Our results indicate consistent contrastive adaptation of auditory brightness judgments on a trial-by-trial basis. The pattern of results suggests that these effects tend to grow with an increase in the duration of the adaptor context but level off after around 8 trials of 2 s duration. A simple model of the response criterion yields a correlation of r = .97 with the measured data and corroborates the notion that brightness perception adapts on timescales that fall in the range of auditory short-term memory. Effects turn out to be similar for spectral filtering based on linear spectral filter slopes and filtering based on a measured transfer function from a commercially available hearing device. Overall, our findings demonstrate the adaptivity of auditory brightness perception under realistic acoustical conditions.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany.
| | - Feline Malin Barg
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Henning Schepker
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
- Starkey Hearing, Eden Prairie, MN, USA
| |
Collapse
|
14
|
Siedenburg K, Jacobsen S, Reuter C. Spectral envelope position and shape in sustained musical instrument sounds. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3715. [PMID: 34241486 DOI: 10.1121/10.0005088] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 05/09/2021] [Indexed: 06/13/2023]
Abstract
It has been argued that the relative position of spectral envelopes along the frequency axis serves as a cue for musical instrument size (e.g., violin vs viola) and that the shape of the spectral envelope encodes family identity (violin vs flute). It is further known that fundamental frequency (F0), F0-register for specific instruments, and dynamic level strongly affect spectral properties of acoustical instrument sounds. However, the associations between these factors have not been rigorously quantified for a representative set of musical instruments. Here, we analyzed 5640 sounds from 50 sustained orchestral instruments sampled across their entire range of F0s at three dynamic levels. Regression of spectral centroid (SC) values that index envelope position indicated that smaller instruments possessed higher SC values for a majority of instrument classes (families), but SC also correlated with F0 and was strongly and consistently affected by the dynamic level. Instrument classification using relatively low-dimensional cepstral audio descriptors allowed for discrimination between instrument classes with accuracies beyond 80%. Envelope shape became much less indicative of instrument class whenever the classification problem involved generalization to different dynamic levels or F0-registers. These analyses confirm that spectral envelopes encode information about instrument size and family identity and highlight their dependence on F0(-register) and dynamic level.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, 26129 Oldenburg, Germany
| | - Simon Jacobsen
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, 26129 Oldenburg, Germany
| | - Christoph Reuter
- Department of Musicology, University of Vienna, 1090 Vienna, Austria
| |
Collapse
|