1
|
Reybrouck M, Podlipniak P, Welch D. Music Listening as Coping Behavior: From Reactive Response to Sense-Making. Behav Sci (Basel) 2020; 10:E119. [PMID: 32698450 PMCID: PMC7407588 DOI: 10.3390/bs10070119] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 07/10/2020] [Accepted: 07/14/2020] [Indexed: 11/22/2022] Open
Abstract
Coping is a survival mechanism of living organisms. It is not merely reactive, but also involves making sense of the environment by rendering sensory information into percepts that have meaning in the context of an organism's cognitions. Music listening, on the other hand, is a complex task that embraces sensory, physiological, behavioral, and cognitive levels of processing. Being both a dispositional process that relies on our evolutionary toolkit for coping with the world and a more elaborated skill for sense-making, it goes beyond primitive action-reaction couplings by the introduction of higher-order intermediary variables between sensory input and effector reactions. Consideration of music-listening from the perspective of coping treats music as a sound environment and listening as a process that involves exploration of this environment as well as interactions with the sounds. Several issues are considered in this regard such as the conception of music as a possible stressor, the role of adaptive listening, the relation between coping and reward, the importance of self-regulation strategies in the selection of music, and the instrumental meaning of music in the sense that it can be used to modify the internal and external environment of the listener.
Collapse
Affiliation(s)
- Mark Reybrouck
- Musicology Research Group, Faculty of Arts, KU Leuven-University of Leuven, 3000 Leuven, Belgium
- IPEM, Department of Art History, Musicology and Theatre Studies, 9000 Ghent, Belgium
| | - Piotr Podlipniak
- Institute of Musicology, Adam Mickiewicz University in Poznań, 61–712 Poznań, Poland;
| | - David Welch
- Institute Audiology Section, School of Population Health, University of Auckland, 2011 Auckland, New Zealand;
| |
Collapse
|
2
|
Bordonné T, Kronland-Martinet R, Ystad S, Derrien O, Aramaki M. Exploring sound perception through vocal imitations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:3306. [PMID: 32486800 DOI: 10.1121/10.0001224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 04/18/2020] [Indexed: 06/11/2023]
Abstract
Understanding how sounds are perceived and interpreted is an important challenge for researchers dealing with auditory perception. The ecological approach to perception suggests that the salient perceptual information that enables an auditor to recognize events through sounds is contained in specific structures called invariants. Identifying such invariants is of interest from a fundamental point of view to better understand auditory perception and it is also useful to include perceptual considerations to model and control sounds. Among the different approaches used to identify perceptually relevant sound structures, vocal imitations are believed to bring a fresh perspective to the field. The main goal of this paper is to better understand how invariants are transmitted through vocal imitations. A sound corpus containing different types of known invariants obtained from an existing synthesizer was established. Participants took part in a test where they were asked to imitate the sound corpus. A continuous and sparse model adapted to the specificities of the vocal imitations was then developed and used to analyze the imitations. Results show that participants were able to highlight salient elements of the sounds that partially correspond to the invariants used in the sound corpus. This study also confirms that vocal imitations reveal how these invariants are transmitted through perception and offers promising perspectives on auditory investigations.
Collapse
Affiliation(s)
- Thomas Bordonné
- Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France
| | - Richard Kronland-Martinet
- Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France
| | - Sølvi Ystad
- Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France
| | - Olivier Derrien
- Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France
| | - Mitsuko Aramaki
- Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France
| |
Collapse
|
3
|
Mehrabi A, Dixon S, Sandler M. Vocal imitation of percussion sounds: On the perceptual similarity between imitations and imitated sounds. PLoS One 2019; 14:e0219955. [PMID: 31344080 PMCID: PMC6657857 DOI: 10.1371/journal.pone.0219955] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Accepted: 07/06/2019] [Indexed: 11/19/2022] Open
Abstract
Recent studies have demonstrated the effectiveness of the voice for communicating sonic ideas, and the accuracy with which it can be used to imitate acoustic instruments, synthesised sounds and environmental sounds. However, there has been little research on vocal imitation of percussion sounds, particularly concerning the perceptual similarity between imitations and the sounds being imitated. In the present study we address this by investigating how accurately musicians can vocally imitate percussion sounds, in terms of whether listeners consider the imitations 'more similar' to the imitated sounds than to other same-category sounds. In a vocal production task, 14 musicians imitated 30 drum sounds from five categories (cymbals, hats, kicks, snares, toms). Listeners were then asked to rate the similarity between the imitations and same-category drum sounds via web based listening test. We found that imitated sounds received the highest similarity ratings for 16 of the 30 sounds. The similarity between a given drum sound and its imitation was generally rated higher than for imitations of another same-category sound, however for some drum categories (snares and toms) certain sounds were consistently considered most similar to the imitations, irrespective of the sound being imitated. Finally, we apply an existing auditory image based measure for perceptual similarity between same-category drum sounds, to model the similarity ratings using linear mixed effect regression. The results indicate that this measure is a good predictor of perceptual similarity between imitations and imitated sounds, when compared to acoustic features containing only temporal or spectral features.
Collapse
Affiliation(s)
- Adib Mehrabi
- Department of Linguistics, Queen Mary University of London, London, England
- School of Electronic Engineering and Computer Science, Queen Mary University of London, London, England
| | - Simon Dixon
- Department of Linguistics, Queen Mary University of London, London, England
| | - Mark Sandler
- Department of Linguistics, Queen Mary University of London, London, England
| |
Collapse
|
4
|
Abstract
People have long pondered the evolution of language and the origin of words. Here, we investigate how conventional spoken words might emerge from imitations of environmental sounds. Does the repeated imitation of an environmental sound gradually give rise to more word-like forms? In what ways do these forms resemble the original sounds that motivated them (i.e. exhibit iconicity)? Participants played a version of the children's game 'Telephone'. The first generation of participants imitated recognizable environmental sounds (e.g. glass breaking, water splashing). Subsequent generations imitated the previous generation of imitations for a maximum of eight generations. The results showed that the imitations became more stable and word-like, and later imitations were easier to learn as category labels. At the same time, even after eight generations, both spoken imitations and their written transcriptions could be matched above chance to the category of environmental sound that motivated them. These results show how repeated imitation can create progressively more word-like forms while continuing to retain a resemblance to the original sound that motivated them, and speak to the possible role of human vocal imitation in explaining the origins of at least some spoken words.
Collapse
Affiliation(s)
- Pierce Edmiston
- Department of Psychology, University of Wisconsin-Madison, 1202 West Johnson Street, Madison, WI 53703, USA
| | - Marcus Perlman
- Department of English Language and Applied Linguistics, University of Birmingham, Birmingham, UK
| | - Gary Lupyan
- Department of Psychology, University of Wisconsin-Madison, 1202 West Johnson Street, Madison, WI 53703, USA
| |
Collapse
|
5
|
Perlman M, Lupyan G. People Can Create Iconic Vocalizations to Communicate Various Meanings to Naïve Listeners. Sci Rep 2018; 8:2634. [PMID: 29422530 PMCID: PMC5805706 DOI: 10.1038/s41598-018-20961-6] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Accepted: 01/23/2018] [Indexed: 11/20/2022] Open
Abstract
The innovation of iconic gestures is essential to establishing the vocabularies of signed languages, but might iconicity also play a role in the origin of spoken words? Can people create novel vocalizations that are comprehensible to naïve listeners without prior convention? We launched a contest in which participants submitted non-linguistic vocalizations for 30 meanings spanning actions, humans, animals, inanimate objects, properties, quantifiers and demonstratives. The winner was determined by the ability of naïve listeners to infer the meanings of the vocalizations. We report a series of experiments and analyses that evaluated the vocalizations for: (1) comprehensibility to naïve listeners; (2) the degree to which they were iconic; (3) agreement between producers and listeners in iconicity; and (4) whether iconicity helps listeners learn the vocalizations as category labels. The results show contestants were able to create successful iconic vocalizations for most of the meanings, which were largely comprehensible to naïve listeners, and easier to learn as category labels. These findings demonstrate how iconic vocalizations can enable interlocutors to establish understanding in the absence of conventions. They suggest that, prior to the advent of full-blown spoken languages, people could have used iconic vocalizations to ground a spoken vocabulary with considerable semantic breadth.
Collapse
Affiliation(s)
- Marcus Perlman
- Department of English Language and Applied Linguistics, University of Birmingham, Birmingham, UK.
| | - Gary Lupyan
- Department of Psychology, University of Wisconsin-Madison, Madison, USA
| |
Collapse
|
6
|
Rising tones and rustling noises: Metaphors in gestural depictions of sounds. PLoS One 2017; 12:e0181786. [PMID: 28750071 PMCID: PMC5547699 DOI: 10.1371/journal.pone.0181786] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Accepted: 07/06/2017] [Indexed: 11/19/2022] Open
Abstract
Communicating an auditory experience with words is a difficult task and, in consequence, people often rely on imitative non-verbal vocalizations and gestures. This work explored the combination of such vocalizations and gestures to communicate auditory sensations and representations elicited by non-vocal everyday sounds. Whereas our previous studies have analyzed vocal imitations, the present research focused on gestural depictions of sounds. To this end, two studies investigated the combination of gestures and non-verbal vocalizations. A first, observational study examined a set of vocal and gestural imitations of recordings of sounds representative of a typical everyday environment (ecological sounds) with manual annotations. A second, experimental study used non-ecological sounds whose parameters had been specifically designed to elicit the behaviors highlighted in the observational study, and used quantitative measures and inferential statistics. The results showed that these depicting gestures are based on systematic analogies between a referent sound, as interpreted by a receiver, and the visual aspects of the gestures: auditory-visual metaphors. The results also suggested a different role for vocalizations and gestures. Whereas the vocalizations reproduce all features of the referent sounds as faithfully as vocally possible, the gestures focus on one salient feature with metaphors based on auditory-visual correspondences. Both studies highlighted two metaphors consistently shared across participants: the spatial metaphor of pitch (mapping different pitches to different positions on the vertical dimension), and the rustling metaphor of random fluctuations (rapidly shaking of hands and fingers). We interpret these metaphors as the result of two kinds of representations elicited by sounds: auditory sensations (pitch and loudness) mapped to spatial position, and causal representations of the sound sources (e.g. rain drops, rustling leaves) pantomimed and embodied by the participants' gestures.
Collapse
|
7
|
Mehrabi A, Dixon S, Sandler MB. Vocal imitation of synthesised sounds varying in pitch, loudness and spectral centroid. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:783. [PMID: 28253682 DOI: 10.1121/1.4974825] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2016] [Revised: 12/27/2016] [Accepted: 01/06/2017] [Indexed: 06/06/2023]
Abstract
Vocal imitations are often used to convey sonic ideas [Lemaitre, Dessein, Susini, and Aura. (2011). Ecol. Psych. 23(4), 267-307]. For computer based systems to interpret these vocalisations, it is advantageous to apply knowledge of what happens when people vocalise sounds where the acoustic features have different temporal envelopes. In the present study, 19 experienced musicians and music producers were asked to imitate 44 sounds with one or two feature envelopes applied. The study addresses two main questions: (1) How accurately can people imitate ramp and modulation envelopes for pitch, loudness, and spectral centroid?; (2) What happens to this accuracy when people are asked to imitate two feature envelopes simultaneously? The results show that experienced musicians can imitate pitch, loudness, and spectral centroid accurately, and that imitation accuracy is generally preserved when the imitated stimuli combine two, non-necessarily congruent features. This demonstrates the viability of using the voice as a natural means of expressing time series of two features simultaneously.
Collapse
Affiliation(s)
- Adib Mehrabi
- Centre for Digital Music, School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom
| | - Simon Dixon
- Centre for Digital Music, School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom
| | - Mark B Sandler
- Centre for Digital Music, School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom
| |
Collapse
|
8
|
Lemaitre G, Houix O, Voisin F, Misdariis N, Susini P. Vocal Imitations of Non-Vocal Sounds. PLoS One 2016; 11:e0168167. [PMID: 27992480 PMCID: PMC5161510 DOI: 10.1371/journal.pone.0168167] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 11/24/2016] [Indexed: 11/25/2022] Open
Abstract
Imitative behaviors are widespread in humans, in particular whenever two persons communicate and interact. Several tokens of spoken languages (onomatopoeias, ideophones, and phonesthemes) also display different degrees of iconicity between the sound of a word and what it refers to. Thus, it probably comes at no surprise that human speakers use a lot of imitative vocalizations and gestures when they communicate about sounds, as sounds are notably difficult to describe. What is more surprising is that vocal imitations of non-vocal everyday sounds (e.g. the sound of a car passing by) are in practice very effective: listeners identify sounds better with vocal imitations than with verbal descriptions, despite the fact that vocal imitations are inaccurate reproductions of a sound created by a particular mechanical system (e.g. a car driving by) through a different system (the voice apparatus). The present study investigated the semantic representations evoked by vocal imitations of sounds by experimentally quantifying how well listeners could match sounds to category labels. The experiment used three different types of sounds: recordings of easily identifiable sounds (sounds of human actions and manufactured products), human vocal imitations, and computational “auditory sketches” (created by algorithmic computations). The results show that performance with the best vocal imitations was similar to the best auditory sketches for most categories of sounds, and even to the referent sounds themselves in some cases. More detailed analyses showed that the acoustic distance between a vocal imitation and a referent sound is not sufficient to account for such performance. Analyses suggested that instead of trying to reproduce the referent sound as accurately as vocally possible, vocal imitations focus on a few important features, which depend on each particular sound category. These results offer perspectives for understanding how human listeners store and access long-term sound representations, and sets the stage for the development of human-computer interfaces based on vocalizations.
Collapse
Affiliation(s)
- Guillaume Lemaitre
- Equipe Perception et Design Sonores, STMS-IRCAM-CNRS-UPMC, Institut de Recherche et de Coordination Acoustique Musique, Paris, France
- * E-mail:
| | - Olivier Houix
- Equipe Perception et Design Sonores, STMS-IRCAM-CNRS-UPMC, Institut de Recherche et de Coordination Acoustique Musique, Paris, France
| | - Frédéric Voisin
- Equipe Perception et Design Sonores, STMS-IRCAM-CNRS-UPMC, Institut de Recherche et de Coordination Acoustique Musique, Paris, France
| | - Nicolas Misdariis
- Equipe Perception et Design Sonores, STMS-IRCAM-CNRS-UPMC, Institut de Recherche et de Coordination Acoustique Musique, Paris, France
| | - Patrick Susini
- Equipe Perception et Design Sonores, STMS-IRCAM-CNRS-UPMC, Institut de Recherche et de Coordination Acoustique Musique, Paris, France
| |
Collapse
|
9
|
Lemaitre G, Jabbari A, Misdariis N, Houix O, Susini P. Vocal imitations of basic auditory features. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:290-300. [PMID: 26827025 DOI: 10.1121/1.4939738] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Describing complex sounds with words is a difficult task. In fact, previous studies have shown that vocal imitations of sounds are more effective than verbal descriptions [Lemaitre and Rocchesso (2014). J. Acoust. Soc. Am. 135, 862-873]. The current study investigated how vocal imitations of sounds enable their recognition by studying how two expert and two lay participants reproduced four basic auditory features: pitch, tempo, sharpness, and onset. It used 4 sets of 16 referent sounds (modulated narrowband noises and pure tones), based on 1 feature or crossing 2 of the 4 features. Dissimilarity rating experiments and multidimensional scaling analyses confirmed that listeners could accurately perceive the four features composing the four sets of referent sounds. The four participants recorded vocal imitations of the four sets of sounds. Analyses identified three strategies: (1) Vocal imitations of pitch and tempo reproduced faithfully the absolute value of the feature; (2) Vocal imitations of sharpness transposed the feature into the participants' registers; (3) Vocal imitations of onsets categorized the continuum of onset values into two discrete morphological profiles. Overall, these results highlight that vocal imitations do not simply mimic the referent sounds, but seek to emphasize the characteristic features of the referent sounds within the constraints of human vocal production.
Collapse
Affiliation(s)
- Guillaume Lemaitre
- STMS-IRCAM-CNRS-UPMC, Equipe Perception et Design Sonores, Paris, France
| | - Ali Jabbari
- STMS-IRCAM-CNRS-UPMC, Equipe Perception et Design Sonores, Paris, France
| | - Nicolas Misdariis
- STMS-IRCAM-CNRS-UPMC, Equipe Perception et Design Sonores, Paris, France
| | - Olivier Houix
- STMS-IRCAM-CNRS-UPMC, Equipe Perception et Design Sonores, Paris, France
| | - Patrick Susini
- STMS-IRCAM-CNRS-UPMC, Equipe Perception et Design Sonores, Paris, France
| |
Collapse
|