1
|
Hansen NC, Reymore L. Timbral cues underlie instrument-specific absolute pitch in expert oboists. PLoS One 2024; 19:e0306974. [PMID: 39361623 PMCID: PMC11449301 DOI: 10.1371/journal.pone.0306974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 06/26/2024] [Indexed: 10/05/2024] Open
Abstract
While absolute pitch (AP)-the ability to identify musical pitches without external reference-is rare even in professional musicians, anecdotal evidence and case-report data suggest that some musicians without traditional AP can nonetheless better name notes played on their musical instrument of expertise than notes played on instruments less familiar to them. We have called this gain in AP ability "instrument-specific absolute pitch" (ISAP). Here, we report the results of the first two experiments designed to investigate ISAP in professional oboists. In Experiment 1 (n = 40), superiority for identifying the pitch of oboe over piano tones varied along a continuum, with 37.5% of oboists demonstrating significant ISAP. Variance in accuracy across pitches was higher among ISAP-possessors than ISAP-non-possessors, suggestive of internalized timbral idiosyncrasies, and the use of timbral cues was the second-most commonly reported task strategy. For both timbres, both groups performed more accurately for pitches associated with white than black piano keys. In Experiment 2 (n = 12), oboists with ISAP were less accurate in pitch identification when oboe tones were artificially pitch-shifted. The use of timbral idiosyncrasies thus may constitute a widespread mechanism of ISAP. Motor interference, conversely, did not significantly reduce accuracy. This study offers the first evidence of ISAP among highly trained musicians and that reliance on subtle timbral (or intonational) idiosyncrasies may constitute an underlying mechanism of this ability in expert oboists. This provides a path forward for future studies extending the scientific understanding of ISAP to other instrument types, expertise levels, and musical contexts. More generally, this may deepen knowledge of specialized expertise, representing a range of implicit abilities that are not addressed directly in training, but which may develop through practice of a related skill set.
Collapse
Affiliation(s)
- Niels Chr. Hansen
- Aarhus Institute of Advanced Studies, Aarhus University, Aarhus, Denmark
- Centre of Excellence in Music, Mind, Body, & Brain, Department of Music, Art and Culture Studies, University of Jyväskylä, Jyväskylä, Finland
- Interacting Minds Centre, School of Culture and Society, Aarhus University, Aarhus, Denmark
- Royal Academy of Music, Aarhus/Aalborg, Denmark
| | - Lindsey Reymore
- Schulich School of Music, McGill University, Montreal, Canada
- School of Music, Dance, and Theatre, Arizona State University, Tempe, AZ, United States of America
| |
Collapse
|
2
|
Saitis C, Wallmark Z. Timbral brightness perception investigated through multimodal interference. Atten Percept Psychophys 2024; 86:1835-1845. [PMID: 39090510 PMCID: PMC11410849 DOI: 10.3758/s13414-024-02934-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/08/2024] [Indexed: 08/04/2024]
Abstract
Brightness is among the most studied aspects of timbre perception. Psychoacoustically, sounds described as "bright" versus "dark" typically exhibit a high versus low frequency emphasis in the spectrum. However, relatively little is known about the neurocognitive mechanisms that facilitate these metaphors we listen with. Do they originate in universal magnitude representations common to more than one sensory modality? Triangulating three different interaction paradigms, we investigated using speeded classification whether intramodal, crossmodal, and amodal interference occurs when timbral brightness, as modeled by the centroid of the spectral envelope, and pitch height/visual brightness/numerical value processing are semantically congruent and incongruent. In four online experiments varying in priming strategy, onset timing, and response deadline, 189 total participants were presented with a baseline stimulus (a pitch, gray square, or numeral) then asked to quickly identify a target stimulus that is higher/lower, brighter/darker, or greater/less than the baseline after being primed with a bright or dark synthetic harmonic tone. Results suggest that timbral brightness modulates the perception of pitch and possibly visual brightness, but not numerical value. Semantically incongruent pitch height-timbral brightness shifts produced significantly slower reaction time (RT) and higher error compared to congruent pairs. In the visual task, incongruent pairings of gray squares and tones elicited slower RTs than congruent pairings (in two experiments). No interference was observed in the number comparison task. These findings shed light on the embodied and multimodal nature of experiencing timbre.
Collapse
Affiliation(s)
| | - Zachary Wallmark
- School of Music and Dance and Center for Translational Neuroscience, University of Oregon, Eugene, OR, USA
| |
Collapse
|
3
|
Shorey AE, King CJ, Whiteford KL, Stilp CE. Musical training is not associated with spectral context effects in instrument sound categorization. Atten Percept Psychophys 2024; 86:991-1007. [PMID: 38216848 DOI: 10.3758/s13414-023-02839-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2023] [Indexed: 01/14/2024]
Abstract
Musicians display a variety of auditory perceptual benefits relative to people with little or no musical training; these benefits are collectively referred to as the "musician advantage." Importantly, musicians consistently outperform nonmusicians for tasks relating to pitch, but there are mixed reports as to musicians outperforming nonmusicians for timbre-related tasks. Due to their experience manipulating the timbre of their instrument or voice in performance, we hypothesized that musicians would be more sensitive to acoustic context effects stemming from the spectral changes in timbre across a musical context passage (played by a string quintet then filtered) and a target instrument sound (French horn or tenor saxophone; Experiment 1). Additionally, we investigated the role of a musician's primary instrument of instruction by recruiting French horn and tenor saxophone players to also complete this task (Experiment 2). Consistent with the musician advantage literature, musicians exhibited superior pitch discrimination to nonmusicians. Contrary to our main hypothesis, there was no difference between musicians and nonmusicians in how spectral context effects shaped instrument sound categorization. Thus, musicians may only outperform nonmusicians for some auditory skills relevant to music (e.g., pitch perception) but not others (e.g., timbre perception via spectral differences).
Collapse
Affiliation(s)
- Anya E Shorey
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, KY, 40292, USA.
| | - Caleb J King
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, KY, 40292, USA.
| | - Kelly L Whiteford
- Department of Psychology, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, KY, 40292, USA
| |
Collapse
|
4
|
Carney LH. Neural Fluctuation Contrast as a Code for Complex Sounds: The Role and Control of Peripheral Nonlinearities. Hear Res 2024; 443:108966. [PMID: 38310710 PMCID: PMC10923127 DOI: 10.1016/j.heares.2024.108966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/14/2024] [Accepted: 01/26/2024] [Indexed: 02/06/2024]
Abstract
The nonlinearities of the inner ear are often considered to be obstacles that the central nervous system has to overcome to decode neural responses to sounds. This review describes how peripheral nonlinearities, such as saturation of the inner-hair-cell response and of the IHC-auditory-nerve synapse, are instead beneficial to the neural encoding of complex sounds such as speech. These nonlinearities set up contrast in the depth of neural-fluctuations in auditory-nerve responses along the tonotopic axis, referred to here as neural fluctuation contrast (NFC). Physiological support for the NFC coding hypothesis is reviewed, and predictions of several psychophysical phenomena, including masked detection and speech intelligibility, are presented. Lastly, a framework based on the NFC code for understanding how the medial olivocochlear (MOC) efferent system contributes to the coding of complex sounds is presented. By modulating cochlear gain control in response to both sound energy and fluctuations in neural responses, the MOC system is hypothesized to function not as a simple feedback gain-control device, but rather as a mechanism for enhancing NFC along the tonotopic axis, enabling robust encoding of complex sounds across a wide range of sound levels and in the presence of background noise. Effects of sensorineural hearing loss on the NFC code and on the MOC feedback system are presented and discussed.
Collapse
Affiliation(s)
- Laurel H Carney
- Depts. of Biomedical Engineering, Neuroscience, and Electrical & Computer Engineering University of Rochester, Rochester, NY, USA.
| |
Collapse
|
5
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. SCIENCE ADVANCES 2024; 10:eadk0010. [PMID: 38363839 PMCID: PMC10871532 DOI: 10.1126/sciadv.adk0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception varies along several pitch-based dimensions: (i) the absolute pitch of notes, (ii) the difference in pitch between successive notes, and (iii) the statistical expectation of each note given prior context. How the brain represents these dimensions and whether their encoding is specialized for music remains unknown. We recorded high-density neurophysiological activity directly from the human auditory cortex while participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial map for representing distinct melodic dimensions. The same participants listened to spoken English, and we compared responses to music and speech. Cortical sites selective for music encoded expectation, while sites that encoded pitch and pitch-change in music used the same neural code to represent equivalent properties of speech. Findings reveal how the perception of melody recruits both music-specific and general-purpose sound representations.
Collapse
Affiliation(s)
- Narayan Sankaran
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K. Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Frederic Theunissen
- Department of Psychology, University of California, Berkeley, 2121 Berkeley Way, Berkeley, CA 94720, USA
| | - Edward F. Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| |
Collapse
|
6
|
Sankaran N, Leonard MK, Theunissen F, Chang EF. Encoding of melody in the human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562771. [PMID: 37905047 PMCID: PMC10614915 DOI: 10.1101/2023.10.17.562771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Melody is a core component of music in which discrete pitches are serially arranged to convey emotion and meaning. Perception of melody varies along several pitch-based dimensions: (1) the absolute pitch of notes, (2) the difference in pitch between successive notes, and (3) the higher-order statistical expectation of each note conditioned on its prior context. While humans readily perceive melody, how these dimensions are collectively represented in the brain and whether their encoding is specialized for music remains unknown. Here, we recorded high-density neurophysiological activity directly from the surface of human auditory cortex while Western participants listened to Western musical phrases. Pitch, pitch-change, and expectation were selectively encoded at different cortical sites, indicating a spatial code for representing distinct dimensions of melody. The same participants listened to spoken English, and we compared evoked responses to music and speech. Cortical sites selective for music were systematically driven by the encoding of expectation. In contrast, sites that encoded pitch and pitch-change used the same neural code to represent equivalent properties of speech. These findings reveal the multidimensional nature of melody encoding, consisting of both music-specific and domain-general sound representations in auditory cortex. Teaser The human brain contains both general-purpose and music-specific neural populations for processing distinct attributes of melody.
Collapse
|
7
|
Tillmann B, Graves JE, Talamini F, Lévêque Y, Fornoni L, Hoarau C, Pralus A, Ginzburg J, Albouy P, Caclin A. Auditory cortex and beyond: Deficits in congenital amusia. Hear Res 2023; 437:108855. [PMID: 37572645 DOI: 10.1016/j.heares.2023.108855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Revised: 06/14/2023] [Accepted: 07/21/2023] [Indexed: 08/14/2023]
Abstract
Congenital amusia is a neuro-developmental disorder of music perception and production, with the observed deficits contrasting with the sophisticated music processing reported for the general population. Musical deficits within amusia have been hypothesized to arise from altered pitch processing, with impairments in pitch discrimination and, notably, short-term memory. We here review research investigating its behavioral and neural correlates, in particular the impairments at encoding, retention, and recollection of pitch information, as well as how these impairments extend to the processing of pitch cues in speech and emotion. The impairments have been related to altered brain responses in a distributed fronto-temporal network, which can be observed also at rest. Neuroimaging studies revealed changes in connectivity patterns within this network and beyond, shedding light on the brain dynamics underlying auditory cognition. Interestingly, some studies revealed spared implicit pitch processing in congenital amusia, showing the power of implicit cognition in the music domain. Building on these findings, together with audiovisual integration and other beneficial mechanisms, we outline perspectives for training and rehabilitation and the future directions of this research domain.
Collapse
Affiliation(s)
- Barbara Tillmann
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France; Laboratory for Research on Learning and Development, Université de Bourgogne, LEAD - CNRS UMR5022, Dijon, France; LEAD-CNRS UMR5022; Université Bourgogne Franche-Comté; Pôle AAFE; 11 Esplanade Erasme; 21000 Dijon, France.
| | - Jackson E Graves
- Laboratoire des systèmes perceptifs, Département d'études cognitives, École normale supérieure, PSL University, Paris 75005, France
| | | | - Yohana Lévêque
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France
| | - Lesly Fornoni
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France
| | - Caliani Hoarau
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France
| | - Agathe Pralus
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France
| | - Jérémie Ginzburg
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France
| | - Philippe Albouy
- CERVO Brain Research Center, School of Psychology, Laval University, Québec, G1J 2G3; International Laboratory for Brain, Music and Sound Research (BRAMS), CRBLM, Montreal QC, H2V 2J2, Canada
| | - Anne Caclin
- CNRS, INSERM, Centre de Recherche en Neurosciences de Lyon CRNL, Université Claude Bernard Lyon 1, UMR5292, U1028, F-69500, Bron, France.
| |
Collapse
|
8
|
Kuo CY, Liu JW, Wang CH, Juan CH, Hsieh IH. The role of carrier spectral composition in the perception of musical pitch. Atten Percept Psychophys 2023; 85:2083-2099. [PMID: 37479873 DOI: 10.3758/s13414-023-02761-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/07/2023] [Indexed: 07/23/2023]
Abstract
Temporal envelope fluctuations of natural sounds convey critical information to speech and music processing. In particular, musical pitch perception is assumed to be primarily underlined by temporal envelope encoding. While increasing evidence demonstrates the importance of carrier fine structure to complex pitch perception, how carrier spectral information affects musical pitch perception is less clear. Here, transposed tones designed to convey identical envelope information across different carriers were used to assess the effects of carrier spectral composition to pitch discrimination and musical-interval and melody identifications. Results showed that pitch discrimination thresholds became lower (better) with increasing carrier frequencies from 1k to 10k Hz, with performance comparable to that of pure sinusoids. Musical interval and melody defined by the periodicity of sine- or harmonic complex envelopes across carriers were identified with greater than 85% accuracy even on a 10k-Hz carrier. Moreover, enhanced interval and melody identification performance was observed with increasing carrier frequency up to 6k Hz. Findings suggest a perceptual enhancement of temporal envelope information with increasing carrier spectral region in musical pitch processing, at least for frequencies up to 6k Hz. For carriers in the extended high-frequency region (8-20k Hz), the use of temporal envelope information to music pitch processing may vary depending on task requirement. Collectively, these results implicate the fidelity of temporal envelope information to musical pitch perception is more pronounced than previously considered, with ecological implications.
Collapse
Affiliation(s)
- Chao-Yin Kuo
- Institute of Cognitive Neuroscience, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan
- Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei City, Taiwan
| | - Jia-Wei Liu
- Institute of Cognitive Neuroscience, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan
| | - Chih-Hung Wang
- Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei City, Taiwan
| | - Chi-Hung Juan
- Institute of Cognitive Neuroscience, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan
- Cognitive Intelligence and Precision Healthcare Center, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan
| | - I-Hui Hsieh
- Institute of Cognitive Neuroscience, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan.
- Cognitive Intelligence and Precision Healthcare Center, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan.
| |
Collapse
|
9
|
Choi W, Lai VKW. Does musicianship influence the perceptual integrality of tones and segmental information? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:852-862. [PMID: 37566718 DOI: 10.1121/10.0020579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 07/21/2023] [Indexed: 08/13/2023]
Abstract
This study investigated the effect of musicianship on the perceptual integrality of tones and segmental information in non-native speech perception. We tested 112 Cantonese musicians, Cantonese non-musicians, English musicians, and English non-musicians with a modified Thai tone AX discrimination task. In the tone discrimination task, the control block only contained tonal variations, whereas the orthogonal block contained both tonal and task-irrelevant segmental variations. Relative to their own performance in the control block, the Cantonese listeners showed decreased sensitivity index (d') and increased response time in the orthogonal block, reflecting integral perception of tones and segmental information. By contrast, the English listeners performed similarly across the two blocks, indicating independent perception. Bayesian analysis revealed that the Cantonese musicians and the Cantonese non-musicians perceived Thai tones and segmental information equally integrally. Moreover, the English musicians and the English non-musicians showed similar degrees of independent perception. Based on the above results, musicianship does not seem to influence tone-segmental perceptual integrality. While musicianship apparently enhances tone sensitivity, not all musical advantages are transferrable to the language domain.
Collapse
Affiliation(s)
- William Choi
- Academic Unit of Human Communication, Development, and Information Sciences, The University of Hong Kong, Pokfulam, Hong Kong Special Administrative Region
| | - Veronica Ka Wai Lai
- Department of Pediatrics, Faculty of Medicine, Manitoba University, Winnipeg, Manitoba, Canada
| |
Collapse
|
10
|
Chan MPY, Kuang J. The effect of tone language background on cue integration in pitch perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:819-830. [PMID: 37563829 DOI: 10.1121/10.0020565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 07/18/2023] [Indexed: 08/12/2023]
Abstract
This study explores the effect of native language and musicality on voice quality cue integration in pitch perception. Previous work by Cui and Kang [(2019). J. Acoust. Soc. Am. 146(6), 4086-4096] found no differences in pitch perception strategies between English and Mandarin speakers. The present study asks whether Cantonese listeners may perform differently, as Cantonese consists of multiple level tones. Participants completed two experiments: (i) a forced choice pitch classification experiment involving four spectral slope permutations that vary in fo across an 11 step continuum, and (ii) the MBEMA test that quantifies listeners' musicality. Results show that Cantonese speakers do not differ from English and Mandarin speakers in terms of overall categoricity and perceptual shift, that Cantonese speakers do not have advantages in musicality, and that musicality is a significant predictor for participants' pitch perception strategies. Listeners with higher musicality scores tend to rely more on fo cues than voice quality cues compared to listeners with lower musicality. These findings support the notion that voice quality integration in pitch perception is not language specific, and may be a universal psychoacoustic phenomenon at a non-lexical level.
Collapse
Affiliation(s)
- May Pik Yu Chan
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6228, USA
| | - Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6228, USA
| |
Collapse
|
11
|
Chen C, de Hoz L. The perceptual categorization of multidimensional stimuli is hierarchically organized. iScience 2023; 26:106941. [PMID: 37378341 PMCID: PMC10291468 DOI: 10.1016/j.isci.2023.106941] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/30/2023] [Accepted: 05/18/2023] [Indexed: 06/29/2023] Open
Abstract
As we interact with our surroundings, we encounter the same or similar objects from different perspectives and are compelled to generalize. For example, despite their variety we recognize dog barks as a distinct sound class. While we have some understanding of generalization along a single stimulus dimension (frequency, color), natural stimuli are identifiable by a combination of dimensions. Measuring their interaction is essential to understand perception. Using a 2-dimension discrimination task for mice and frequency or amplitude modulated sounds, we tested untrained generalization across pairs of auditory dimensions in an automatized behavioral paradigm. We uncovered a perceptual hierarchy over the tested dimensions that was dominated by the sound's spectral composition. Stimuli are thus not perceived as a whole, but as a combination of their features, each of which weights differently on the identification of the stimulus according to an established hierarchy, possibly paralleling their differential shaping of neuronal tuning.
Collapse
Affiliation(s)
- Chi Chen
- Department of Neurogenetics, Max Planck Institute for Experimental Medicine, Göttingen, Germany
- International Max Planck Research School for Neurosciences, Göttingen, Germany
- Göttingen Graduate School of Neurosciences and Molecular Biosciences, Göttingen, Germany
- Neuroscience Research Center, Charité Medical University, Berlin, Germany
| | - Livia de Hoz
- Department of Neurogenetics, Max Planck Institute for Experimental Medicine, Göttingen, Germany
- Neuroscience Research Center, Charité Medical University, Berlin, Germany
- Bernstein Center for Computational Neuroscience, Berlin, Germany
| |
Collapse
|
12
|
Bouvier B, Susini P, Marquis-Favre C, Misdariis N. Revealing the stimulus-driven component of attention through modulations of auditory salience by timbre attributes. Sci Rep 2023; 13:6842. [PMID: 37100849 PMCID: PMC10133446 DOI: 10.1038/s41598-023-33496-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 04/13/2023] [Indexed: 04/28/2023] Open
Abstract
Attention allows the listener to select relevant information from their environment, and disregard what is irrelevant. However, irrelevant stimuli sometimes manage to capture it and stand out from a scene because of bottom-up processes driven by salient stimuli. This attentional capture effect was observed using an implicit approach based on the additional singleton paradigm. In the auditory domain, it was shown that sound attributes such as intensity and frequency tend to capture attention during auditory search (cost to performance) for targets defined on a different dimension such as duration. In the present study, the authors examined whether a similar phenomenon occurs for attributes of timbre such as brightness (related to the spectral centroid) and roughness (related the amplitude modulation depth). More specifically, we revealed the relationship between the variations of these attributes and the magnitude of the attentional capture effect. In experiment 1, the occurrence of a brighter sound (higher spectral centroid) embedded in sequences of successive tones produced significant search costs. In experiments 2 and 3, different values of brightness and roughness confirmed that attention capture is monotonically driven by the sound features. In experiment 4, the effect was found to be symmetrical: positive or negative, the same difference in brightness had the same negative effect on performance. Experiment 5 suggested that the effect produced by the variations of the two attributes is additive. This work provides a methodology for quantifying the bottom-up component of attention and brings new insights on attention capture and auditory salience.
Collapse
Affiliation(s)
- Baptiste Bouvier
- STMS IRCAM, Sorbonne Université, CNRS, Ministère de La Culture, 75004, Paris, France.
- Univ Lyon, ENTPE, École Centrale de Lyon, CNRS, LTDS, UMR5513, 69518, Vaulx-en-Velin, France.
| | - Patrick Susini
- STMS IRCAM, Sorbonne Université, CNRS, Ministère de La Culture, 75004, Paris, France
| | - Catherine Marquis-Favre
- Univ Lyon, ENTPE, École Centrale de Lyon, CNRS, LTDS, UMR5513, 69518, Vaulx-en-Velin, France
| | - Nicolas Misdariis
- STMS IRCAM, Sorbonne Université, CNRS, Ministère de La Culture, 75004, Paris, France
| |
Collapse
|
13
|
Rosi V, Arias Sarah P, Houix O, Misdariis N, Susini P. Shared mental representations underlie metaphorical sound concepts. Sci Rep 2023; 13:5180. [PMID: 36997613 PMCID: PMC10063581 DOI: 10.1038/s41598-023-32214-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 03/24/2023] [Indexed: 04/01/2023] Open
Abstract
Communication between sound and music experts is based on the shared understanding of a metaphorical vocabulary derived from other sensory modalities. Yet, the impact of sound expertise on the mental representation of these sound concepts remains blurry. To address this issue, we investigated the acoustic portraits of four metaphorical sound concepts (brightness, warmth, roundness, and roughness) in three groups of participants (sound engineers, conductors, and non-experts). Participants (N = 24) rated a corpus of orchestral instrument sounds (N = 520) using Best-Worst Scaling. With this data-driven method, we sorted the sound corpus for each concept and population. We compared the population ratings and ran machine learning algorithms to unveil the acoustic portraits of each concept. Overall, the results revealed that sound engineers were the most consistent. We found that roughness is widely shared while brightness is expertise dependent. The frequent use of brightness by expert populations suggests that its meaning got specified through sound expertise. As for roundness and warmth, it seems that the importance of pitch and noise in their acoustic definition is the key to distinguishing them. These results provide crucial information on the mental representations of a metaphorical vocabulary of sound and whether it is shared or refined by sound expertise.
Collapse
Affiliation(s)
- Victor Rosi
- Sound Perception and Design Group, STMS, Ircam - Sorbonne Université - CNRS - Ministère de la Culture, 1 Place Igor Stravinsky, 75004, Paris, France.
| | - Pablo Arias Sarah
- School of Psychology and Neuroscience, University of Glasgow, 62 Hillhead Street, Glasgow, G12 8QB, UK
- Lund University Cognitive Science, Lund University, Box 192, 221 00, Lund, Sweden
| | - Olivier Houix
- Sound Perception and Design Group, STMS, Ircam - Sorbonne Université - CNRS - Ministère de la Culture, 1 Place Igor Stravinsky, 75004, Paris, France
| | - Nicolas Misdariis
- Sound Perception and Design Group, STMS, Ircam - Sorbonne Université - CNRS - Ministère de la Culture, 1 Place Igor Stravinsky, 75004, Paris, France
| | - Patrick Susini
- Sound Perception and Design Group, STMS, Ircam - Sorbonne Université - CNRS - Ministère de la Culture, 1 Place Igor Stravinsky, 75004, Paris, France
| |
Collapse
|
14
|
McPherson MJ, McDermott JH. Relative pitch representations and invariance to timbre. Cognition 2023; 232:105327. [PMID: 36495710 PMCID: PMC10016107 DOI: 10.1016/j.cognition.2022.105327] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 09/13/2022] [Accepted: 11/10/2022] [Indexed: 12/12/2022]
Abstract
Information in speech and music is often conveyed through changes in fundamental frequency (f0), perceived by humans as "relative pitch". Relative pitch judgments are complicated by two facts. First, sounds can simultaneously vary in timbre due to filtering imposed by a vocal tract or instrument body. Second, relative pitch can be extracted in two ways: by measuring changes in constituent frequency components from one sound to another, or by estimating the f0 of each sound and comparing the estimates. We examined the effects of timbral differences on relative pitch judgments, and whether any invariance to timbre depends on whether judgments are based on constituent frequencies or their f0. Listeners performed up/down and interval discrimination tasks with pairs of spoken vowels, instrument notes, or synthetic tones, synthesized to be either harmonic or inharmonic. Inharmonic sounds lack a well-defined f0, such that relative pitch must be extracted from changes in individual frequencies. Pitch judgments were less accurate when vowels/instruments were different compared to when they were the same, and were biased by the associated timbre differences. However, this bias was similar for harmonic and inharmonic sounds, and was observed even in conditions where judgments of harmonic sounds were based on f0 representations. Relative pitch judgments are thus not invariant to timbre, even when timbral variation is naturalistic, and when such judgments are based on representations of f0.
Collapse
Affiliation(s)
- Malinda J McPherson
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States of America; Program in Speech and Hearing Biosciences and Technology, Harvard University, Boston, MA 02115, United States of America; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States of America.
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States of America; Program in Speech and Hearing Biosciences and Technology, Harvard University, Boston, MA 02115, United States of America; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States of America; Center for Brains Minds and Machines, MIT, Cambridge, MA 02139, United States of America
| |
Collapse
|
15
|
Siedenburg K, Graves J, Pressnitzer D. A unitary model of auditory frequency change perception. PLoS Comput Biol 2023; 19:e1010307. [PMID: 36634121 PMCID: PMC9876382 DOI: 10.1371/journal.pcbi.1010307] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 01/25/2023] [Accepted: 01/04/2023] [Indexed: 01/13/2023] Open
Abstract
Changes in the frequency content of sounds over time are arguably the most basic form of information about the behavior of sound-emitting objects. In perceptual studies, such changes have mostly been investigated separately, as aspects of either pitch or timbre. Here, we propose a unitary account of "up" and "down" subjective judgments of frequency change, based on a model combining auditory correlates of acoustic cues in a sound-specific and listener-specific manner. To do so, we introduce a generalized version of so-called Shepard tones, allowing symmetric manipulations of spectral information on a fine scale, usually associated to pitch (spectral fine structure, SFS), and on a coarse scale, usually associated timbre (spectral envelope, SE). In a series of behavioral experiments, listeners reported "up" or "down" shifts across pairs of generalized Shepard tones that differed in SFS, in SE, or in both. We observed the classic properties of Shepard tones for either SFS or SE shifts: subjective judgements followed the smallest log-frequency change direction, with cases of ambiguity and circularity. Interestingly, when both SFS and SE changes were applied concurrently (synergistically or antagonistically), we observed a trade-off between cues. Listeners were encouraged to report when they perceived "both" directions of change concurrently, but this rarely happened, suggesting a unitary percept. A computational model could accurately fit the behavioral data by combining different cues reflecting frequency changes after auditory filtering. The model revealed that cue weighting depended on the nature of the sound. When presented with harmonic sounds, listeners put more weight on SFS-related cues, whereas inharmonic sounds led to more weight on SE-related cues. Moreover, these stimulus-based factors were modulated by inter-individual differences, revealing variability across listeners in the detailed recipe for "up" and "down" judgments. We argue that frequency changes are tracked perceptually via the adaptive combination of a diverse set of cues, in a manner that is in fact similar to the derivation of other basic auditory dimensions such as spatial location.
Collapse
Affiliation(s)
- Kai Siedenburg
- Carl von Ossietzky University of Oldenburg, Dept. of Medical Physics and Acoustics, Oldenburg, Germany
- * E-mail:
| | - Jackson Graves
- Laboratoire des systèmes perceptifs, Dépt. d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| | - Daniel Pressnitzer
- Laboratoire des systèmes perceptifs, Dépt. d’études cognitives, École normale supérieure, PSL University, CNRS, Paris, France
| |
Collapse
|
16
|
Oxenham AJ. Questions and controversies surrounding the perception and neural coding of pitch. Front Neurosci 2023; 16:1074752. [PMID: 36699531 PMCID: PMC9868815 DOI: 10.3389/fnins.2022.1074752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 12/16/2022] [Indexed: 01/12/2023] Open
Abstract
Pitch is a fundamental aspect of auditory perception that plays an important role in our ability to understand speech, appreciate music, and attend to one sound while ignoring others. The questions surrounding how pitch is represented in the auditory system, and how our percept relates to the underlying acoustic waveform, have been a topic of inquiry and debate for well over a century. New findings and technological innovations have led to challenges of some long-standing assumptions and have raised new questions. This article reviews some recent developments in the study of pitch coding and perception and focuses on the topic of how pitch information is extracted from peripheral representations based on frequency-to-place mapping (tonotopy), stimulus-driven auditory-nerve spike timing (phase locking), or a combination of both. Although a definitive resolution has proved elusive, the answers to these questions have potentially important implications for mitigating the effects of hearing loss via devices such as cochlear implants.
Collapse
Affiliation(s)
- Andrew J. Oxenham
- Center for Applied and Translational Sensory Science, University of Minnesota Twin Cities, Minneapolis, MN, United States
- Department of Psychology, University of Minnesota Twin Cities, Minneapolis, MN, United States
| |
Collapse
|
17
|
Abstract
We examined pitch-error detection in well-known songs sung with or without meaningful lyrics. In Experiment 1, adults heard the initial phrase of familiar songs sung with lyrics or repeating syllables (la) and judged whether they heard an out-of-tune note. Half of the renditions had a single pitch error (50 or 100 cents); half were in tune. Listeners were poorer at pitch-error detection in songs with lyrics. In Experiment 2, within-note pitch fluctuations in the same performances were eliminated by auto-tuning. Again, pitch-error detection was worse for renditions with lyrics (50 cents), suggesting adverse effects of semantic processing. In Experiment 3, songs were sung with repeating syllables or scat syllables to ascertain the role of phonetic variability. Performance was poorer for scat than for repeating syllables, indicating adverse effects of phonetic variability, but overall performance exceeded Experiment 1. In Experiment 4, listeners evaluated songs in all styles (repeating syllables, scat, lyrics) within the same session. Performance was best with repeating syllables (50 cents) and did not differ between scat or lyric versions. In short, tracking the pitches of highly familiar songs was impaired by the presence of words, an impairment stemming primarily from phonetic variability rather than interference from semantic processing.
Collapse
Affiliation(s)
- Michael W Weiss
- International Laboratory for Brain, Music, and Sound Research, University of Montreal, Montreal, QC, Canada
| | | |
Collapse
|
18
|
Hoeppli ME, Thurston TS, Roy M, Light AR, Amann M, Gracely RH, Schweinhardt P. Development of a computerized 2D rating scale for continuous and simultaneous evaluation of two dimensions of a sensory stimulus. Front Psychol 2023; 14:1127699. [PMID: 36935976 PMCID: PMC10022668 DOI: 10.3389/fpsyg.2023.1127699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 02/06/2023] [Indexed: 03/06/2023] Open
Abstract
Introduction One-dimensional rating scales are widely used in research and in the clinic to assess individuals' perceptions of sensory stimuli. Although these scales provide essential knowledge of stimulus perception, their limitation to one dimension hinders our understanding of complex stimuli. Methods To allow improved investigation of complex stimuli, a two-dimensional scale based on the one-dimensional Gracely Box Scale was developed and tested in healthy participants on a visual and an auditory task (rating changes in brightness and size of circles and rating changes in frequency and sound pressure of sounds, which was compared to ratings on one-dimensional scales). Before performing these tasks, participants were familiarized with the intensity descriptors of the two-dimensional scale by completing two tasks. First, participants sorted the descriptors based on their judgment of the intensity of the descriptors. Second, participants evaluated the intensity of the descriptors by pressing a button for the duration they considered matching the intensity of the descriptors or squeezing a hand grip dynamometer as strong as they considered matching the intensity of the descriptors. Results Results from these tasks confirmed the order of the descriptors as displayed on the original rating scale. Results from the visual and auditory tasks showed that participants were able to rate changes in the physical attributes of visual or auditory stimuli on the two-dimensional scale as accurately as on one-dimensional scales. Discussion These results support the use of a two-dimensional scale to simultaneously report multiple dimensions of complex stimuli.
Collapse
Affiliation(s)
- Marie-Eve Hoeppli
- Alan Edwards Center for Research on Pain, McGill University, Montreal, QC, Canada
- Pediatric Pain Research Center, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States
- Division of Behavioral Medicine and Clinical Psychology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States
- *Correspondence: Marie-Eve Hoeppli,
| | - Taylor S. Thurston
- Department of Nutrition and Integrative Physiology, The University of Utah, Salt Lake City, UT, United States
| | - Mathieu Roy
- Department of Psychology, McGill University, Montreal, QC, Canada
| | - Alan R. Light
- Department of Anesthesiology, The University of Utah, Salt Lake City, UT, United States
| | - Markus Amann
- Department of Anesthesiology, The University of Utah, Salt Lake City, UT, United States
- VAMC, Geriatric Research, Education, and Clinical Center, Salt Lake City, UT, United States
| | - Richard H. Gracely
- Department of Endodontics, School of Dentistry, Center for Neurosensory Disorders, The University of North Carolina, Chapel Hill, Chapel Hill, NC, United States
| | - Petra Schweinhardt
- Alan Edwards Center for Research on Pain, McGill University, Montreal, QC, Canada
- Faculty of Dentistry, McGill University, Montreal, QC, Canada
- Faculty of Medicine, McGill University, Montreal, QC, Canada
- Integrative Spinal Research, Department of Chiropractic Medicine, Balgrist University Hospital, Zurich, Switzerland
| |
Collapse
|
19
|
Wheeler HJ, Hatch DR, Moody-Antonio SA, Nie Y. Music and Speech Perception in Prelingually Deafened Young Listeners With Cochlear Implants: A Preliminary Study Using Sung Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:3951-3965. [PMID: 36179251 DOI: 10.1044/2022_jslhr-21-00271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
PURPOSE In the context of music and speech perception, this study aimed to assess the effect of variation in one of two auditory attributes-pitch contour and timbre-on the perception of the other in prelingually deafened young cochlear implant (CI) users, and the relationship between pitch contour perception and two cognitive functions of interest. METHOD Nine prelingually deafened CI users, aged 8.75-22.17 years, completed a melodic contour identification (MCI) task using stimuli of piano notes or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note), a speech perception task identifying matrix-styled sentences naturally intonated or sung with a fixed pitch (same pitch for each word) or a mixed pitch (different pitches for each word), a forward digit span test indexing auditory short-term memory (STM), and the matrices section of the Kaufman Brief Intelligence Test-Second Edition indexing nonverbal IQ. RESULTS MCI was significantly poorer for the mixed timbre condition. Speech perception was significantly poorer for the fixed and mixed pitch conditions than for the naturally intonated condition. Auditory STM positively correlated with MCI at 2- and 3-semitone note spacings. Relative to their normal-hearing peers from a related study using the same stimuli and tasks, the CI participants showed comparable MCI at 2- or 3-semitone note spacing, and a comparable level of significant decrement in speech perception across three pitch contour conditions. CONCLUSION Findings suggest that prelingually deafened CI users show similar trends of normal-hearing peers for the effect of variation in pitch contour or timbre on the perception of the other, and that cognitive functions may underlie these outcomes to some extent, at least for the perception of pitch contour. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.21217937.
Collapse
Affiliation(s)
- Harley J Wheeler
- Department of Communication Sciences and Disorders, James Madison University, Harrisonburg, VA
| | - Debora R Hatch
- Department of Otolaryngology, Eastern Virginia Medical School, Norfolk
| | | | - Yingjiu Nie
- Department of Communication Sciences and Disorders, James Madison University, Harrisonburg, VA
| |
Collapse
|
20
|
Hansen NC, Højlund A, Møller C, Pearce M, Vuust P. Musicians show more integrated neural processing of contextually relevant acoustic features. Front Neurosci 2022; 16:907540. [PMID: 36312026 PMCID: PMC9612920 DOI: 10.3389/fnins.2022.907540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 09/08/2022] [Indexed: 12/04/2022] Open
Abstract
Little is known about expertise-related plasticity of neural mechanisms for auditory feature integration. Here, we contrast two diverging hypotheses that musical expertise is associated with more independent or more integrated predictive processing of acoustic features relevant to melody perception. Mismatch negativity (MMNm) was recorded with magnetoencephalography (MEG) from 25 musicians and 25 non-musicians, exposed to interleaved blocks of a complex, melody-like multi-feature paradigm and a simple, oddball control paradigm. In addition to single deviants differing in frequency (F), intensity (I), or perceived location (L), double and triple deviants were included reflecting all possible feature combinations (FI, IL, LF, FIL). Following previous work, early neural processing overlap was approximated in terms of MMNm additivity by comparing empirical MMNms obtained with double and triple deviants to modeled MMNms corresponding to summed constituent single-deviant MMNms. Significantly greater subadditivity was found in musicians compared to non-musicians, specifically for frequency-related deviants in complex, melody-like stimuli. Despite using identical sounds, expertise effects were absent from the simple oddball paradigm. This novel finding supports the integrated processing hypothesis whereby musicians recruit overlapping neural resources facilitating more integrative representations of contextually relevant stimuli such as frequency (perceived as pitch) during melody perception. More generally, these specialized refinements in predictive processing may enable experts to optimally capitalize upon complex, domain-relevant, acoustic cues.
Collapse
Affiliation(s)
- Niels Chr. Hansen
- Aarhus Institute of Advanced Studies, Aarhus University, Aarhus, Denmark
- Department of Clinical Medicine, Center for Music in the Brain, Aarhus University, Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark
- Department of Dramaturgy and Musicology, School of Communication and Culture, Aarhus University, Aarhus, Denmark
- *Correspondence: Niels Chr. Hansen,
| | - Andreas Højlund
- Department of Linguistics, Cognitive Science, and Semiotics, School of Communication and Culture, Aarhus University, Aarhus, Denmark
- Department of Clinical Medicine, Faculty of Health, Center of Functionally Integrative Neuroscience, Aarhus University, Aarhus, Denmark
| | - Cecilie Møller
- Department of Clinical Medicine, Center for Music in the Brain, Aarhus University, Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark
- Department of Psychology and Behavioural Sciences, Aarhus University, Aarhus, Denmark
| | - Marcus Pearce
- Department of Clinical Medicine, Center for Music in the Brain, Aarhus University, Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark
- School of Electronic Engineering and Computer Science, Cognitive Science Research Group and Centre for Digital Music, Queen Mary University of London, London, United Kingdom
| | - Peter Vuust
- Department of Clinical Medicine, Center for Music in the Brain, Aarhus University, Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark
| |
Collapse
|
21
|
Holmes E, Kinghorn EE, McGarry LM, Busari E, Griffiths TD, Johnsrude IS. Pitch discrimination is better for synthetic timbre than natural musical instrument timbres despite familiarity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:31. [PMID: 35931555 PMCID: PMC9800047 DOI: 10.1121/10.0011918] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 06/08/2022] [Accepted: 06/09/2022] [Indexed: 06/15/2023]
Abstract
Pitch discrimination is better for complex tones than pure tones, but how pitch discrimination differs between natural and artificial sounds is not fully understood. This study compared pitch discrimination thresholds for flat-spectrum harmonic complex tones with those for natural sounds played by musical instruments of three different timbres (violin, trumpet, and flute). To investigate whether natural familiarity with sounds of particular timbres affects pitch discrimination thresholds, this study recruited non-musicians and musicians who were trained on one of the three instruments. We found that flautists and trumpeters could discriminate smaller differences in pitch for artificial flat-spectrum tones, despite their unfamiliar timbre, than for sounds played by musical instruments, which are regularly heard in everyday life (particularly by musicians who play those instruments). Furthermore, thresholds were no better for the instrument a musician was trained to play than for other instruments, suggesting that even extensive experience listening to and producing sounds of particular timbres does not reliably improve pitch discrimination thresholds for those timbres. The results show that timbre familiarity provides minimal improvements to auditory acuity, and physical acoustics (e.g., the presence of equal-amplitude harmonics) determine pitch discrimination thresholds more than does experience with natural sounds and timbre-specific training.
Collapse
Affiliation(s)
- Emma Holmes
- Department of Speech Hearing and Phonetic Sciences, University College London, London WC1N 1PF, United Kingdom
| | - Elizabeth E Kinghorn
- Don Wright Faculty of Music, University of Western Ontario, London, Ontario N6A 3K7, Canada
| | - Lucy M McGarry
- Brain and Mind Institute, University of Western Ontario, London, Ontario N6A 5B7, Canada
| | - Elizabeth Busari
- UCL Ear Institute, University College London, London WC1E 6BT, United Kingdom
| | - Timothy D Griffiths
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Ingrid S Johnsrude
- Brain and Mind Institute, University of Western Ontario, London, Ontario N6A 5B7, Canada
| |
Collapse
|
22
|
Estimation of the Underlying F0 Range of a Speaker from the Spectral Features of a Brief Speech Input. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12136494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
From a very brief speech, human listeners can estimate the pitch range of the speaker and normalize pitch perception. Spectral features which inherently involve both articulatory and phonatory characteristics were speculated to play roles in this process, but few were reported to directly correlate with speaker’s F0 range. To mimic this human auditory capability and validate the speculation, in a preliminary study we proposed an LSTM-based method to estimate speaker’s F0 range from a 300 ms-long speech input, which turned out to outperform the conventional method. By two more experiments, this study further improved the method and verified its validity in estimating the speaker-specific underlying F0 range. After incorporating a novel measurement of F0 range and a multi-task training approach, Experiment 1 showed that the refined model gave more accurate estimates than the initial model. Based on a Japanese-Chinese bilingual parallel speech corpus, Experiment 2 found that the F0 ranges estimated with the model from the Chinese speech and the model from the Japanese speech produced by the same set of speakers had no significant difference, whereas the conventional method showed significant difference. The results indicate that the proposed spectrum-based method captures the speaker-specific underlying F0 range which is independent of the linguistic content.
Collapse
|
23
|
Rosi V, Ravillion A, Houix O, Susini P. Best-worst scaling, an alternative method to assess perceptual sound qualities. JASA EXPRESS LETTERS 2022; 2:064404. [PMID: 36154161 DOI: 10.1121/10.0011752] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
When designing sound evaluation experiments, researchers rely on listening test methods, such as rating scales (RS). This work aims to investigate the suitability of best-worst scaling (BWS) for the perceptual evaluation of sound qualities. To do so, 20 participants rated the "brightness" of a corpus of instrumental sounds (N = 100) with RS and BWS methods. The results show that BWS procedure is the fastest and that RS and BWS are equivalent in terms of performance. Interestingly, participants preferred BWS over RS. Therefore, BWS is an alternative method that reliably measures perceptual sound qualities and could be used in many-sounds paradigm.
Collapse
Affiliation(s)
- Victor Rosi
- Sound Perception and Design group, STMS, IRCAM, Sorbonne Université, CNRS, Ministère de la Culture, 75004 Paris, France , , ,
| | - Aliette Ravillion
- Sound Perception and Design group, STMS, IRCAM, Sorbonne Université, CNRS, Ministère de la Culture, 75004 Paris, France , , ,
| | - Olivier Houix
- Sound Perception and Design group, STMS, IRCAM, Sorbonne Université, CNRS, Ministère de la Culture, 75004 Paris, France , , ,
| | - Patrick Susini
- Sound Perception and Design group, STMS, IRCAM, Sorbonne Université, CNRS, Ministère de la Culture, 75004 Paris, France , , ,
| |
Collapse
|
24
|
Oh Y, Zuwala JC, Salvagno CM, Tilbrook GA. The Impact of Pitch and Timbre Cues on Auditory Grouping and Stream Segregation. Front Neurosci 2022; 15:725093. [PMID: 35087369 PMCID: PMC8787191 DOI: 10.3389/fnins.2021.725093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 12/16/2021] [Indexed: 11/16/2022] Open
Abstract
In multi-talker listening environments, the culmination of different voice streams may lead to the distortion of each source’s individual message, causing deficits in comprehension. Voice characteristics, such as pitch and timbre, are major dimensions of auditory perception and play a vital role in grouping and segregating incoming sounds based on their acoustic properties. The current study investigated how pitch and timbre cues (determined by fundamental frequency, notated as F0, and spectral slope, respectively) can affect perceptual integration and segregation of complex-tone sequences within an auditory streaming paradigm. Twenty normal-hearing listeners participated in a traditional auditory streaming experiment using two alternating sequences of harmonic tone complexes A and B with manipulating F0 and spectral slope. Grouping ranges, the F0/spectral slope ranges over which auditory grouping occurs, were measured with various F0/spectral slope differences between tones A and B. Results demonstrated that the grouping ranges were maximized in the absence of the F0/spectral slope differences between tones A and B and decreased by 2 times as their differences increased to ±1-semitone F0 and ±1-dB/octave spectral slope. In other words, increased differences in either F0 or spectral slope allowed listeners to more easily distinguish between harmonic stimuli, and thus group them together less. These findings suggest that pitch/timbre difference cues play an important role in how we perceive harmonic sounds in an auditory stream, representing our ability to group or segregate human voices in a multi-talker listening environment.
Collapse
|
25
|
Saddler MR, Gonzalez R, McDermott JH. Deep neural network models reveal interplay of peripheral coding and stimulus statistics in pitch perception. Nat Commun 2021; 12:7278. [PMID: 34907158 PMCID: PMC8671597 DOI: 10.1038/s41467-021-27366-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 11/12/2021] [Indexed: 11/15/2022] Open
Abstract
Perception is thought to be shaped by the environments for which organisms are optimized. These influences are difficult to test in biological organisms but may be revealed by machine perceptual systems optimized under different conditions. We investigated environmental and physiological influences on pitch perception, whose properties are commonly linked to peripheral neural coding limits. We first trained artificial neural networks to estimate fundamental frequency from biologically faithful cochlear representations of natural sounds. The best-performing networks replicated many characteristics of human pitch judgments. To probe the origins of these characteristics, we then optimized networks given altered cochleae or sound statistics. Human-like behavior emerged only when cochleae had high temporal fidelity and when models were optimized for naturalistic sounds. The results suggest pitch perception is critically shaped by the constraints of natural environments in addition to those of the cochlea, illustrating the use of artificial neural networks to reveal underpinnings of behavior.
Collapse
Affiliation(s)
- Mark R Saddler
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA.
| | - Ray Gonzalez
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA.
- Program in Speech and Hearing Biosciences and Technology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
26
|
Symons AE, Dick F, Tierney AT. Dimension-selective attention and dimensional salience modulate cortical tracking of acoustic dimensions. Neuroimage 2021; 244:118544. [PMID: 34492294 DOI: 10.1016/j.neuroimage.2021.118544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/19/2021] [Accepted: 08/31/2021] [Indexed: 11/17/2022] Open
Abstract
Some theories of auditory categorization suggest that auditory dimensions that are strongly diagnostic for particular categories - for instance voice onset time or fundamental frequency in the case of some spoken consonants - attract attention. However, prior cognitive neuroscience research on auditory selective attention has largely focused on attention to simple auditory objects or streams, and so little is known about the neural mechanisms that underpin dimension-selective attention, or how the relative salience of variations along these dimensions might modulate neural signatures of attention. Here we investigate whether dimensional salience and dimension-selective attention modulate the cortical tracking of acoustic dimensions. In two experiments, participants listened to tone sequences varying in pitch and spectral peak frequency; these two dimensions changed at different rates. Inter-trial phase coherence (ITPC) and amplitude of the EEG signal at the frequencies tagged to pitch and spectral changes provided a measure of cortical tracking of these dimensions. In Experiment 1, tone sequences varied in the size of the pitch intervals, while the size of spectral peak intervals remained constant. Cortical tracking of pitch changes was greater for sequences with larger compared to smaller pitch intervals, with no difference in cortical tracking of spectral peak changes. In Experiment 2, participants selectively attended to either pitch or spectral peak. Cortical tracking was stronger in response to the attended compared to unattended dimension for both pitch and spectral peak. These findings suggest that attention can enhance the cortical tracking of specific acoustic dimensions rather than simply enhancing tracking of the auditory object as a whole.
Collapse
Affiliation(s)
- Ashley E Symons
- Department of Psychological Sciences, Birkbeck College, University of London UK.
| | - Fred Dick
- Department of Psychological Sciences, Birkbeck College, University of London UK; Division of Psychology & Language Sciences, University College London UK
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London UK
| |
Collapse
|
27
|
Abstract
Perception adapts to the properties of prior stimulation, as illustrated by phenomena such as visual color constancy or speech context effects. In the auditory domain, only little is known about adaptive processes when it comes to the attribute of auditory brightness. Here, we report an experiment that tests whether listeners adapt to spectral colorations imposed on naturalistic music and speech excerpts. Our results indicate consistent contrastive adaptation of auditory brightness judgments on a trial-by-trial basis. The pattern of results suggests that these effects tend to grow with an increase in the duration of the adaptor context but level off after around 8 trials of 2 s duration. A simple model of the response criterion yields a correlation of r = .97 with the measured data and corroborates the notion that brightness perception adapts on timescales that fall in the range of auditory short-term memory. Effects turn out to be similar for spectral filtering based on linear spectral filter slopes and filtering based on a measured transfer function from a commercially available hearing device. Overall, our findings demonstrate the adaptivity of auditory brightness perception under realistic acoustical conditions.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany.
| | - Feline Malin Barg
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Henning Schepker
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
- Starkey Hearing, Eden Prairie, MN, USA
| |
Collapse
|
28
|
Kothinti SR, Huang N, Elhilali M. Auditory salience using natural scenes: An online study. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:2952. [PMID: 34717500 PMCID: PMC8528551 DOI: 10.1121/10.0006750] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 08/13/2021] [Accepted: 09/29/2021] [Indexed: 05/12/2023]
Abstract
Salience is the quality of a sensory signal that attracts involuntary attention in humans. While it primarily reflects conspicuous physical attributes of a scene, our understanding of processes underlying what makes a certain object or event salient remains limited. In the vision literature, experimental results, theoretical accounts, and large amounts of eye-tracking data using rich stimuli have shed light on some of the underpinnings of visual salience in the brain. In contrast, studies of auditory salience have lagged behind due to limitations in both experimental designs and stimulus datasets used to probe the question of salience in complex everyday soundscapes. In this work, we deploy an online platform to study salience using a dichotic listening paradigm with natural auditory stimuli. The study validates crowd-sourcing as a reliable platform to collect behavioral responses to auditory salience by comparing experimental outcomes to findings acquired in a controlled laboratory setting. A model-based analysis demonstrates the benefits of extending behavioral measures of salience to broader selection of auditory scenes and larger pools of subjects. Overall, this effort extends our current knowledge of auditory salience in everyday soundscapes and highlights the limitations of low-level acoustic attributes in capturing the richness of natural soundscapes.
Collapse
Affiliation(s)
- Sandeep Reddy Kothinti
- Department of Electrical and Computer Engineering, Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Nicholas Huang
- Department of Biomedical Engineering, The Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Mounya Elhilali
- Department of Electrical and Computer Engineering, Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, Maryland 21218, USA
| |
Collapse
|
29
|
Lau BK, Oxenham AJ, Werner LA. Infant Pitch and Timbre Discrimination in the Presence of Variation in the Other Dimension. J Assoc Res Otolaryngol 2021; 22:693-702. [PMID: 34519951 DOI: 10.1007/s10162-021-00807-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 07/02/2021] [Indexed: 11/25/2022] Open
Abstract
Adult listeners perceive pitch with fine precision, with many adults capable of discriminating less than a 1 % change in fundamental frequency (F0). Although there is variability across individuals, this precise pitch perception is an ability ascribed to cortical functions that are also important for speech and music perception. Infants display neural immaturity in the auditory cortex, suggesting that pitch discrimination may improve throughout infancy. In two experiments, we tested the limits of F0 (pitch) and spectral centroid (timbre) perception in 66 infants and 31 adults. Contrary to expectations, we found that infants at both 3 and 7 months were able to reliably detect small changes in F0 in the presence of random variations in spectral content, and vice versa, to the extent that their performance matched that of adults with musical training and exceeded that of adults without musical training. The results indicate high fidelity of F0 and spectral-envelope coding in infants, implying that fully mature cortical processing is not necessary for accurate discrimination of these features. The surprising difference in performance between infants and musically untrained adults may reflect a developmental trajectory for learning natural statistical covariations between pitch and timbre that improves coding efficiency but results in degraded performance in adults without musical training when expectations for such covariations are violated.
Collapse
Affiliation(s)
- Bonnie K Lau
- Institute for Language and Brain Sciences, University of Washington, 1715 NE Columbia Rd, Box 357988, Seattle, WA, 98195, USA.
- Department of Otolaryngology - Head and Neck Surgery, University of Washington, 1701 NE Columbia Rd, Box 357923, Seattle, WA, 98195, USA.
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, MN, 55455, USA
| | - Lynne A Werner
- Department of Speech and Hearing Sciences, University of Washington, 1417 NE 42nd Street, Box 354875, Seattle, WA, 98105, USA
| |
Collapse
|
30
|
Yoon YS, Mills I, Toliver B, Park C, Whitaker G, Drew C. Comparisons in Frequency Difference Limens Between Sequential and Simultaneous Listening Conditions in Normal-Hearing Listeners. Am J Audiol 2021; 30:266-274. [PMID: 33769845 DOI: 10.1044/2021_aja-20-00134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Purpose We compared frequency difference limens (FDLs) in normal-hearing listeners under two listening conditions: sequential and simultaneous. Method Eighteen adult listeners participated in three experiments. FDL was measured using a method of limits for comparison frequency. In the sequential listening condition, the tones were presented with a half-second time interval in between, but for the simultaneous listening condition, the tones were presented simultaneously. For the first experiment, one of four reference tones (125, 250, 500, or 750 Hz), which was presented to the left ear, was paired with one of four starting comparison tones (250, 500, 750, or 1000 Hz), which was presented to the right ear. The second and third experiments had the same testing conditions as the first experiment except with two- and three-tone complexes, comparison tones. The subjects were asked if the tones sounded the same or different. When a subject chose "different," the comparison frequency decreased by 10% of the frequency difference between the reference and comparison tones. FDLs were determined when the subjects chose "same" 3 times in a row. Results FDLs were significantly broader (worse) with simultaneous listening than with sequential listening for the two- and three-tone complex conditions but not for the single-tone condition. The FDLs were narrowest (best) with the three-tone complex under both listening conditions. FDLs broadened as the testing frequencies increased for the single tone and the two-tone complex. The FDLs were not broadened at frequencies > 250 Hz for the three-tone complex. Conclusion The results suggest that sequential and simultaneous frequency discriminations are mediated by different processes at different stages in the auditory pathway for complex tones, but not for pure tones.
Collapse
Affiliation(s)
- Yang-Soo Yoon
- Department of Communication Sciences and Disorders, Baylor University, Waco, TX
| | - Ivy Mills
- Department of Communication Sciences and Disorders, Baylor University, Waco, TX
| | - BaileyAnn Toliver
- Department of Communication Sciences and Disorders, Baylor University, Waco, TX
| | - Christine Park
- Department of Communication Sciences and Disorders, Baylor University, Waco, TX
| | - George Whitaker
- Division of Otolaryngology, Baylor Scott & White Medical Center, Temple, TX
| | - Carrie Drew
- Department of Communication Sciences and Disorders, Baylor University, Waco, TX
| |
Collapse
|
31
|
Siedenburg K, Jacobsen S, Reuter C. Spectral envelope position and shape in sustained musical instrument sounds. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3715. [PMID: 34241486 DOI: 10.1121/10.0005088] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 05/09/2021] [Indexed: 06/13/2023]
Abstract
It has been argued that the relative position of spectral envelopes along the frequency axis serves as a cue for musical instrument size (e.g., violin vs viola) and that the shape of the spectral envelope encodes family identity (violin vs flute). It is further known that fundamental frequency (F0), F0-register for specific instruments, and dynamic level strongly affect spectral properties of acoustical instrument sounds. However, the associations between these factors have not been rigorously quantified for a representative set of musical instruments. Here, we analyzed 5640 sounds from 50 sustained orchestral instruments sampled across their entire range of F0s at three dynamic levels. Regression of spectral centroid (SC) values that index envelope position indicated that smaller instruments possessed higher SC values for a majority of instrument classes (families), but SC also correlated with F0 and was strongly and consistently affected by the dynamic level. Instrument classification using relatively low-dimensional cepstral audio descriptors allowed for discrimination between instrument classes with accuracies beyond 80%. Envelope shape became much less indicative of instrument class whenever the classification problem involved generalization to different dynamic levels or F0-registers. These analyses confirm that spectral envelopes encode information about instrument size and family identity and highlight their dependence on F0(-register) and dynamic level.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, 26129 Oldenburg, Germany
| | - Simon Jacobsen
- Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, 26129 Oldenburg, Germany
| | - Christoph Reuter
- Department of Musicology, University of Vienna, 1090 Vienna, Austria
| |
Collapse
|
32
|
Reymore L, Hansen NC. A Theory of Instrument-Specific Absolute Pitch. Front Psychol 2020; 11:560877. [PMID: 33192828 PMCID: PMC7642881 DOI: 10.3389/fpsyg.2020.560877] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Accepted: 09/24/2020] [Indexed: 12/21/2022] Open
Abstract
While absolute pitch (AP)—the ability to name musical pitches globally and without reference—is rare in expert musicians, anecdotal evidence suggests that some musicians may better identify pitches played on their primary instrument than pitches played on other instruments. We call this phenomenon “instrument-specific absolute pitch” (ISAP). In this paper we present a theory of ISAP. Specifically, we offer the hypothesis that some expert musicians without global AP may be able to more accurately identify pitches played on their primary instrument(s), and we propose timbral cues and articulatory motor imagery as two underlying mechanisms. Depending on whether informative timbral cues arise from performer- or instrument-specific idiosyncrasies or from timbre-facilitated tonotopic representations, we predict that performance may be enhanced for notes played by oneself, notes played on one’s own personal instrument, and/or notes played on any exemplar of one’s own instrument type. Sounds of one’s primary instrument may moreover activate kinesthetic memory and motor imagery, aiding pitch identification. In order to demonstrate how our theory can be tested, we report the methodology and analysis of two exemplary experiments conducted on two case-study participants who are professional oboists. The aim of the first experiment was to determine whether the oboists demonstrated ISAP ability, while the purpose of the second experiment was to provide a preliminary investigation of the underlying mechanisms. The results of the first experiment revealed that only one of the two oboists showed an advantage for identifying oboe tones over piano tones. For this oboist demonstrating ISAP, the second experiment demonstrated that pitch-naming accuracy decreased and variance around the correct pitch value increased as an effect of transposition and motor interference, but not of instrument or performer. These preliminary data suggest that some musicians possess ISAP while others do not. Timbral cues and motor imagery may both play roles in the acquisition of this ability. Based on our case study findings, we provide methodological considerations and recommendations for future empirical testing of our theory of ISAP.
Collapse
Affiliation(s)
- Lindsey Reymore
- School of Music, The Ohio State University, Columbus, OH, United States.,Schulich School of Music, McGill University, Montréal, QC, Canada
| | - Niels Chr Hansen
- Aarhus Institute of Advanced Studies, Aarhus University, Aarhus, Denmark.,Center for Music in the Brain, Aarhus University, Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark
| |
Collapse
|
33
|
Adaptation to pitch-altered feedback is independent of one's own voice pitch sensitivity. Sci Rep 2020; 10:16860. [PMID: 33033324 PMCID: PMC7544828 DOI: 10.1038/s41598-020-73932-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 09/23/2020] [Indexed: 01/17/2023] Open
Abstract
Monitoring voice pitch is a fine-tuned process in daily conversations as conveying accurately the linguistic and affective cues in a given utterance depends on the precise control of phonation and intonation. This monitoring is thought to depend on whether the error is treated as self-generated or externally-generated, resulting in either a correction or inflation of errors. The present study reports on two separate paradigms of adaptation to altered feedback to explore whether participants could behave in a more cohesive manner once the error is of comparable size perceptually. The vocal behavior of normal-hearing and fluent speakers was recorded in response to a personalized size of pitch shift versus a non-specific size, one semitone. The personalized size of shift was determined based on the just-noticeable difference in fundamental frequency (F0) of each participant’s voice. Here we show that both tasks successfully demonstrated opposing responses to a constant and predictable F0 perturbation (on from the production onset) but these effects barely carried over once the feedback was back to normal, depicting a pattern that bears some resemblance to compensatory responses. Experiencing a F0 shift that is perceived as self-generated (because it was precisely just-noticeable) is not enough to force speakers to behave more consistently and more homogeneously in an opposing manner. On the contrary, our results suggest that the type of the response as well as the magnitude of the response do not depend in any trivial way on the sensitivity of participants to their own voice pitch. Based on this finding, we speculate that error correction could possibly occur even with a bionic ear, typically even when F0 cues are too subtle for cochlear implant users to detect accurately.
Collapse
|
34
|
Bordonné T, Kronland-Martinet R, Ystad S, Derrien O, Aramaki M. Exploring sound perception through vocal imitations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:3306. [PMID: 32486800 DOI: 10.1121/10.0001224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 04/18/2020] [Indexed: 06/11/2023]
Abstract
Understanding how sounds are perceived and interpreted is an important challenge for researchers dealing with auditory perception. The ecological approach to perception suggests that the salient perceptual information that enables an auditor to recognize events through sounds is contained in specific structures called invariants. Identifying such invariants is of interest from a fundamental point of view to better understand auditory perception and it is also useful to include perceptual considerations to model and control sounds. Among the different approaches used to identify perceptually relevant sound structures, vocal imitations are believed to bring a fresh perspective to the field. The main goal of this paper is to better understand how invariants are transmitted through vocal imitations. A sound corpus containing different types of known invariants obtained from an existing synthesizer was established. Participants took part in a test where they were asked to imitate the sound corpus. A continuous and sparse model adapted to the specificities of the vocal imitations was then developed and used to analyze the imitations. Results show that participants were able to highlight salient elements of the sounds that partially correspond to the invariants used in the sound corpus. This study also confirms that vocal imitations reveal how these invariants are transmitted through perception and offers promising perspectives on auditory investigations.
Collapse
Affiliation(s)
- Thomas Bordonné
- Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France
| | - Richard Kronland-Martinet
- Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France
| | - Sølvi Ystad
- Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France
| | - Olivier Derrien
- Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France
| | - Mitsuko Aramaki
- Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France
| |
Collapse
|
35
|
Mehta AH, Oxenham AJ. Effect of lowest harmonic rank on fundamental-frequency difference limens varies with fundamental frequency. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2314. [PMID: 32359332 PMCID: PMC7166120 DOI: 10.1121/10.0001092] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 03/25/2020] [Accepted: 03/27/2020] [Indexed: 06/11/2023]
Abstract
This study investigated the relationship between fundamental frequency difference limens (F0DLs) and the lowest harmonic number present over a wide range of F0s (30-2000 Hz) for 12-component harmonic complex tones that were presented in either sine or random phase. For fundamental frequencies (F0s) between 100 and 400 Hz, a transition from low (∼1%) to high (∼5%) F0DLs occurred as the lowest harmonic number increased from about seven to ten, in line with earlier studies. At lower and higher F0s, the transition between low and high F0DLs occurred at lower harmonic numbers. The worsening performance at low F0s was reasonably well predicted by the expected decrease in spectral resolution below about 500 Hz. At higher F0s, the degradation in performance at lower harmonic numbers could not be predicted by changes in spectral resolution but remained relatively good (<2%-3%) in some conditions, even when all harmonics were above 8 kHz, confirming that F0 can be extracted from harmonics even when temporal envelope or fine-structure cues are weak or absent.
Collapse
Affiliation(s)
- Anahita H Mehta
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, 75 East River Parkway, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
36
|
Nikolsky A, Alekseyev E, Alekseev I, Dyakonova V. The Overlooked Tradition of "Personal Music" and Its Place in the Evolution of Music. Front Psychol 2020; 10:3051. [PMID: 32132941 PMCID: PMC7040865 DOI: 10.3389/fpsyg.2019.03051] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 12/24/2019] [Indexed: 12/31/2022] Open
Abstract
This is an attempt to describe and explain so-called timbre-based music as a special system of musicking, communication, and psychological and social usage, which along with its corresponding beliefs constitutes a viable alternative to “frequency-based” music. Unfortunately, the current scientific research into music has been skewed almost entirely in favor of the frequency-based music prevalent in the West. Subsequently, whenever samples of timbre-based music attract the attention of Western researchers, these are usually interpreted as “defective” implementations of frequency-based music. The presence of discrete pitch is often regarded as the structural criterion that distinguishes music from non-music. We would like to present evidence to the contrary—in support of the existence of indigenous music systems based on the discretization and patterning of aspects of timbre, rather than pitch. This evidence comes mainly from extensive ethnographic research systematically conducted in eastern European and Asian parts of Russia from the 1890s. It involved the efforts of thousands of specialists and was coordinated by dozens of research institutions, and it has included not just ethnomusicology but linguistics, philology, organology, archaeology, anthropology, geography, and religious, and social studies. Much of the data has not been translated into Western languages. Although some Soviet-era publications were tainted by Marxist ideology, many researchers strove to provide accurate information (despite at times having been prosecuted for their work), and post-1990 research undertook a substantial revision of ideologically compromised concepts. Timbre-based tonal organization (TO) differs from that based on frequency in its personal orientation: musicking here occurs primarily for oneself and/or for close relatives/friends. Collective music-making is rare and exceptional. The foundation of timbre-based music seems to have vocal roots and rests on “personal song”—a system of personal identification through individualized patterns of rhythm, timbre, and pitch contour, utilized like a “human voice”—whose sound enables the recognition of a particular individual. The instrumental counterpart of the personalized singing tradition is the jaw harp tradition. The jaw harp is the principal musical instrument for at least 21 ethnicities in Russia, who occupy over half the territory of the country. The evolution of its TO forms the backbone for the development of timbre-based music art. Here, we provide the acoustic, socio-cultural, geographic, and chronological overview of timbre-based music.
Collapse
Affiliation(s)
| | - Eduard Alekseyev
- Independent Researcher, Boston, MA, United States.,The State Institute for Art Studies of the Ministry of Culture of the Russian Federation, Moscow, Russia
| | - Ivan Alekseev
- Experimental Laboratory of the North-Eastern Federal University, Yakutsk, Russia.,International Jaw Harp Music Center, Yakutsk, Russia
| | - Varvara Dyakonova
- Department of Art Studies, Arctic State Institute of Arts and Culture, Yakutsk, Russia
| |
Collapse
|
37
|
Mehta AH, Lu H, Oxenham AJ. The Perception of Multiple Simultaneous Pitches as a Function of Number of Spectral Channels and Spectral Spread in a Noise-Excited Envelope Vocoder. J Assoc Res Otolaryngol 2020; 21:61-72. [PMID: 32048077 DOI: 10.1007/s10162-019-00738-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 10/30/2019] [Indexed: 01/06/2023] Open
Abstract
Cochlear implant (CI) listeners typically perform poorly on tasks involving the pitch of complex tones. This limitation in performance is thought to be mainly due to the restricted number of active channels and the broad current spread that leads to channel interactions and subsequent loss of precise spectral information, with temporal information limited primarily to temporal-envelope cues. Little is known about the degree of spectral resolution required to perceive combinations of multiple pitches, or a single pitch in the presence of other interfering tones in the same spectral region. This study used noise-excited envelope vocoders that simulate the limited resolution of CIs to explore the perception of multiple pitches presented simultaneously. The results show that the resolution required for perceiving multiple complex pitches is comparable to that found in a previous study using single complex tones. Although relatively high performance can be achieved with 48 channels, performance remained near chance when even limited spectral spread (with filter slopes as steep as 144 dB/octave) was introduced to the simulations. Overall, these tight constraints suggest that current CI technology will not be able to convey the pitches of combinations of spectrally overlapping complex tones.
Collapse
Affiliation(s)
- Anahita H Mehta
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA.
| | - Hao Lu
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA
| |
Collapse
|
38
|
Cui A, Kuang J. The effects of musicality and language background on cue integration in pitch perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:4086. [PMID: 31893734 DOI: 10.1121/1.5134442] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 10/28/2019] [Indexed: 06/10/2023]
Abstract
Pitch perception involves the processing of multidimensional acoustic cues, and listeners can exhibit different cue integration strategies in interpreting pitch. This study aims to examine whether musicality and language experience have effects on listeners' pitch perception strategies. Both Mandarin and English listeners were recruited to participate in two experiments: (1) a pitch classification experiment that tested their relative reliance on f0 and spectral cues, and (2) the Montreal Battery of Evaluation of Musical Abilities that objectively quantified their musical aptitude as continuous musicality scores. Overall, the results show a strong musicality effect: Listeners with higher musicality scores relied more on f0 in pitch perception, while listeners with lower musicality scores were more likely to attend to spectral cues. However, there were no effects of language experience on musicality scores or cue integration strategies in pitch perception. These results suggest that less musical or even amusic subjects may not suffer impairment in linguistic pitch processing due to the multidimensional nature of pitch cues.
Collapse
Affiliation(s)
- Aletheia Cui
- Department of Linguistics, University of Pennsylvania, 3401-C Walnut Street, Suite 300, Philadelphia, Pennsylvania 19104, USA
| | - Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, 3401-C Walnut Street, Suite 300, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
39
|
Graves JE, Pralus A, Fornoni L, Oxenham AJ, Caclin A, Tillmann B. Short- and long-term memory for pitch and non-pitch contours: Insights from congenital amusia. Brain Cogn 2019; 136:103614. [PMID: 31546175 PMCID: PMC6953621 DOI: 10.1016/j.bandc.2019.103614] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 09/11/2019] [Accepted: 09/13/2019] [Indexed: 11/25/2022]
Abstract
Congenital amusia is a neurodevelopmental disorder characterized by deficits in music perception, including discriminating and remembering melodies and melodic contours. As non-amusic listeners can perceive contours in dimensions other than pitch, such as loudness and brightness, our present study investigated whether amusics' pitch contour deficits also extend to these other auditory dimensions. Amusic and control participants performed an identification task for ten familiar melodies and a short-term memory task requiring the discrimination of changes in the contour of novel four-tone melodies. For both tasks, melodic contour was defined by pitch, brightness, or loudness. Amusic participants showed some ability to extract contours in all three dimensions. For familiar melodies, amusic participants showed impairment in all conditions, perhaps reflecting the fact that the long-term memory representations of the familiar melodies were defined in pitch. In the contour discrimination task with novel melodies, amusic participants exhibited less impairment for loudness-based melodies than for pitch- or brightness-based melodies, suggesting some specificity of the deficit for spectral changes, if not for pitch alone. The results suggest pitch and brightness may not be processed by the same mechanisms as loudness, and that short-term memory for loudness contours may be spared to some degree in congenital amusia.
Collapse
Affiliation(s)
- Jackson E Graves
- Lyon Neuroscience Research Center (CRNL), CNRS, UMR 5292, Inserm U1028, Université Lyon 1, Lyon, France; Department of Psychology, University of Minnesota, Minneapolis, MN, USA; Laboratoire des systèmes perceptifs, Département d'études cognitives, École normale supérieure, PSL University, CNRS, 75005 Paris, France.
| | - Agathe Pralus
- Lyon Neuroscience Research Center (CRNL), CNRS, UMR 5292, Inserm U1028, Université Lyon 1, Lyon, France
| | - Lesly Fornoni
- Lyon Neuroscience Research Center (CRNL), CNRS, UMR 5292, Inserm U1028, Université Lyon 1, Lyon, France
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| | - Anne Caclin
- Lyon Neuroscience Research Center (CRNL), CNRS, UMR 5292, Inserm U1028, Université Lyon 1, Lyon, France
| | - Barbara Tillmann
- Lyon Neuroscience Research Center (CRNL), CNRS, UMR 5292, Inserm U1028, Université Lyon 1, Lyon, France
| |
Collapse
|
40
|
Cortical Correlates of Attention to Auditory Features. J Neurosci 2019; 39:3292-3300. [PMID: 30804086 DOI: 10.1523/jneurosci.0588-18.2019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2018] [Revised: 02/12/2019] [Accepted: 02/13/2019] [Indexed: 11/21/2022] Open
Abstract
Pitch and timbre are two primary features of auditory perception that are generally considered independent. However, an increase in pitch (produced by a change in fundamental frequency) can be confused with an increase in brightness (an attribute of timbre related to spectral centroid) and vice versa. Previous work indicates that pitch and timbre are processed in overlapping regions of the auditory cortex, but are separable to some extent via multivoxel pattern analysis. Here, we tested whether attention to one or other feature increases the spatial separation of their cortical representations and if attention can enhance the cortical representation of these features in the absence of any physical change in the stimulus. Ten human subjects (four female, six male) listened to pairs of tone triplets varying in pitch, timbre, or both and judged which tone triplet had the higher pitch or brighter timbre. Variations in each feature engaged common auditory regions with no clear distinctions at a univariate level. Attending to one did not improve the separability of the neural representations of pitch and timbre at the univariate level. At the multivariate level, the classifier performed above chance in distinguishing between conditions in which pitch or timbre was discriminated. The results confirm that the computations underlying pitch and timbre perception are subserved by strongly overlapping cortical regions, but reveal that attention to one or other feature leads to distinguishable activation patterns even in the absence of physical differences in the stimuli.SIGNIFICANCE STATEMENT Although pitch and timbre are generally thought of as independent auditory features of a sound, pitch height and timbral brightness can be confused for one another. This study shows that pitch and timbre variations are represented in overlapping regions of auditory cortex, but that they produce distinguishable patterns of activation. Most importantly, the patterns of activation can be distinguished based on whether subjects attended to pitch or timbre even when the stimuli remained physically identical. The results therefore show that variations in pitch and timbre are represented by overlapping neural networks, but that attention to different features of the same sound can lead to distinguishable patterns of activation.
Collapse
|
41
|
Kuang J, Liberman M. Integrating Voice Quality Cues in the Pitch Perception of Speech and Non-speech Utterances. Front Psychol 2018; 9:2147. [PMID: 30555365 PMCID: PMC6281971 DOI: 10.3389/fpsyg.2018.02147] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 10/18/2018] [Indexed: 11/13/2022] Open
Abstract
Pitch perception plays a crucial role in speech processing. Since F0 is highly ambiguous and variable in the speech signal, effective pitch-range perception is important in perceiving the intended linguistic pitch targets. This study argues that the effectiveness of pitch-range perception can be achieved by taking advantage of other signal-internal information that co-varies with F0, such as voice quality cues. This study provides direct perceptual evidence that voice quality cues as an indicator of pitch ranges can effectively affect the pitch-height perception. A series of forced-choice pitch classification experiments with four spectral conditions were conducted to investigate the degree to which manipulating spectral slope affects pitch-height perception. Both non-speech and speech stimuli were investigated. The results suggest that the pitch classification function is significantly shifted under different spectral conditions. Listeners are likely to perceive a higher pitch when the spectrum has higher high-frequency energy (i.e., tenser phonation). The direction of the shift is consistent with the correlation between voice quality and pitch range. Moreover, cue integration is affected by the speech mode, where listeners are more sensitive to relative difference within an utterance when hearing speech stimuli. This study generally supports the hypothesis that voice quality is an important enhancement cue for pitch range.
Collapse
Affiliation(s)
- Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, Philadelphia, PA, United States
| | | |
Collapse
|
42
|
Speech Perception with Spectrally Non-overlapping Maskers as Measure of Spectral Resolution in Cochlear Implant Users. J Assoc Res Otolaryngol 2018; 20:151-167. [PMID: 30456730 DOI: 10.1007/s10162-018-00702-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Accepted: 10/07/2018] [Indexed: 10/27/2022] Open
Abstract
Poor spectral resolution contributes to the difficulties experienced by cochlear implant (CI) users when listening to speech in noise. However, correlations between measures of spectral resolution and speech perception in noise have not always been found to be robust. It may be that the relationship between spectral resolution and speech perception in noise becomes clearer in conditions where the speech and noise are not spectrally matched, so that improved spectral resolution can assist in separating the speech from the masker. To test this prediction, speech intelligibility was measured with noise or tone maskers that were presented either in the same spectral channels as the speech or in interleaved spectral channels. Spectral resolution was estimated via a spectral ripple discrimination task. Results from vocoder simulations in normal-hearing listeners showed increasing differences in speech intelligibility between spectrally overlapped and interleaved maskers as well as improved spectral ripple discrimination with increasing spectral resolution. However, no clear differences were observed in CI users between performance with spectrally interleaved and overlapped maskers, or between tone and noise maskers. The results suggest that spectral resolution in current CIs is too poor to take advantage of the spectral separation produced by spectrally interleaved speech and maskers. Overall, the spectrally interleaved and tonal maskers produce a much larger difference in performance between normal-hearing listeners and CI users than do traditional speech-in-noise measures, and thus provide a more sensitive test of speech perception abilities for current and future implantable devices.
Collapse
|
43
|
Interaction Between Pitch and Timbre Perception in Normal-Hearing Listeners and Cochlear Implant Users. J Assoc Res Otolaryngol 2018; 20:57-72. [PMID: 30377852 DOI: 10.1007/s10162-018-00701-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Accepted: 10/07/2018] [Indexed: 10/28/2022] Open
Abstract
Despite their mutually exclusive definitions, pitch and timbre perception interact with each other in normal-hearing (NH) listeners. Cochlear implant (CI) users have worse than normal pitch and timbre perception. However, the pitch-timbre interaction with CIs is not well understood. This study tested the interaction between pitch and sharpness (an aspect of timbre) perception related to the fundamental frequency (F0) and spectral slope of harmonic complex tones, respectively, in both NH listeners and CI users. In experiment 1, the F0 (and spectral slope) difference limens (DLs) were measured with a fixed spectral slope (and F0) and 20-dB amplitude roving. Then, the F0 and spectral slope were varied congruently or incongruently by the same multiple of individual DLs to assess the pitch and sharpness ranking sensitivity. Both NH and CI subjects had significantly higher pitch and sharpness ranking sensitivity with congruent than with incongruent F0 and spectral slope variations, and showed a similar symmetric interaction between pitch and timbre perception. In experiment 2, CI users' melodic contour identification (MCI) was tested in three spectral slope (no, congruent, and incongruent spectral slope variations by the same multiple of individual DLs as the F0 variations) and two amplitude conditions (0- and 20-dB amplitude roving). When there was no amplitude roving, the MCI scores were significantly higher with congruent than with no, and in turn than with incongruent spectral slope variations. The 20-dB amplitude roving significantly reduced the overall MCI scores and the effect of spectral slope variations. These results reflected a confusion between higher (or lower) pitch and sharper (or duller) timbre and offered important implications for understanding and enhancing pitch and timbre perception with CIs.
Collapse
|
44
|
Cognitive Load Changes during Music Listening and its Implication in Earcon Design in Public Environments: An fNIRS Study. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2018; 15:ijerph15102075. [PMID: 30248908 PMCID: PMC6210363 DOI: 10.3390/ijerph15102075] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 09/13/2018] [Accepted: 09/17/2018] [Indexed: 11/16/2022]
Abstract
A key for earcon design in public environments is to incorporate an individual’s perceived level of cognitive load for better communication. This study aimed to examine the cognitive load changes required to perform a melodic contour identification task (CIT). While healthy college students (N = 16) were presented with five CITs, behavioral (reaction time and accuracy) and cerebral hemodynamic responses were measured using functional near-infrared spectroscopy. Our behavioral findings showed a gradual increase in cognitive load from CIT1 to CIT3 followed by an abrupt increase between CIT4 (i.e., listening to two concurrent melodic contours in an alternating manner and identifying the direction of the target contour, p < 0.001) and CIT5 (i.e., listening to two concurrent melodic contours in a divided manner and identifying the directions of both contours, p < 0.001). Cerebral hemodynamic responses showed a congruent trend with behavioral findings. Specific to the frontopolar area (Brodmann’s area 10), oxygenated hemoglobin increased significantly between CIT4 and CIT5 (p < 0.05) while the level of deoxygenated hemoglobin decreased. Altogether, the findings indicate that the cognitive threshold for young adults (CIT5) and appropriate tuning of the relationship between timbre and pitch contour can lower the perceived cognitive load and, thus, can be an effective design strategy for earcon in a public environment.
Collapse
|
45
|
Piazza EA, Theunissen FE, Wessel D, Whitney D. Rapid Adaptation to the Timbre of Natural Sounds. Sci Rep 2018; 8:13826. [PMID: 30218053 PMCID: PMC6138731 DOI: 10.1038/s41598-018-32018-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Accepted: 08/29/2018] [Indexed: 11/09/2022] Open
Abstract
Timbre, the unique quality of a sound that points to its source, allows us to quickly identify a loved one's voice in a crowd and distinguish a buzzy, bright trumpet from a warm cello. Despite its importance for perceiving the richness of auditory objects, timbre is a relatively poorly understood feature of sounds. Here we demonstrate for the first time that listeners adapt to the timbre of a wide variety of natural sounds. For each of several sound classes, participants were repeatedly exposed to two sounds (e.g., clarinet and oboe, male and female voice) that formed the endpoints of a morphed continuum. Adaptation to timbre resulted in consistent perceptual aftereffects, such that hearing sound A significantly altered perception of a neutral morph between A and B, making it sound more like B. Furthermore, these aftereffects were robust to moderate pitch changes, suggesting that adaptation to timbral features used for object identification drives these effects, analogous to face adaptation in vision.
Collapse
Affiliation(s)
- Elise A Piazza
- Princeton Neuroscience Institute, Princeton University, Princeton, NJ, 08544, USA. .,Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, 94720, USA. .,Vision Science Graduate Group, University of California, Berkeley, Berkeley, CA, 94720, USA.
| | - Frédéric E Theunissen
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, 94720, USA.,Department of Psychology, University of California, Berkeley, Berkeley, CA, 94720, USA
| | - David Wessel
- Department of Music, University of California, Berkeley, Berkeley, CA, 94720, USA.,Center for New Music and Audio Technologies, University of California, Berkeley, Berkeley, CA, 94720, USA
| | - David Whitney
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, 94720, USA.,Vision Science Graduate Group, University of California, Berkeley, Berkeley, CA, 94720, USA.,Department of Psychology, University of California, Berkeley, Berkeley, CA, 94720, USA
| |
Collapse
|
46
|
Nie Y, Galvin JJ, Morikawa M, André V, Wheeler H, Fu QJ. Music and Speech Perception in Children Using Sung Speech. Trends Hear 2018; 22:2331216518766810. [PMID: 29609496 PMCID: PMC5888806 DOI: 10.1177/2331216518766810] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.
Collapse
Affiliation(s)
- Yingjiu Nie
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | | | - Michael Morikawa
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | - Victoria André
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | - Harley Wheeler
- 1 Department of Communication Sciences and Disorders, 3745 James Madison University , Harrisonburg, VA, USA
| | - Qian-Jie Fu
- 3 Department of Head and Neck Surgery, University of California-Los Angeles, CA, USA
| |
Collapse
|
47
|
Incongruent pitch cues are associated with increased activation and functional connectivity in the frontal areas. Sci Rep 2018; 8:5206. [PMID: 29581445 PMCID: PMC5980092 DOI: 10.1038/s41598-018-23287-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 03/08/2018] [Indexed: 12/03/2022] Open
Abstract
Pitch plays a crucial role in music and speech perception. Pitch perception is characterized by multiple perceptual dimensions, such as pitch height and chroma. Information provided by auditory signals that are related to these perceptual dimensions can be either congruent or incongruent. To create conflicting cues for pitch perception, we modified Shepard tones by varying the pitch height and pitch chroma dimensions in either the same or opposite directions. Our behavioral data showed that most listeners judged pitch changes based on pitch chroma, instead of pitch height, when incongruent information was provided. The reliance on pitch chroma resulted in a stable percept of upward or downward pitch shift, rather than alternating between two different percepts. Across the incongruent and congruent conditions, consistent activation was found in the bilateral superior temporal and inferior frontal areas. In addition, significantly stronger activation was observed in the inferior frontal areas during the incongruent compared to congruent conditions. Enhanced functional connectivity was found between the left temporal and bilateral frontal areas in the incongruent than congruent conditions. Increased intra-hemispheric and inter-hemispheric connectivity was also observed in the frontal areas. Our results suggest the involvement of the frontal lobe in top-down and bottom-up processes to generate a stable percept of pitch change with conflicting perceptual cues.
Collapse
|
48
|
Siedenburg K. Timbral Shepard-illusion reveals ambiguity and context sensitivity of brightness perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:EL93. [PMID: 29495721 DOI: 10.1121/1.5022983] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Recent research has described strong effects of prior context on the perception of ambiguous pitch shifts of Shepard tones [Chambers, Akram, Adam, Pelofi, Sahani, Shamma, and Pressnitzer (2017). Nat. Commun. 8, 15027]. Here, similar effects are demonstrated for brightness shift judgments of harmonic complexes with cyclic spectral envelope components and fixed fundamental frequency. It is shown that frequency shifts of the envelopes are perceived as systematic shifts of brightness. Analogous to the work of Chambers et al., the perceptual ambiguity of half-octave shifts resolves with the presentation of prior context tones. These results constitute a context effect for the perceptual processing of spectral envelope shifts and indicate so-far unknown commonalities between pitch and timbre perception.
Collapse
Affiliation(s)
- Kai Siedenburg
- Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, Carl von Ossietzky University, Oldenburg, Germany
| |
Collapse
|
49
|
Perceptual changes with monopolar and phantom electrode stimulation. Hear Res 2017; 359:64-75. [PMID: 29325874 DOI: 10.1016/j.heares.2017.12.019] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Revised: 12/17/2017] [Accepted: 12/23/2017] [Indexed: 11/21/2022]
Abstract
Phantom electrode (PE) stimulation is achieved by simultaneously stimulating out-of-phase from two adjacent intra-cochlear electrodes with different amplitudes. If the basal electrode stimulates with a smaller amplitude than the apical electrode of the pair, the resulting electrical field is pushed away from the basal electrode producing a lower pitch. There is great interest in using PE stimulation in a processing strategy as it can be used to provide stimulation to regions of the cochlea located more apically than the most apical contact on the electrode array. The result is that even lower pitch sensations can be provided without additional risk of a deeper insertion. However, it is unknown if there are perceptual differences between monopolar (MP) and PE stimulation other than a shift in place pitch. Furthermore, it is unknown if the effect and magnitude of changing from MP to PE stimulation is dependent on electrode location. This study investigates the perceptual differences (including pitch and other sound quality differences) at multiple electrode positions using MP and PE stimulation using both a multidimensional scaling procedure (MDS) and a traditional scaling procedure. 10 Advanced Bionics users reported the perceptual distances between 5 single electrode (typically 1, 3, 5, 7, and 9) stimuli in either MP or PE (σ = 0.5) mode. Subjects were asked to report how perceptually different each pair of stimuli were using any perceived differences except loudness. Subsequently, each stimulus was presented in isolation and subjects scaled how "high" or how "clean" each sounded. Results from the MDS task suggest that perceptual differences between MP and PE stimulation can be explained by a single dimension. The traditional scaling suggests that the single dimension is place pitch. PE stimulation elicits lower pitch perceptions in all cochlear regions. Analysis of Cone Beam Computer Tomography (CBCT) data suggests that PE stimulation may be more effective at the apical part of the cochlea. PE stimulation can be used for new sound coding strategies in order to extend the pitch range for cochlear implant (CI) users without perceptual side effects.
Collapse
|
50
|
Bianchi F, Hjortkjær J, Santurette S, Zatorre RJ, Siebner HR, Dau T. Subcortical and cortical correlates of pitch discrimination: Evidence for two levels of neuroplasticity in musicians. Neuroimage 2017; 163:398-412. [DOI: 10.1016/j.neuroimage.2017.07.057] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Revised: 07/11/2017] [Accepted: 07/27/2017] [Indexed: 10/19/2022] Open
|