1
|
Erickson ML, Faulkner K, Johnstone PM, Hedrick MS, Stone T. Multidimensional Timbre Spaces of Cochlear Implant Vocoded and Non-vocoded Synthetic Female Singing Voices. Front Neurosci 2020; 14:307. [PMID: 32372904 PMCID: PMC7179674 DOI: 10.3389/fnins.2020.00307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 03/16/2020] [Indexed: 12/04/2022] Open
Abstract
Many post-lingually deafened cochlear implant (CI) users report that they no longer enjoy listening to music, which could possibly contribute to a perceived reduction in quality of life. One aspect of music perception, vocal timbre perception, may be difficult for CI users because they may not be able to use the same timbral cues available to normal hearing listeners. Vocal tract resonance frequencies have been shown to provide perceptual cues to voice categories such as baritone, tenor, mezzo-soprano, and soprano, while changes in glottal source spectral slope are believed to be related to perception of vocal quality dimensions such as fluty vs. brassy. As a first step toward understanding vocal timbre perception in CI users, we employed an 8-channel noise-band vocoder to test how vocoding can alter the timbral perception of female synthetic sung vowels across pitches. Non-vocoded and vocoded stimuli were synthesized with vibrato using 3 excitation source spectral slopes and 3 vocal tract transfer functions (mezzo-soprano, intermediate, soprano) at the pitches C4, B4, and F5. Six multi-dimensional scaling experiments were conducted: C4 not vocoded, C4 vocoded, B4 not vocoded, B4 vocoded, F5 not vocoded, and F5 vocoded. At the pitch C4, for both non-vocoded and vocoded conditions, dimension 1 grouped stimuli according to voice category and was most strongly predicted by spectral centroid from 0 to 2 kHz. While dimension 2 grouped stimuli according to excitation source spectral slope, it was organized slightly differently and predicted by different acoustic parameters in the non-vocoded and vocoded conditions. For pitches B4 and F5 spectral centroid from 0 to 2 kHz most strongly predicted dimension 1. However, while dimension 1 separated all 3 voice categories in the vocoded condition, dimension 1 only separated the soprano stimuli from the intermediate and mezzo-soprano stimuli in the non-vocoded condition. While it is unclear how these results predict timbre perception in CI listeners, in general, these results suggest that perhaps some aspects of vocal timbre may remain.
Collapse
Affiliation(s)
- Molly L. Erickson
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States
| | | | | | | | | |
Collapse
|
2
|
Spectrographic and Electroglottographic Findings of Religious Vocal Performers in Düzce Province of Turkey. J Voice 2018; 32:127.e25-127.e35. [DOI: 10.1016/j.jvoice.2017.03.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Revised: 03/13/2017] [Accepted: 03/13/2017] [Indexed: 11/19/2022]
|
3
|
Echternach M, Burk F, Rose F, Herbst CT, Burdumy M, Döllinger M, Richter B. [Impact of functional mass lesions in professional female singers : Biomechanics of vocal fold oscillation in the register transition regions]. HNO 2017; 66:308-320. [PMID: 29247438 DOI: 10.1007/s00106-017-0447-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
BACKGROUND The influence of functional mass lesions on vocal fold oscillation patterns in vocally challenging tasks is not yet understood in detail. MATERIALS UND METHODS Glissandi on the vowel [a:] from 220 to 440 Hz and 440 to 880 Hz were analyzed in three groups of four professional female singers: without a mass lesion or dysphony (group A), with a functional mass lesion (swellings without a great impact on oscillation patterns during stroboscopy; group B), and with organic dysphony (group C). High-speed digital imaging (HSDI; 20,000 fps), and acoustic and electroglottographic (EGG) signals were used for analysis. Based on the EGG sample entropy, time windows for analysis of register transition phenomena were constructed. The voice signals (glottal area waveform, GAW; acoustic and EGG signals) were perceptually rated in terms of the noticeability of registration events. RESULTS The absolute sample entropy revealed maxima in fundamental frequency regions where register transitions typically occur. Groups A and B could be distinguished neither by perceptual rating nor based on sample entropy values. In comparison to the other two groups, the absolute sample entropy values of group C were greater in the lower glissando. However, the larger vocal fold oscillatory irregularities were observable for the upper glissando in this group. CONCLUSION Functional mass lesions do not influence biomechanics adversely in vocally challenging tasks such as register transitions. The use of sample entropy as a criterion for detection of register transitions is promising, but needs further validation.
Collapse
Affiliation(s)
- M Echternach
- Freiburger Institut für Musikermedizin, Medizinische Fakultät, Albert-Ludwigs-Universität und Universitätsklinikum Freiburg, Breisacher Str. 60, 79106, Freiburg i.Br., Deutschland.
| | - F Burk
- Freiburger Institut für Musikermedizin, Medizinische Fakultät, Albert-Ludwigs-Universität und Universitätsklinikum Freiburg, Breisacher Str. 60, 79106, Freiburg i.Br., Deutschland
| | - F Rose
- Freiburger Institut für Musikermedizin, Medizinische Fakultät, Albert-Ludwigs-Universität und Universitätsklinikum Freiburg, Breisacher Str. 60, 79106, Freiburg i.Br., Deutschland
| | - C T Herbst
- Department für Musikwissenschaft, Universität Mozarteum Salzburg, Salzburg, Österreich
| | - M Burdumy
- Medizin Physik, Medizinische Fakultät, Albert-Ludwigs-Universität und Universitätsklinikum Freiburg, Breisacher Str. 60a, 79106, Freiburg, Deutschland
| | - M Döllinger
- Abteilung für Phoniatrie und Pädaudiologie an der HNO Klinik Erlangen, Universitätsklinikum Erlangen, Bohlenplatz 21, 91054, Erlangen, Deutschland
| | - B Richter
- Freiburger Institut für Musikermedizin, Medizinische Fakultät, Albert-Ludwigs-Universität und Universitätsklinikum Freiburg, Breisacher Str. 60, 79106, Freiburg i.Br., Deutschland
| |
Collapse
|
4
|
Echternach M, Burk F, Köberlein M, Selamtzis A, Döllinger M, Burdumy M, Richter B, Herbst CT. Laryngeal evidence for the first and second passaggio in professionally trained sopranos. PLoS One 2017; 12:e0175865. [PMID: 28467509 PMCID: PMC5414960 DOI: 10.1371/journal.pone.0175865] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Accepted: 03/31/2017] [Indexed: 11/18/2022] Open
Abstract
Introduction Due to a lack of empirical data, the current understanding of the laryngeal mechanics in the passaggio regions (i.e., the fundamental frequency ranges where vocal registration events usually occur) of the female singing voice is still limited. Material and methods In this study the first and second passaggio regions of 10 professionally trained female classical soprano singers were analyzed. The sopranos performed pitch glides from A3 (ƒo = 220 Hz) to A4 (ƒo = 440 Hz) and from A4 (ƒo = 440 Hz) to A5 (ƒo = 880 Hz) on the vowel [iː]. Vocal fold vibration was assessed with trans-nasal high speed videoendoscopy at 20,000 fps, complemented by simultaneous electroglottographic (EGG) and acoustic recordings. Register breaks were perceptually rated by 12 voice experts. Voice stability was documented with the EGG-based sample entropy. Glottal opening and closing patterns during the passaggi were analyzed, supplemented with open quotient data extracted from the glottal area waveform. Results In both the first and the second passaggio, variations of vocal fold vibration patterns were found. Four distinct patterns emerged: smooth transitions with either increasing or decreasing durations of glottal closure, abrupt register transitions, and intermediate loss of vocal fold contact. Audible register transitions (in both the first and second passaggi) generally coincided with higher sample entropy values and higher open quotient variance through the respective passaggi. Conclusions Noteworthy vocal fold oscillatory registration events occur in both the first and the second passaggio even in professional sopranos. The respective transitions are hypothesized to be caused by either (a) a change of laryngeal biomechanical properties; or by (b) vocal tract resonance effects, constituting level 2 source-filter interactions.
Collapse
Affiliation(s)
- Matthias Echternach
- Institute of Musicians’ Medicine, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Fabian Burk
- Institute of Musicians’ Medicine, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Marie Köberlein
- Institute of Musicians’ Medicine, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Andreas Selamtzis
- Royal Technical University, Music Acoustics. Lindstedtsvägen 24, Stockholm, Sweden
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Medical School, Waldstrasse 1, Erlangen, Germany
| | - Michael Burdumy
- Department of Medical Physics, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Bernhard Richter
- Institute of Musicians’ Medicine, University of Freiburg Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Christian Thomas Herbst
- Laboratory of Bio-Acoustics, Department of Cognitive Biology, University of Vienna, Althanstraße 14, Vienna, Austria
- * E-mail:
| |
Collapse
|
5
|
Echternach M, Sundberg J, Baumann T, Markl M, Richter B. Vocal tract area functions and formant frequencies in opera tenors' modal and falsetto registers. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 129:3955-63. [PMID: 21682417 DOI: 10.1121/1.3589249] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
According to recent model investigations, vocal tract resonance is relevant to vocal registers. However, no experimental corroboration of this claim has been published so far. In the present investigation, ten professional tenors' vocal tract configurations were analyzed using MRI volumetry. All subjects produced a sustained tone on the pitch F4 (349 Hz) on the vowel /a/ (1) in modal and (2) in falsetto register. The area functions were estimated from the MRI data and their associated formant frequencies were calculated. In a second condition the same subjects repeated the same tasks in a sound treated room and their formant frequencies were estimated by means of inverse filtering. In both recordings similar formant frequencies were observed. Vocal tract shapes differed between modal and falsetto register. In modal as compared to falsetto the lip opening and the oral cavity were wider and the first formant frequency was higher. In this sense the presented results are in agreement with the claim that the formant frequencies differ between registers.
Collapse
Affiliation(s)
- Matthias Echternach
- Institut of Musicians' Medicine, Freiburg University Medical Center, Breisacher Strasse 60, 79106 Freiburg, Germany
| | | | | | | | | |
Collapse
|
6
|
Salomão GL, Sundberg J. What do male singers mean by modal and falsetto register? An investigation of the glottal voice source. LOGOP PHONIATR VOCO 2009; 34:73-83. [PMID: 19363740 DOI: 10.1080/14015430902879918] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
The voice source differs between modal and falsetto registers, but singers often try to reduce the associated timbral differences, some even doubting that there are any. A total of 54 vowel sounds sung in falsetto and modal register by 13 male more or less experienced choir singers were analyzed by inverse filtering and electroglottography. Closed quotient, maximum flow declination rate, peak-to-peak airflow amplitude, normalized amplitude quotient, and level difference between the two lowest source spectrum partials were determined, and systematic differences were found in all singers, regardless of experience of singing. The observations seem compatible with previous observations of thicker vocal folds in modal register.
Collapse
Affiliation(s)
- Gláucia Laís Salomão
- Pontifical Catholic University, Applied Linguistic and Language Studies Program, Sao Paulo, SP, Brazil.
| | | |
Collapse
|
7
|
Lehto L, Airas M, Björkner E, Sundberg J, Alku P. Comparison of Two Inverse Filtering Methods in Parameterization of the Glottal Closing Phase Characteristics in Different Phonation Types. J Voice 2007; 21:138-50. [PMID: 16478660 DOI: 10.1016/j.jvoice.2005.10.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/01/2005] [Indexed: 11/27/2022]
Abstract
SUMMARY Inverse filtering (IF) is a common method used to estimate the source of voiced speech, the glottal flow. This investigation aims to compare two IF methods: one manual and the other semiautomatic. Glottal flows were estimated from speech pressure waveforms of six female and seven male subjects producing sustained vole /a/ in breathy, normal, and pressed phonation. The closing phase characteristics of the glottal pulse were parameterized using two time-based parameters: the closing quotient (C1Q) and the normalized amplitude quotient (NAQ). The information given by these two parameters indicates a strong correlation between the two IF methods. The results are encouraging in showing that the parameterization of the voice source in different speech sounds can be performed independently of the technique used for inverse filtering.
Collapse
Affiliation(s)
- Laura Lehto
- Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, Helsinki, Finland.
| | | | | | | | | |
Collapse
|
8
|
Björkner E, Sundberg J, Cleveland T, Stone E. Voice Source Differences Between Registers in Female Musical Theater Singers. J Voice 2006; 20:187-97. [PMID: 16051463 DOI: 10.1016/j.jvoice.2005.01.008] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/25/2005] [Indexed: 10/25/2022]
Abstract
Musical theater singing typically requires women to use two vocal registers. Our investigation considered voice source and subglottal pressure P(s) characteristics of the speech pressure signal recorded for a sequence of /pae/ syllables sung at constant pitch and decreasing vocal loudness in each register by seven female musical theater singers. Ten equally spaced P(s) values were selected, and the relationships between P(s) and several parameters were examined; closed-quotient (Q(closed)), peak-to-peak pulse amplitude (U(p-t-p)), amplitude of the negative peak of the differentiated flow glottogram, ie, the maximum flow declination rate (MFDR), and the normalized amplitude quotient (NAQ) [U(p-t-p)/(T0*MFDR)], where T0 is the fundamental period. P(s) was typically slightly higher in chest than in head register. As P(s) influences the measured glottogram parameters, these were also compared at an approximately identical P(s) of 11 cm H2O. Results showed that for typical tokens, MFDR and Q(closed) were significantly greater, whereas U(p-t-p) and therefore NAQ were significantly lower in chest than in head.
Collapse
Affiliation(s)
- Eva Björkner
- Department of Speech Music Hearing, KTH, Stockholm, Sweden.
| | | | | | | |
Collapse
|
9
|
Thalén M, Sundberg J. Describing different styles of singing: a comparison of a female singer's voice source in "Classical", "Pop", "Jazz" and "Blues". LOGOP PHONIATR VOCO 2002; 26:82-93. [PMID: 11769346 DOI: 10.1080/140154301753207458] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
The voice is apparently used in quite different manners in different styles of singing. Some of these differences concern the voice source, which varies considerably with loudness, pitch, and mode of phonation. We attempt to describe voice source differences between Classical, Pop, Jazz and Blues styles of singing as produced in a triad melody pattern by a professional female singer in soft, middle and loud phonation. An expert panel was asked to identify these triads as examples of either Classical, Pop, Jazz or Blues. The voice source was analysed by inverse filtering. Subglottal pressure Ps, closed quotient QClosed, glottal compliance (ratio between the air volume contained in a voice pulse and Ps), and the level difference between the two lowest source spectrum partials were analysed in the styles and in four modes of phonation: breathy, flow, neutral, and pressed. The same expert panel rated the degree of pressedness in the entire material. Averages across pitch were calculated for each mode and style and related to their total range of variation in the subject. The glottogram data showed a high correlation with the ratings of pressedness. Based on these correlations a pressedness factor was computed from the glottogram data. A phonation map was constructed with the axes representing mean adduction factor and mean Ps, respectively. In this map Classical was similar to flow phonation, Pop and Jazz to neutral and flow phonation, and Blues to pressed phonation.
Collapse
Affiliation(s)
- M Thalén
- SMI (University College of Music Education in Stockholm), Sweden
| | | |
Collapse
|