Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Pabon P, Ternström S. Feature Maps of the Acoustic Spectrum of the Voice. J Voice 2020;34:161.e1-161.e26. [PMID: 30269894 DOI: 10.1016/j.jvoice.2018.08.014] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 08/21/2018] [Accepted: 08/22/2018] [Indexed: 11/20/2022]

For:	Pabon P, Ternström S. Feature Maps of the Acoustic Spectrum of the Voice. J Voice 2020;34:161.e1-161.e26. [PMID: 30269894 DOI: 10.1016/j.jvoice.2018.08.014] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 08/21/2018] [Accepted: 08/22/2018] [Indexed: 11/20/2022]

Number

Cited by Other Article(s)

Iob NA, He L, Ternström S, Cai H, Brockmann-Bauser M. Effects of Speech Characteristics on Electroglottographic and Instrumental Acoustic Voice Analysis Metrics in Women With Structural Dysphonia Before and After Treatment. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024;67:1660-1681. [PMID: 38758676 DOI: 10.1044/2024_jslhr-23-00253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2024]

Abstract

PURPOSE

Literature suggests a dependency of the acoustic metrics, smoothed cepstral peak prominence (CPPS) and harmonics-to-noise ratio (HNR), on human voice loudness and fundamental frequency (F0). Even though this has been explained with different oscillatory patterns of the vocal folds, so far, it has not been specifically investigated. In the present work, the influence of three elicitation levels, calibrated sound pressure level (SPL), F0 and vowel on the electroglottographic (EGG) and time-differentiated EGG (dEGG) metrics hybrid open quotient (OQ), dEGG OQ and peak dEGG, as well as on the acoustic metrics CPPS and HNR, was examined, and their suitability for voice assessment was evaluated.

METHOD

In a retrospective study, 29 women with a mean age of 25 years (± 8.9, range: 18-53) diagnosed with structural vocal fold pathologies were examined before and after voice therapy or phonosurgery. Both acoustic and EGG signals were recorded simultaneously during the phonation of the sustained vowels /ɑ/, /i/, and /u/ at three elicited levels of loudness (soft/comfortable/loud) and unconstrained F0 conditions.

RESULTS

A linear mixed-model analysis showed a significant effect of elicitation effort levels on peak dEGG, HNR, and CPPS (all p < .01). Calibrated SPL significantly influenced HNR and CPPS (both p < .01). Furthermore, F0 had a significant effect on peak dEGG and CPPS (p < .0001). All metrics showed significant changes with regard to vowel (all p < .05). However, the treatment had no effect on the examined metrics, regardless of the treatment type (surgery vs. voice therapy).

CONCLUSIONS

The value of the investigated metrics for voice assessment purposes when sampled without sufficient control of SPL and F0 is limited, in that they are significantly influenced by the phonatory context, be it speech or elicited sustained vowels. Future studies should explore the diagnostic value of new data collation approaches such as voice mapping, which take SPL and F0 effects into account.

Collapse

Cai H, Ternström S, Chaffanjon P, Henrich Bernardoni N. Effects on Voice Quality of Thyroidectomy: A Qualitative and Quantitative Study Using Voice Maps. J Voice 2024:S0892-1997(24)00082-1. [PMID: 38714436 DOI: 10.1016/j.jvoice.2024.03.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 03/11/2024] [Accepted: 03/12/2024] [Indexed: 05/09/2024]

Luo J, Wu Y, Liu M, Li Z, Wang Z, Zheng Y, Feng L, Lu J, He F. Differentiation between depression and bipolar disorder in child and adolescents by voice features. Child Adolesc Psychiatry Ment Health 2024;18:19. [PMID: 38287442 PMCID: PMC10826007 DOI: 10.1186/s13034-024-00708-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 01/11/2024] [Indexed: 01/31/2024] Open

Abstract

OBJECTIVE

Major depressive disorder (MDD) and bipolar disorder (BD) are serious chronic disabling mental and emotional disorders, with symptoms that often manifest atypically in children and adolescents, making diagnosis difficult without objective physiological indicators. Therefore, we aimed to objectively identify MDD and BD in children and adolescents by exploring their voiceprint features.

METHODS

This study included a total of 150 participants, with 50 MDD patients, 50 BD patients, and 50 healthy controls aged between 6 and 16 years. After collecting voiceprint data, chi-square test was used to screen and extract voiceprint features specific to emotional disorders in children and adolescents. Then, selected characteristic voiceprint features were used to establish training and testing datasets with the ratio of 7:3. The performances of various machine learning and deep learning algorithms were compared using the training dataset, and the optimal algorithm was selected to classify the testing dataset and calculate the sensitivity, specificity, accuracy, and ROC curve.

RESULTS

The three groups showed differences in clustering centers for various voice features such as root mean square energy, power spectral slope, low-frequency percentile energy level, high-frequency spectral slope, spectral harmonic gain, and audio signal energy level. The model of linear SVM showed the best performance in the training dataset, achieving a total accuracy of 95.6% in classifying the three groups in the testing dataset, with sensitivity of 93.3% for MDD, 100% for BD, specificity of 93.3%, AUC of 1 for BD, and AUC of 0.967 for MDD.

CONCLUSION

By exploring the characteristics of voice features in children and adolescents, machine learning can effectively differentiate between MDD and BD in a population, and voice features hold promise as an objective physiological indicator for the auxiliary diagnosis of mood disorder in clinical practice.

Collapse

Affiliation(s)

Jie Luo National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
Yuanzhen Wu National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
Mengqi Liu National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
Zhaojun Li Beijing Institute of Technology, School of Integrated Circuits and Electronics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China
Zhuo Wang Beijing Institute of Technology, School of Integrated Circuits and Electronics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China
Yi Zheng National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
Lihui Feng Beijing Institute of Technology, School of Optics and Photonics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China
Jihua Lu Beijing Institute of Technology, School of Integrated Circuits and Electronics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China.
Fan He National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China.

Collapse

Herbst CT, Story BH, Meyer D. Acoustical Theory of Vowel Modification Strategies in Belting. J Voice 2023:S0892-1997(23)00004-8. [PMID: 37080890 DOI: 10.1016/j.jvoice.2023.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 01/03/2023] [Accepted: 01/04/2023] [Indexed: 04/22/2023]

Abstract

Various authors have argued that belting is to be produced by "speech-like" sounds, with the first and second supraglottic vocal tract resonances (f_R1 and f_R2) at frequencies of the vowels determined by the lyrics to be sung. Acoustically, the hallmark of belting has been identified as a dominant second harmonic, possibly enhanced by first resonance tuning (f_R1≈2f_o). It is not clear how both these concepts - (a) phonating with "speech-like," unmodified vowels; and (b) producing a belting sound with a dominant second harmonic, typically enhanced by f_R1 - can be upheld when singing across a singer's entire musical pitch range. For instance, anecdotal reports from pedagogues suggest that vowels with a low f_R1, such as [i] or [u], might have to be modified considerably (by raising f_R1) in order to phonate at higher pitches. These issues were systematically addressed in silico with respect to treble singing, using a linear source-filter voice production model. The dominant harmonic of the radiated spectrum was assessed in 12987 simulations, covering a parameter space of 37 fundamental frequencies (f_o) across the musical pitch range from C3 to C6; 27 voice source spectral slope settings from -4 to -30 dB/octave; computed for 13 different IPA vowels. The results suggest that, for most unmodified vowels, the stereotypical belting sound characteristics with a dominant second harmonic can only be produced over a pitch range of about a musical fifth, centered at f_o≈0.5f_R1. In the [ɔ] and [ɑ] vowels, that range is extended to an octave, supported by a low second resonance. Data aggregation - considering the relative prevalence of vowels in American English - suggests that, historically, belting with f_R1≈2f_o was derived from speech, and that songs with an extended musical pitch range likely demand considerable vowel modification. We thus argue that - on acoustical grounds - the pedagogical commandment for belting with unmodified, "speech-like" vowels can not always be fulfilled.

Collapse

Kankare E, Rantala L, Laukkanen AM. Vocal Fatigue Index in Finnish-Speaking Population. J Voice 2023:S0892-1997(23)00092-9. [PMID: 37003862 DOI: 10.1016/j.jvoice.2023.02.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Revised: 02/24/2023] [Accepted: 02/24/2023] [Indexed: 04/03/2023]

Abstract

BACKGROUND AND OBJECTIVE

Vocal fatigue is an important complaint that may indicate a voice disorder or a risk thereof. There is a need for a reliable tool to detect and quantify vocal fatigue and distinguish dysphonic and vocally healthy speakers. The Vocal Fatigue Index (VFI) questionnaire has been found valid and reliable among speakers of different languages. This study aims to validate it for speakers of Finnish.

STUDY DESIGN

Experimental comparative study.

METHODS

The VFI questionnaire was translated from English to Finnish according to the WHO recommendations. Next, it was subjected to the validation procedure. In total, 160 Finnish speakers volunteered to participate in the study. Hundred-and-eight were voice patients (83 males, 25 females) and 52 were vocally healthy controls (37 females, 15 males). As a comparison, the Voice Handicap Index (VHI) questionnaire was completed and voice samples were recorded to enable Acoustic Voice Quality Index (AVQI03.01_FIN) analysis.

RESULTS

Results from the first and second completions of the VFI(F) questionnaire correlated strongly (Spearman's rho 0.901, P = 0.01). Answers to the individual questions the VFI(F) also correlated strongly, showing high internal consistency. Factor 1 (Tiredness of voice and avoidance of voice use) of the VFI correlated strongly with the VHI, and the two other factors (Physical discomfort associated with voicing and Improvement of symptoms) correlated moderately with the VHI. Factor one of the VFI(F) correlated moderately with AVQI03.01_FIN and its sub-parameters, CPPS, HNR, and shimmer. The VFI(F) showed good construct validity, differentiating voice patients and controls at cut-off 13.5, with sensitivity of 0.963 and specificity of 0.885. Discriminatory power was strong for all factors: F1 A_ROC = 0.985, F2 A_ROC = 0.864, and F3 A_ROC = 0.821.

CONCLUSION

The VFI(F) correlates with the VHI and with AVQI01.01_FIN and it is a valid and reliable tool for detecting vocal fatigue in Finnish speakers.

Collapse

Echternach M, Nusseck M, Strasding M, Richter B. Differences of Electroglottographical Contact Quotients between Connected Speech and Sustained Phonation in Clinical Measurement of Voice. J Voice 2023:S0892-1997(23)00077-2. [PMID: 36941166 DOI: 10.1016/j.jvoice.2023.02.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 02/15/2023] [Accepted: 02/15/2023] [Indexed: 03/23/2023]

Barsties V Latoszek B, Mathmann P, Neumann K. The cepstral spectral index of dysphonia, the acoustic voice quality index and the acoustic breathiness index as novel multiparametric indices for acoustic assessment of voice quality. Curr Opin Otolaryngol Head Neck Surg 2021;29:451-457. [PMID: 34334615 DOI: 10.1097/moo.0000000000000743] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Patel RR, Ternström S. Quantitative and Qualitative Electroglottographic Wave Shape Differences in Children and Adults Using Voice Map-Based Analysis. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021;64:2977-2995. [PMID: 34319772 DOI: 10.1044/2021_jslhr-20-00717] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]

Abstract

Purpose The purpose of this study is to identify the extent to which various measurements of contacting parameters differ between children and adults during habitual range and overlap vocal frequency/intensity, using voice map-based assessment of noninvasive electroglottography (EGG). Method EGG voice maps were analyzed from 26 adults (22-45 years) and 22 children (4-8 years) during connected speech and vowel /a/ over the habitual range and the overlap vocal frequency/intensity from the voice range profile task on the vowel /a/. Mean and standard deviations of contact quotient by integration, normalized contacting speed, quotient of speed by integration, and cycle-rate sample entropy were obtained. Group differences were evaluated using the linear mixed model analysis for the habitual range connected speech and the vowel, whereas analysis of covariance was conducted for the overlap vocal frequency/intensity from the voice range profile task. Presence of a "knee" on the EGG wave shape was determined by visual inspection of the presence of convexity along the decontacting slope of the EGG pulse and the presence of the second derivative zero-crossing. Results The contact quotient by integration, normalized contacting speed, quotient of speed by integration, and cycle-rate sample entropy were significantly different in children compared to (a) adult males for habitual range and (b) adult males and adult females for the overlap vocal frequency/intensity. None of the children had a "knee" on the decontacting slope of the EGG slope. Conclusion EGG parameters of contact quotient by integration, normalized contacting speed, quotient of speed by integration, cycle-rate sample entropy, and absence of a "knee" on the decontacting slope characterize the wave shape differences between children and adults, whereas the normalized contacting speed, quotient of speed by integration, cycle-rate sample entropy, and presence of a "knee" on the downward pulse slope characterize the wave shape differences between adult males and adult females. Supplemental Material https://doi.org/10.23641/asha.15057345.

Collapse

Titze IR, Palaparthi A, Cox K, Stark A, Maxfield L, Manternach B. Vocalization with semi-occluded airways is favorable for optimizing sound production. PLoS Comput Biol 2021;17:e1008744. [PMID: 33780433 PMCID: PMC8031921 DOI: 10.1371/journal.pcbi.1008744] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 04/08/2021] [Accepted: 01/26/2021] [Indexed: 01/25/2023] Open

Titze IR, Palaparthi A. Vocal Loudness Variation With Spectral Slope. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020;63:74-82. [PMID: 31940253 PMCID: PMC7213475 DOI: 10.1044/2019_jslhr-19-00018] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2019] [Revised: 09/18/2019] [Accepted: 10/08/2019] [Indexed: 06/10/2023]

Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of Vowels. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9214535] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Ternström S, D'Amario S, Selamtzis A. Effects of the Lung Volume on the Electroglottographic Waveform in Trained Female Singers. J Voice 2018;34:485.e1-485.e21. [PMID: 30337119 DOI: 10.1016/j.jvoice.2018.09.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Revised: 09/04/2018] [Accepted: 09/06/2018] [Indexed: 11/25/2022]

Abstract

OBJECTIVES

To determine if in singing there is an effect of lung volume on the electroglottographic waveform, and if so, how it varies over the voice range.

STUDY DESIGN

Eight trained female singers sang the tune "Frère Jacques" in 18 conditions: three phonetic contexts, three dynamic levels, and high or low lung volume. Conditions were randomized and replicated.

METHODS

The audio and EGG signals were recorded in synchrony with signals tracking respiration and vertical larynx position. The first 10 Fourier descriptors of every EGG cycle were computed. These spectral data were clustered statistically, and the clusters were mapped by color into a voice range profile display, thus visualizing the EGG waveform changes under the influence of f_o and SPL. The rank correlations and effect sizes of the relationships between relative lung volume and several adduction-related EGG wave shape metrics were similarly rendered on a color scale, in voice range profile-style 'voice maps.'

RESULTS

In most subjects, EGG waveforms varied considerably over the voice range. Within subjects, reproducibility was high, not only across the replications, but also across the phonetic contexts. The EGG waveforms were quite individual, as was the nature of the EGG shape variation across the range. EGG metrics were significantly correlated to changes in lung volume, in parts of the range of the song, and in most subjects. However, the effect sizes of the relative lung volume were generally much smaller than the effects of f_o and SPL, and the relationships always varied, even changing polarity from one part of the range to another.

CONCLUSIONS

Most subjects exhibited small, reproducible effects of the relative lung volume on the EGG waveform. Some hypothesized influences of tracheal pull were seen, mostly at the lowest SPLs. The effects were however highly variable, both across the moderately wide f_o-SPL range and across subjects. Different singers may be applying different techniques and compensatory behaviors with changing lung volume. The outcomes emphasize the importance of making observations over a substantial part of the voice range, and not only of phonations sustained at a few fundamental frequencies and sound levels.

Collapse