1
|
Han JY, Hsiao CJ, Zheng WZ, Weng KC, Ho GM, Chang CY, Wang CT, Fang SH, Lai YH. Enhancing the Performance of Pathological Voice Quality Assessment System Through the Attention-Mechanism Based Neural Network. J Voice 2023:S0892-1997(22)00426-X. [PMID: 36732109 DOI: 10.1016/j.jvoice.2022.12.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 12/28/2022] [Accepted: 12/28/2022] [Indexed: 02/04/2023]
Abstract
OBJECTIVE Doctors, nowadays, primarily use auditory-perceptual evaluation, such as the grade, roughness, breathiness, asthenia, and strain scale, to evaluate voice quality and determine the treatment. However, the results predicted by individual physicians often differ, because of subjective perceptions, and diagnosis time interval, if the patient's symptoms are hard to judge. Therefore, an accurate computerized pathological voice quality assessment system will improve the quality of assessment. METHOD This study proposes a self_attention-based system, with a deep learning technology, named self_attention-based bidirectional long-short term memory (SA BiLSTM). Different pitches [low, normal, high], and vowels [/a/, /i/, /u/], were added into the proposed model, to make it learn how professional doctors evaluate the grade, roughness, breathiness, asthenia, and strain scale, in a high dimension view. RESULTS The experimental results showed that the proposed system provided higher performance than the baseline system. More specifically, the macro average of the F1 score, presented as decimal, was used to compare the accuracy of classification. The (G, R, and B) of the proposed system were (0.768±0.011, 0.820±0.009, and 0.815±0.009), which is higher than the baseline systems: deep neural network (0.395±0.010, 0.312±0.019, 0.321±0.014) and convolution neural network (0.421±0.052, 0.306±0.043, 0.3250±0.032) respectively. CONCLUSIONS The proposed system, with SA BiLSTM, pitches, and vowels, provides a more accurate way to evaluate the voice. This will be helpful for clinical voice evaluations and will improve patients' benefits from voice therapy.
Collapse
Affiliation(s)
- Ji-Yan Han
- National Yang Ming Chiao Tung University, Department of Biomedical Engineering, Taipei, Taiwan
| | - Ching-Ju Hsiao
- National Yang Ming Chiao Tung University, Department of Biomedical Engineering, Taipei, Taiwan
| | - Wei-Zhong Zheng
- National Yang Ming Chiao Tung University, Department of Biomedical Engineering, Taipei, Taiwan
| | | | | | | | - Chi-Te Wang
- Far Eastern Memorial Hospital, Department of Otolaryngology Head and Neck Surgery, Taipei, Taiwan
| | - Shih-Hau Fang
- Yuan Ze University, Department of Electric Engineering, Taoyuan, Taiwan
| | - Ying-Hui Lai
- National Yang Ming Chiao Tung University, Department of Biomedical Engineering, Taipei, Taiwan; Medical Device Innovation & Translation Center, National Yang Ming Chiao Tung University, Taipei, Taiwan.
| |
Collapse
|
2
|
Englert M, Barsties V Latoszek B, Maryn Y, Behlau M. Validation of the acoustic breathiness index to the Brazilian Portuguese language. LOGOP PHONIATR VOCO 2021; 47:56-62. [PMID: 33404289 DOI: 10.1080/14015439.2020.1864467] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
OBJECTIVE To assess the concurrent validity and the diagnostic accuracy of the Acoustic Breathiness Index (ABI) in Brazilian Portuguese. METHODS The counting numbers 1-20 and the vowel /a/ of 150 subjects were recorded (37 vocally healthy and 113 with dysphonia). The analyzed samples were the counting number 1-11 and 3 s of the sustained vowel. Nine voice specialists performed the perceptual judgment of the degree of breathiness. The Spearman Correlation and the receiver operating characteristic (ROC) curve were used to assess ABI's concurrent validity and diagnosis accuracy. RESULTS Results from five listeners were chosen for the study analyses due to moderate and substantial intra-rater reliability (Cohen's Kappa values = 0.520-0.772) and moderate inter-rater reliability (Fleiss Kappa = 0.353). The ABI presented a high concurrent validity (r = 0.746); 55.6% of the breathiness vocal deviation can be explained by the acoustic analysis (r2 = 0.556). The ROC curve presented good diagnostic accuracy (85.2%). At a threshold of 2.94, the sensitivity was 75.3% and the specificity was 93.4%. CONCLUSION The ABI is a valid tool for screening and patient's follow-up regarding breathy vocal qualities in the Brazilian Portuguese language.
Collapse
Affiliation(s)
- Marina Englert
- Unifesp - "Universidade Federal de São Paulo" and CEV - "Centro de Estudos da Voz", São Paulo, Brazil
| | - Ben Barsties V Latoszek
- Speech-Language Pathology, SRH University of Applied Health Sciences, Düsseldorf, Germany.,Department of Phoniatrics and Pediatric Audiology, University Hospital Münster, Westphalian Wilhelm University, Münster, Germany
| | - Youri Maryn
- ENT Department, Sint-Augustinus GZA, Wilrijk, Belgium
| | - Mara Behlau
- Unifesp - "Universidade Federal de São Paulo" and CEV - "Centro de Estudos da Voz", São Paulo, Brazil
| |
Collapse
|
3
|
Englert M, Barsties v. Latoszek B, Maryn Y, Behlau M. Validation of the Acoustic Voice Quality Index, Version 03.01, to the Brazilian Portuguese Language. J Voice 2021; 35:160.e15-160.e21. [DOI: 10.1016/j.jvoice.2019.07.024] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Revised: 07/23/2019] [Accepted: 07/26/2019] [Indexed: 10/26/2022]
|
4
|
Englert M, Lima L, Behlau M. Acoustic Voice Quality Index and Acoustic Breathiness Index: Analysis With Different Speech Material in the Brazilian Portuguese. J Voice 2020; 34:810.e11-810.e17. [DOI: 10.1016/j.jvoice.2019.03.015] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 03/23/2019] [Accepted: 03/25/2019] [Indexed: 11/27/2022]
|
5
|
Englert M, Lopes L, Vieira V, Behlau M. Accuracy of Acoustic Voice Quality Index and Its Isolated Acoustic Measures to Discriminate the Severity of Voice Disorders. J Voice 2020; 36:582.e1-582.e10. [PMID: 32873433 DOI: 10.1016/j.jvoice.2020.08.010] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 08/05/2020] [Accepted: 08/06/2020] [Indexed: 11/26/2022]
Abstract
OBJECTIVE To evaluate the Acoustic Voice Quality Index (AVQI) and its isolated acoustic measures accuracy in discriminating voices with different degrees of deviation. METHODS Two hundred and fifty-eight voice samples (160 dysphonic; 98 vocally healthy). Information regarding acoustic analysis and overall degree of deviation (G) were considered. The acoustic analysis consisted of the AVQI total score and its isolated acoustic measures: smoothed cepstral peak prominence (CPPs); harmonic-to-noise ratio (HNR); shimmer local and dB (Shim, ShdB); the general slope of the spectrum (Slope) and tilt of the regression line through the spectrum (Tilt). The auditory-perceptual judgment was the median G score of five voice specialists (Cohen's = 0.605-0.773; Fleiss = 0.370). Quadratic discriminant analysis and accuracy, sensitivity, and specificity of performance measures were used to investigate the discriminatory power of these measures. RESULTS AVQI presented acceptable accuracy to differentiate voices with no vocal deviation and with vocal deviation (73.9%) and among the degrees of deviation (mild vs. moderate = 70.49%; mild vs. moderate = 71.39%; moderate vs. severe = 87.5%). No isolated acoustic measurement was consistent with differentiating voice quality among all degrees of deviation. A combination of five acoustic measures (CPPs, HNR, ShdB, Slope, Tilt) had the highest accuracy to differentiate between healthy and deviated voice (75.55%). Shimmer was more accurate to discriminate between voices with mild, moderate, and severe deviation; almost all isolated acoustic measurements were accurate to discriminate voices with moderate and severe deviation. The combination of acoustic measures presented higher accuracy (mild vs. moderate = 70.21%-74.29%; mild vs. moderate = 71.53%-76.11%; moderate vs. severe = 86%-95.50%). CONCLUSION AVQI is an accepted tool to discriminate among different degrees of vocal deviation, and more accurate between voices with moderate and severe deviations. Isolated acoustic measures perform better when discriminating voices with a higher degree of deviation. A combination of acoustic parameters, with the same weight, is more accurate to discriminate different degrees of deviation, however, not consistent.
Collapse
Affiliation(s)
- Marina Englert
- Human Communication Disorders, Universidade Federal de São Paulo -UNIFESP, São Paulo, Sao Paulo, Brazil; Centro de Estudos da Voz - CEV, São Paulo, Sao Paulo, Brazil.
| | - Leonardo Lopes
- Speech, Language and Hearing Sciences Department, Universidade Federal da Paraíba-UFPB, João Pessoa, Pariaba, Brazil
| | - Vinícius Vieira
- Speech, Language and Hearing Sciences Department, Universidade Federal da Paraíba-UFPB, João Pessoa, Pariaba, Brazil
| | - Mara Behlau
- Human Communication Disorders, Universidade Federal de São Paulo -UNIFESP, São Paulo, Sao Paulo, Brazil; Centro de Estudos da Voz - CEV, São Paulo, Sao Paulo, Brazil
| |
Collapse
|
6
|
Behlau M, Rocha B, Englert M, Madazio G. Validation of the Brazilian Portuguese CAPE-V Instrument-Br CAPE-V for Auditory-Perceptual Analysis. J Voice 2020; 36:586.e15-586.e20. [PMID: 32811691 DOI: 10.1016/j.jvoice.2020.07.007] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 07/14/2020] [Accepted: 07/16/2020] [Indexed: 10/23/2022]
Abstract
INTRODUCTION The Consensus Auditory Perceptual Evaluation of Voice (CAPE-V) scale is a modern, clinical-scientific approach to voice analysis. It has been translated and culturally adapted to Brazilian Portuguese, but it still lacks validation. OBJECTIVE To validate the Brazilian Portuguese version of the CAPE-V scale using the previously translated and culturally adapted version. METHOD Forty voice samples were selected (30 dysphonic, 10 nondysphonic), and the degree of vocal deviation was evaluated by a committee of three voice specialists. Nine voice specialists judged the 40 voice samples plus 20% repetition (total of 48 samples) using the CAPE-V. To ensure construct validity of the CAPE-V, its analysis was compared to the Grade-Roughness-Breathiness-Asthenia-Strain (GRBAS) scale that was performed 48-72 hours later. Finally, the intra- and inter-rater reliability values were verified and the correlation between the nine judges and the previously defined evaluation was analyzed. RESULTS The Brazilian CAPE-V presented significant intra (0.860-0.997) and inter-rater reliability values (0.707-0.964) for the overall degree and strong correlation with GRBAS (above 0.828). Deviant voice quality had greater consensus among raters than normal voices. A strong correlation was observed between the analysis of the nine raters and that of the committee. CONCLUSION CAPE-V is an important diagnostic instrument that contributes to the standardization of vocal quality evaluation in several languages, including Brazilian Portuguese. Thus, its usefulness is neither related to a single language nor to a single set of raters.
Collapse
Affiliation(s)
- Mara Behlau
- Department of Communication Disorders, Unifesp Universidade Federal de São Paulo, São Paulo 04023-062, Brazil; CEV, Centro de Estudos da Voz, São Paulo, Brazil.
| | - Bruna Rocha
- Department of Communication Disorders, Unifesp Universidade Federal de São Paulo, São Paulo 04023-062, Brazil; CEV, Centro de Estudos da Voz, São Paulo, Brazil
| | - Marina Englert
- Department of Communication Disorders, Unifesp Universidade Federal de São Paulo, São Paulo 04023-062, Brazil; CEV, Centro de Estudos da Voz, São Paulo, Brazil
| | | |
Collapse
|
7
|
Englert M, Lima L, Latoszek BBV, Behlau M. Influence of the Voice Sample Length in Perceptual and Acoustic Voice Quality Analysis. J Voice 2020; 36:582.e23-582.e32. [PMID: 32792161 DOI: 10.1016/j.jvoice.2020.07.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 07/15/2020] [Accepted: 07/16/2020] [Indexed: 10/23/2022]
Abstract
OBJECTIVE To analyze the variations that different voice sample length (VSL) has on the perceived degree of voice quality deviation and on the Acoustic Voice Quality Index (AVQI) accuracy. METHODS Voices of 71 subjects (53 dysphonic; 18 vocally health) were recorded: numbers 1-20 (42 syllables) + vowel/a/. Three different VSL were edited: VSL_long, 1-20 + 3 seconds vowel/a/; VSL_cust, customized length, were voiced-segments of the continuous speech had the same length of the vowel (mean = 18.73 syllables corresponding to 3 seconds of only-voiced segments) + 3 seconds vowel/a/; VSL_short, 1-10 (15 syllables) + 3 seconds vowel/a/. Three voice specialists perceptually judged the overall voice quality (G); 3 sessions were performed to evaluate each VSL variant. AVQI's precision and Spearman correlation were assessed. RESULTS The intra-rater reliability was "almost perfect" (kappa >0.826) for all evaluators in VSL_short; "substantial" (0.684) and "almost perfect" (0.897) in VSL_cust and "fair" (0.447) to "almost perfect" (1.000) in VSL_long. The inter-rater reliability was "moderate" (0.554) for VSL_long, "substantial" (0.622 and 0.618) for VSL_cust and VSL_short. The Gmean and AVQI_mean were perceived as more severe for longer samples and less severe for shorter samples. Considering the AVQI, VSL_short (r = 0.665) presented the higher correlation. VSL_cust presented the best area under the ROC curve (0.821). VSL_long and VSL_cust specificity was 100%, VSL_short specificity was 75%; higher sensitivity was observed for VSL_short (74%). CONCLUSION The voice quality outcomes changes for different VSLs. Longer VSLs seem to be perceived as more deviated, shorter VSLs seem to be more reliable and have better correlation with the acoustic analysis. The AVQI best accuracy was found at a customized length. Thus, to increase the voice analysis reliability, standardized procedure must be followed, including a precise speech material control allowing comparison among clinics and voice-centers.
Collapse
Affiliation(s)
- Marina Englert
- Department of Communication Disorders, Unifesp Universidade Federal de São Paulo, São Paulo, Brazil; CEV, Centro de Estudos da Voz, São Paulo, Brazil.
| | - Livia Lima
- CEV, Centro de Estudos da Voz, São Paulo, Brazil
| | - Ben Barsties V Latoszek
- Speech-Language Pathology, SRH University of Applied Health Sciences, Düsseldorf, Germany; Department of Phoniatrics and Pediatric Audiology, University Hospital Münster, Westphalian Wilhelm University, Münster, Germany
| | - Mara Behlau
- Department of Communication Disorders, Unifesp Universidade Federal de São Paulo, São Paulo, Brazil; CEV, Centro de Estudos da Voz, São Paulo, Brazil
| |
Collapse
|
8
|
Fujimura S, Kojima T, Okanoue Y, Shoji K, Inoue M, Omori K, Hori R. Classification of Voice Disorders Using a One-Dimensional Convolutional Neural Network. J Voice 2020; 36:15-20. [DOI: 10.1016/j.jvoice.2020.02.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2019] [Revised: 02/10/2020] [Accepted: 02/10/2020] [Indexed: 10/24/2022]
|
9
|
Englert M, Mendoza V, Behlau M, De Bodt M. GALP Qualifier Scale: Initial Considerations to Classify a Voice Problem. Folia Phoniatr Logop 2019; 72:402-410. [PMID: 31574520 PMCID: PMC7592637 DOI: 10.1159/000502772] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 08/14/2019] [Accepted: 08/14/2019] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVE To propose a single qualifier scale for voice problems based on the International Classification of Functioning, Disability, and Health (ICF) that classifies a voice problem considering its multidimensionality. METHOD A multicultural database was analyzed (280 subjects). The analyzed information was: the perceptual judgment of the overall voice quality (G); the acoustic analysis (A) with the Acoustic Voice Quality Index; the laryngeal diagnosis (L) and the patient self-assessment (P) using the Voice Handicap Index. The variables were categorized. A 2-step cluster analysis was performed to define groups with common characteristics. RESULTS A 7-point qualifier scale, the GALP, was defined to generally classify levels of voice problems considering 4 dimensions of the voice evaluation. Each level of voice problem, that is, no problem, mild, moderate, severe, or complete voice problem, has its own possible outcome for G, A, L, and P that will change, or not, the overall level of voice problem. The extremes of the scale represent "no problem" at all when all parameters are normal, and "complete problem" when all parameters are altered. The 3 levels in between were defined by the cluster analysis (mild, moderate, and severe problem) and change according to the outcome of each evaluation (G, A, L, and P). Thus, changes in one parameter alone may or not contribute to the change of the level of voice problem. Also, there are 2 categories for cases that do not fit the classification (not specified) and for which some of the variables are missing (not applicable). CONCLUSION The GALP scale was proposed to classify the level of voice problem. This approach considers important dimensions of voice evaluation according to the ICF. It is a potential tool to be used by different professionals, with different assessment procedures, and among different populations, clinicians, and study centers.
Collapse
Affiliation(s)
- Marina Englert
- Human Communication Disorders, Universidade Federal de São Paulo UNIFESP, São Paulo, Brazil,
- Centro de Estudos da Voz CEV, São Paulo, Brazil,
| | - Viviana Mendoza
- Department of Otorhinolaryngology, Head and Neck Surgery and Communication Disorders, University Hospital of Antwerp, Antwerp, Belgium
| | - Mara Behlau
- Human Communication Disorders, Universidade Federal de São Paulo UNIFESP, São Paulo, Brazil
- Centro de Estudos da Voz CEV, São Paulo, Brazil
| | - Marc De Bodt
- Department of Otorhinolaryngology, Head and Neck Surgery and Communication Disorders, University Hospital of Antwerp, Antwerp, Belgium
- Faculty of Medicine and Health Sciences, Antwerp University, Antwerp, Belgium
- Faculty of Medicine and Social Health Sciences, University of Ghent, Ghent, Belgium
| |
Collapse
|
10
|
Englert M, Lima L, Constantini AC, Latoszek BBV, Maryn Y, Behlau M. Acoustic Voice Quality Index - AVQI para o português brasileiro: análise de diferentes materiais de fala. Codas 2019; 31:e20180082. [DOI: 10.1590/2317-1782/20182018082] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Accepted: 08/15/2018] [Indexed: 11/22/2022] Open
Abstract
RESUMO Objetivo Verificar a melhor amostra de fala para validação do AVQI para o português-brasileiro; identificar o contexto de fala com melhor correlação perceptivo-acústica e que possui maior acurácia diagnóstica com o AVQI. Método Gravações de 50 sujeitos (disfônicos e vocalmente saudáveis), incluindo: vogal/a/; meses do ano; números (1 a 20) e repetição das frases do CAPE-V. As amostras de fala foram editadas para conter três diferentes durações mais vogal: D1-fala completa; D2-fala com 3s de segmentos sonoros; D3-fala com ponto de corte pré-determinado. Três avaliadores realizaram a análise perceptivoauditiva (APA) das amostras combinadas em 3 contextos seguidos da vogal e deram um único escore do desvio vocal (G:0 a 3). Verificou-se qual estímulo de fala possuía melhor correlação perceptivo-acústica considerando o Gmédio; analisou-se qual estímulo possuía melhor acurácia diagnóstica considerando como presença ou ausência G<0,5 e G<0,68. Resultados A correlação perceptivo-acústica variou de r = 0,482 a r = 0,634 (Correlação de Spearman); números apresentou os valores mais elevados. O AVQI foi altamente específico e pouco sensível. Considerando G<0,5, a melhor sensibilidade e valor da curva ROC foi para frases em D3 (0,578;0,72). Considerando G<0,68, houve boa acurácia diagnóstica para números de 1 a 10 e maior sensibilidade para números de 1 a 20. Conclusão Melhor correlação perceptivo-acústica foi para números, 1 a 10. As frases do CAPE-V produziram melhor acurácia diagnóstica com G<0,5, números apresentou elevada acurácia diagnóstica com G<0,68. Números é bastante usual na clínica brasileira, logo, sugere-se seu uso para validação e análises do AVQI.
Collapse
Affiliation(s)
- Marina Englert
- Universidade Federal de São Paulo, Brasil; Centro de Estudos da Voz, Brasil
| | | | | | | | | | - Mara Behlau
- Universidade Federal de São Paulo, Brasil; Centro de Estudos da Voz, Brasil
| |
Collapse
|
11
|
Englert M, Madazio G, Gielow I, Lucero J, Behlau M. Influência do fator de aprendizagem na análise perceptivo-auditiva. Codas 2018; 30:e20170107. [DOI: 10.1590/2317-1782/20182017107] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 10/31/2017] [Indexed: 11/21/2022] Open
Abstract
RESUMO Objetivo Investigar o fator de aprendizagem durante uma tarefa perceptivo-auditiva para três grupos diferentes em uma tarefa não usual. Método 269 ouvintes, divididos em três grupos: 73 no grupo dos fonoaudiólogos especialistas em voz (GE), 84 no grupo dos fonoaudiólogos não especialistas em voz (GNE) e 112 no grupo leigo (GL), dos não fonoaudiólogos. Todos foram submetidos a uma sessão de escuta que incluiu 18 vozes humanas e 18 vozes sintetizadas com diferentes tipos e graus de desvio, mais 50% de repetição para avaliar a consistência intraindivíduo. A tarefa era classificar as vozes como humana ou sintetizada. Analisou-se o fator de aprendizagem pela comparação da porcentagem de erros do começo, primeiras 18 vozes, e do final, últimas 18 vozes, da sessão de escuta. Resultados O GE foi submetido ao fator de aprendizagem, apresentando menos erros no final da tarefa (25,5%), do que no começo (28,6%), com diferença estatística (p = 0,024). O GNE e o GL não apresentaram diferença da porcentagem de erros no começo e no final da tarefa (GNE começo = 36,5%; GNE final = 35,3%; GL começo = 38,3%; GL final = 37,7%). Conclusão O GE foi o único grupo que apresentou indícios evidentes do fator de aprendizagem. Parece que a experiência profissional influencia de modo positivo a análise perceptivo-auditiva, reforçando o impacto de um treinamento para se tornar um especialista em voz. Ainda, o especialista em voz parece estar mais preparado e mais suscetível a utilizar estratégias de aprendizagem para melhorar sua performance durante uma tarefa perceptivo-auditiva mesmo que pouco usual.
Collapse
Affiliation(s)
| | | | | | | | - Mara Behlau
- Universidade Federal de São Paulo, Brasil; Centro de Estudos da Voz, Brasil
| |
Collapse
|
12
|
Silva MFBDL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. REVISTA CEFAC 2017. [DOI: 10.1590/1982-021620171961417] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
ABSTRACT Purpose: to present a methodological approach for interpreting perceptual judgments of vocal quality by a group of evaluators using the script Vocal Profile Analysis Scheme. Methods: a cross-sectional study based on 90 speech samples from 25 female teachers with voice disorders and/or laryngeal changes. Prior to the perceptual judgment, three perceptual tasks were performed to select samples to be presented to five evaluators using the Experiment script MFC 3.2 (software PRAAT). Next, a sequence of tests was applied, based on successive approaches of inter- and intra-evaluators’ behavior. Data were treated by statistical analysis (Cochran and Selenor tests). Results: with respect to the analysis of the evaluators' performance, it was possible to define those that presented the best results, in terms of reliability and proximity of analyses, as compared to the most experienced evaluator, excluding one. The results of the cluster analysis also allowed designing a voice quality profile of the group of speakers studied. Conclusions: the proposal of a methodological approach allowed defining evaluators whose judgments were based on phonetic knowledge, and drawing a vocal quality profile of the group of samples analyzed.
Collapse
|
13
|
Perceptual Error Analysis of Human and Synthesized Voices. J Voice 2017; 31:516.e5-516.e18. [DOI: 10.1016/j.jvoice.2016.12.015] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Accepted: 12/21/2016] [Indexed: 11/19/2022]
|
14
|
San Segundo E, Mompean JA. A Simplified Vocal Profile Analysis Protocol for the Assessment of Voice Quality and Speaker Similarity. J Voice 2017; 31:644.e11-644.e27. [PMID: 28215407 DOI: 10.1016/j.jvoice.2017.01.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 01/09/2017] [Accepted: 01/11/2017] [Indexed: 12/01/2022]
Abstract
OBJECTIVES A simplified perceptual protocol for the assessment of voice quality (VQ) is attempted based on the Vocal Profile Analysis (VPA) scheme, with the aim of alleviating typical issues associated with the multidimensionality of VQ and enabling an easy quantification of speaker similarity. STUDY DESIGN Twenty-four non-pathological male speakers (12 monozygotic twin pairs) of Standard Peninsular Spanish were perceptually evaluated by two trained phoneticians using the simplified VPA (SVPA). Based on their perceptual ratings, intra- and inter-rater agreement was measured, and an index of speaker similarity was calculated not only between twin pairs but also between non-twin pairs. For that purpose, one member of each twin pair was compared with a member of a different twin pair. METHODS Intra- and inter-rater agreement measures were tested with unweighted and linear weighted kappa. Speaker similarity was measured with simple matching coefficients (SMC). RESULTS The results show that analysts' internal consistency was very high, whereas inter-rater agreement was found to be strongly setting-dependent. SMCs between speakers indicate that twin pairs are, on average, more similar than non-twin pairs. CONCLUSIONS Agreement results suggest that the proposed SVPA is a reliable protocol for the perceptual characterization of VQ, and SMC results confirm that it can also be a useful tool for the assessment of speaker (dis)similarity. The extraction of a voice quality similarity index shows potential in fields like forensic phonetics, but could also be of interest in related areas of voice research and professional practice.
Collapse
Affiliation(s)
- Eugenia San Segundo
- Department of Language and Linguistic Science, University of York, York, UK.
| | | |
Collapse
|
15
|
Cantor Cutiva LC, Fajardo A, Burdorf A. Associations between self-perceived voice disorders in teachers, perceptual assessment by speech-language pathologists, and instrumental analysis. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2016; 18:550-559. [PMID: 27063687 DOI: 10.3109/17549507.2016.1143969] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
PURPOSE The three aims of this study were to assess agreement between self-perceived voice disorders, perceptual and instrumental assessment; to determine factors associated with perceptual voice assessment; and to determine which associated factors would serve as an initial screening tool for ascertainment of the presence or absence of voice disorders among teachers. METHOD A cross-sectional study was conducted among 574 Colombian teachers. Participants filled in a questionnaire and recorded a voice sample. The voice samples were perceptually evaluated by a speech-language pathologist with the Grade, Roughness, Breathiness, Asthenia, and Strain (GRBAS) scale and objectively with an automated voice analysis for fundamental frequency, jitter, shimmer and maximum phonation time. Agreements between GRBAS scale, self-reported voice disorders and instrumental analysis were determined by unweighted Coheńs Kappa coefficients and receiver operating characteristic curves. Multivariate logistic regression analysis was used to identify variables associated with the perceptual assessment. Diagnostic performance of these variables was assessed by the area under the curve. RESULT There was no agreement between self-reported voice disorders and GRBAS assessments. Maximum phonation time showed a slight agreement with perceptual assessment of voice disorders. CONCLUSION Since these three methods offer different information, it is advisable to include all methods in ascertainment of voice disorders among teachers at work.
Collapse
Affiliation(s)
| | - Adriana Fajardo
- b Programa de Fonoaudiología , Universidad del Rosario , Bogotá D.C. , Colombia
| | - Alex Burdorf
- a Department of Public Health, Erasmus MC , University Medical Center , Rotterdam , the Netherlands and
| |
Collapse
|
16
|
Jesus LMT, Martinez J, Hall A, Ferreira A. Acoustic Correlates of Compensatory Adjustments to the Glottic and Supraglottic Structures in Patients with Unilateral Vocal Fold Paralysis. BIOMED RESEARCH INTERNATIONAL 2015; 2015:704121. [PMID: 26557690 PMCID: PMC4628731 DOI: 10.1155/2015/704121] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Accepted: 04/24/2015] [Indexed: 11/17/2022]
Abstract
The goal of this study was to analyse perceptually and acoustically the voices of patients with Unilateral Vocal Fold Paralysis (UVFP) and compare them to the voices of normal subjects. These voices were analysed perceptually with the GRBAS scale and acoustically using the following parameters: mean fundamental frequency (F0), standard-deviation of F0, jitter (ppq5), shimmer (apq11), mean harmonics-to-noise ratio (HNR), mean first (F1) and second (F2) formants frequency, and standard-deviation of F1 and F2 frequencies. Statistically significant differences were found in all of the perceptual parameters. Also the jitter, shimmer, HNR, standard-deviation of F0, and standard-deviation of the frequency of F2 were statistically different between groups, for both genders. In the male data differences were also found in F1 and F2 frequencies values and in the standard-deviation of the frequency of F1. This study allowed the documentation of the alterations resulting from UVFP and addressed the exploration of parameters with limited information for this pathology.
Collapse
Affiliation(s)
- Luis M. T. Jesus
- Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, 3810-193 Aveiro, Portugal
- School of Health Sciences (ESSUA), University of Aveiro, 3810-193 Aveiro, Portugal
| | - Joana Martinez
- Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, 3810-193 Aveiro, Portugal
| | - Andreia Hall
- Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, 3810-193 Aveiro, Portugal
- Department of Mathematics (DMat), University of Aveiro, 3810-193 Aveiro, Portugal
| | - Aníbal Ferreira
- Department of Electrical and Computer Engineering, University of Porto, 4200-465 Porto, Portugal
| |
Collapse
|
17
|
Petrovic-Lazic M, Jovanovic N, Kulic M, Babac S, Jurisic V. Acoustic and Perceptual Characteristics of the Voice in Patients With Vocal Polyps After Surgery and Voice Therapy. J Voice 2015; 29:241-6. [DOI: 10.1016/j.jvoice.2014.07.009] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Accepted: 07/17/2014] [Indexed: 11/30/2022]
|
18
|
Silva RSA, Simões-Zenari M, Nemr NK. Impacto de treinamento auditivo na avaliação perceptivo-auditiva da voz realizada por estudantes de Fonoaudiologia. ACTA ACUST UNITED AC 2012; 24:19-25. [DOI: 10.1590/s2179-64912012000100005] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2011] [Accepted: 08/04/2011] [Indexed: 11/22/2022]
Abstract
OBJETIVO: Analisar o impacto de treino auditivo na avaliação perceptivo-auditiva da voz realizada por estudantes de Fonoaudiologia. MÉTODOS: Durante dois semestres, 17 estudantes que cursavam disciplinas teóricas de fonação (Fonação/Distúrbios da Fonação) analisaram amostras de vozes alteradas e não alteradas (selecionadas para este estudo), por meio da escala GRBAS. Todos receberam treinamento auditivo durante um total de nove encontros semanais, com cerca de 15 minutos de duração cada. Em cada encontro foi apresentado um parâmetro, por meio de vozes diferentes da amostra avaliada, com predominância no aspecto treinado. A avaliação das amostras por meio da escala foi realizada pré e pós o treinamento e em outros quatro momentos ao longo dos encontros. As avaliações dos alunos foram comparadas com uma avaliação de juízas, realizada previamente por três fonoaudiólogos, especialistas em voz. Para verificar a efetividade do treinamento foi usado o teste de Friedman e Índice de Concordância Kappa. RESULTADOS: O índice de acertos dos alunos no momento pré-treinamento foi considerado entre regular e bom. Observou-se manutenção do número de acertos ao longo das avaliações realizadas, para a maioria dos parâmetros da escala. No momento pós-treinamento observou-se melhora na análise da astenia, parâmetro enfatizado a partir das dificuldades apresentadas pelos alunos. Houve diminuição dos acertos no parâmetro rugosidade após este ter sido trabalhado de maneira segmentada em rouquidão e aspereza, e associado a diferentes diagnósticos e parâmetros acústicos. CONCLUSÃO: O treino auditivo potencializa as habilidades iniciais dos alunos, refinando-as para realização da avaliação, além de nortear ajustes em dinâmicas das disciplinas.
Collapse
|
19
|
Gama ACC, Santos LLM, Sanches NA, Côrtes MG, Bassi IB. Estudo do efeito do apoio visual do traçado espectrográfico na confiabilidade da análise perceptivo-auditiva. REVISTA CEFAC 2010. [DOI: 10.1590/s1516-18462010005000123] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
OBJETIVO: avaliar a concordância intra e inter-sujeitos na avaliação perceptivo-auditiva realizada de forma isolada e simultaneamente à apresentação do traçado espectrográfico correspondente, a fim de verificar se a apresentação simultânea dos estímulos vocais e espectrográficos aumenta a concordância da avaliação perceptivo-auditiva da voz. MÉTODOS: trata-se de um estudo longitudinal, em que seis fonoaudiólogas avaliaram, em dois momentos distintos, 105 vozes disfônicas e não disfônicas, de forma perceptivo-auditiva: primeiramente sem e posteriormente com a apresentação dos espectrogramas correspondentes. Vinte por cento das vozes foram repetidas aleatoriamente nos dois momentos, a fim de se analisar as concordâncias intra-avaliadoras. Utilizou-se a escala GRBASI para realização da avaliação vocal. Para análise da concordância inter-avaliadores, foi utilizado o índice estatístico Kappa Fleiss e, para cálculo da concordância intra-avaliador, foi utilizado o coeficiente de correlação de Spearman. RESULTADOS: não houve diferença estatisticamente significante entre as avaliações perceptivo-auditivas inter-sujeitos com e sem a leitura espectrográfica, porém houve aumento da concordância entre os avaliadores para as variáveis G, R, B e S. Não houve diferença estatisticamente significante entre as avaliações perceptivo-auditivas intra-sujeitos sem e com o apoio visual do espectrograma, entretanto, houve aumento da concordância intra-avaliadores após a apresentação do estímulo visual, para as variáveis G, B e I. CONCLUSÃO: o apoio visual do espectrograma não aumenta significativamente a confiabilidade da avaliação perceptivo-auditiva da voz, mas a auxilia, uma vez que promove um aumento da concordância inter e intra-avaliadores.
Collapse
|