1
|
Lima-Filho LMA, Lopes LW, Filho TDMES. Integrated Vocal Deviation Index (IVDI): A Machine Learning Model to Classifier of the General Grade of Vocal Deviation. J Voice 2024:S0892-1997(24)00384-9. [PMID: 39592352 DOI: 10.1016/j.jvoice.2024.11.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 10/31/2024] [Accepted: 11/01/2024] [Indexed: 11/28/2024]
Abstract
OBJECTIVE To develop a multiparametric index based on machine learning (ML) to predict and classify the overall degree of vocal deviation (GG). METHOD The sample consisted of 300 dysphonic and non-dysphonic participants of both sexes. Two speech tasks were sustained vowel [a] and connected speech (counting numbers from 1 to 10). Five speech-language pathologists performed the auditory-perceptual judgment (APJ) of the GG and the degrees of roughness (GR), breathiness (GB), instability (GI), and strain (GS). We extracted 47 acoustic measurements from these tasks. The APJ result and the acoustic measurements were used to develop the multiparametric index. We used mean absolute error, root mean square error, and coefficient of determination (R²) to select the best model of ML to predict GG and feature importance to select the best set of variables for the index. After classifying the GG between nondysphonic, mild, moderate, and severe, the final model was validated using accuracy, sensitivity, specificity, predictive values, likelihood ratios, F1-Score, and weighted kappa. RESULTS The gradient boost model showed the best performance among the ML models. Eight features were selected in the model, including four acoustic measures (jitterLoc, smoothed cepstral peak prominenc, mean harmonic-to-noise ratio (HNRmean), and correlation) and four APJ measures (GR, GB, GS, and GI). The final model correctly classified 93.75% of participants and obtained a weighted kappa index of 0.9374, demonstrating the model's excellent performance. CONCLUSION The Integrated Vocal Deviation Index includes four acoustic measures and four auditory-perceptual measures and showed excellent performance in classifying voices according to GG.
Collapse
Affiliation(s)
- Luiz Medeiros Araujo Lima-Filho
- Department of Statistics, Member of the Graduate Program in Decision Models and Health, Universidade Federal da Paraíba - UFPB, João Pessoa, Paraíba, Brazil
| | - Leonardo Wanderley Lopes
- Department of Communication Science Disorders, Member of the Graduate Program in Decision Models and Health, Universidade Federal da Paraíba - UFPB, João Pessoa, Paraíba, Brazil.
| | | |
Collapse
|
2
|
Yeşilli-Puzella G, Maryn Y, Tunçer AM, Akbulut S, Ünsal EM, Tadıhan Özkan E. Validation of the Acoustic Voice Quality Index Version 03.01 in Turkish. J Voice 2024:S0892-1997(24)00284-4. [PMID: 39393953 DOI: 10.1016/j.jvoice.2024.08.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 08/24/2024] [Accepted: 08/26/2024] [Indexed: 10/13/2024]
Abstract
OBJECTIVES The aim of this study was to validate the Acoustic Voice Quality Index (AVQI) version 3.01 in the Turkish-speaking population. MATERIALS AND METHODS Concatenated voice samples of the sustained vowel [a:] and continuous speech were collected from 127 dysphonic and 128 normophonic participants. The auditory-perceptual evaluation was performed by five experienced raters using the Grade parameter of the Grade, Roughness, Breathiness, Asthenia, Strain scale. Rater reliability, concurrent validity, diagnostic accuracy, and differences between normophonic and dysphonic groups were analyzed for the AVQI version 3.01. RESULTS The number of syllables for the standardized reading text with the concatenation of the voiced parts lasting around 3 seconds (mean = 3.84 seconds) was 36. The intraclass correlation coefficient (ICC) values of intra-rater reliability of G scores of five raters were excellent (mean ICC = 0.934), and of inter-rater reliability, they varied between moderate and excellent (mean ICC = 0.786). AVQIv3 demonstrated a high diagnostic accuracy with area under receiver-operating characteristic curve = 0.906 in identifying disrupted versus normal voice quality. With sensitivity of 80% and specificity of 94%, AVQIv3 = 2.345 was the cutoff point that differentiated most accurately between normophonic and dysphonic voices in Turkish. CONCLUSION AVQIv3 is an ecologically valid tool for objective differentiation between dysphonic and normal voices in the Turkish language.
Collapse
Affiliation(s)
- Gamze Yeşilli-Puzella
- Speech and Language Therapy Department, School of Health Sciences, Cappadocia University, Ürgüp, Nevşehir, Turkey; Otolaryngology Division, Azienda Ospedaliera Universitaria, Sassari, Italy.
| | - Youri Maryn
- Department of Otorhinolaryngology and Head & Neck Surgery, European Institute for ORL-HNS, GZA Sint-Augustinus, Wilrijk, Belgium; Department of Rehabilitation Sciences, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium; Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium; Department of Health, University College Ghent, Ghent, Belgium; Phonanium, Lokeren, Belgium
| | - Aylin Müge Tunçer
- Speech and Language Therapy Department, Faculty of Health Sciences, Mugla Sitki Kocman University, Muğla, Turkey
| | | | - Elif Meryem Ünsal
- Department of Speech and Language Therapy, Faculty of Health Sciences, İzmir Bakırçay University, Menemen, İzmir, Turkey
| | - Elçin Tadıhan Özkan
- Speech and Language Therapy Department, Faculty of Health Sciences, Anadolu University, Eskisehir, Turkey
| |
Collapse
|
3
|
Calaf N, Garcia-Quintana D. Development and Validation of the Bilingual Catalan/Spanish Cross-Cultural Adaptation of the Consensus Auditory-Perceptual Evaluation of Voice. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:1072-1089. [PMID: 38527275 DOI: 10.1044/2024_jslhr-23-00536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
PURPOSE This study aimed to develop a valid and reliable bilingual version of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) for the auditory-perceptual evaluation of voice in Catalan and Spanish speakers. METHOD The development of this CAPE-V adaptation included Delphi methodology with 20 voice and speech experts reaching consensus on the optimal adapted terminology of the perceptual vocal attributes, considering also input from the original instrument authors. The adaptation and validation of vocal tasks followed a sequential validation procedure, with input from phoneticians and speech-language pathologists. Following pilot testing with a large sample of speech-language pathology students, a refined adapted version was empirically tested for validity and reliability. Concurrent validity was assessed by comparing the adapted CAPE-V with the reference Grade, Roughness, Breathiness, Asthenia, Strain scale. Construct validity was assessed through convergent and discriminant validity analysis. Intrarater and interrater reliability were assessed via intraclass correlation coefficient calculations. User experience was evaluated through a questionnaire. Scale properties were validated using a confusion matrix, and cutoff values were calculated to achieve the optimal balance between sensitivity and specificity. RESULTS Through a formalized consensus process, optimal Catalan/Spanish terminology was determined for the perceptual attributes of voice present in the CAPE-V. An adapted protocol of tasks was obtained that preserves the objectives of the original instrument and the relevance of the phonetic criteria in the target languages. The results demonstrated concurrent validity, construct validity, and intrarater reliability. Interrater reliability was found to depend on the extent to which evaluators shared their internal standards. The raters identified CAPE-V as an effective and preferred instrument. CONCLUSION An adapted, validated version of the CAPE-V is made available to clinical professionals for the evaluation of voice in Catalan and Spanish speakers.
Collapse
Affiliation(s)
- Neus Calaf
- Department of Basic, Developmental and Educational Psychology, Autonomous University of Barcelona, Bellaterra, Spain
- Voice Analysis Lab, Biophysics Unit, School of Medicine, Autonomous University of Barcelona, Bellaterra, Spain
| | - David Garcia-Quintana
- Voice Analysis Lab, Biophysics Unit, School of Medicine, Autonomous University of Barcelona, Bellaterra, Spain
| |
Collapse
|
4
|
Martinho DHDC, Constantini AC. Auditory-Perceptual Assessment and Acoustic Analysis of Gender Expression in the Voice. J Voice 2024:S0892-1997(23)00417-4. [PMID: 38336566 DOI: 10.1016/j.jvoice.2023.12.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 12/29/2023] [Accepted: 12/29/2023] [Indexed: 02/12/2024]
Abstract
OBJECTIVE Determine if acoustic measurements exist that are predictive of Auditory-Perceptual Assessment (APA) of gender expression in the voice of transgender, nonbinary, and cisgender Brazilian speakers by transgender, nonbinary, and cisgender judges, as well as speech-language pathologists in the area of voice studies. METHODS Cross-sectional study. Clips of speech (automatic speech and expressive reading of poetry) and sustained vowel emission of people of different genders were recorded and underwent APA for gender expression in the voice using a visual analog scale across 100 points, ranging from very masculine to very feminine. Sixteen acoustic measurements were extracted (noise, perturbation, spectral, and cepstral measurements). A descriptive and inferential analysis was performed using interclass coefficients of correlation and stepwise multiple linear regression, considering P < 0.05 for statistical significance. RESULTS Forty-seven people of different genders had their voices recorded. The perceived gender of these voices was judged by 236 people (65 speech-language pathologists, 101 cisgender people, and 70 transgender and nonbinary people). The perceptions and measurements that were predictive of gender perception in the voice differed according to the task (vowel or speech) and the group of judges. The predictive acoustic measurements that were common in all groups were: speech-median F0, harmonic-to-noise ratio (HNR), F0 standard deviation (F0sd), average width between F0 peaks, and spectral emphasis (Emph); vowels-median F0, HNR, F0sd, and average width between F0 peaks. Divergent measurements between groups were: speech-coefficient of variation of intensity, speech rate (Sr), minimum and maximum F0, jitter, and shimmer; vowels-coefficient of variation of intensity, Emph, Sr, and minimum and maximum F0. CONCLUSION There are acoustic measures that may predict APA; however, each group of judges considers different measures to evaluate gender, revealing an important influence of context on the evaluator in gender assessment through the voice.
Collapse
|
5
|
Benoy JJ, Jayakumar T. Effect of Anchor Voices and Listener Expertise on Auditory-Perceptual Judgments of Voice Quality Using the GRBAS Scale. J Voice 2024:S0892-1997(23)00397-1. [PMID: 38199908 DOI: 10.1016/j.jvoice.2023.12.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 12/10/2023] [Accepted: 12/11/2023] [Indexed: 01/12/2024]
Abstract
OBJECTIVES This study aimed to determine the effect of anchor voices and listener expertise on auditory-perceptual judgment of voice quality using the GRBAS scale. METHODS This study utilized a modified crossover design with counterbalancing. Anchor voices for each parameter of the GRBAS scale were chosen based on expert consensus. A total of 28 participants were divided into three groups based on their expertise. The first and second groups consisted of nine undergraduate (UG) and nine postgraduate (PG) students of speech-language pathology. The third group consisted of 10 practicing speech-language pathologists (SLPs). These participants carried out auditory-perceptual judgment of 60 dysphonic voice samples under two counterbalanced experimental conditions (with and without anchor voices). Each of the three groups was randomly divided into two subgroups to balance the experimental conditions. Interrater reliability for each subgroup was calculated using Krippendorff's α and 95% confidence intervals. RESULTS For all the groups involved in the study, interrater reliability was higher when anchor voices aided perceptual judgment for most parameters of the GRBAS scale. For the different parameters of GRBAS, interrater reliability for the UG group varied from fair (20 < α ≤ 40) to moderate (40 < α ≤ 60). In contrast, it was fair (20 < α ≤ 40) to substantial (60 < α ≤ 80) for the PG group and moderate (40 < α ≤ 60) to substantial (60 < α ≤ 80) for the SLP group. Variations in reliability were the least for the SLP group compared to the UG and PG groups. However, there were overlaps in interrater reliability between the groups, as revealed by the 95% confidence intervals. CONCLUSIONS Anchor voices help improve the auditory-perceptual judgment of voice quality, especially interrater reliability. Listener expertise is also shown to influence the interrater reliability of auditory-perceptual judgment of voice quality.
Collapse
Affiliation(s)
- Jesnu Jose Benoy
- Department of Speech-Language Sciences, All India Institute of Speech and Hearing, Mysuru, Karnataka, India
| | - Thirunavukkarasu Jayakumar
- Department of Speech-Language Sciences, All India Institute of Speech and Hearing, Mysuru, Karnataka, India.
| |
Collapse
|
6
|
Castillo-Allendes A, Codino J, Cantor-Cutiva LC, Nudelman CJ, Rubin AD, Barsties v. Latoszek B, Hunter EJ. Clinical Utility and Validation of the Acoustic Voice Quality and Acoustic Breathiness Indexes for Voice Disorder Assessment in English Speakers. J Clin Med 2023; 12:7679. [PMID: 38137748 PMCID: PMC10743486 DOI: 10.3390/jcm12247679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Revised: 12/12/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023] Open
Abstract
BACKGROUND While several acoustic voice metrics are available for clinical voice assessment, there remains a significant need for reliable and ecologically valid tools. The Acoustic Voice Quality Index version 03.01 (AVQI-3) and Acoustic Breathiness Index (ABI) hold potential due to their comprehensive assessment approach, incorporating diverse voice aspects. However, these tools still need to be validated in English-speaking populations. METHODS This study assessed the discriminatory accuracy and validity of AVQI-3 and ABI in 197 participants, including 148 with voice disorders. Voice samples were collected, followed by AVQI-3 and ABI calculations. Additionally, auditory-perceptual assessments were conducted by a panel of speech-language pathologists. RESULTS AVQI-3 and ABI effectively identified disordered voice quality, evidenced by high accuracy (AUCs: 0.84, 0.89), sensitivity, and specificity (thresholds: AVQI-3 = 1.17, ABI = 2.35). Strong positive correlations were observed with subjective voice quality assessments (rs = 0.72, rs = 0.77, p < 0.001). CONCLUSIONS The study highlights AVQI-3 and ABI as promising instruments for clinically assessing voice disorders in U.S. English speakers, underscoring their utility in clinical practice and voice research.
Collapse
Affiliation(s)
- Adrián Castillo-Allendes
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, MI 48824, USA; (A.C.-A.)
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA 52242, USA
| | - Juliana Codino
- Lakeshore Professional Voice Center, Lakeshore Ear, Nose & Throat Center, St. Clair Shores, MI 48081, USA
| | - Lady Catherine Cantor-Cutiva
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, MI 48824, USA; (A.C.-A.)
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA 52242, USA
| | - Charles J. Nudelman
- Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, IL 61820, USA
| | - Adam D. Rubin
- Lakeshore Professional Voice Center, Lakeshore Ear, Nose & Throat Center, St. Clair Shores, MI 48081, USA
| | | | - Eric J. Hunter
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, MI 48824, USA; (A.C.-A.)
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA 52242, USA
| |
Collapse
|
7
|
Uloza V, Ulozaitė-Stanienė N, Petrauskas T, Pribuišis K, Ulozienė I, Blažauskas T, Damaševičius R, Maskeliūnas R. Smartphone-Based Voice Wellness Index Application for Dysphonia Screening and Assessment: Development and Reliability. J Voice 2023:S0892-1997(23)00330-2. [PMID: 37980209 DOI: 10.1016/j.jvoice.2023.10.021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 10/12/2023] [Accepted: 10/12/2023] [Indexed: 11/20/2023]
Abstract
OBJECTIVE This study aimed to develop a Voice Wellness Index (VWI) application combining the acoustic voice quality index (AVQI) and glottal function index (GFI) data and to evaluate its reliability in quantitative voice assessment and normal versus pathological voice differentiation. STUDY DESIGN Cross-sectional study. METHODS A total of 135 adult participants (86 patients with voice disorders and 49 patients with normal voices) were included in this study. Five iOS and Android smartphones with the "Voice Wellness Index" app installed were used to estimate VWI. The VWI data obtained using smartphones were compared with VWI measurements computed from voice recordings collected from a reference studio microphone. The diagnostic efficacy of VWI in differentiating between normal and disordered voices was assessed using receiver operating characteristics (ROC). RESULTS With a Cronbach's alpha of 0.972 and an ICC of 0.972 (0.964-0.979), the VWI scores of the individual smartphones demonstrated remarkable inter-smartphone agreement and reliability. The VWI data obtained from different smartphones and a studio microphone showed nearly perfect direct linear correlations (r = 0.993-0.998). Depending on the individual smartphone device used, the cutoff scores of VWI related to differentiating between normal and pathological voice groups were calculated as 5.6-6.0 with the best balance between sensitivity (94.10-95.15%) and specificity (93.68-95.72%), The diagnostic accuracy was excellent in all cases, with an area under the curve (AUC) of 0.970-0.974. CONCLUSION The "Voice Wellness Index" application is an accurate and reliable tool for voice quality measurement and normal versus pathological voice screening and has considerable potential to be used by healthcare professionals and patients for voice assessment.
Collapse
Affiliation(s)
- Virgilijus Uloza
- Department of Otorhinolaryngology, Lithuanian University of Health Sciences, Kaunas, Lithuania
| | - Nora Ulozaitė-Stanienė
- Department of Otorhinolaryngology, Lithuanian University of Health Sciences, Kaunas, Lithuania
| | - Tadas Petrauskas
- Department of Otorhinolaryngology, Lithuanian University of Health Sciences, Kaunas, Lithuania
| | - Kipras Pribuišis
- Department of Otorhinolaryngology, Lithuanian University of Health Sciences, Kaunas, Lithuania.
| | - Ingrida Ulozienė
- Department of Otorhinolaryngology, Lithuanian University of Health Sciences, Kaunas, Lithuania
| | - Tomas Blažauskas
- Faculty of Informatics, Kaunas University of Technology, Kaunas, Lithuania
| | | | - Rytis Maskeliūnas
- Faculty of Informatics, Kaunas University of Technology, Kaunas, Lithuania
| |
Collapse
|
8
|
Sol J, Aaen M, Sadolin C, Ten Bosch L. Towards Automated Vocal Mode Classification in Healthy Singing Voice-An XGBoost Decision Tree-Based Machine Learning Classifier. J Voice 2023:S0892-1997(23)00281-3. [PMID: 37953088 DOI: 10.1016/j.jvoice.2023.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 09/07/2023] [Indexed: 11/14/2023]
Abstract
Auditory-perceptual assessment is widely used in clinical and pedagogical practice for speech and singing voice, yet several studies have shown poor intra- and inter-rater reliability in both clinical and singing voice contexts. Recent advances in artificial intelligence and machine learning offer models for automated classification and have demonstrated discriminatory power in both pathological and healthy voice. This study develops and tests an XGBoost decision tree based machine learning classifier to develop automated vocal mode classification in healthy singing voice. Classification models trained on mel-frequency cepstrum coefficients, MFCC-Zero-Time Windowing, glottal features, voice quality features, and α-ratios demonstrated 92% average F1-score accuracy in distinguishing metallic and non-metallic singing for male singers and 87% average F1-score for female singers. The model distinguished vocal modes with 70% and 69% average F1-score for male and female samples, respectively. Model performance was compared to human auditory-perceptual assessments of 64 corresponding samples performed by 41 professional singers. The model performed with approximating or subpar performance to human assessors on task-matched problems. The XGBoost gains observed across tested features reveal that the most important attributes for the tested classification problems were MFCCs and α-ratios between high and low frequency energy, with models trained on only these features achieving performance not statistically significantly different from the best tested models. The best automated models in this study do not yet match human auditory-perceptual discrimination but improve on previously reported F1-average accuracies in automated classification in singing voice.
Collapse
Affiliation(s)
- Jeroen Sol
- Institute for Computing and Information Sciences, Radboud University, Nijmegen, the Netherlands
| | - Mathias Aaen
- Research & Development, Complete Vocal Institute, Copenhagen K, Denmark; Nottingham University Hospitals, NHS Trust, Queen's Medical, ENT Department, Nottingham, United Kingdom.
| | - Cathrine Sadolin
- Research & Development, Complete Vocal Institute, Copenhagen K, Denmark
| | - Louis Ten Bosch
- Department of Language and Communication, Centre for Language Studies, Radboud University, Nijmegen, the Netherlands
| |
Collapse
|
9
|
Van der Straeten C, Verbeke J, Alighieri C, Bettens K, Van Beveren E, Bruneel L, Van Lierde K. Treatment Outcomes of Interdisciplinary Care on Speech and Health-Related Quality of Life Outcomes in Adults With Cleft Palate. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2023; 32:2654-2675. [PMID: 37844623 DOI: 10.1044/2023_ajslp-23-00024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2023]
Abstract
PURPOSE Individuals born with a cleft palate with or without a cleft lip (CP ± L) often experience functional, aesthetic, and psychosocial consequences well into adulthood. This study aimed to investigate outcomes of speech and health-related quality of life (HRQoL) in adults with a CP ± L who received interdisciplinary cleft care at the Ghent University Hospital using valid, reliable, and condition-specific instruments. METHOD Thirteen Belgian Dutch-speaking participants with a CP ± L with a mean age of 25.4 years (SD = 5.1, range: 20-33 years) and an age- and gender-matched control group of 13 participants without a CP ± L with a mean age of 25.2 years (SD = 4.8, range: 20-32 years) were included in this study. Speech characteristics were evaluated perceptually and instrumentally. HRQoL was assessed through standardized patient-reported outcome measures. Outcomes were compared with those of the control group and to normative data where available. RESULTS Participants with a CP ± L in this sample demonstrated significantly lower speech acceptability (p < .001) and higher rates of hypernasality (p = .015) and nasal turbulence (p = .005) than the control group. They showed significantly higher satisfaction with appearance of the cleft scar compared with norms of adults with a CP ± L (p = .047). No other differences in speech characteristics, sociodemographics, or HRQoL were found between participants with and without a CP ± L. CONCLUSIONS The reduced speech acceptability and the presence of resonance and nasal airflow disorders may indicate the need for standardized long-term outcome measurement and interdisciplinary follow-up for speech characteristics and velopharyngeal insufficiency in young and middle adulthood in future clinical practice. Additional research is necessary to further substantiate these findings and to determine predictors for these continuing complications in adults with a CP ± L. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24243901.
Collapse
Affiliation(s)
- Charis Van der Straeten
- Department of Rehabilitation Sciences, Centre for Speech and Language Sciences (CESLAS), Ghent University, Belgium
| | - Jolien Verbeke
- Department of Rehabilitation Sciences, Centre for Speech and Language Sciences (CESLAS), Ghent University, Belgium
| | - Cassandra Alighieri
- Department of Rehabilitation Sciences, Centre for Speech and Language Sciences (CESLAS), Ghent University, Belgium
| | - Kim Bettens
- Department of Rehabilitation Sciences, Centre for Speech and Language Sciences (CESLAS), Ghent University, Belgium
| | - Ellen Van Beveren
- Department of Rehabilitation Sciences, Centre for Speech and Language Sciences (CESLAS), Ghent University, Belgium
| | - Laura Bruneel
- Department of Rehabilitation Sciences, Centre for Speech and Language Sciences (CESLAS), Ghent University, Belgium
| | - Kristiane Van Lierde
- Department of Rehabilitation Sciences, Centre for Speech and Language Sciences (CESLAS), Ghent University, Belgium
- Department of Speech-Language Pathology and Audiology, University of Pretoria, South Africa
| |
Collapse
|
10
|
da Silva ACF, de Araújo Lima-Filho LM, Almeida AA, Coêlho HFC, Ribeiro VV, Lopes LW. Spectrographic Voice Analysis Protocol (SAP): Convergent, Concurrent, and Accuracy Validity. J Voice 2023:S0892-1997(23)00283-7. [PMID: 37863674 DOI: 10.1016/j.jvoice.2023.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 09/10/2023] [Accepted: 09/11/2023] [Indexed: 10/22/2023]
Abstract
OBJECTIVE To verify the convergent and concurrent validity of the Spectrographic Voice Analysis Protocol (SAP) and its accuracy to discriminate dysphonic from nondysphonic patients. METHOD The study used 82 vowel /Ɛ/ samples and their respective narrowband spectrograms, analyzed with SAP. Cepstral peak prominence (CPP) and cepstral peak prominence smoothed (CPPS) verified the convergent validity of the SAP total score, while the general grade of vocal deviation (GG) verified the concurrent validity of the SAP total score. The ROC (receive operator curve) curve and its accuracy, sensitivity, and specificity values, positive predictive value (PPV) and negative predictive value (NPV), and positive likelihood ratio (LR+) and negative likelihood ratio (LR-) verified the accuracy of the SAP score to discriminate dysphonic from nondysphonic individuals. RESULTS Dysphonic and nondysphonic had different SAP total scores. In the convergent validity, the SAP score had a weak and moderate negative correlation, respectively, with CPP and CPPS, as well as a moderate positive correlation with GG. SAP performed well in discriminating dysphonic from nondysphonic individuals (area under the curve = 82.0%; sensitivity = 91.7%; specificity = 51.7%; PPV = 93.7%; NPV = 44.0%; LR+ = 6.21; LR- = 0.53) based on the 8-point cutoff score. CONCLUSION SAP has convergent validity with CPP and CPPS and concurrent validity with GG. The SAP total score performed well in discriminating dysphonic from nondysphonic individuals. However, the specificity, NPV, and LR- values justify cautiously using SAP, always in combination with other information in clinical voice assessment.
Collapse
Affiliation(s)
| | | | - Anna Alice Almeida
- Universidade Federal da Paraíba (UFPB), Decision Models and Health Program, João Pessoa, Paraíba, Brazil
| | | | - Vanessa Veis Ribeiro
- Universidade de Brasília (UNB), Speech-Language and Hearing Department, Brasília, Federal District, Brazil
| | - Leonardo Wanderley Lopes
- Universidade Federal da Paraíba (UFPB), Decision Models and Health Program, João Pessoa, Paraíba, Brazil.
| |
Collapse
|
11
|
Hofman EC, Dassie-Leite AP, Martins PDN, Pereira EC. Acoustic measurements of CPPS and AVQI pre and post speech therapy. Codas 2023; 35:e20220136. [PMID: 37672413 PMCID: PMC10547137 DOI: 10.1590/2317-1782/20232022136pt] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 10/14/2022] [Indexed: 09/08/2023] Open
Abstract
PURPOSE To compare the acoustic measurements of Cepstral Peak Prominence-Smoothed (CPPS) and Acoustic Voice Quality Index (AVQI) at pre- and post-voice therapy times. METHODS This is a before and after intervention study, with retrospective data collection. Twenty-two subjects with a mean age of 49.9 years participated in the study. The vocal therapy occurred between the years 2016 to 2019 in a teaching clinic, and the subjects had vocal samples collected before and after the therapeutic processes. CPPS and AVQI data extractions were performed during pre- and post-therapy. In order to characterize the sample, auditory-perceptual evaluation (APE) regarding the overall degree of vocal deviation at pre- and post-therapy moments was performed. The data were analyzed statistically. RESULTS The APE data indicated a decrease in the median values of overall vocal deviation degree at the post-therapy stage for both the vowel (p=0.00) and number (p=0.00) samples. The average CPPS for the vowel was 14.53 pre-therapy and 16.37 post-therapy (p=0.01); for the number emission, it was 8.22 pre-therapy and 9.06 post-therapy (p=0.02), there was a difference in the CPPS of the vowel and numbers indicating vocal improvement at post-therapy. The average AVQI was 2.27 pre-therapy and 1.54 post-therapy (p=0.05). There was an improvement in the AVQI results, with borderline p-value. CONCLUSION Vocal therapy produced changes in the general degree of vocal deviation, as well as in CPPS and AVQI measurements, and the results at the post-therapy moment are similar to those of vocally healthy individuals.
Collapse
Affiliation(s)
- Eduarda Cristina Hofman
- Departamento de Fonoaudiologia, Universidade Estadual do Centro-Oeste - UNICENTRO - Irati (PR), Brasil.
| | - Ana Paula Dassie-Leite
- Departamento de Fonoaudiologia, Universidade Estadual do Centro-Oeste - UNICENTRO - Irati (PR), Brasil.
| | | | - Eliane Cristina Pereira
- Departamento de Fonoaudiologia, Universidade Estadual do Centro-Oeste - UNICENTRO - Irati (PR), Brasil.
| |
Collapse
|
12
|
Hofman EC, Dassie-Leite AP, Martins PDN, Pereira EC. Acoustic measurements of CPPS and AVQI pre and post speech therapy. Codas 2023; 35:e20220136. [PMID: 37672413 PMCID: PMC10547137 DOI: 10.1590/2317-1782/20232022136en] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 10/14/2022] [Indexed: 11/20/2023] Open
Abstract
PURPOSE To compare the acoustic measurements of Cepstral Peak Prominence-Smoothed (CPPS) and Acoustic Voice Quality Index (AVQI) at pre- and post-voice therapy times. METHODS This is a before and after intervention study, with retrospective data collection. Twenty-two subjects with a mean age of 49.9 years participated in the study. The vocal therapy occurred between the years 2016 to 2019 in a teaching clinic, and the subjects had vocal samples collected before and after the therapeutic processes. CPPS and AVQI data extractions were performed during pre- and post-therapy. In order to characterize the sample, auditory-perceptual evaluation (APE) regarding the overall degree of vocal deviation at pre- and post-therapy moments was performed. The data were analyzed statistically. RESULTS The APE data indicated a decrease in the median values of overall vocal deviation degree at the post-therapy stage for both the vowel (p=0.00) and number (p=0.00) samples. The average CPPS for the vowel was 14.53 pre-therapy and 16.37 post-therapy (p=0.01); for the number emission, it was 8.22 pre-therapy and 9.06 post-therapy (p=0.02), there was a difference in the CPPS of the vowel and numbers indicating vocal improvement at post-therapy. The average AVQI was 2.27 pre-therapy and 1.54 post-therapy (p=0.05). There was an improvement in the AVQI results, with borderline p-value. CONCLUSION Vocal therapy produced changes in the general degree of vocal deviation, as well as in CPPS and AVQI measurements, and the results at the post-therapy moment are similar to those of vocally healthy individuals.
Collapse
Affiliation(s)
- Eduarda Cristina Hofman
- Departamento de Fonoaudiologia, Universidade Estadual do Centro-Oeste - UNICENTRO - Irati (PR), Brasil.
| | - Ana Paula Dassie-Leite
- Departamento de Fonoaudiologia, Universidade Estadual do Centro-Oeste - UNICENTRO - Irati (PR), Brasil.
| | | | - Eliane Cristina Pereira
- Departamento de Fonoaudiologia, Universidade Estadual do Centro-Oeste - UNICENTRO - Irati (PR), Brasil.
| |
Collapse
|
13
|
Barsties V Latoszek B, Mayer J, Watts CR, Lehnert B. Advances in Clinical Voice Quality Analysis with VOXplot. J Clin Med 2023; 12:4644. [PMID: 37510759 PMCID: PMC10380658 DOI: 10.3390/jcm12144644] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 07/04/2023] [Accepted: 07/08/2023] [Indexed: 07/30/2023] Open
Abstract
BACKGROUND The assessment of voice quality can be evaluated perceptually with standard clinical practice, also including acoustic evaluation of digital voice recordings to validate and further interpret perceptual judgments. The goal of the present study was to determine the strongest acoustic voice quality parameters for perceived hoarseness and breathiness when analyzing the sustained vowel [a:] using a new clinical acoustic tool, the VOXplot software. METHODS A total of 218 voice samples of individuals with and without voice disorders were applied to perceptual and acoustic analyses. Overall, 13 single acoustic parameters were included to determine validity aspects in relation to perceptions of hoarseness and breathiness. RESULTS Four single acoustic measures could be clearly associated with perceptions of hoarseness or breathiness. For hoarseness, the harmonics-to-noise ratio (HNR) and pitch perturbation quotient with a smoothing factor of five periods (PPQ5), and, for breathiness, the smoothed cepstral peak prominence (CPPS) and the glottal-to-noise excitation ratio (GNE) were shown to be highly valid, with a significant difference being demonstrated for each of the other perceptual voice quality aspects. CONCLUSIONS Two acoustic measures, the HNR and the PPQ5, were both strongly associated with perceptions of hoarseness and were able to discriminate hoarseness from breathiness with good confidence. Two other acoustic measures, the CPPS and the GNE, were both strongly associated with perceptions of breathiness and were able to discriminate breathiness from hoarseness with good confidence.
Collapse
Affiliation(s)
- Ben Barsties V Latoszek
- Speech-Language Pathology, SRH University of Applied Health Sciences, 40210 Düsseldorf, Germany
| | - Jörg Mayer
- Institute for Natural Language Processing, University of Stuttgart, 70049 Stuttgart, Germany
| | - Christopher R Watts
- Harris College of Nursing & Health Sciences, Texas Christian University, Fort Worth, TX 76109, USA
| | - Bernhard Lehnert
- Department of Oto-Rhino-Laryngology, Phoniatrics and Pedaudiology Division, University Medicine Greifswald, 17475 Greifswald, Germany
| |
Collapse
|
14
|
Uloza V, Ulozaitė-Stanienė N, Petrauskas T, Pribuišis K, Blažauskas T, Damaševičius R, Maskeliūnas R. Reliability of Universal-Platform-Based Voice Screen Application in AVQI Measurements Captured with Different Smartphones. J Clin Med 2023; 12:4119. [PMID: 37373811 DOI: 10.3390/jcm12124119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 06/15/2023] [Accepted: 06/16/2023] [Indexed: 06/29/2023] Open
Abstract
The aim of the study was to develop a universal-platform-based (UPB) application suitable for different smartphones for estimation of the Acoustic Voice Quality Index (AVQI) and evaluate its reliability in AVQI measurements and normal and pathological voice differentiation. Our study group consisted of 135 adult individuals, including 49 with normal voices and 86 patients with pathological voices. The developed UPB "Voice Screen" application installed on five iOS and Android smartphones was used for AVQI estimation. The AVQI measures calculated from voice recordings obtained from a reference studio microphone were compared with AVQI results obtained using smartphones. The diagnostic accuracy of differentiating normal and pathological voices was evaluated by applying receiver-operating characteristics. One-way ANOVA analysis did not detect statistically significant differences between mean AVQI scores revealed using a studio microphone and different smartphones (F = 0.759; p = 0.58). Almost perfect direct linear correlations (r = 0.991-0.987) were observed between the AVQI results obtained with a studio microphone and different smartphones. An acceptable level of precision of the AVQI in discriminating between normal and pathological voices was yielded, with areas under the curve (AUC) displaying 0.834-0.862. There were no statistically significant differences between the AUCs (p > 0.05) obtained from studio and smartphones' microphones. The significant difference revealed between the AUCs was only 0.028. The UPB "Voice Screen" application represented an accurate and robust tool for voice quality measurements and normal vs. pathological voice screening purposes, demonstrating the potential to be used by patients and clinicians for voice assessment, employing both iOS and Android smartphones.
Collapse
Affiliation(s)
- Virgilijus Uloza
- Department of Otorhinolaryngology, Lithuanian University of Health Sciences, 50061 Kaunas, Lithuania
| | - Nora Ulozaitė-Stanienė
- Department of Otorhinolaryngology, Lithuanian University of Health Sciences, 50061 Kaunas, Lithuania
| | - Tadas Petrauskas
- Department of Otorhinolaryngology, Lithuanian University of Health Sciences, 50061 Kaunas, Lithuania
| | - Kipras Pribuišis
- Department of Otorhinolaryngology, Lithuanian University of Health Sciences, 50061 Kaunas, Lithuania
| | - Tomas Blažauskas
- Faculty of Informatics, Kaunas University of Technology, 51368 Kaunas, Lithuania
| | | | - Rytis Maskeliūnas
- Faculty of Informatics, Kaunas University of Technology, 51368 Kaunas, Lithuania
| |
Collapse
|
15
|
Kankare E, Rantala L, Laukkanen AM. Vocal Fatigue Index in Finnish-Speaking Population. J Voice 2023:S0892-1997(23)00092-9. [PMID: 37003862 DOI: 10.1016/j.jvoice.2023.02.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Revised: 02/24/2023] [Accepted: 02/24/2023] [Indexed: 04/03/2023]
Abstract
BACKGROUND AND OBJECTIVE Vocal fatigue is an important complaint that may indicate a voice disorder or a risk thereof. There is a need for a reliable tool to detect and quantify vocal fatigue and distinguish dysphonic and vocally healthy speakers. The Vocal Fatigue Index (VFI) questionnaire has been found valid and reliable among speakers of different languages. This study aims to validate it for speakers of Finnish. STUDY DESIGN Experimental comparative study. METHODS The VFI questionnaire was translated from English to Finnish according to the WHO recommendations. Next, it was subjected to the validation procedure. In total, 160 Finnish speakers volunteered to participate in the study. Hundred-and-eight were voice patients (83 males, 25 females) and 52 were vocally healthy controls (37 females, 15 males). As a comparison, the Voice Handicap Index (VHI) questionnaire was completed and voice samples were recorded to enable Acoustic Voice Quality Index (AVQI03.01FIN) analysis. RESULTS Results from the first and second completions of the VFI(F) questionnaire correlated strongly (Spearman's rho 0.901, P = 0.01). Answers to the individual questions the VFI(F) also correlated strongly, showing high internal consistency. Factor 1 (Tiredness of voice and avoidance of voice use) of the VFI correlated strongly with the VHI, and the two other factors (Physical discomfort associated with voicing and Improvement of symptoms) correlated moderately with the VHI. Factor one of the VFI(F) correlated moderately with AVQI03.01FIN and its sub-parameters, CPPS, HNR, and shimmer. The VFI(F) showed good construct validity, differentiating voice patients and controls at cut-off 13.5, with sensitivity of 0.963 and specificity of 0.885. Discriminatory power was strong for all factors: F1 AROC = 0.985, F2 AROC = 0.864, and F3 AROC = 0.821. CONCLUSION The VFI(F) correlates with the VHI and with AVQI01.01FIN and it is a valid and reliable tool for detecting vocal fatigue in Finnish speakers.
Collapse
Affiliation(s)
- Eliina Kankare
- Department of Rehabilitation and Psychosocial Support, Logopedics, Phoniatrics, Tampere University Hospital, Tampere, Finland; Speech and Voice Research Laboratory, Tampere University, Tampere, Finland.
| | - Leena Rantala
- Speech and Voice Research Laboratory, Tampere University, Tampere, Finland
| | | |
Collapse
|
16
|
Puig-Herreros C, Sanz JL, Rosell-Clari V, Barona L, Melo M. What Are the Contemporary Trends on Euphonic Voice Research? A Scientometric Analysis. Healthcare (Basel) 2022; 10:2137. [PMID: 36360478 PMCID: PMC9690488 DOI: 10.3390/healthcare10112137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/24/2022] [Accepted: 10/26/2022] [Indexed: 11/30/2022] Open
Abstract
(1) Background: The study of the human euphonic voice is a subject that has been researched in recent years from different perspectives. Therefore, it is pertinent to assess the current state of the science. The aim of analyzing the characteristics of normal voice-related publications over the last 11 years is to identify research trends, the numerical and temporal evolution of the publications, their type, and the most-used descriptors. (2) Methods: Bibliometric data from 2011 to 2021 were obtained through several databases. Subsequently, a science mapping analysis was made via VOSviewer software. (3) Results: A total of 901 publications were obtained. The analysis of the scientific production on the field of study regarding the euphonic voice shows a slight increase over the last 11 years, with an average of 82 publications per year. Co-authorship analysis revealed a 6215 authors contributing to the field with a 901 articles (headed by Jiang, J.J. with 18 articles). Keyword co-occurrence analysis highlighted the lack of temporal advancement and variety in the terminology used in the field of voice research. (4) Conclusions: This scientometric study sheds light to the need to broaden in this field of study and the establishment of solid research groups to contribute to its advancement.
Collapse
Affiliation(s)
- Clara Puig-Herreros
- Department of Basic Psychology, Speech Therapy University Clinic, Universitat de València, 46010 València, Spain
| | - José Luis Sanz
- Department of Stomatology, Dental University Clinic, Universitat de València, 46010 València, Spain
| | - Vicent Rosell-Clari
- Department of Basic Psychology, Speech Therapy University Clinic, Universitat de València, 46010 València, Spain
| | - Luz Barona
- Department of Otolaryngology, Barona Clinic, Casa de la Salud Hospital, 46021 València, Spain
| | - María Melo
- Department of Stomatology, Dental University Clinic, Universitat de València, 46010 València, Spain
| |
Collapse
|
17
|
Walden PR, Rau S. Individual Voice Dimensions' Prediction of Overall Dysphonia Severity on Two Auditory-Perceptual Scales. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:2759-2777. [PMID: 35868295 DOI: 10.1044/2022_jslhr-21-00689] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
BACKGROUND Auditory-perceptual evaluation of dysphonic voice is an essential clinical activity that characterizes the nature of dysphonia and aids in planning its clinical management. Although there are multidimensional acoustic measures that correlate well with overall severity ratings, they tend to include measures that have only small or moderate correlations with individual voice characteristics frequently perceptually measured (e.g., breathiness or roughness). Given this difference between perceptual and acoustic measures, it is unclear how much individual voice characteristics contribute to a listener's perception of overall severity of dysphonia. PURPOSE The purpose of this study was to explore individual voice characteristics' relative contribution to the rating of overall dysphonia severity and to explore sex-related differences. METHOD Two hundred ninety-six voice samples were accessed from the Perceptual Voice Qualities Database. Roughness, breathiness, asthenia, strain, pitch, and loudness ratings from the Grade, Roughness, Breathiness, Asthenia, Strain and Consensus Auditory-Perceptual Evaluation of Voice scales were used to predict overall voice quality severity in linear regression with bootstrapped coefficients. RESULTS Roughness, breathiness, and strain were the strongest predictors of overall severity. Asthenia and, to a lesser extent, pitch were also significant predictors of overall severity. Loudness was not a significant predictor. There were several sex-related differences noted, as well as differences related to the scale used. CONCLUSIONS Breathiness, roughness, and strain were all important predictors of overall severity for all regressions. Clinicians should be aware of scale-related differences if they are using auditory-perceptual measures to choose voice therapy targets. Analyses accounting for perceptual strategy differences were recommended for future studies.
Collapse
Affiliation(s)
| | - Sydney Rau
- Department of Communication Sciences and Disorders, St. John's University, Queens, NY
| |
Collapse
|