1
|
Du Z, Xu Y, Yu X, Wang S, Xu L. Estimation of Speech Features Using a Wearable Inertial Sensor. J Voice 2024:S0892-1997(24)00303-5. [PMID: 39393952 DOI: 10.1016/j.jvoice.2024.09.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 09/07/2024] [Accepted: 09/09/2024] [Indexed: 10/13/2024]
Abstract
Speech features have been investigated as novel digital biomarkers for many psychiatric and neurocognitive diseases. Microphones are the most used devices for speech recording but inevitably suffering from several disadvantages such as privacy leakage and environmental noises, limiting their clinical applications particularly for long-term ambulatory monitoring. The aim of the present study is therefore to explore the feasibility of extracting speech features from the acceleration recorded on the sternum. Ten healthy subjects volunteered in our study. Two speech tasks, that is, repeating one sentence 20 times and reading 20 different sentences, were performed by each subject, with each task repeated eight times under different speech rate and loudness. Voice signals and speech-caused chest vibrations were simultaneously recorded by a microphone and an accelerometer placed on the sternum. Forty-two acoustic features and six time-related prosodic features were extracted from both signals using a standard toolbox, and then compared by a linear fit and correlation analysis. Good agreement between the acceleration features and microphone features is observed in all six time-related prosodic features for both tasks, but only in 19 and 17 acoustic features for task 1 and 2, respectively, with most of them loudness- or pitch-related. Our results suggest the sternum acceleration to track time-related speech prosody, loudness, and pitch very well, demonstrating the feasibility of deriving digital biomarkers from the acceleration signal for diseases strongly related to time-related prosodic and loudness features.
Collapse
Affiliation(s)
- Zuyu Du
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Yaodan Xu
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China; Shanghai Advanced Research Institute, Chinese Academy of Science, Shanghai, China
| | - Xinsheng Yu
- Shanghai Ruiwei Digital Technology, Shanghai, China
| | - Sen Wang
- Shanghai Ruiwei Digital Technology, Shanghai, China
| | - Lin Xu
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China; Shanghai Frontiers Science Center of Human-centered Artificial Intelligence, Shanghai, China; MoE Key Lab of Intelligent Perception and Human-Machine Collaboration (ShanghaiTech University), Shanghai, China.
| |
Collapse
|
2
|
Colletti L, Heller Murray E. Voice Onset Time in Children With and Without Vocal Fold Nodules. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:1467-1478. [PMID: 36940476 PMCID: PMC10457081 DOI: 10.1044/2023_jslhr-22-00463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 11/21/2022] [Accepted: 01/16/2023] [Indexed: 05/11/2023]
Abstract
PURPOSE Voice onset time (VOT) of voiceless consonants provides information on the coordination of the vocal and articulatory systems. This study examined whether vocal-articulatory coordination is affected by the presence of vocal fold nodules (VFNs) in children. METHOD The voices of children with VFNs (6-12 years) and age- and gender-matched vocally healthy controls were examined. VOT was calculated as the time between the voiceless stop consonant burst and the vocal onset of the vowel. Measures of the average VOT and VOT variability, defined as the coefficient of variation, were calculated. The acoustic measure of dysphonia, cepstral peak prominence (CPP), was also calculated. CPP provides information about the overall periodicity of the signal, with more dysphonic voices having lower CPP values. RESULTS There were no significant differences in either average VOT or VOT variability between the VFN and control groups. VOT variability and average VOT were both significantly predicted by the interaction between Group and CPP. There was a significant negative correlation between CPP and VOT variability in the VFN group, but no significant relationship was found in the control group. CONCLUSIONS Unlike previous studies with adults, there were no group differences in average VOT or VOT variability in this study. However, children with VFNs who were more dysphonic had increased VOT variability, suggestive of a relationship between dysphonia severity and control of vocal onset during speech production.
Collapse
Affiliation(s)
- Lauren Colletti
- Department of Communication Sciences and Disorders, College of Public Health, Temple University, Philadelphia, PA
| | - Elizabeth Heller Murray
- Department of Communication Sciences and Disorders, College of Public Health, Temple University, Philadelphia, PA
| |
Collapse
|
3
|
Tseng WH, Chang CC, Chiu HL, Hsiao TY, Yang TL. Effects of surgery on the relationship between subglottic pressure and fundamental frequency in vocal fold dynamics in patients with benign laryngeal diseases. Eur Arch Otorhinolaryngol 2023; 280:1283-1290. [PMID: 36136150 DOI: 10.1007/s00405-022-07662-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 09/14/2022] [Indexed: 02/07/2023]
Abstract
PURPOSE Subglottic pressure (Ps) and fundamental frequency (F0) play important roles in governing vocal fold (VF) dynamics. Theoretical description, model simulation, excised larynx and animal models have been used in previous studies, yet clinically applicable measurements are still lacking. This study aimed to evaluate the effects of surgery for benign laryngeal lesions by investigating the relationship between F0 and Ps. METHODS Patients with benign laryngeal lesions who underwent phonosurgery were prospectively recruited. Participants were instructed to sustain voicing the vowel /o/ at three incremental frequencies four semitones apart in the modal register (F01, F02, F03). F0 was estimated by VF vibration on the accelerometer. Ps change was achieved and measured using the airflow interruption method. RESULTS Thirteen patients with a mean age (SD) of 43.5 (12.4) years were included. The change in F0 per unit change of Ps, which is the slope (Hz/kPa) of the regression line of the frequency-pressure data pairs, decreased as the tension of the VF increased. The slopes significantly increased after the operation for F01 and F02 (36.43 ± 14.68 preoperatively, 53.91 ± 30.71 postoperatively, p = 0.011 and 26.02 ± 10.71; 34.85 ± 17.92, p = 0.046, respectively). In addition, there was a significant decrease in phonation threshold pressure and improvements in the grade, roughness, breathiness, asthenia, strain scale, and the voice handicap inventory-10. CONCLUSIONS The relationship between F0 and Ps may serve as an objective assessment of the outcomes in the treatment of benign laryngeal diseases with clinical relevance.
Collapse
Affiliation(s)
- Wen-Hsuan Tseng
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, #1, Sec. 1, Jen-Ai Road, Taipei, 100, Taiwan.,Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, Taiwan
| | - Chi-Chin Chang
- Department of Speech Language Pathology and Audiology, National Taipei University of Nursing and Health Sciences, Taipei, Taiwan
| | - Hsiang-Ling Chiu
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, #1, Sec. 1, Jen-Ai Road, Taipei, 100, Taiwan
| | - Tzu-Yu Hsiao
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, #1, Sec. 1, Jen-Ai Road, Taipei, 100, Taiwan
| | - Tsung-Lin Yang
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, #1, Sec. 1, Jen-Ai Road, Taipei, 100, Taiwan. .,Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, Taiwan. .,Research Center for Developmental Biology and Regenerative Medicine, National Taiwan University, Taipei, Taiwan.
| |
Collapse
|
4
|
Comparative Study on the Effects of Surface Neuromuscular Electrical Stimulation Between Subjects With Unilateral Vocal Fold Paralysis in the Paramedian and Median Positions. J Voice 2022. [DOI: 10.1016/j.jvoice.2022.09.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
5
|
Groll MD, Vojtech JM, Hablani S, Mehta DD, Buckley DP, Noordzij JP, Stepp CE. Automated Relative Fundamental Frequency Algorithms for Use With Neck-Surface Accelerometer Signals. J Voice 2022; 36:156-169. [PMID: 32653267 PMCID: PMC7790853 DOI: 10.1016/j.jvoice.2020.06.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 06/04/2020] [Indexed: 10/23/2022]
Abstract
OBJECTIVE Relative fundamental frequency (RFF) has been suggested as a potential acoustic measure of vocal effort. However, current clinical standards for RFF measures require time-consuming manual markings. Previous semi-automated algorithms have been developed to calculate RFF from microphone signals. The current study aimed to develop fully automated algorithms to calculate RFF from neck-surface accelerometer signals for ecological momentary assessment and ambulatory monitoring of voice. METHODS Training a set of 2646 /vowel-fricative-vowel/ utterances from 317 unique speakers, with and without voice disorders, was used to develop automated algorithms to calculate RFF values from neck-surface accelerometer signals. The algorithms first rejected utterances with poor vowel-to-noise ratios, then identified fricative locations, then used signal features to determine voicing boundary cycles, and finally calculated corresponding RFF values. These automated RFF values were compared to the clinical gold-standard of manual RFF calculated from simultaneously collected microphone signals in a novel test set of 639 utterances from 77 unique speakers. RESULTS Automated accelerometer-based RFF values resulted in an average mean bias error (MBE) across all cycles of 0.027 ST, with an MBE of 0.152 ST and -0.252 ST in the offset and onset cycles closest to the fricative, respectively. CONCLUSION All MBE values were smaller than the expected changes in RFF values following successful voice therapy, suggesting that the current algorithms could be used for ecological momentary assessment and ambulatory monitoring via neck-surface accelerometer signals.
Collapse
Affiliation(s)
- Matti D. Groll
- Department of Biomedical Engineering, Boston University, Boston, 02215, Massachusetts,Department of Speech, Language and Hearing Sciences, Boston University, Boston, 02215, Massachusetts
| | - Jennifer M. Vojtech
- Department of Biomedical Engineering, Boston University, Boston, 02215, Massachusetts,Department of Speech, Language and Hearing Sciences, Boston University, Boston, 02215, Massachusetts
| | - Surbhi Hablani
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, 02215, Massachusetts
| | - Daryush D. Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation and MGH Institute of Health Professions, Massachusetts General Hospital, Boston, 02114, Massachusetts,Department of Surgery, Harvard Medical School, Boston, 02144, Massachusetts,Program in Rehabilitation Sciences, MGH Institute of Health Professions, Boston, 02129, Massachusetts,Speech and Hearing Bioscience and Technology Program, Division of Medical Sciences, Harvard Medical School, Boston, 02144, Massachusetts
| | - Daniel P. Buckley
- Department of Speech, Language and Hearing Sciences, Boston University, Boston, 02215, Massachusetts,Department of Otolaryngology – Head and Neck Surgery, Boston University School of Medicine, Boston, 02118, Massachusetts
| | - J. Pieter Noordzij
- Department of Otolaryngology – Head and Neck Surgery, Boston University School of Medicine, Boston, 02118, Massachusetts
| | - Cara E. Stepp
- Department of Biomedical Engineering, Boston University, Boston, 02215, Massachusetts,Department of Speech, Language and Hearing Sciences, Boston University, Boston, 02215, Massachusetts,Department of Otolaryngology – Head and Neck Surgery, Boston University School of Medicine, Boston, 02118, Massachusetts
| |
Collapse
|
6
|
Heller Murray ES, Segina RK, Woodnorth GH, Stepp CE. Relative Fundamental Frequency in Children With and Without Vocal Fold Nodules. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:361-371. [PMID: 32073342 PMCID: PMC7210445 DOI: 10.1044/2019_jslhr-19-00058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Purpose Relative fundamental frequency (RFF) is an acoustic measure that is sensitive to functional voice differences in adults. The aim of the current study was to evaluate RFF in children, as there are known structural and functional differences between the pediatric and adult vocal mechanisms. Method RFF was analyzed in 28 children with vocal fold nodules (CwVN, M = 9.0 years) and 28 children with typical voices (CwTV, M = 8.9 years). RFF is the instantaneous fundamental frequency (f 0) of the 10 vocalic cycles during devoicing (vocal offset) and 10 vocalic cycles during the revoicing (vocal onset) of the vowels that surround a voiceless consonant. Each cycle's f 0 was normalized to a steady-state portion of the vowel. RFF values for the cycles closest to the voiceless consonant, that is, Offset Cycle 10 and Onset Cycle 1, were examined. Results Average RFF values for Offset Cycle 10 and Onset Cycle 1 did not differ between CwVN and CwTV; however, within-subject variability of Offset Cycle 10 was decreased in CwVN. Across both groups, male children had lower Offset Cycle 10 RFF values as compared to female children. Additionally, Onset Cycle 1 values were decreased in younger children as compared to those of older children. Conclusions Unlike previous work with adults, CwVN did not have significantly different RFF values than CwTV. Younger children had lower RFF values for Onset Cycle 1 than older children, suggesting that vocal onset f 0 may provide information on the maturity of the laryngeal motor system.
Collapse
Affiliation(s)
- Elizabeth S. Heller Murray
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Otolaryngology and Communication Enhancement, Boston Children's Hospital, MA
| | - Roxanne K. Segina
- Department of Speech, Language & Hearing Sciences, Boston University, MA
| | | | - Cara E. Stepp
- Department of Speech, Language & Hearing Sciences, Boston University, MA
- Department of Otolaryngology—Head & Neck Surgery, Boston University School of Medicine, MA
- Department of Biomedical Engineering, Boston University, MA
| |
Collapse
|
7
|
Tseng WH, Chang CC, Yang TL, Hsiao TY. Estimating vocal fold stiffness: Using the relationship between subglottic pressure and fundamental frequency of phonation as an analog. Clin Otolaryngol 2019; 45:40-46. [PMID: 31625675 DOI: 10.1111/coa.13463] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Accepted: 10/13/2019] [Indexed: 11/29/2022]
Abstract
OBJECTIVE The stiffness of the vocal folds is an important factor in voice production, yet clinically applicable measurements are still lacking. It has been demonstrated in an in vivo canine model that fundamental frequency (F0 ) increased linearly as subglottic pressure (Ps ) increased, but with a lesser slope for higher levels of vocal fold tension. In this study, the relationship between F0 and Ps was investigated using the airflow interruption method in awake patients non-invasively. DESIGN Healthy volunteers enrolled for evaluation. SETTING Single-centre. PARTICIPANTS Thirty-three healthy volunteers aged 20 and older were recruited, with one excluded for a recent asthma attack. MAIN OUTCOME MEASURES The relationships between F0 and Ps , described as the slope (Hz/kPa), were investigated when the participants sustained voicing the vowel/o/at 3 incremental frequencies 4 semitones apart in the modal register (F1, F2 and F3). RESULTS Thirty-two healthy volunteers (20 females, 12 males) aged 20-47 years were enrolled for final analyses. There was a statistically significant difference in the slopes of the linear regression lines of F0 -Ps , depending on the frequency with which the vowel/o/ was produced (P < .001). The slope differed significantly between F2 and F1 (P < .001; P = .015), F3 and F1 (P < .001; P = .002) and F3 and F2 (P < .001; P = .005) for both women and men, respectively. CONCLUSIONS It was demonstrated that the higher the vocal fold tension, the smaller the slope between F0 and Ps . Using the relationship between F0 and Ps as an analog of vocal fold stiffness is potentially practical for clinical application.
Collapse
Affiliation(s)
- Wen-Hsuan Tseng
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan.,Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, Taiwan
| | - Chi-Chin Chang
- Department of Speech Language Pathology and Audiology, National Taipei University of Nursing and Health Sciences, Taipei, Taiwan
| | - Tsung-Lin Yang
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan.,Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, Taiwan.,Research Center for Developmental Biology and Regenerative Medicine, National Taiwan University, Taipei, Taiwan
| | - Tzu-Yu Hsiao
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan
| |
Collapse
|
8
|
McKenna VS, Stepp CE. The relationship between acoustical and perceptual measures of vocal effort. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:1643. [PMID: 30424674 PMCID: PMC6167228 DOI: 10.1121/1.5055234] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Revised: 08/15/2018] [Accepted: 09/06/2018] [Indexed: 05/15/2023]
Abstract
Excessive vocal effort is a common clinical voice symptom, yet the acoustical manifestation of vocal effort and how that is perceived by speakers and listeners has not been fully elucidated. Here, 26 vocally healthy adults increased vocal effort during the production of the utterance /ifi/, followed by self-ratings of effort on a 100 mm visual analog scale. Twenty inexperienced listeners assessed the speakers' vocal effort using the visual sort-and-rate method. Previously proposed acoustical correlates of vocal effort were calculated, including: mean sound pressure level (SPL), mean fundamental frequency (f o), relative fundamental frequency (RFF) offset cycle 10 and onset cycle 1, harmonics-to-noise ratio (HNR), cepstral peak prominence and its standard deviation (SD), and low-to-high (L/H) spectral ratio and its SD. Two separate mixed-effects regression models yielded mean SPL, L/H ratio, and HNR as significant predictors of both speaker and listener ratings of vocal effort. RFF offset cycle 10 and mean f o were significant predictors of listener ratings only. Therefore, speakers and listeners attended to similar acoustical cues when making judgments of vocal effort, but listeners also used additional time-based information. Further work is needed to determine how vocal effort manifests in the speech signal in speakers with voice disorders.
Collapse
Affiliation(s)
- Victoria S McKenna
- Department of Speech, Language, and Hearing Sciences, Boston University, 677 Beacon Street, Boston, Massachusetts 02215, USA
| | - Cara E Stepp
- Department of Speech, Language, and Hearing Sciences, Boston University, 677 Beacon Street, Boston, Massachusetts 02215, USA
| |
Collapse
|
9
|
Kagan LS, Heaton JT. The Effectiveness of Low-Level Light Therapy in Attenuating Vocal Fatigue. J Voice 2017; 31:384.e15-384.e23. [DOI: 10.1016/j.jvoice.2016.09.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2016] [Revised: 09/07/2016] [Accepted: 09/08/2016] [Indexed: 11/29/2022]
|
10
|
McKenna VS, Heller Murray ES, Lien YAS, Stepp CE. The Relationship Between Relative Fundamental Frequency and a Kinematic Estimate of Laryngeal Stiffness in Healthy Adults. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2016; 59:1283-1294. [PMID: 27936279 PMCID: PMC5399757 DOI: 10.1044/2016_jslhr-s-15-0406] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Revised: 02/21/2016] [Accepted: 05/02/2016] [Indexed: 05/19/2023]
Abstract
PURPOSE This study examined the relationship between the acoustic measure relative fundamental frequency (RFF) and a kinematic estimate of laryngeal stiffness. METHOD Twelve healthy adults (mean age = 22.7 years, SD = 4.4; 10 women, 2 men) produced repetitions of /ifi/ while varying their vocal effort during simultaneous acoustic and video nasendoscopic recordings. RFF was determined from the last 10 voicing cycles before the voiceless obstruent (RFF offset) and the first 10 cycles of revoicing (RFF onset). A kinematic stiffness ratio was calculated for the vocal fold adductory gesture during revoicing by normalizing the maximum angular velocity by the maximum glottic angle during the voiceless obstruent. RESULTS A linear mixed effect model indicated that RFF offset and onset were significant predictors of the kinematic stiffness ratios. The model accounted for 52% of the variance in the kinematic data. Individual relationships between RFF and kinematic stiffness ratios varied across participants, with at least moderate negative correlations in 83% of participants for RFF offset but only 40% of participants for RFF onset. CONCLUSIONS RFF significantly predicted kinematic estimates of laryngeal stiffness in healthy speakers and has the potential to be a useful clinical indicator of laryngeal tension. Further research is needed in individuals with voice disorders.
Collapse
Affiliation(s)
| | | | - Yu-An S. Lien
- Department of Biomedical Engineering, Boston University, MA
| | - Cara E. Stepp
- Department of Speech, Language, and Hearing Sciences, Boston University, MA
- Department of Biomedical Engineering, Boston University, MA
- Department of Otolaryngology—Head and Neck Surgery, Boston University School of Medicine, MA
| |
Collapse
|
11
|
Mehta DD, Van Stan JH, Hillman RE. Relationships between vocal function measures derived from an acoustic microphone and a subglottal neck-surface accelerometer. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 2016; 24:659-668. [PMID: 27066520 PMCID: PMC4826073 DOI: 10.1109/taslp.2016.2516647] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Monitoring subglottal neck-surface acceleration has received renewed attention due to the ability of low-profile accelerometers to confidentially and noninvasively track properties related to normal and disordered voice characteristics and behavior. This study investigated the ability of subglottal neck-surface acceleration to yield vocal function measures traditionally derived from the acoustic voice signal and help guide the development of clinically functional accelerometer-based measures from a physiological perspective. Results are reported for 82 adult speakers with voice disorders and 52 adult speakers with normal voices who produced the sustained vowels /a/, /i/, and /u/ at a comfortable pitch and loudness during the simultaneous recording of radiated acoustic pressure and subglottal neck-surface acceleration. As expected, timing-related measures of jitter exhibited the strongest correlation between acoustic and neck-surface acceleration waveforms (r ≤ 0.99), whereas amplitude-based measures of shimmer correlated less strongly (r ≤ 0.74). Additionally, weaker correlations were exhibited by spectral measures of harmonics-to-noise ratio (r ≤ 0.69) and tilt (r ≤ 0.57), whereas the cepstral peak prominence correlated more strongly (r ≤ 0.90). These empirical relationships provide evidence to support the use of accelerometers as effective complements to acoustic recordings in the assessment and monitoring of vocal function in the laboratory, clinic, and during an individual's daily activities.
Collapse
Affiliation(s)
- Daryush D Mehta
- Center for Laryngeal Surgery & Voice Rehabilitation, Massachusetts General Hospital, Boston MA 02114 USA, Department of Surgery, Harvard Medical School, Boston, MA 02115 USA, and the Institute of Health Professions, Massachusetts General Hospital, Boston, Massachusetts 02129 USA ( )
| | - Jarrad H Van Stan
- Center for Laryngeal Surgery & Voice Rehabilitation, Massachusetts General Hospital, Boston MA 02114 USA and the Institute of Health Professions, Massachusetts General Hospital, Boston, Massachusetts 02129 USA ( )
| | - Robert E Hillman
- Center for Laryngeal Surgery & Voice Rehabilitation and Institute of Health Professions, Massachusetts General Hospital, Boston MA 02114 USA and Surgery and Health Sciences & Technology, Harvard Medical School, Boston, MA 02115 ( )
| |
Collapse
|
12
|
Lien YAS, Calabrese CR, Michener CM, Murray EH, Van Stan JH, Mehta DD, Hillman RE, Noordzij JP, Stepp CE. Voice Relative Fundamental Frequency Via Neck-Skin Acceleration in Individuals With Voice Disorders. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2015; 58:1482-7. [PMID: 26134171 PMCID: PMC4686308 DOI: 10.1044/2015_jslhr-s-15-0126] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2015] [Accepted: 06/25/2015] [Indexed: 05/20/2023]
Abstract
PURPOSE This study investigated the use of neck-skin acceleration for relative fundamental frequency (RFF) analysis. METHOD Forty individuals with voice disorders associated with vocal hyperfunction and 20 age- and sex-matched control participants were recorded with a subglottal neck-surface accelerometer and a microphone while producing speech stimuli appropriate for RFF. Rater reliabilities, RFF means, and RFF standard deviations derived from the accelerometer were compared with those derived from the microphone. RESULTS RFF estimated from the accelerometer had slightly higher intrarater reliability and identical interrater reliability compared with values estimated with the microphone. Although sensor type and the Vocal Cycle × Sensor and Vocal Cycle × Sensor × Group interactions showed significant effects on RFF means, the typical RFF pattern could be derived from either sensor. For both sensors, the RFF of individuals with vocal hyperfunction was lower than that of the controls. Sensor type and its interactions did not have significant effects on RFF standard deviations. CONCLUSIONS RFF can be reliably estimated using an accelerometer, but these values cannot be compared with those collected via microphone. Future studies are needed to determine the physiological basis of RFF and examine the effect of sensors on RFF in practical voice assessment and monitoring settings.
Collapse
Affiliation(s)
| | | | | | | | - Jarrad H. Van Stan
- MGH Institute of Health Professions, Boston, MA
- Center for Laryngeal Surgery & Voice Rehabilitation, Massachusetts General Hospital, Boston
| | - Daryush D. Mehta
- MGH Institute of Health Professions, Boston, MA
- Center for Laryngeal Surgery & Voice Rehabilitation, Massachusetts General Hospital, Boston
- Harvard Medical School, Cambridge, MA
| | - Robert E. Hillman
- MGH Institute of Health Professions, Boston, MA
- Center for Laryngeal Surgery & Voice Rehabilitation, Massachusetts General Hospital, Boston
- Harvard Medical School, Cambridge, MA
| | - J. Pieter Noordzij
- Boston University, Boston, MA
- Boston University School of Medicine, Boston, MA
| | - Cara E. Stepp
- Boston University, Boston, MA
- Boston University School of Medicine, Boston, MA
| |
Collapse
|
13
|
Van Stan JH, Mehta DD, Zeitels SM, Burns JA, Barbu AM, Hillman RE. Average Ambulatory Measures of Sound Pressure Level, Fundamental Frequency, and Vocal Dose Do Not Differ Between Adult Females With Phonotraumatic Lesions and Matched Control Subjects. Ann Otol Rhinol Laryngol 2015; 124:864-74. [PMID: 26024911 DOI: 10.1177/0003489415589363] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
OBJECTIVES Clinical management of phonotraumatic vocal fold lesions (nodules, polyps) is based largely on assumptions that abnormalities in habitual levels of sound pressure level (SPL), fundamental frequency (f0), and/or amount of voice use play a major role in lesion development and chronic persistence. This study used ambulatory voice monitoring to evaluate if significant differences in voice use exist between patients with phonotraumatic lesions and normal matched controls. METHODS Subjects were 70 adult females: 35 with vocal fold nodules or polyps and 35 age-, sex-, and occupation-matched normal individuals. Weeklong summary statistics of voice use were computed from anterior neck surface acceleration recorded using a smartphone-based ambulatory voice monitor. RESULTS Paired t tests and Kolmogorov-Smirnov tests resulted in no statistically significant differences between patients and matched controls regarding average measures of SPL, f0, vocal dose measures, and voicing/voice rest periods. Paired t tests comparing f0 variability between the groups resulted in statistically significant differences with moderate effect sizes. CONCLUSIONS Individuals with phonotraumatic lesions did not exhibit differences in average ambulatory measures of vocal behavior when compared with matched controls. More refined characterizations of underlying phonatory mechanisms and other potentially contributing causes are warranted to better understand risk factors associated with phonotraumatic lesions.
Collapse
Affiliation(s)
- Jarrad H Van Stan
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts, USA MGH Institute of Health Professions, Boston, Massachusetts, USA
| | - Daryush D Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts, USA MGH Institute of Health Professions, Boston, Massachusetts, USA Department of Surgery, Harvard Medical School, Boston, Massachusetts, USA
| | - Steven M Zeitels
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts, USA Department of Surgery, Harvard Medical School, Boston, Massachusetts, USA
| | - James A Burns
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts, USA Department of Surgery, Harvard Medical School, Boston, Massachusetts, USA
| | - Anca M Barbu
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts, USA Department of Surgery, Harvard Medical School, Boston, Massachusetts, USA
| | - Robert E Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts, USA MGH Institute of Health Professions, Boston, Massachusetts, USA Department of Surgery, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
14
|
Effects of Adventitious Acute Vocal Trauma: Relative Fundamental Frequency and Listener Perception. J Voice 2015; 30:177-85. [PMID: 26028369 DOI: 10.1016/j.jvoice.2015.04.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2014] [Accepted: 04/08/2015] [Indexed: 11/23/2022]
Abstract
OBJECTIVE High voice users (individuals who demonstrate excessive or loud vocal use) are at risk for developing voice disorders. The objective of this study was to examine, both acoustically and perceptually, vocal changes in healthy speakers after an acute period of high voice use. METHODS Members of a university women's volleyball team (n = 12) were recorded a week before (pre) and week after (post) the 10-week spring season; n = 6 control speakers were recorded over the same time period for comparison. Speakers read four sentences, which were analyzed for relative fundamental frequency (RFF). Eight naïve listeners participated in an auditory-perceptual visual sort and rate (VSR) task, in which they rated each voice sample's overall severity and strain. RESULTS No significant differences were found as a function of time point in the VSR ratings for the volleyball group. Onset cycle 1 RFF values were significantly lower (P = 0.04) in the postrecordings of the volleyball participants compared with prerecordings, but there was no significant difference (P = 0.20) in offset cycle 10 RFF values. Receiver operating characteristic analyses indicated moderate sensitivity and specificity of onset cycle 1 RFF for discrimination between the volleyball and control participants. Changes were not apparent in the control group as a function of time for either, onset cycle 1 RFF, offset cycle 10 RFF, or either vocal attribute. CONCLUSIONS Onset cycle 1 RFF may be an effective marker for detecting vocal changes over an acute high voice use period of time before perceptual changes are noted.
Collapse
|