1
|
Calvache C, Castillo-Triana N, Aguirre FD, Leguízamo P, Rojas S, Valenzuela P, Piedrahita MM, Ardila MDPR, Pérez DVB. Integration of Dysphagia Therapy Techniques into Voice Rehabilitation: Design and Content Validation of a Cross-Therapy Protocol. J Voice 2024:S0892-1997(24)00235-2. [PMID: 39244386 DOI: 10.1016/j.jvoice.2024.07.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 06/06/2024] [Accepted: 07/22/2024] [Indexed: 09/09/2024]
Abstract
BACKGROUND The intricate relationship between swallowing and phonation, sharing anatomical and physiological substrates, underscores a clinical demand for integrated therapeutic approaches. Existing interventions often address these functions in isolation, overlooking their interconnected dynamics. OBJECTIVE To design and validate a cross-therapy protocol incorporating dysphagia therapy techniques (maneuvers/exercises) into voice rehabilitation. This protocol aims to exploit the shared biomechanical components of swallowing and phonation to improve both functions simultaneously in patients with underlying hypofunctional laryngeal pathology. METHODS A descriptive research design was employed, consisting of three phases: a comprehensive literature review and expert discussions in a German seminar format to conceptualize the protocol; detailed analysis and categorization of swallowing maneuvers/exercises; and content validation by a panel of seven experts through a structured evaluation instrument. The process integrated motor learning and exercise physiology principles to ensure the protocol's clinical applicability and theoretical coherence. RESULTS The developed cross-therapy protocol incorporates four core swallowing therapy techniques to voice therapy procedures. Selected swallowing therapy techniques target laryngeal excursion and vocal fold closure because they are critical components of swallowing and phonation. Expert validation yielded a Content Validity Coefficient exceeding 0.90 for most items, indicating high consensus on the protocol's relevance, clarity, and applicability. Adjustments were made based on feedback, enhancing the protocol's precision and user-friendliness. CONCLUSION We present a novel, evidence-based therapy protocol for voice and swallowing difficulties resulting from hypofunctional laryngeal pathology. Its development marks a significant step toward bridging the gap between swallowing and voice therapy. Future empirical studies are needed to assess its effectiveness in clinical settings.
Collapse
Affiliation(s)
- Carlos Calvache
- Corporación Universitaria Iberoamericana, Department Communication Sciences and Disorders, Bogotá, Colombia; Vocology Research, Vocology Center, Bogotá, Colombia.
| | - Nicolás Castillo-Triana
- Corporación Universitaria Iberoamericana, Department Communication Sciences and Disorders, Bogotá, Colombia
| | - Fernando Delprado Aguirre
- Vocology Research, Vocology Center, Bogotá, Colombia; Fundación Universitaria María Cano, Speech Therapy Program, Medellín, Colombia
| | - Paola Leguízamo
- Escuela Colombiana de Rehabilitación, Speech Therapy Program, Bogotá, Colombia
| | - Sandra Rojas
- Escuela de Fonoaudiología, Facultad de Odontología y Ciencias de la Rehabilitación, Universidad San Sebastián, Santiago, Chile
| | | | | | | | | |
Collapse
|
2
|
Yamauchi A, Imagawa H, Yokonishi H, Sakakibara KI, Tayama N. Multivariate Analysis of Vocal Fold Vibrations in Normal Speakers Using High-Speed Digital Imaging. J Voice 2024; 38:10-17. [PMID: 34470706 DOI: 10.1016/j.jvoice.2021.08.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2021] [Revised: 07/30/2021] [Accepted: 08/02/2021] [Indexed: 11/18/2022]
Abstract
INTRODUCTION Little is known about the normal variations in vocal fold vibrations. We conducted a prospective study on normal subjects using high-speed digital imaging (HSDI) to elucidate key parameters regarding age/gender-related normal variations. METHODS Forty-six healthy adult volunteers were divided into young (aged ≤35 years) male, young female, elderly (aged ≥65 years) male, and elderly female subgroups. HSDI data of sustained phonation of /i/ at a comfortable pitch and loudness were obtained, and vibratory parameters were calculated using the visual-perceptual rating, laryngotopography, digital kymography, and glottal area waveform. Multivariate analysis was then performed on these parameters to clarify the subgroup-specific key parameters. RESULTS Four key parameters were identified from a total of 83: one from visual perceptual rating and three from laryngotopography. Subgroup analyses showed that posterior-to-anterior longitudinal phase difference (PD) and high fundamental frequency (F0) were specific to young female participants. A low F0 was specific to young male participants. Large anterior-to-posterior longitudinal PD and its left-right difference were specific to elderly male participants. There were no key parameters for elderly female participants. CONCLUSIONS Methods that can assess F0 and longitudinal PD, such as visual-perceptual rating and laryngotopography, were effective in the evaluation of normal vocal fold vibrations and their variations.
Collapse
Affiliation(s)
- Akihito Yamauchi
- Department of Otolaryngology, The University of Tokyo Hospital, Bunkyo-Ku, Tokyo, Japan.
| | - Hiroshi Imagawa
- Department of Otolaryngology, The University of Tokyo Hospital, Bunkyo-Ku, Tokyo, Japan
| | - Hisayuki Yokonishi
- Department of Otolaryngology, Tokyo Metropolitan Bokutoh Hospital, Sumida-Ku, Tokyo, Japan
| | - Ken-Ichi Sakakibara
- Department of Communication Disorders, Health Sciences University of Hokkaido, Ishikari-Gun, Hokkaido, Japan
| | - Niro Tayama
- Department of Otolaryngology and Tracheo-esophagology, National Center for Global Health and Medicine, Shinjuku-Ku, Tokyo, Japan
| |
Collapse
|
3
|
Sarin V, Sarin BC, Chatterjee A, Juneja A. Muscle Tension Dysphonia: A Sequeale of Chemoradiotherapy in Patients of Head and Neck Cancer. Indian J Otolaryngol Head Neck Surg 2023; 75:1405-1413. [PMID: 37636687 PMCID: PMC10447776 DOI: 10.1007/s12070-023-03577-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 02/09/2023] [Indexed: 02/23/2023] Open
Abstract
It's very important to demarcate that voice is the production of sound by the larynx while speech is articulation of the produced sound by tongue movements, soft palate and the lips. Mucositis, dysphagia, change in speech and voice are the common sequelae of Radiotherapy (RT) alone or in combination with chemotherapy (CRT) which is commonly used in the treatment of head and neck cancer (HNC). The aim of this study was to investigate the patient-reported voice impairment among non laryngeal head and neck cancer survivors who were treated with curative RT/CRT with or without surgery. This tertiary institutional assessor blinded study consists of a study cohort of 128 patients who after of completion of treatment for HNC reported to the laryngology clinic for voice complaints and throat discomfort. The assessment included laryngeal endoscopic and stroboscopic imaging, acoustics assessment and VHI (Vocal handicap index). This study cohort consisted of 89.8% males and 11.2% females. There was hyperadduction and strain of ventricular bands in almost all the cases. There was hyperactivity and compression of both true and false cords in 80.5% of the cases. DSI impairment level showed significant association with gender, VHI, GRBAS score and RT/CRT and it did not show significant association with smoking and surgery, while VHI showed significant association with DSI and RT/CRT and it did not show significant association with gender, smoking and surgery. Muscle tension is a very common effect of RT/RCT and dysphonia can be easily associated with it. Future research needs to focus on specific voice treatment regimens in HNC treated with RT/CRT to improve the quality of life of these patients.
Collapse
Affiliation(s)
- Vanita Sarin
- Department of Otorhinolaryngology, Sri Guru Ram Das Institute of Medical Sciences and Research, Amritsar, India
| | - B. C. Sarin
- Department of TB and Respiratory Diseases, Sri Guru Ram Das Institute of Medical Sciences and Research, Amritsar, India
| | - Arpita Chatterjee
- Department of Otorhinolaryngology, Sri Guru Ram Das Institute of Medical Sciences and Research, Amritsar, India
| | - Ateev Juneja
- Sri Guru Ram Das Institute of Medical Sciences and Research, Amritsar, India
| |
Collapse
|
4
|
Differences Among Mixed, Chest, and Falsetto Registers: A Multiparametric Study. J Voice 2023; 37:298.e11-298.e29. [PMID: 33518476 DOI: 10.1016/j.jvoice.2020.12.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 12/23/2020] [Accepted: 12/28/2020] [Indexed: 11/23/2022]
Abstract
INTRODUCTION Typical singing registers are the chest and falsetto; however, trained singers have an additional register, namely, the mixed register. The mixed register, which is also called "mixed voice" or "mix," is an important technique for singers, as it can help bridge from the chest voice to falsetto without noticeable voice breaks. OBJECTIVE The present study aims to reveal the nature of the voice-production mechanism of the different registers (chest, mix, and falsetto) using high-speed digital imaging (HSDI), electroglottography (EGG), and acoustic and aerodynamic measurements. STUDY DESIGN Cross-sectional study. METHODS Aerodynamic measurements were acquired for twelve healthy singers (six men and women) during the phonation of a variety of pitches using three registers. HSDI and EGG devices were simultaneously used on three healthy singers (two men and one woman) from which an open quotient (OQ) and speed quotient (SQ) were detected. Audio signals were recorded for five sustained vowels, and a spectral analysis was conducted to determine the amplitude of each harmonic component. Furthermore, the absolute (not relative) value of the glottal volume flow was estimated by integrating data obtained from the HSDI and aerodynamic studies. RESULTS For all singers, the subglottal pressure (PSub) was the highest for the chest in the three registers, and the mean flow rate (MFR) was the highest for the falsetto. Conversely, the PSub of the mix was as low as the falsetto, and the MFR of the mix was as low as the chest. The HSDI analysis showed that the OQ differed significantly among the registers, even when the fundamental frequency was the same; the OQ of the mix was higher than that of the chest but lower than that of the falsetto. The acoustic analysis showed that, for the mix, the harmonic structure was intermediate between the chest and falsetto. The results of the glottal volume-flow analysis revealed that the maximum volume velocity was the least for the mix register at every fundamental frequency. The first and second harmonic (H1-H2) difference of the voice source spectrum was the greatest for the falsetto, then the mix, and finally, the chest. CONCLUSIONS We found differences in the registers in terms of the aeromechanical mechanisms and vibration patterns of the vocal folds. The mixed register proved to have a distinct voice-production mechanism, which can be differentiated from those of the chest or falsetto registers.
Collapse
|
5
|
Döllinger M, Schraut T, Henrich LA, Chhetri D, Echternach M, Johnson AM, Kunduk M, Maryn Y, Patel RR, Samlan R, Semmler M, Schützenberger A. Re-Training of Convolutional Neural Networks for Glottis Segmentation in Endoscopic High-Speed Videos. APPLIED SCIENCES (BASEL, SWITZERLAND) 2022; 12:9791. [PMID: 37583544 PMCID: PMC10427138 DOI: 10.3390/app12199791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Endoscopic high-speed video (HSV) systems for visualization and assessment of vocal fold dynamics in the larynx are diverse and technically advancing. To consider resulting "concepts shifts" for neural network (NN)-based image processing, re-training of already trained and used NNs is necessary to allow for sufficiently accurate image processing for new recording modalities. We propose and discuss several re-training approaches for convolutional neural networks (CNN) being used for HSV image segmentation. Our baseline CNN was trained on the BAGLS data set (58,750 images). The new BAGLS-RT data set consists of additional 21,050 images from previously unused HSV systems, light sources, and different spatial resolutions. Results showed that increasing data diversity by means of preprocessing already improves the segmentation accuracy (mIoU + 6.35%). Subsequent re-training further increases segmentation performance (mIoU + 2.81%). For re-training, finetuning with dynamic knowledge distillation showed the most promising results. Data variety for training and additional re-training is a helpful tool to boost HSV image segmentation quality. However, when performing re-training, the phenomenon of catastrophic forgetting should be kept in mind, i.e., adaption to new data while forgetting already learned knowledge.
Collapse
Affiliation(s)
- Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany
| | - Tobias Schraut
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany
| | - Lea A. Henrich
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany
| | - Dinesh Chhetri
- Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), 80331 Munich, Germany
| | - Aaron M. Johnson
- NYU Voice Center, Department of Otolaryngology–Head and Neck Surgery, New York University, Grossman School of Medicine, New York, NY 10001, USA
| | - Melda Kunduk
- Department of Communication Sciences and Disorders, Louisiana State University, Baton Rouge, LA 70801, USA
| | - Youri Maryn
- Department of Speech, Language and Hearing Sciences, University of Ghent, 9000 Ghent, Belgium
| | - Rita R. Patel
- Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IA 47401, USA
| | - Robin Samlan
- Department of Speech, Language, & Hearing Sciences, University of Arizona, Tucson, AZ 85641, USA
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany
| | - Anne Schützenberger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhino-laryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, 91054 Erlangen, Germany
| |
Collapse
|
6
|
Nogueira do Nascimento U, Santos MAR, Gama ACC. Analysis of the Immediate Effects of the LaxVox Technique on Digital Videokymography Parameters in Adults With Voice Complaints. J Voice 2022:S0892-1997(22)00026-1. [PMID: 35256223 DOI: 10.1016/j.jvoice.2022.01.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 01/24/2022] [Accepted: 01/26/2022] [Indexed: 11/30/2022]
Abstract
OBJECTIVES Digital videokymography based on high-speed videoendoscopy enables the evaluation of therapeutic techniques and voice training, such as the LaxVox technique, on vocal fold vibrations. This study investigated the immediate effects of the LaxVox technique on digital videokymographic parameters obtained through high-speed videolaryngoscopy in adults with voice complaints. STUDY DESIGN An experimental intrasubject comparative study of adults with voice complaints was conducted. METHODS Image processing software was used to analyze the videos and obtain digital videokymography parameters. The intraclass correlation coefficient was used to determine the intra-rater reliability of the analyzed parameters. The paired t test and Wilcoxon signed-rank test were used to compare digital videokymography parameters before and after the LaxVox technique, in sex-specific analyses. The significance level was set at 5%. RESULTS In total, 25 laryngeal images from 15 women and 10 men were analyzed. On digital videokymography analysis, the mean vocal fold opening in the posterior glottal region was decreased immediately after using the LaxVox technique in women. In contrast, no significant changes were found in other parameters compared to pre LaxVox technique values in both men and women with voice complaints. CONCLUSION Digital videokymography analysis revealed that the LaxVox technique reduces the mean vocal fold opening in the posterior glottal region of women with voice complaints.
Collapse
Affiliation(s)
- Ualisson Nogueira do Nascimento
- Federal University of Minas Gerais (UFMG), School of Medicine, Graduate Program in Speech Therapy Sciences, Belo Horizonte, MG, Brazil.
| | | | - Ana Cristina Côrtes Gama
- Federal University of Minas Gerais (UFMG), School of Medicine, Graduate Program in Speech Therapy Sciences, Belo Horizonte, MG, Brazil
| |
Collapse
|
7
|
Yamauchi A, Imagawa H, Yokonishi H, Sakakibara KI, Tayama N. Multivariate Analysis of Vocal Fold Vibrations in Normal Speakers Using High-Speed Digital Imaging. J Voice 2021; 11:S0892-1997(21)00252-6. [PMID: 34470706 DOI: 10.3390/app11146284] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2021] [Revised: 07/30/2021] [Accepted: 08/02/2021] [Indexed: 05/29/2023]
Abstract
INTRODUCTION Little is known about the normal variations in vocal fold vibrations. We conducted a prospective study on normal subjects using high-speed digital imaging (HSDI) to elucidate key parameters regarding age/gender-related normal variations. METHODS Forty-six healthy adult volunteers were divided into young (aged ≤35 years) male, young female, elderly (aged ≥65 years) male, and elderly female subgroups. HSDI data of sustained phonation of /i/ at a comfortable pitch and loudness were obtained, and vibratory parameters were calculated using the visual-perceptual rating, laryngotopography, digital kymography, and glottal area waveform. Multivariate analysis was then performed on these parameters to clarify the subgroup-specific key parameters. RESULTS Four key parameters were identified from a total of 83: one from visual perceptual rating and three from laryngotopography. Subgroup analyses showed that posterior-to-anterior longitudinal phase difference (PD) and high fundamental frequency (F0) were specific to young female participants. A low F0 was specific to young male participants. Large anterior-to-posterior longitudinal PD and its left-right difference were specific to elderly male participants. There were no key parameters for elderly female participants. CONCLUSIONS Methods that can assess F0 and longitudinal PD, such as visual-perceptual rating and laryngotopography, were effective in the evaluation of normal vocal fold vibrations and their variations.
Collapse
Affiliation(s)
- Akihito Yamauchi
- Department of Otolaryngology, The University of Tokyo Hospital, Bunkyo-Ku, Tokyo, Japan.
| | - Hiroshi Imagawa
- Department of Otolaryngology, The University of Tokyo Hospital, Bunkyo-Ku, Tokyo, Japan
| | - Hisayuki Yokonishi
- Department of Otolaryngology, Tokyo Metropolitan Bokutoh Hospital, Sumida-Ku, Tokyo, Japan
| | - Ken-Ichi Sakakibara
- Department of Communication Disorders, Health Sciences University of Hokkaido, Ishikari-Gun, Hokkaido, Japan
| | - Niro Tayama
- Department of Otolaryngology and Tracheo-esophagology, National Center for Global Health and Medicine, Shinjuku-Ku, Tokyo, Japan
| |
Collapse
|
8
|
Schlegel P, Kist AM, Kunduk M, Dürr S, Döllinger M, Schützenberger A. Interdependencies between acoustic and high-speed videoendoscopy parameters. PLoS One 2021; 16:e0246136. [PMID: 33529244 PMCID: PMC7853476 DOI: 10.1371/journal.pone.0246136] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Accepted: 01/13/2021] [Indexed: 02/06/2023] Open
Abstract
In voice research, uncovering relations between the oscillating vocal folds, being the sound source of phonation, and the resulting perceived acoustic signal are of great interest. This is especially the case in the context of voice disorders, such as functional dysphonia (FD). We investigated 250 high-speed videoendoscopy (HSV) recordings with simultaneously recorded acoustic signals (124 healthy females, 60 FD females, 44 healthy males, 22 FD males). 35 glottal area waveform (GAW) parameters and 14 acoustic parameters were calculated for each recording. Linear and non-linear relations between GAW and acoustic parameters were investigated using Pearson correlation coefficients (PCC) and distance correlation coefficients (DCC). Further, norm values for parameters obtained from 250 ms long sustained phonation data (vowel /i/) were provided. 26 PCCs in females (5.3%) and 8 in males (1.6%) were found to be statistically significant (|corr.| ≥ 0.3). Only minor differences were found between PCCs and DCCs, indicating presence of weak non-linear dependencies between parameters. Fundamental frequency was involved in the majority of all relevant PCCs between GAW and acoustic parameters (19 in females and 7 in males). The most distinct difference between correlations in females and males was found for the parameter Period Variability Index. The study shows only weak relations between investigated acoustic and GAW-parameters. This indicates that the reduction of the complex 3D glottal dynamics to the 1D-GAW may erase laryngeal dynamic characteristics that are reflected within the acoustic signal. Hence, other GAW parameters, 2D-, 3D-laryngeal dynamics and vocal tract parameters should be further investigated towards potential correlations to the acoustic signal.
Collapse
Affiliation(s)
- Patrick Schlegel
- Department of Head & Neck Surgery, David Geffen School of Medicine, University of California Los Angeles (UCLA), Los Angeles, California, United States of America
- Dep. of Otorhinolaryngology, Div. of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
- * E-mail:
| | - Andreas M. Kist
- Dep. of Otorhinolaryngology, Div. of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Melda Kunduk
- Dep. of Communication Sciences and Disorders, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Stephan Dürr
- Dep. of Otorhinolaryngology, Div. of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Michael Döllinger
- Dep. of Otorhinolaryngology, Div. of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Anne Schützenberger
- Dep. of Otorhinolaryngology, Div. of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
9
|
Abstract
This review provides a comprehensive compilation, from a digital image processing point of view of the most important techniques currently developed to characterize and quantify the vibration behaviour of the vocal folds, along with a detailed description of the laryngeal image modalities currently used in the clinic. The review presents an overview of the most significant glottal-gap segmentation and facilitative playbacks techniques used in the literature for the mentioned purpose, and shows the drawbacks and challenges that still remain unsolved to develop robust vocal folds vibration function analysis tools based on digital image processing.
Collapse
|
10
|
Fehling MK, Grosch F, Schuster ME, Schick B, Lohscheller J. Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network. PLoS One 2020; 15:e0227791. [PMID: 32040514 PMCID: PMC7010264 DOI: 10.1371/journal.pone.0227791] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 12/25/2019] [Indexed: 01/22/2023] Open
Abstract
The objective investigation of the dynamic properties of vocal fold vibrations demands the recording and further quantitative analysis of laryngeal high-speed video (HSV). Quantification of the vocal fold vibration patterns requires as a first step the segmentation of the glottal area within each video frame from which the vibrating edges of the vocal folds are usually derived. Consequently, the outcome of any further vibration analysis depends on the quality of this initial segmentation process. In this work we propose for the first time a procedure to fully automatically segment not only the time-varying glottal area but also the vocal fold tissue directly from laryngeal high-speed video (HSV) using a deep Convolutional Neural Network (CNN) approach. Eighteen different Convolutional Neural Network (CNN) network configurations were trained and evaluated on totally 13,000 high-speed video (HSV) frames obtained from 56 healthy and 74 pathologic subjects. The segmentation quality of the best performing Convolutional Neural Network (CNN) model, which uses Long Short-Term Memory (LSTM) cells to take also the temporal context into account, was intensely investigated on 15 test video sequences comprising 100 consecutive images each. As performance measures the Dice Coefficient (DC) as well as the precisions of four anatomical landmark positions were used. Over all test data a mean Dice Coefficient (DC) of 0.85 was obtained for the glottis and 0.91 and 0.90 for the right and left vocal fold (VF) respectively. The grand average precision of the identified landmarks amounts 2.2 pixels and is in the same range as comparable manual expert segmentations which can be regarded as Gold Standard. The method proposed here requires no user interaction and overcomes the limitations of current semiautomatic or computational expensive approaches. Thus, it allows also for the analysis of long high-speed video (HSV)-sequences and holds the promise to facilitate the objective analysis of vocal fold vibrations in clinical routine. The here used dataset including the ground truth will be provided freely for all scientific groups to allow a quantitative benchmarking of segmentation approaches in future.
Collapse
Affiliation(s)
- Mona Kirstin Fehling
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, Trier, Germany
| | - Fabian Grosch
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, Trier, Germany
| | - Maria Elke Schuster
- Department of Otorhinolaryngology and Head and Neck Surgery, University of Munich, Campus Grosshadern, München, Germany
| | - Bernhard Schick
- Department of Otorhinolaryngology, Saarland University Hospital, Homburg/Saar, Germany
| | - Jörg Lohscheller
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, Trier, Germany
| |
Collapse
|
11
|
Herbst CT, Dunn JC. Fundamental Frequency Estimation of Low-quality Electroglottographic Signals. J Voice 2019; 33:401-411. [DOI: 10.1016/j.jvoice.2018.01.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 01/04/2018] [Indexed: 11/16/2022]
|
12
|
Sauder C, Nevdahl M, Kapsner-Smith M, Merati A, Eadie T. Does the accuracy of case history affect interpretation of videolaryngostroboscopic exams? Laryngoscope 2019; 130:718-725. [PMID: 31124157 DOI: 10.1002/lary.28081] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 04/02/2019] [Accepted: 05/06/2019] [Indexed: 11/06/2022]
Abstract
OBJECTIVE To determine the effect of initial diagnostic hypotheses on clinicians' 1) detection and perceived severity of abnormalities, and 2) clinical impressions and treatment recommendations for individuals with and without voice disorders following interpretation of videolaryngostroboscopy (VLS). METHODS Thirty-two experienced speech-language pathologists and otolaryngologists specializing in voice disorders read case histories prior to interpreting exams. Case histories suggested specific accurate or inaccurate laryngeal diagnoses, or a control scenario that suggested a normal larynx. The effects of the accuracy of case histories on perceived severity of associated visual-perceptual parameters, clinical impressions, and treatment recommendations were examined. RESULTS Significant increases in perceived severity of posterior laryngeal appearance (P < 0.05) and mucosal wave (P < 0.02) were observed when these abnormalities were suggested by case histories. Overall agreement with clinical impressions improved from 49% to 72% when the case history was consistent with the examination. Case histories (accurate and inaccurate) indicating voice symptoms predicted recommendations for treatment above and beyond that of VLS presentation alone, P < 0.001. CONCLUSION Case histories suggesting specific abnormalities significantly affected severity ratings for two of three associated visual-perceptual parameters selected as primary outcome measures. Accurate case histories suggesting specific abnormalities increased the probability of detection and perceived severity. Inaccurate case histories led to false-positive findings and failures to detect abnormalities or to interpret them as less severe. Case histories affected visual-perceptual judgments and contributed to decisions about clinical impressions and treatment. LEVEL OF EVIDENCE 2b Laryngoscope, 130:718-725, 2020.
Collapse
Affiliation(s)
- Cara Sauder
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington, U.S.A
| | - Martin Nevdahl
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington, U.S.A
| | - Mara Kapsner-Smith
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington, U.S.A
| | - Albert Merati
- Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, Washington, U.S.A
| | - Tanya Eadie
- Department of Speech and Hearing Sciences, University of Washington, Seattle, Washington, U.S.A.,Department of Otolaryngology-Head and Neck Surgery, University of Washington, Seattle, Washington, U.S.A
| |
Collapse
|
13
|
Influence of spatial camera resolution in high-speed videoendoscopy on laryngeal parameters. PLoS One 2019; 14:e0215168. [PMID: 31009488 PMCID: PMC6476512 DOI: 10.1371/journal.pone.0215168] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 03/27/2019] [Indexed: 11/19/2022] Open
Abstract
In laryngeal high-speed videoendoscopy (HSV) the area between the vibrating vocal folds during phonation is of interest, being referred to as glottal area waveform (GAW). Varying camera resolution may influence parameters computed on the GAW and hence hinder the comparability between examinations. This study investigates the influence of spatial camera resolution on quantitative vocal fold vibratory function parameters obtained from the GAW. In total 40 HSV recordings during sustained phonation (20 healthy males and 20 healthy females) were investigated. A clinically used Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512×256 pixels was applied. This initial resolution was reduced by pixel averaging to (1) a resolution of 256×128 and (2) to a resolution of 128×64 pixels, yielding three sets of recordings. The GAW was extracted and in total 50 vocal fold vibratory parameters representing different features of the GAW were computed. Statistical analyses using SPSS Statistics, version 21, was performed. 15 Parameters showing strong mathematical dependencies with other parameters were excluded from the main analysis but are given in the Supporting Information. Data analysis revealed clear influence of spatial resolution on GAW parameters. Fundamental period measures and period perturbation measures were the least affected. Amplitude perturbation measures and mechanical measures were most strongly influenced. Most glottal dynamic characteristics and symmetry measures deviated significantly. Most energy perturbation measures changed significantly in males but were mostly unaffected in females. In females 18 of 35 remaining parameters (51%) and in males 22 parameters (63%) changed significantly between spatial resolutions. This work represents the first step in studying the impact of video resolution on quantitative HSV parameters. Clear influences of spatial camera resolution on computed parameters were found. The study results suggest avoiding the use of the most strongly affected parameters. Further, the use of cameras with high resolution is recommended to analyze GAW measures in HSV data.
Collapse
|
14
|
Powell ME, Deliyski DD, Zeitels SM, Burns JA, Hillman RE, Gerlach TT, Mehta DD. Efficacy of Videostroboscopy and High-Speed Videoendoscopy to Obtain Functional Outcomes From Perioperative Ratings in Patients With Vocal Fold Mass Lesions. J Voice 2019; 34:769-782. [PMID: 31005449 DOI: 10.1016/j.jvoice.2019.03.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Revised: 03/20/2019] [Accepted: 03/21/2019] [Indexed: 11/30/2022]
Abstract
OBJECTIVES A major limitation of comparing the efficacy of videostroboscopy (VS) and high-speed videoendoscopy (HSV) is the lack of an objective reference by which to compare the functional assessment ratings of the two techniques. For patients with vocal fold mass lesions, intraoperative measures of lesion size and depth may serve as this objective reference. This study compared the relationships between the pre- to postoperative change in VS and HSV visual-perceptual ratings to intraoperative measures of lesion size and depth. DESIGN Prospective visual-perceptual study with intraoperative measures of lesion size and depth. METHODS VS and HSV samples were obtained preoperatively and postoperatively from 28 patients with vocal fold lesions and from 17 vocally healthy controls. Two experienced clinicians rated amplitude, mucosal wave, vertical phase difference, left-right phase asymmetry, and vocal fold edge on a visual-analog scale using both imaging techniques. The change in perioperative ratings from VS and HSV was compared between groups and correlated to intraoperative measures of lesion size and depth. RESULTS HSV was as reliable as VS for ratings of amplitude and edge, and substantially more reliable for ratings of mucosal wave and left-right phase asymmetry. Both VS and HSV had mild-moderate correlations between change in perioperative ratings and intraoperative measures of lesion area. Change in function could be obtained in more patients and for more parameters using HSV than VS. Group differences were noted for postoperative ratings of amplitude and edge; however, these differences were within one level of the visual-perceptual rating scale. The presence of asynchronicity in VS recordings renders vibratory features either uninterpretable or potentially distorted and thus should not be rated. CONCLUSIONS Amplitude and edge are robust vibratory measures for perioperative functional assessment, regardless of imaging modality. HSV is indicated for evaluation of subepithelial lesions or if asynchronicity is present in the VS image sequence.
Collapse
Affiliation(s)
- Maria E Powell
- Department of Otolaryngology, Vanderbilt University Medical Center, Nashville, Tennessee; Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio; Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio.
| | - Dimitar D Deliyski
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio; Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio; Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan
| | - Steven M Zeitels
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts; Department of Surgery, Harvard Medical School, Boston, Massachusetts
| | - James A Burns
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts; Department of Surgery, Harvard Medical School, Boston, Massachusetts
| | - Robert E Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts; Department of Surgery, Harvard Medical School, Boston, Massachusetts; Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, Massachusetts
| | - Terri Treman Gerlach
- Voice and Swallowing Center, Charlotte Eye Ear Nose and Throat Associates, Charlotte, North Carolina
| | - Daryush D Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts; Department of Surgery, Harvard Medical School, Boston, Massachusetts; Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, Massachusetts
| |
Collapse
|
15
|
Kumar SP, Švec JG. Kinematic model for simulating mucosal wave phenomena on vocal folds. Biomed Signal Process Control 2019. [DOI: 10.1016/j.bspc.2018.12.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
16
|
Influence of Analyzed Sequence Length on Parameters in Laryngeal High-Speed Videoendoscopy. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8122666] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Laryngeal high-speed videoendoscopy (HSV) allows objective quantification of vocal fold vibratory characteristics. However, it is unknown how the analyzed sequence length affects some of the computed parameters. To examine if varying sequence lengths influence parameter calculation, 20 HSV recordings of healthy females during sustained phonation were investigated. The clinical prevalent Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512 × 256 pixels was used to collect HSV data. The glottal area waveform (GAW), describing the increase and decrease of the area between the vocal folds during phonation, was extracted. Based on the GAW, 16 perturbation parameters were computed for sequences of 5, 10, 20, 50 and 100 consecutive cycles. Statistical analysis was performed using SPSS Statistics, version 21. Only three parameters (18.8%) were statistically significantly influenced by changing sequence lengths. Of these parameters, one changed until 10 cycles were reached, one until 20 cycles were reached and one, namely Amplitude Variability Index (AVI), changed between almost all groups of different sequence lengths. Moreover, visually observable, but not statistically significant, changes within parameters were observed. These changes were often most prominent between shorter sequence lengths. Hence, we suggest using a minimum sequence length of at least 20 cycles and discarding the parameter AVI.
Collapse
|
17
|
Patel RR, Awan SN, Barkmeier-Kraemer J, Courey M, Deliyski D, Eadie T, Paul D, Švec JG, Hillman R. Recommended Protocols for Instrumental Assessment of Voice: American Speech-Language-Hearing Association Expert Panel to Develop a Protocol for Instrumental Assessment of Vocal Function. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2018; 27:887-905. [PMID: 29955816 DOI: 10.1044/2018_ajslp-17-0009] [Citation(s) in RCA: 355] [Impact Index Per Article: 59.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 02/17/2018] [Indexed: 05/09/2023]
Abstract
PURPOSE The aim of this study was to recommend protocols for instrumental assessment of voice production in the areas of laryngeal endoscopic imaging, acoustic analyses, and aerodynamic procedures, which will (a) improve the evidence for voice assessment measures, (b) enable valid comparisons of assessment results within and across clients and facilities, and (c) facilitate the evaluation of treatment efficacy. METHOD Existing evidence was combined with expert consensus in areas with a lack of evidence. In addition, a survey of clinicians and a peer review of an initial version of the protocol via VoiceServe and the American Speech-Language-Hearing Association's Special Interest Group 3 (Voice and Voice Disorders) Community were used to create the recommendations for the final protocols. RESULTS The protocols include recommendations regarding technical specifications for data acquisition, voice and speech tasks, analysis methods, and reporting of results for instrumental evaluation of voice production in the areas of laryngeal endoscopic imaging, acoustics, and aerodynamics. CONCLUSION The recommended protocols for instrumental assessment of voice using laryngeal endoscopic imaging, acoustic, and aerodynamic methods will enable clinicians and researchers to collect a uniform set of valid and reliable measures that can be compared across assessments, clients, and facilities.
Collapse
Affiliation(s)
- Rita R Patel
- Department of Speech and Hearing Sciences, Indiana University, Bloomington
| | - Shaheen N Awan
- Department of Audiology and Speech-Language Pathology, Bloomsburg University of Pennsylvania
| | | | - Mark Courey
- Otolaryngology, The Mount Sinai Hospital, New York Eye and Ear Infirmary of Mount Sinai
| | - Dimitar Deliyski
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing
| | - Tanya Eadie
- Department of Speech and Hearing Sciences, University of Washington, Seattle
| | - Diane Paul
- Director, Clinical Issues in Speech-Language Pathology, American Speech-Language-Hearing Association, Rockville, MD
| | - Jan G Švec
- Department of Biophysics, Faculty of Science, Palacký University, Olomouc, Czech Republic
| | - Robert Hillman
- Massachusetts General Hospital, Harvard Medical School, MGH Institute of Health Professions, Boston
| |
Collapse
|
18
|
Herbst CT, Koda H, Kunieda T, Suzuki J, Garcia M, Fitch WT, Nishimura T. Japanese macaque phonatory physiology. ACTA ACUST UNITED AC 2018; 221:jeb.171801. [PMID: 29615529 DOI: 10.1242/jeb.171801] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 03/26/2018] [Indexed: 11/20/2022]
Abstract
Although the call repertoire and its communicative function are relatively well explored in Japanese macaques (Macaca fuscata), little empirical data are available on the physics and the physiology of this species' vocal production mechanism. Here, a 6 year old female Japanese macaque was trained to phonate under an operant conditioning paradigm. The resulting 'coo' calls and spontaneously uttered 'growl' and 'chirp' calls were recorded with sound pressure level (SPL) calibrated microphones and electroglottography (EGG), a non-invasive method for assessing the dynamics of phonation. A total of 448 calls were recorded, complemented by ex vivo recordings on an excised Japanese macaque larynx. In this novel multidimensional investigative paradigm, in vivo and ex vivo data were matched via comparable EGG waveforms. Subsequent analysis suggests that the vocal range (range of fundamental frequency and SPL) of the macaque was comparable to that of a 7-10 year old human, with the exception of low intensity chirps, the production of which may be facilitated by the species' vocal membranes. In coo calls, redundant control of fundamental frequency in relation to SPL was also comparable to that in humans. EGG data revealed that growls, coos and chirps were produced by distinct laryngeal vibratory mechanisms. EGG further suggested changes in the degree of vocal fold adduction in vivo, resulting in spectral variation within the emitted coo calls, ranging from 'breathy' (including aerodynamic noise components) to 'non-breathy'. This is again analogous to humans, corroborating the notion that phonation in humans and non-human primates is based on universal physical and physiological principles.
Collapse
Affiliation(s)
- Christian T Herbst
- Bioacoustics Laboratory, Department of Cognitive Biology, University Vienna, Althanstrasse 14, 1090 Vienna, Austria
| | - Hiroki Koda
- Primate Research Institute, Kyoto University, Inuyama, Aichi 484-8506, Japan
| | - Takumi Kunieda
- Primate Research Institute, Kyoto University, Inuyama, Aichi 484-8506, Japan
| | - Juri Suzuki
- Primate Research Institute, Kyoto University, Inuyama, Aichi 484-8506, Japan
| | - Maxime Garcia
- Bioacoustics Laboratory, Department of Cognitive Biology, University Vienna, Althanstrasse 14, 1090 Vienna, Austria.,ENES Lab, Université Lyon/Saint-Etienne, NEURO-PSI, CNRS UMR 9197, 23 rue Paul Michelon, 42023 Saint-Etienne, France
| | - W Tecumseh Fitch
- Bioacoustics Laboratory, Department of Cognitive Biology, University Vienna, Althanstrasse 14, 1090 Vienna, Austria
| | - Takeshi Nishimura
- Primate Research Institute, Kyoto University, Inuyama, Aichi 484-8506, Japan
| |
Collapse
|
19
|
GARCIA MAXIME, HERBST CHRISTIANT. Excised larynx experimentation: history, current developments, and prospects for bioacoustic research. ANTHROPOL SCI 2018. [DOI: 10.1537/ase.171216] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Affiliation(s)
- MAXIME GARCIA
- ENES Lab, Université Lyon/Saint-Etienne, Neuro-PSI, CNRS UMR 9197, Saint-Etienne
| | | |
Collapse
|
20
|
Herbst CT, Schutte HK, Bowling DL, Svec JG. Comparing Chalk With Cheese—The EGG Contact Quotient Is Only a Limited Surrogate of the Closed Quotient. J Voice 2017; 31:401-409. [DOI: 10.1016/j.jvoice.2016.11.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2016] [Revised: 11/06/2016] [Accepted: 11/08/2016] [Indexed: 10/20/2022]
|
21
|
Zacharias SRC, Brehm SB, Weinrich B, Kelchner L, Tabangin M, de Alarcon A. Feasibility of Clinical Endoscopy and Stroboscopy in Children With Bilateral Vocal Fold Lesions. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2016; 25:598-604. [PMID: 27893084 DOI: 10.1044/2016_ajslp-15-0071] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Accepted: 04/10/2016] [Indexed: 06/06/2023]
Abstract
PURPOSE The purpose of this study was to examine the utility of flexible and rigid endoscopy and stroboscopy for the identification of anatomical and physiological features in children with bilateral vocal fold lesions. The secondary purpose was to describe the age distribution of patients who could tolerate use of the different types of endoscopes. METHOD This cross-sectional clinic-based study included 38 children (ages 5 to 12 years) diagnosed with bilateral vocal fold lesions via videoendoscopy. Vocal fold vibratory characteristics (e.g., mucosal wave) were rated by 4 clinicians by consensus. RESULTS Bilateral vocal fold lesions could be well described anatomically after visualization with both flexible and rigid endoscopes and were most commonly described as symmetrical and broad based. However, the clinicians' confidence in the accuracy of stroboscopy for rating vocal fold vibratory characteristics was limited for both flexible and rigid stroboscopes. CONCLUSIONS Videoendoscopy was adequate for viewing and characterizing anatomical structures of bilateral vocal fold lesions in pediatric patients; however, vibratory characteristics were often not fully visualized with videostroboscopy. In view of the importance of visualizing vocal fold vibration in the differential diagnosis and treatment of vocal fold lesions, other imaging modalities, such as high-speed videoendoscopy, may provide more accurate descriptions of vocal fold vibratory characteristics in this population.
Collapse
Affiliation(s)
- Stephanie R C Zacharias
- Center for Pediatric Voice Disorders, Cincinnati Children's Hospital Medical Center, OHDivision of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, OHUniversity of Cincinnati, OHDivision of Pediatric Otolaryngology-Head and Neck Surgery, Cincinnati Children's Hospital Medical Center, OH
| | - Susan Baker Brehm
- Center for Pediatric Voice Disorders, Cincinnati Children's Hospital Medical Center, OHDivision of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, OHMiami University, Oxford, OH
| | - Barbara Weinrich
- Center for Pediatric Voice Disorders, Cincinnati Children's Hospital Medical Center, OHDivision of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, OHMiami University, Oxford, OH
| | - Lisa Kelchner
- Center for Pediatric Voice Disorders, Cincinnati Children's Hospital Medical Center, OHDivision of Speech-Language Pathology, Cincinnati Children's Hospital Medical Center, OHUniversity of Cincinnati, OH
| | - Meredith Tabangin
- Biostatistics and Epidemiology, Cincinnati Children's Hospital Medical Center, OH
| | - Alessandro de Alarcon
- Center for Pediatric Voice Disorders, Cincinnati Children's Hospital Medical Center, OHUniversity of Cincinnati, OHDivision of Pediatric Otolaryngology-Head and Neck Surgery, Cincinnati Children's Hospital Medical Center, OH
| |
Collapse
|
22
|
Abstract
Objectives: Kymographic imaging through videokymography has been recognized as a convenient, novel way to display laryngeal behavior, yet little systematic research has been done to map the relevant features displayed in such images. Here we have aimed at specification of these features to enable systematic visual characterization and categorization of vocal fold vibratory patterns in voice disorders. Methods: A cross-sectional, descriptive design was used. We selected 45 subjects and extracted 100 videokymographic images from the archive of more than 7,000 videokymographic examinations of subjects with a wide range of voice disorders. The images showed a large variety of vocal fold vibratory behaviors during sustained phonations. We visually identified the prominent features that distinguished the vibration patterns across the images. Results: We divided the findings into 10 feature categories. They included refined traditional features (eg, mucosal waves), as well as additional features that are obscured in strobolaryngoscopy (eg, different types of irregularities, left-right frequency differences, shapes of lateral and medial peaks, cycle aberrations). Conclusions: The variations in the identified features reveal different behavioral origins of voice disorders. The findings open new possibilities for objective documentation and for monitoring vocal fold behavior in clinical practice through kymographic imaging.
Collapse
Affiliation(s)
- Jan G Svec
- Groningen Voice Research Laboratory, Dept of Biomedical Engineering, University Medical Center Groningen, University of Groningen, Antonius Deusinglaan 1, NL 9713 AV Groningen, the Netherlands
| | | | | |
Collapse
|
23
|
Patel RR, Unnikrishnan H, Donohue KD. Effects of Vocal Fold Nodules on Glottal Cycle Measurements Derived from High-Speed Videoendoscopy in Children. PLoS One 2016; 11:e0154586. [PMID: 27124157 PMCID: PMC4849744 DOI: 10.1371/journal.pone.0154586] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Accepted: 04/17/2016] [Indexed: 11/18/2022] Open
Abstract
The goal of this study is to quantify the effects of vocal fold nodules on vibratory motion in children using high-speed videoendoscopy. Differences in vibratory motion were evaluated in 20 children with vocal fold nodules (5–11 years) and 20 age and gender matched typically developing children (5–11 years) during sustained phonation at typical pitch and loudness. Normalized kinematic features of vocal fold displacements from the mid-membranous vocal fold point were extracted from the steady-state high-speed video. A total of 12 kinematic features representing spatial and temporal characteristics of vibratory motion were calculated. Average values and standard deviations (cycle-to-cycle variability) of the following kinematic features were computed: normalized peak displacement, normalized average opening velocity, normalized average closing velocity, normalized peak closing velocity, speed quotient, and open quotient. Group differences between children with and without vocal fold nodules were statistically investigated. While a moderate effect size was observed for the spatial feature of speed quotient, and the temporal feature of normalized average closing velocity in children with nodules compared to vocally normal children, none of the features were statistically significant between the groups after Bonferroni correction. The kinematic analysis of the mid-membranous vocal fold displacement revealed that children with nodules primarily differ from typically developing children in closing phase kinematics of the glottal cycle, whereas the opening phase kinematics are similar. Higher speed quotients and similar opening phase velocities suggest greater relative forces are acting on vocal fold in the closing phase. These findings suggest that future large-scale studies should focus on spatial and temporal features related to the closing phase of the glottal cycle for differentiating the kinematics of children with and without vocal fold nodules.
Collapse
Affiliation(s)
- Rita R. Patel
- Department of Speech & Hearing Sciences, Indiana University, Bloomington, Indiana, United States of America
- * E-mail:
| | - Harikrishnan Unnikrishnan
- Department of Electrical and Computer Engineering, University of Kentucky, Lexington, Kentucky, United States of America
| | - Kevin D. Donohue
- Department of Electrical and Computer Engineering, University of Kentucky, Lexington, Kentucky, United States of America
| |
Collapse
|
24
|
Li G, Hou Q. The Physiological Basis of Chinese Höömii Generation. J Voice 2016; 31:116.e13-116.e16. [PMID: 27130324 DOI: 10.1016/j.jvoice.2016.03.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Revised: 03/07/2016] [Accepted: 03/10/2016] [Indexed: 10/21/2022]
Abstract
OBJECTIVE The study aimed to investigate the physiological basis of vibration mode of sound source of a variety of Mongolian höömii forms of singing in China. METHODS The participant is a Mongolian höömii performing artist who was recommended by the Chinese Medical Association of Art. He used three types of höömii, namely vibration höömii, whistle höömii, and overtone höömii, which were compared with general comfortable pronunciation of /i:/ as control. Phonation was observed during /i:/. A laryngostroboscope (Storz) was used to determine vibration source-mucosal wave in the throat. RESULTS For vibration höömii, bilateral ventricular folds approximated to the midline and made contact at the midline during pronunciation. Ventricular and vocal folds oscillated together as a single unit to form a composite vibration (double oscillator) sound source. For whistle höömii, ventricular folds approximated to the midline to cover part of vocal folds, but did not contact each other. It did not produce mucosal wave. The vocal folds produced mucosal wave to form a single vibration sound source. For overtone höömii, the anterior two-thirds of ventricular folds touched each other during pronunciation. The last one-third produced the mucosal wave. The vocal folds produced mucosal wave at the same time, which was a composite vibration (double oscillator) sound source mode. CONCLUSIONS The Höömii form of singing, including mixed voices and multivoice, was related to the presence of dual vibration sound sources. Its high overtone form of singing (whistle höömii) was related to stenosis at the resonance chambers' initiation site (ventricular folds level).
Collapse
Affiliation(s)
- Gelin Li
- Department of Otolaryngology, Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China.
| | - Qian Hou
- Department of Otolaryngology, Beijing Friendship Hospital, Capital Medical University, Beijing 100050, China
| |
Collapse
|
25
|
Hampala V, Garcia M, Švec JG, Scherer RC, Herbst CT. Relationship Between the Electroglottographic Signal and Vocal Fold Contact Area. J Voice 2016; 30:161-71. [DOI: 10.1016/j.jvoice.2015.03.018] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2015] [Accepted: 03/30/2015] [Indexed: 11/24/2022]
|
26
|
Brockmann-Bauser M, Beyer D, Bohlender JE. Reliable acoustic measurements in children between 5;0 and 9;11 years: Gender, age, height and weight effects on fundamental frequency, jitter and shimmer in phonations without and with controlled voice SPL. Int J Pediatr Otorhinolaryngol 2015; 79:2035-42. [PMID: 26412461 DOI: 10.1016/j.ijporl.2015.09.005] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 09/02/2015] [Accepted: 09/03/2015] [Indexed: 10/23/2022]
Abstract
BACKGROUND Current pediatric voice assessment guidelines include instrumental measurements of fundamental frequency (F0) and the perturbation indices jitter and shimmer. In children below 10 years, gender, age, height and weight effects on these parameters have been inconsistently characterized. Recent research in healthy children showed, that differences in habitual speaking voice intensity (voice SPL) under the usual assessment procedure significantly affect jitter and shimmer. These effects were reduced in phonations with controlled voice SPL >80dBA. Reliable measurement methods and description of physiologic influencing factors are essential to identify pathologic voices. OBJECTIVE This cross-sectional single cohort study investigates in children between 5;0 and 9;11 years how gender, age, height and weight affect voice F0, jitter and shimmer in phonations at individually "medium" voice intensity (modeling the usual clinical practice) and with controlled voice SPL >80dBA. SUBJECTS AND METHODS 68 vocally healthy children (39 f/29 m) aged 5;0-9;11 years provided 3 prolonged phonations of/a/at individually "medium" and controlled voice intensity at ">80dBA" (visual feedback, 10cm distance). F0 (Hz), jitter (%), shimmer (%) and voice SPL (dBA) were determined with PRAAT. Gender, age, height and weight effects without and with controlled voice SPL were assessed by descriptive statistics, Analysis of Variance and Linear Mixed Models. RESULTS F0 (Hz), jitter (%), shimmer (%) and voice SPL (dBA) were significantly different in medium voice compared to >80dBA (p<0.01). In medium phonations girls had a higher F0 than boys (girls: 276.7(50.7), boys: 261.5(33.7)), but with >80dBA this difference was only minimal (girls: 328.9(52.2), boys 327.9(51.2)). Mean jitter (0.27(0.10)) and shimmer (4.34(1.68)) were smaller and showed less spread (jitter: 0.5(0.26); shimmer: 9.47(3.47)) with >80dBA. Gender, age, height and weight had no significant effects on F0, jitter, shimmer and voice SPL in both phonation types (p-range=0.42-0.99). CONCLUSIONS Neither without nor with controlled voice SPL there were systematic gender, age, height or weight effects on voice F0, jitter and shimmer. Gender related F0 discrepancies were equalized in phonations with >80dBA. In children below 10 years gender related acoustic voice differences may be mainly linked to behavior, which should be considered in future works regarding physiologic voice development.
Collapse
Affiliation(s)
- Meike Brockmann-Bauser
- Department of Phoniatrics and Speech Pathology, Clinic for Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, Frauenklinikstrasse 24, 8091 Zurich, Switzerland.
| | - Denis Beyer
- Department of Phoniatrics and Speech Pathology, Clinic for Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, Frauenklinikstrasse 24, 8091 Zurich, Switzerland.
| | - Jörg Edgar Bohlender
- Department of Phoniatrics and Speech Pathology, Clinic for Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, Frauenklinikstrasse 24, 8091 Zurich, Switzerland.
| |
Collapse
|
27
|
Unger J, Schuster M, Hecker DJ, Schick B, Lohscheller J. A generalized procedure for analyzing sustained and dynamic vocal fold vibrations from laryngeal high-speed videos using phonovibrograms. Artif Intell Med 2015; 66:15-28. [PMID: 26597002 DOI: 10.1016/j.artmed.2015.10.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Revised: 09/28/2015] [Accepted: 10/20/2015] [Indexed: 12/01/2022]
Abstract
OBJECTIVE This work presents a computer-based approach to analyze the two-dimensional vocal fold dynamics of endoscopic high-speed videos, and constitutes an extension and generalization of a previously proposed wavelet-based procedure. While most approaches aim for analyzing sustained phonation conditions, the proposed method allows for a clinically adequate analysis of both dynamic as well as sustained phonation paradigms. MATERIALS AND METHODS The analysis procedure is based on a spatio-temporal visualization technique, the phonovibrogram, that facilitates the documentation of the visible laryngeal dynamics. From the phonovibrogram, a low-dimensional set of features is computed using a principle component analysis strategy that quantifies the type of vibration patterns, irregularity, lateral symmetry and synchronicity, as a function of time. Two different test bench data sets are used to validate the approach: (I) 150 healthy and pathologic subjects examined during sustained phonation. (II) 20 healthy and pathologic subjects that were examined twice: during sustained phonation and a glissando from a low to a higher fundamental frequency. In order to assess the discriminative power of the extracted features, a Support Vector Machine is trained to distinguish between physiologic and pathologic vibrations. The results for sustained phonation sequences are compared to the previous approach. Finally, the classification performance of the stationary analyzing procedure is compared to the transient analysis of the glissando maneuver. RESULTS For the first test bench the proposed procedure outperformed the previous approach (proposed feature set: accuracy: 91.3%, sensitivity: 80%, specificity: 97%, previous approach: accuracy: 89.3%, sensitivity: 76%, specificity: 96%). Comparing the classification performance of the second test bench further corroborates that analyzing transient paradigms provides clear additional diagnostic value (glissando maneuver: accuracy: 90%, sensitivity: 100%, specificity: 80%, sustained phonation: accuracy: 75%, sensitivity: 80%, specificity: 70%). CONCLUSIONS The incorporation of parameters describing the temporal evolvement of vocal fold vibration clearly improves the automatic identification of pathologic vibration patterns. Furthermore, incorporating a dynamic phonation paradigm provides additional valuable information about the underlying laryngeal dynamics that cannot be derived from sustained conditions. The proposed generalized approach provides a better overall classification performance than the previous approach, and hence constitutes a new advantageous tool for an improved clinical diagnosis of voice disorders.
Collapse
Affiliation(s)
- Jakob Unger
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, 54293 Trier, Germany.
| | - Maria Schuster
- Department of Otorhinolaryngology and Head and Neck Surgery, University of Munich, Campus Grosshadern, Marchioninistr. 13, 81366 München, Germany
| | - Dietmar J Hecker
- Department of Otorhinolaryngology, Saarland University Hospital, Kirrbergerstr., 66424 Homburg/Saar, Germany
| | - Bernhard Schick
- Department of Otorhinolaryngology, Saarland University Hospital, Kirrbergerstr., 66424 Homburg/Saar, Germany
| | - Jörg Lohscheller
- Department of Computer Science, Trier University of Applied Sciences, Schneidershof, 54293 Trier, Germany
| |
Collapse
|
28
|
Deliyski DD, Hillman RE, Mehta DD. Laryngeal High-Speed Videoendoscopy: Rationale and Recommendation for Accurate and Consistent Terminology. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2015; 58:1488-92. [PMID: 26375398 PMCID: PMC4686309 DOI: 10.1044/2015_jslhr-s-14-0253] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Revised: 02/25/2015] [Accepted: 06/09/2015] [Indexed: 05/24/2023]
Abstract
PURPOSE The authors discuss the rationale behind the term laryngeal high-speed videoendoscopy to describe the application of high-speed endoscopic imaging techniques to the visualization of vocal fold vibration. METHOD Commentary on the advantages of using accurate and consistent terminology in the field of voice research is provided. Specific justification is described for each component of the term high-speed videoendoscopy, which is compared and contrasted with alternative terminologies in the literature. RESULTS In addition to the ubiquitous high-speed descriptor, the term endoscopy is necessary to specify the appropriate imaging technology and distinguish among modalities such as ultrasound, magnetic resonance imaging, and nonendoscopic optical imaging. Furthermore, the term video critically indicates the electronic recording of a sequence of optical still images representing scenes in motion, in contrast to strobed images using high-speed photography and non-optical high-speed magnetic resonance imaging. High-speed videoendoscopy thus concisely describes the technology and can be appended by the desired anatomical nomenclature such as laryngeal. CONCLUSIONS Laryngeal high-speed videoendoscopy strikes a balance between conciseness and specificity when referring to the typical high-speed imaging method performed on human participants. Guidance for the creation of future terminology provides clarity and context for current and future experiments and the dissemination of results among researchers.
Collapse
Affiliation(s)
- Dimitar D. Deliyski
- Communication Sciences Research Center, Cincinnati Children's Hospital Medical Center, OH
- University of Cincinnati, OH
| | - Robert E. Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA
- Harvard Medical School, Boston, MA
- Massachusetts General Hospital Institute of Health Professions, Charlestown, MA
| | - Daryush D. Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA
- Harvard Medical School, Boston, MA
- Massachusetts General Hospital Institute of Health Professions, Charlestown, MA
| |
Collapse
|
29
|
Hertegård S, Larsson H. A Portable High-Speed Camera System for Vocal Fold Examinations. J Voice 2014; 28:681-7. [DOI: 10.1016/j.jvoice.2014.04.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 04/01/2014] [Indexed: 11/30/2022]
|
30
|
Shinghal T, Low A, Russell L, Propst EJ, Eskander A, Campisi P. High-speed video or video stroboscopy in adolescents: which sheds more light? Otolaryngol Head Neck Surg 2014; 151:1041-5. [PMID: 25257907 DOI: 10.1177/0194599814551548] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
OBJECTIVE The primary objective of this study was to compare the utility of high-speed video (HSV) to videostroboscopy (VS) in the assessment of adolescents with normal and abnormal larynges. A secondary objective was to evaluate the ease of assessment of adolescents with HSV. STUDY DESIGN Case series with chart review. SETTING Tertiary academic health care center. SUBJECTS AND METHODS This study involved a retrospective review of recordings of 7 adolescents assessed with both HSV and VS. The 14 recordings were randomized and presented to 4 groups of blinded evaluators: 2 fellowship-trained laryngologists, 2 speech language pathologists (SLP) with multiyear experience working in a voice clinic, 2 pediatric otolaryngologists, and 2 otolaryngology residents. Raters were asked to evaluate the videos using a standardized scoring tool. Raters also completed a questionnaire assessing their opinion of the HSV and VS recordings. RESULTS Evaluators required more time to complete their assessment of VS recordings (2.95 min ± 2.41 min) than HSV recordings (2.31 min ± 1.92 min) (P = .004). There was no difference in ease of evaluation (P = .878) or diagnostic accuracy within evaluator groups by recording modality (P = .5). The overall agreement between VS and HSV was moderate (kappa [SE] = 0.446 [0.029]). The debrief questionnaire revealed that 5 of 8 (62.5%) preferred VS to HSV. CONCLUSION This is the first comparative study between HSV and VS in patients under 18 years of age. HSV permitted faster evaluation than VS, but there was no difference in diagnostic accuracy between the 2 modalities. The evaluators preferred VS to HSV.
Collapse
Affiliation(s)
- Tulika Shinghal
- Department of Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Aaron Low
- The Voice Clinic, Toronto, Ontario, Canada
| | - Laurie Russell
- Department of Otolaryngology-Head and Neck Surgery, Hospital for Sick Children, Toronto, Ontario, Canada Centre for Paediatric Voice & Laryngeal Function and Department of Communication Disorders, Hospital for Sick Children, Toronto, Ontario, Canada
| | - Evan J Propst
- Department of Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada Department of Otolaryngology-Head and Neck Surgery, Hospital for Sick Children, Toronto, Ontario, Canada
| | - Antoine Eskander
- Department of Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada
| | - Paolo Campisi
- Department of Otolaryngology-Head and Neck Surgery, University of Toronto, Toronto, Ontario, Canada Department of Otolaryngology-Head and Neck Surgery, Hospital for Sick Children, Toronto, Ontario, Canada Centre for Paediatric Voice & Laryngeal Function and Department of Communication Disorders, Hospital for Sick Children, Toronto, Ontario, Canada
| |
Collapse
|
31
|
Patel R, Dubrovskiy D, Döllinger M. Characterizing vibratory kinematics in children and adults with high-speed digital imaging. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2014; 57:S674-86. [PMID: 24686982 PMCID: PMC7315516 DOI: 10.1044/2014_jslhr-s-12-0278] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
PURPOSE The aim of this study is to quantify and identify characteristic vibratory motion in typically developing prepubertal children and young adults using high-speed digital imaging. METHOD The vibrations of the vocal folds were recorded from 27 children (ages 5-9 years) and 35 adults (ages 21-45 years), with high speed at 4,000 frames per second for sustained phonation. Kinematic features of amplitude periodicity, time periodicity, phase asymmetry, spatial symmetry, and glottal gap index were analyzed from the glottal area waveform across mean and standard deviation (i.e., intercycle variability) for each measure. RESULTS Children exhibited lower mean amplitude periodicity compared to men and women and lower time periodicity compared to men. Children and women exhibited greater variability in amplitude periodicity, time periodicity, phase asymmetry, and glottal gap index compared to men. Women had lower mean values of amplitude periodicity and time periodicity compared to men. CONCLUSION Children differed both spatially but more temporally in vocal fold motion, suggesting the need for the development of children-specific kinematic norms. Results suggest more uncontrolled vibratory motion in children, reflecting changes in the vocal fold layered structure and aero-acoustic source mechanisms.
Collapse
|
32
|
Herbst CT, Lohscheller J, Švec JG, Henrich N, Weissengruber G, Fitch WT. Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings. J Exp Biol 2014; 217:955-63. [DOI: 10.1242/jeb.093203] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Previous research has suggested that the peaks in the first derivative (dEGG) of the electroglottographic (EGG) signal are good approximate indicators of the events of glottal opening and closing. These findings were based on high-speed video (HSV) recordings with frame rates 10 times lower than the sampling frequencies of the corresponding EGG data. The present study attempts to corroborate these previous findings, utilizing super-HSV recordings. The HSV and EGG recordings (sampled at 27 and 44 kHz, respectively) of an excised canine larynx phonation were synchronized by an external TTL signal to within 0.037 ms. Data were analyzed by means of glottovibrograms, digital kymograms, the glottal area waveform and the vocal fold contact length (VFCL), a new parameter representing the time-varying degree of ‘zippering’ closure along the anterior–posterior (A–P) glottal axis. The temporal offsets between glottal events (depicted in the HSV recordings) and dEGG peaks in the opening and closing phase of glottal vibration ranged from 0.02 to 0.61 ms, amounting to 0.24–10.88% of the respective glottal cycle durations. All dEGG double peaks coincided with vibratory A–P phase differences. In two out of the three analyzed video sequences, peaks in the first derivative of the VFCL coincided with dEGG peaks, again co-occurring with A–P phase differences. The findings suggest that dEGG peaks do not always coincide with the events of glottal closure and initial opening. Vocal fold contacting and de-contacting do not occur at infinitesimally small instants of time, but extend over a certain interval, particularly under the influence of A–P phase differences.
Collapse
Affiliation(s)
- Christian T. Herbst
- Voice Research Laboratory, Department of Biophysics, Faculty of Science, Palacký University Olomouc, tr. 17. Listopadu 12, 771 46 Olomouc, Czech Republic
- Laboratory of Bio-Acoustics, Department of Cognitive Biology, University of Vienna, Althanstraße 14, 1090 Vienna, Austria
| | - Jörg Lohscheller
- University of Applied Sciences, Department of Computer Science, Schneidershof, 54293 Trier, Germany
| | - Jan G. Švec
- Voice Research Laboratory, Department of Biophysics, Faculty of Science, Palacký University Olomouc, tr. 17. Listopadu 12, 771 46 Olomouc, Czech Republic
| | - Nathalie Henrich
- GIPSA-lab, CNRS, Grenoble INP, Grenoble University, 11 rue des Mathématiques – BP 46, 38402 Saint Martin d'Hères cedex, France
| | - Gerald Weissengruber
- University of Veterinary Medicine Vienna, Institute for Anatomy, Histology and Embryology, Veterinärplatz 1, 1210 Vienna, Austria
| | - W. Tecumseh Fitch
- Laboratory of Bio-Acoustics, Department of Cognitive Biology, University of Vienna, Althanstraße 14, 1090 Vienna, Austria
| |
Collapse
|
33
|
Objective Measures of Laryngeal Imaging: What Have We Learned Since Dr. Paul Moore. J Voice 2014; 28:69-81. [DOI: 10.1016/j.jvoice.2013.02.001] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2012] [Accepted: 02/06/2013] [Indexed: 11/19/2022]
|
34
|
Evaluation of Vocal Fold Vibration With an Assessment Form for High-Speed Digital Imaging: Comparative Study between Healthy Young and Elderly Subjects. J Voice 2012; 26:742-50. [DOI: 10.1016/j.jvoice.2011.12.010] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2011] [Accepted: 12/20/2011] [Indexed: 11/22/2022]
|
35
|
Srivastava S, Vasavada AR, Vasavada VA, Vasavada VA. Real-time intraoperative high-speed imaging during phacoemulsification. J Cataract Refract Surg 2012; 38:1519-25. [PMID: 22841426 DOI: 10.1016/j.jcrs.2012.07.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2012] [Revised: 03/24/2012] [Accepted: 03/28/2012] [Indexed: 10/28/2022]
Abstract
We describe the use of high-speed imaging during phacoemulsification in a clinical scenario. Images captured during surgery at high frame rates are converted into a slow-motion film to view and analyze various surgical steps. This technique highlights events that are not captured in a normal-speed video recording. It has obvious applications for understanding surgical techniques and technology.
Collapse
|
36
|
Patel RR, Dixon A, Richmond A, Donohue KD. Pediatric high speed digital imaging of vocal fold vibration: a normative pilot study of glottal closure and phase closure characteristics. Int J Pediatr Otorhinolaryngol 2012; 76:954-9. [PMID: 22445799 PMCID: PMC3372768 DOI: 10.1016/j.ijporl.2012.03.004] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/28/2011] [Revised: 02/29/2012] [Accepted: 03/03/2012] [Indexed: 11/30/2022]
Abstract
OBJECTIVE The aim of the study is to characterize normal vibratory patterns of both glottal closure and phase closure in the pediatric population with the use of high speed digital imaging. METHODS For this prospective study a total of 56 pre-pubertal children, 5-11 years (boys=28, girls=28) and 56 adults, 21-45 years (males=28, females=28) without known voice problems were examined with the use of a new technology of high speed digital imaging. Recordings were captured at 4000 frames per second for duration of 4.094 s at participants' typical phonation. With semi-automated software, montage analysis of glottal cycles was performed. Three trained experienced raters, rated features of glottal configuration and phase closure from glottal cycle montages. RESULTS Posterior glottal gap was the predominant glottal closure configuration in children (girls=85%, boys=68%) with normal voice. Other glottal configurations observed were: anterior gap (girls=3.6%, boys=0%), complete closure (girls=7%, boys=10%) and hour glass (girls=0%, boys=11%). Adults with normal voice also demonstrated predominantly higher percentage of posterior glottal gap configuration (females=75% male=54%) compared to the configurations of anterior gap (females=0% male=7%), complete closure (females=2% male=39%), hour glass (females=3.6% male=3.6%). A predominantly open phase (51-70% of the glottal cycle) was observed in 86% girls and 71% boys. Compared to children, adult females showed a predominantly balance phased closure 46%, followed by open phase (39%) and predominantly closed phase (14%). Adult males showed a predominantly closed phase (43%), followed by predominantly open phase (39%), followed by a balanced phase (18%). CONCLUSIONS This is a first study investigating characteristics of normal vibratory motion in children with high speed digital imaging. Glottal configuration and phase closure for children with normal voices are distinctly different compared to adults. The results suggest that posterior glottal gap and a predominantly open phase of the glottal cycle should be considered as normal glottal configuration in children during modal pitch and loudness. This study provides preliminary information on the vibratory characteristics of children with normal voice. The data presented here may provide the bases for differentiating normal vibratory characteristics from the disordered in the pediatric population.
Collapse
Affiliation(s)
- Rita R. Patel
- Department of Rehabilitation Sciences, Division of Communication Sciences & Disorders, University of Kentucky, Lexington, USA
| | - Angela Dixon
- Department of Rehabilitation Sciences, Division of Communication Sciences & Disorders, University of Kentucky, Lexington, USA
| | - AnnaMary Richmond
- Department of Rehabilitation Sciences, Division of Communication Sciences & Disorders, University of Kentucky, Lexington, USA
| | - Kevin D. Donohue
- Department of Electrical & Computer Engineering, University of Kentucky, Lexington, USA
| |
Collapse
|
37
|
Roark RM, Watson BC, Baken R, Brown DJ, Thomas JM. Measures of Vocal Attack Time for Healthy Young Adults. J Voice 2012; 26:12-7. [DOI: 10.1016/j.jvoice.2010.09.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2010] [Accepted: 09/27/2010] [Indexed: 10/18/2022]
|
38
|
Patel RR, Donohue KD, Johnson WC, Archer SM. Laser projection imaging for measurement of pediatric voice. Laryngoscope 2011; 121:2411-7. [PMID: 21993904 DOI: 10.1002/lary.22325] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2011] [Revised: 07/18/2011] [Accepted: 07/22/2011] [Indexed: 11/11/2022]
Abstract
OBJECTIVES/HYPOTHESIS The aim of the study was to present the development of a miniature laser projection endoscope and to quantify vocal fold length and vibratory amplitude of the pediatric glottis using high-speed digital imaging coupled with the laser endoscope. STUDY DESIGN For this prospective study, absolute measurement of entire vocal fold length, membranous length of the vocal fold, and vibratory amplitude during phonation were obtained in one child (9 years old), one adult male (36 years old), and one adult female (20 years old) with the use of high-speed digital imaging, coupled with a custom-developed laser projection endoscope. METHODS The laser projection system consists of a module slip-fit sleeve with two 3-mW 650-nm laser diodes in horizontal orientation separated by a distance of 5 mm. Calibration involved projecting the laser onto grid patterns at depths ranging from 6 to 10 cm and tilt angles of 15 to -5 degrees to obtain pixel-to-millimeter conversion templates. Measurements of vocal fold length and vibratory amplitude were extracted based on methods of image processing. RESULTS The system demonstrated a method for estimating vocal fold length and vibratory amplitude with a single laser point with high measurement precision. First measurements of vocal fold length (6.8 mm) and vibratory amplitude (0.25 mm) during phonation in a pediatric participant are reported. CONCLUSIONS The proposed laser projection system can be used to obtain absolute length and vibratory measurements of the pediatric glottis. The projection system can be used with stroboscopy or high-speed digital imaging systems with a 70-degree rigid endoscope.
Collapse
Affiliation(s)
- Rita R Patel
- Department of Rehabilitation Sciences, Division of Communication Sciences and Disorders, University of Kentucky, Lexington, Kentucky 40536-0200, USA.
| | | | | | | |
Collapse
|
39
|
Patel RR, Liu L, Galatsanos N, Bless DM. Differential vibratory characteristics of adductor spasmodic dysphonia and muscle tension dysphonia on high-speed digital imaging. Ann Otol Rhinol Laryngol 2011; 120:21-32. [PMID: 21370677 DOI: 10.1177/000348941112000104] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
OBJECTIVES The purpose of this study was to quantify disorder-specific signature kinematic disturbances of vibratory motion in adductor spasmodic dysphonia (AdSD) and muscle tension dysphonia (MTD), in voice disturbances of a severe nature, with the use of high-speed digital imaging (HSDI). A secondary hypothesis of the study was to investigate the sensitivity and specificity of the signature kinematic features obtained from HSDI, in differentiating between AdSD and MTD. METHODS We used vibratory features from automated extraction of vocal fold motion waveforms and glottal cycle montage analysis from HSDI for differential kinematic profiling of AdSD and MTD. RESULTS Novel features of motion irregularities and micromotions (as small as 27 ms) were greater in number for AdSD, whereas reduced motion irregularities, absence of oscillatory breaks, absence of micromotions, and increased hyperfunction characterized the MTD group. Oscillatory breaks (as small as 8 ms), although present only in the AdSD group, were not statistically significant because of their reduced number of occurrences compared to the other features. Further montage analysis of successive glottal cycles of oscillatory breaks in the AdSD group revealed 3 different kinematic patterns within the AdSD group, indicative of likely AdSD with: 1) possible predominant thyroarytenoid muscle involvement, 2) possible predominant cricothyroid muscle involvement, and 3) possible combined involvements of the thyroarytenoid and lateral cricoarytenoid muscles. Four consistent but unique kinematic patterns were identified within the MTD group: 1) diplophonia, 2) vocal fry, 3) breathy phonation, and 4) pressed phonation. Sensitivity and specificity analysis revealed that only motion irregularity was a significant predictor of the presence of AdSD. CONCLUSIONS Fine kinematic analysis from HSDI can be used to aid detailed clinical profiling of the source characteristics of AdSD and MTD.
Collapse
Affiliation(s)
- Rita R Patel
- Division of Otolaryngology-Head and Neck Surgery, Department of Surgery, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | | | | | | |
Collapse
|
40
|
A comparison of sung and spoken phonation onset gestures using high-speed digital imaging. J Voice 2011; 26:226-38. [PMID: 21256709 DOI: 10.1016/j.jvoice.2010.11.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2010] [Accepted: 11/15/2010] [Indexed: 11/23/2022]
Abstract
Phonation onset is important in the maintenance of healthy vocal production for speech and singing. The purpose of this preliminary study was to examine differences in vocal fold vibratory behavior between sung and spoken phonation onset gestures. Given the greater degree of precision required for the abrupt onset sung gestures, we hypothesize that differences exist in the timing and coordination of the vocal fold adductory gesture with the onset of vocal fold vibration. Staccato and German (a modified glottal plosive, so named for its occurrence in German classical singing) onset gestures were compared with breathy, normal, and hard onset gestures, using high-speed digital imaging. Samples were obtained from two subjects with no history of voice disorders (a female trained singer and a male nonsinger). Simultaneous capture of acoustical data confirmed the distinction among gestures. Image data were compared for glottal area configurations, degree of adductory positioning, number of small-amplitude prephonatory oscillations (PPOs), and timing of onset gesture events, the latter marked by maximum vocal fold abduction, maximum adduction, beginning of PPOs, and beginning of steady-state oscillation. Results reveal closer adductory positioning of the vocal folds for the staccato and German gestures. The data also suggest a direct relationship between the degree of adductory positioning and the number of PPOs. Results for the timing of onset gesture events suggest a relationship between discrete adductory positioning and more evenly spaced PPOs. By contrast, the overlapping of prephonatory adductory positioning with vibration onset revealed more unevenly spaced PPOs. This may support an existing hypothesis that less well-defined boundaries interfere with normal modes of vibration of the vocal fold tissue.
Collapse
|
41
|
|
42
|
Advances in laryngeal imaging. Eur Arch Otorhinolaryngol 2009; 266:1509-20. [PMID: 19618198 DOI: 10.1007/s00405-009-1050-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2008] [Accepted: 07/07/2009] [Indexed: 10/20/2022]
Abstract
Imaging and image analysis became an important issue in laryngeal diagnostics. Various techniques, such as videostroboscopy, videokymography, digital kymography, or ultrasonography are available and are used in research and clinical practice. This paper reviews recent advances in imaging for laryngeal diagnostics.
Collapse
|
43
|
Abstract
BACKGROUND Stroboscopy is widely used and is quite adequate for the examination of normal voices, but with increasing hoarseness its suitability declines, even when it is supplemented by video recordings and image evaluation. Real-time procedures such as videokymography or high-speed (HS) video imaging are more suitable methods of observing the movements of the vocal folds in such cases. A drawback of any video recording is the later time-consuming offline replay of the films in slow motion and our restricted pattern recognition for motion and other time-dependent processes. METHODS The phonovibrogram (PVG) is an image-processing algorithm that extracts the vocal fold motions of a whole laryngoscopic HS video film and automatically compresses them into a single image. RESULTS Simple patterns that vary from person to person are revealed by PVG; these can be categorized by means of simple geometric forms, which a human observer can more easily recognize and interpret than dynamic motion patterns. The PVG computation is described in detail and an extensive guide to interpretation is given, illustrated by reference to theoretical and real examples. CONCLUSION In clinical conditions, HS laryngoscopic video recording is useful only in association with automatic image processing. The PVG procedure is a promising approach and tests should be performed with a view to further clinical validation.
Collapse
Affiliation(s)
- U Eysholdt
- Abteilung für Phoniatrie und Pädaudiologie, Universitätsklinikum Erlangen, Erlangen, Germany.
| | | |
Collapse
|
44
|
Bonilha HS, Deliyski DD. Period and Glottal Width Irregularities in Vocally Normal Speakers. J Voice 2008; 22:699-708. [DOI: 10.1016/j.jvoice.2007.03.002] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2006] [Accepted: 03/01/2007] [Indexed: 10/22/2022]
|
45
|
Mortensen M, Woo P. High-speed imaging used to detect vocal fold paresis: a case report. Ann Otol Rhinol Laryngol 2008; 117:684-7. [PMID: 18834072 DOI: 10.1177/000348940811700910] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
High-speed imaging has been used to study vocal fold vibration and has been shown to provide additional information in aid of our understanding of pathologic vocal fold vibration. This is the first case report of vocal fold paresis diagnosed by high-speed imaging. An 18-year-old girl presented with intermittent voice loss that had been present for 4 years. The patient had been seen by other otolaryngologists and had been given proton pump inhibitors without any improvement in her voice. Her voice was diplophonic. The patient was examined by rigid stroboscopy and was found to have a predominantly open phase pattern but a normal vibratory pattern. High-speed photography showed a distinct vibratory frequency for each vocal fold, suggestive of a paresis pattern. Laryngeal electromyography confirmed the diagnosis of vocal fold paresis. A computed tomographic scan of the larynx and chest showed a thymoma. After thymectomy, the patient recovered full voice function. High-speed imaging is useful for the clinical evaluation of pathologic vocal fold vibration and can detect subtle features of paralysis that may not be detected on fiberoptic endoscopy and rigid stroboscopy. The additional information from high-speed imaging helped to make the diagnosis of vocal fold paresis in this patient.
Collapse
Affiliation(s)
- Melissa Mortensen
- Department of Otolaryngology-Head and Neck Surgery, Mount Sinai Medical Center, New York, New York 10029, USA
| | | |
Collapse
|
46
|
Krantz JH. Did I Really See That? The Complex Relationship Between the Visual Stimulus and Visual Perception. J Voice 2008; 22:520-32. [PMID: 17509821 DOI: 10.1016/j.jvoice.2007.02.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2006] [Accepted: 02/14/2007] [Indexed: 11/17/2022]
Abstract
Laryngeal imaging uses optical and electronic means to visualize the larynx. Understanding some of the issues related to how the human visual system operates and how imaging systems interact with the visual system can help clarify some of the artifacts that arise from these technologies. This article describes how the visual system can construct coherent perceptions from limited information, how it adjusts to current situations, and how the perception of any one part of the image depends upon the light levels around each point. In particular, the limited field of view and stroboscopic nature of the images can lead to many distortions from laryngeal imaging. This article also describes the way that imaging systems sample the image, and the lack of stability inherent in an imaging system. The article concludes with some observations and recommendations to improve the ability to use imaging systems in the diagnosis of laryngeal pathology.
Collapse
Affiliation(s)
- John H Krantz
- Department of Psychology, Hanover College, P.O. Box 890, Hanover, IN 47243, USA.
| |
Collapse
|
47
|
Patel R, Dailey S, Bless D. Comparison of High-Speed Digital Imaging with Stroboscopy for Laryngeal Imaging of Glottal Disorders. Ann Otol Rhinol Laryngol 2008; 117:413-24. [DOI: 10.1177/000348940811700603] [Citation(s) in RCA: 141] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Objectives: High-speed digital imaging (HSDI), unlike stroboscopy, is a frequency-independent visualization technique that provides detailed biomechanical assessment of vocal physiology due to increased temporal resolution. The purpose of this study was to investigate the clinical value of HSDI compared to that of stroboscopy across 3 disorder groups classified as epithelial, subepithelial, and neurologic disorders. Methods: Judgments of vibratory features of vocal fold edge, glottal closure, phase closure, vertical level, vibratory amplitude, mucosal wave, phase symmetry, tissue pliability, and glottal cycle periodicity from 252 participants were performed by 3 experienced raters. Results: The results revealed that 63% of the data set was noninterpretable for assessment of vibratory function on stroboscopic analysis because of the severity of the voice disorder (100% of participants with severe voice disorders and 64% of participants with moderate voice disorders), whereas HSDI resulted in analysis of 100% of the data. The neuromuscular group (74%) was the most difficult to analyze with stroboscopy, followed by the epithelial (58%) and subepithelial groups (53%), secondary to the severity of hoarseness. Conclusions: Because it is desirable in clinical examination to observe vocal fold vibrations, which cannot be done in cases of severe dysphonia, HSDI may aid in clinical decision-making when patients exhibit values exceeding 0.87% jitter, 4.4% shimmer, and a signal-to-noise ratio of less than 15.4 dB on acoustic analysis. These measures could serve as minimal indications for use of HSDI. The data suggest that HSDI can be viewed as augmentative to stroboscopy, particularly in cases of moderate to severe aperiodicity, in which HSDI may aid clinical decision-making.
Collapse
|
48
|
Lohscheller J, Eysholdt U, Toy H, Dollinger M. Phonovibrography: mapping high-speed movies of vocal fold vibrations into 2-D diagrams for visualizing and analyzing the underlying laryngeal dynamics. IEEE TRANSACTIONS ON MEDICAL IMAGING 2008; 27:300-9. [PMID: 18334426 DOI: 10.1109/tmi.2007.903690] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Endoscopic high-speed laryngoscopy in combination with image analysis strategies is the most promising approach to investigate the interrelation between vocal fold vibrations and voice disorders. So far, due to the lack of an objective and standardized analysis procedure a unique characterization of vocal fold vibrations has not been achieved yet. We present a visualization and analysis strategy which transforms the segmented edges of vibrating vocal folds into a single 2-D image, denoted Phonovibrogram (PVG). Within a PVG the individual type of vocal fold vibration becomes uniquely characterized by specific geometric patterns. The PVG geometries give an intuitive access on the type and degree of the laryngeal asymmetry and can be quantified using an image segmentation approach. The PVG analysis was applied to 14 representative recordings derived from a high-speed database comprising normal and pathological voices. We demonstrate that PVGs are capable to differentiate and quantify different types of normal and pathological vocal fold vibrations. The objective and precise quantification of the PVG geometry may have the potential to realize a novel classification of vocal fold vibrations.
Collapse
Affiliation(s)
- Jörg Lohscheller
- Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen Medical School, 91054 Erlangen, Germany.
| | | | | | | |
Collapse
|
49
|
Singh A, Kazi R, Venkitaraman R, Kapoor K, Nutting C, Clarke P, Rhys Evans P, Harrington K. Does flexible videostroboscopy compare with rigid videostroboscopy in the assessment of the neoglottis? A preliminary report. Clin Otolaryngol 2008; 33:60-3. [PMID: 18302558 DOI: 10.1111/j.1749-4486.2007.01571.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
OBJECTIVE To evaluate rigid and flexible stroboscopy of the neoglottis. STUDY DESIGN Prospective pilot study set at a tertiary level Head & Neck Unit. PARTICIPANTS Twenty-four patients recruited. All had undergone a total laryngectomy and were voicing using a Blom-singer valve. All had stroboscopic evaluation of their neoglottis using flexible and rigid endoscopes. MAIN OUTCOME MEASURES A rating form was devised based on six parameters with clear definitions. Secondary measures included ability to tolerate the procedure and completeness of the rating form for each parameter using the two systems. RESULTS There was good reliability between individual raters for the assessment of each system based on Spearman Rho correlation. Importantly, two-thirds of the patients were unable to tolerate the rigid videostroboscopy managed flexible videostroboscopy. Correlation between rigid and flexible videostroboscopy was poor for both raters. Flexible systems picked up more mucosal waves and allowed further analysis of the mucosal wave pattern. CONCLUSIONS To our knowledge, this is the first study to demonstrate that fibreoptic videstroboscopy is as good as rigid videostroboscopy in the assessment of the neoglottis. In fact, flexible videostroboscopy should be routinely used, as it is better tolerated and allows a more detailed analysis of the neoglottis.
Collapse
|
50
|
Orlikoff RF, Deliyski DD, Baken RJ, Watson BC. Validation of a glottographic measure of vocal attack. J Voice 2007; 23:164-8. [PMID: 18083343 DOI: 10.1016/j.jvoice.2007.08.004] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2007] [Accepted: 08/15/2007] [Indexed: 12/01/2022]
Abstract
The speed with which the vocal folds adduct to the midline is considered an important variable in the etiology of some voice disorders and may also be a meaningful indicator of central or peripheral neural dysfunction. It is proposed that the time lag between the rise of the sound pressure (SP) and electroglottographic (EGG) signals, measured at the onset of phonation, provides a useful index of vocal attack time. This report describes the experimental validation of this measure, whereby the SP and EGG signals were recorded synchronously with high-speed videoendoscopy, from which a digital kymogram was generated. It is shown that, after appropriate signal processing, the intersignal time delay provides a potentially useful measure that varies with vocal attack characteristics. The proposed method calls for no invasive procedures and relies on signals that are routinely obtained in most clinical settings. Unlike acoustic "rise time" measures of voice onset, the glottographic measure involves no operator intervention, requires no arbitrary decisions about measurement points, and may be accomplished quickly and automatically on any personal computer.
Collapse
Affiliation(s)
- Robert F Orlikoff
- Department of Speech-Language Pathology, School of Graduate Medical Education, Seton Hall University, South Orange, New Jersey 07079, USA.
| | | | | | | |
Collapse
|