1
|
Rohlfing ML, Buckley DP, Piraquive J, Stepp CE, Tracy LF. Hey Siri: How Effective are Common Voice Recognition Systems at Recognizing Dysphonic Voices? Laryngoscope 2020; 131:1599-1607. [PMID: 32949415 DOI: 10.1002/lary.29082] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 08/13/2020] [Accepted: 08/16/2020] [Indexed: 11/11/2022]
Abstract
OBJECTIVES/HYPOTHESIS Interaction with voice recognition systems, such as Siri™ and Alexa™, is an increasingly important part of everyday life. Patients with voice disorders may have difficulty with this technology, leading to frustration and reduction in quality of life. This study evaluates the ability of common voice recognition systems to transcribe dysphonic voices. STUDY DESIGN Retrospective evaluation of "Rainbow Passage" voice samples from patients with and without voice disorders. METHODS Participants with (n = 30) and without (n = 23) voice disorders were recorded reading the "Rainbow Passage". Recordings were played at standardized intensity and distance-to-dictation programs on Apple iPhone 6S™, Apple iPhone 11 Pro™, and Google Voice™. Word recognition scores were calculated as the proportion of correctly transcribed words. Word recognition scores were compared to auditory-perceptual and acoustic measures. RESULTS Mean word recognition scores for participants with and without voice disorders were, respectively, 68.6% and 91.9% for Apple iPhone 6S™ (P < .001), 71.2% and 93.7% for Apple iPhone 11 Pro™ (P < .001), and 68.7% and 93.8% for Google Voice™ (P < .001). There were strong, approximately linear associations between CAPE-V ratings of overall severity of dysphonia and word recognition score, with correlation coefficients (R2 ) of 0.609 (iPhone 6S™), 0.670 (iPhone 11 Pro™), and 0.619 (Google Voice™). These relationships persisted when controlling for diagnosis, age, gender, fundamental frequency, and speech rate (P < .001 for all systems). CONCLUSION Common voice recognition systems function well with nondysphonic voices but are poor at accurately transcribing dysphonic voices. There was a strong negative correlation with word recognition scores and perceptual voice evaluation. As our society increasingly interfaces with automated voice recognition technology, the needs of patients with voice disorders should be considered. LEVEL OF EVIDENCE 4 Laryngoscope, 131:1599-1607, 2021.
Collapse
Affiliation(s)
- Matthew L Rohlfing
- Department of Otolaryngology-Head and Neck Surgery, Boston Medical Center Boston University School of Medicine, Boston, Massachusetts, U.S.A
| | - Daniel P Buckley
- Department of Otolaryngology-Head and Neck Surgery, Boston Medical Center Boston University School of Medicine, Boston, Massachusetts, U.S.A.,Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts, U.S.A
| | - Jacquelyn Piraquive
- Department of Otolaryngology-Head and Neck Surgery, Boston Medical Center Boston University School of Medicine, Boston, Massachusetts, U.S.A
| | - Cara E Stepp
- Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts, U.S.A
| | - Lauren F Tracy
- Department of Otolaryngology-Head and Neck Surgery, Boston Medical Center Boston University School of Medicine, Boston, Massachusetts, U.S.A
| |
Collapse
|
2
|
Ishikawa K, Boyce S, Kelchner L, Powell MG, Schieve H, de Alarcon A, Khosla S. The Effect of Background Noise on Intelligibility of Dysphonic Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:1919-1929. [PMID: 28679008 PMCID: PMC6194928 DOI: 10.1044/2017_jslhr-s-16-0012] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2016] [Revised: 07/19/2016] [Accepted: 01/20/2017] [Indexed: 05/21/2023]
Abstract
PURPOSE The aim of this study is to determine the effect of background noise on the intelligibility of dysphonic speech and to examine the relationship between intelligibility in noise and an acoustic measure of dysphonia: cepstral peak prominence (CPP). METHOD A study of speech perception was conducted using speech samples from 6 adult speakers with typical voice and 6 adult speakers with dysphonia. Speech samples were presented to 30 listeners with typical hearing in 3 noise conditions: quiet, signal-to-noise ratio (SNR)+5, and SNR+0. Intelligibility scores were obtained via orthographic transcription as the percentage of correctly identified words. Speech samples were acoustically analyzed using CPP, and the correlation between the CPP measurements and intelligibility scores was examined. RESULTS The intelligibility of both typical and dysphonic speech was reduced as the level of background noise increased. The reduction was significantly greater in dysphonic speech. A strong correlation was noted between CPP and intelligibility score at SNR+0. CONCLUSIONS Dysphonic speech is relatively harder to understand in the presence of background noise as compared with typical speech. CPP may be a useful predictor of this intelligibility deficit. Future work is needed to confirm these findings with a larger number of speakers and speech materials with known predictability.
Collapse
Affiliation(s)
- Keiko Ishikawa
- Department of Communication Sciences and Disorders, University of Cincinnati, OH
| | - Suzanne Boyce
- Department of Communication Sciences and Disorders, University of Cincinnati, OH
| | - Lisa Kelchner
- Department of Communication Sciences and Disorders, University of Cincinnati, OH
| | - Maria Golla Powell
- Department of Communication Sciences and Disorders, University of Cincinnati, OH
| | - Heidi Schieve
- Department of Communication Sciences and Disorders, University of Cincinnati, OH
| | | | - Sid Khosla
- Department of Otolaryngology, University of Cincinnati, OH
| |
Collapse
|
3
|
Monson BB, Hunter EJ, Lotto AJ, Story BH. The perceptual significance of high-frequency energy in the human voice. Front Psychol 2014; 5:587. [PMID: 24982643 PMCID: PMC4059169 DOI: 10.3389/fpsyg.2014.00587] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2014] [Accepted: 05/26/2014] [Indexed: 11/25/2022] Open
Abstract
While human vocalizations generate acoustical energy at frequencies up to (and beyond) 20 kHz, the energy at frequencies above about 5 kHz has traditionally been neglected in speech perception research. The intent of this paper is to review (1) the historical reasons for this research trend and (2) the work that continues to elucidate the perceptual significance of high-frequency energy (HFE) in speech and singing. The historical and physical factors reveal that, while HFE was believed to be unnecessary and/or impractical for applications of interest, it was never shown to be perceptually insignificant. Rather, the main causes for focus on low-frequency energy appear to be because the low-frequency portion of the speech spectrum was seen to be sufficient (from a perceptual standpoint), or the difficulty of HFE research was too great to be justifiable (from a technological standpoint). The advancement of technology continues to overcome concerns stemming from the latter reason. Likewise, advances in our understanding of the perceptual effects of HFE now cast doubt on the first cause. Emerging evidence indicates that HFE plays a more significant role than previously believed, and should thus be considered in speech and voice perception research, especially in research involving children and the hearing impaired.
Collapse
Affiliation(s)
- Brian B. Monson
- Department of Pediatric Newborn Medicine, Brigham and Women’s Hospital, Harvard Medical SchoolBoston, MA, USA
- National Center for Voice and Speech, University of UtahSalt Lake City, UT, USA
| | - Eric J. Hunter
- National Center for Voice and Speech, University of UtahSalt Lake City, UT, USA
- Department of Communicative Sciences and Disorders, Michigan State UniversityEast Lansing, MI, USA
| | - Andrew J. Lotto
- Speech, Language, and Hearing Sciences, University of ArizonaTucson, AZ, USA
| | - Brad H. Story
- Speech, Language, and Hearing Sciences, University of ArizonaTucson, AZ, USA
| |
Collapse
|
4
|
Fraj S, Schoentgen J, Grenez F. Development and perceptual assessment of a synthesizer of disordered voices. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 132:2603-2615. [PMID: 23039453 DOI: 10.1121/1.4751536] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
A synthesizer is based on a nonlinear wave-shaping model of the glottal area, an algebraic model of the glottal aerodynamics as well as concatenated-tube models of the trachea and vocal tract. Voice disorders are simulated by way of models of vocal frequency jitter and tremor, vocal amplitude shimmer and tremor, as well as pulsatile additive noise. Six experiments have been carried out to assess the synthesizer perceptually. Three experiments involve the perceptual categorization of male synthetic and human stimuli and one the auditory discrimination between synthetic and human tokens. A fifth experiment reports the auditory discrimination between synthetic tokens with different levels of additive and modulation noise. A sixth experiment reports the scoring by expert listeners of male synthetic stimuli on equal-appearing interval scales grade-roughness-breathiness (GRB). A first objective is to demonstrate the ability of the synthesizer to simulate vowel sounds that are valid exemplars of speech sounds produced by humans with voice disorders. A second objective is to learn how human expert raters perceptually map vocal frequency, additive and modulation noise as well as vowel categories into scores on GRB scales.
Collapse
Affiliation(s)
- Samia Fraj
- Laboratory of Signals Images and Telecommunication Devices, CP 165/51, Faculty of Applied Sciences, Université Libre de Bruxelles, 50 Avenue F.-D. Roosevelt, B-1050 Brussels, Belgium
| | | | | |
Collapse
|
5
|
Schoentgen J. Spectral models of additive and modulation noise in speech and phonatory excitation signals. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 113:553-562. [PMID: 12558291 DOI: 10.1121/1.1523384] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The article presents spectral models of additive and modulation noise in speech. The purpose is to learn about the causes of noise in the spectra of normal and disordered voices and to gauge whether the spectral properties of the perturbations of the phonatory excitation signal can be inferred from the spectral properties of the speech signal. The approach to modeling consists of deducing the Fourier series of the perturbed speech, assuming that the Fourier series of the noise and of the clean monocycle-periodic excitation are known. The models explain published data, take into account the effects of supraglottal tremor, demonstrate the modulation distortion owing to vocal tract filtering, establish conditions under which noise cues of different speech signals may be compared, and predict the impossibility of inferring the spectral properties of the frequency modulating noise from the spectral properties of the frequency modulation noise (e.g., phonatory jitter and frequency tremor). The general conclusion is that only phonatory frequency modulation noise is spectrally relevant. Other types of noise in speech are either epiphenomenal, or their spectral effects are masked by the spectral effects of frequency modulation noise.
Collapse
Affiliation(s)
- Jean Schoentgen
- Laboratory of Experimental Phonetics, Université Libre de Bruxelles, 50 Av. F-D. Roosevelt, B-1050 Brussels, Belgium.
| |
Collapse
|
6
|
Schoentgen J. Stochastic models of jitter. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2001; 109:1631-1650. [PMID: 11325133 DOI: 10.1121/1.1350557] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
This study presents stochastic models of jitter. Jitter designates small, random, involuntary perturbations of the glottal cycle lengths. Jitter is a base-line phenomenon that may be observed in all voiced speech sounds. Knowledge of its properties is therefore relevant to the acoustic modeling, analysis, and synthesis of voice quality. Also, models of jitter are conceptual frameworks that enable experimenters and clinicians to distinguish jitter in particular from aperiodic cycle length patterns in general. Vocal jitter is modeled by means of the ribbon model of the glottal vibration combined with stochastic models of the disturbances of the instantaneous frequency. The disturbance model comprises correlation-free noise and vocal microtremor. Properties of jitter that are simulated are the stochasticity, stationarity, and normality of the decorrelated cycle length perturbations, the size of decorrelated jitter, the correlation between the perturbations of neighboring glottal cycles, the modulation level and modulation frequency owing to microtremor, the asynchrony between external disturbances and glottal cycles, the dependence of the size of jitter on the average glottal cycle length, and the relation between jitter and laryngeal pathologies. Modeled jitter is discussed in the light of measured jitter, as well as the physiological and statistical plausibility of the model parameters.
Collapse
Affiliation(s)
- J Schoentgen
- Laboratoire de Phonétique Expérimentale, Université Libre de Bruxelles, Brussels, Belgium.
| |
Collapse
|
7
|
Schoentgen J, Bensaid M, Bucella F. Multivariate statistical analysis of flat vowel spectra with a view to characterizing dysphonic voices. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2000; 43:1493-1508. [PMID: 11193968 DOI: 10.1044/jslhr.4306.1493] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The aim of this article is to show how dysphonic voices can be characterized by means of a multivariate statistical analysis of flat vowel spectra. The spectral contour was obtained by means of a wavelet transform of the logarithmic magnitude spectrum, which was subsequently flattened to remove interspeaker variability related to the excitation and vocal tract filter functions. The results of the statistical analysis of flat spectra were the following. Firstly, principal components analysis produced markers that separated noisy from clean spectra. Secondly, the heuristic search for harmonic peaks or interharmonic dips could be omitted. Thirdly, conventional spectral markers of noise appeared as special instances of the markers that were derived statistically. Fourthly, the levels of visually assigned hoarseness and the first two principal components were significantly correlated. The assignment of different levels of (visual) hoarseness to different vowel timbres could be explained by the variability associated with the spectral contour.
Collapse
|
8
|
Wolfe VI, Martin DP, Palmer CI. Perception of dysphonic voice quality by naive listeners. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2000; 43:697-705. [PMID: 10877439 DOI: 10.1044/jslhr.4303.697] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
For clinical assessment as well as student training, there is a need for information pertaining to the perceptual dimensions of dysphonic voice. To this end, 24 naive listeners judged the similarity of 10 female and 10 male vowel samples, selected from within a narrow range of fundamental frequencies. Most of the perceptual variance for both sets of voices was associated with "degree of abnormality" as reflected by perceptual ratings as well as combined acoustic measures, based upon filtered and unfiltered signals. A second perceptual dimension for female voices was associated with high frequency noise as reflected by two acoustic measures: breathiness index (BRI) and a high-frequency power ratio. A second perceptual dimension for male voices was associated with a breathy-overtight continuum as reflected by period deviation (PDdev) and perceptual ratings of breathiness. Results are discussed in terms of perceptual training and the clinical assessment of pathological voices.
Collapse
Affiliation(s)
- V I Wolfe
- Department of Communication and Dramatic Arts, Auburn University at Montgomery, AL 36117-3596, USA.
| | | | | |
Collapse
|
9
|
Murphy PJ. Perturbation-free measurement of the harmonics-to-noise ratio in voice signals using pitch synchronous harmonic analysis. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 1999; 105:2866-2881. [PMID: 10335636 DOI: 10.1121/1.426901] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The measurement of the harmonics-to-noise ratio (HNR) in speech signals gives an indication of the aperiodicity of the speech waveform. This may be due to the presence of jitter, shimmer, additive noise, waveshape change, or some unknown combination of these factors. In order to estimate the HNR as a measure of the additive noise component only, the contaminating effects of the other contributory components must first be removed. A pitch synchronous harmonic analysis is proposed to overcome this problem. The procedure takes advantage of the time scale compression-frequency expansion property of the Fourier series in order to eliminate jitter and shimmer. Successive spectra are added by harmonic number as opposed to frequency location, and perturbation is removed due to the fact that the relative heights of the harmonic components remain the same for scaled signals. The technique is examined on synthetically generated voice signals. A discussion of the results is given in terms of human voice signals, characterization of jitter, vocal tract filtering effects, perturbation mechanisms, nonlinear dynamics, and the possibility of developing the method for use with inverse filtering strategies.
Collapse
Affiliation(s)
- P J Murphy
- Department of Physics, Royal College of Surgeons in Ireland, Dublin, Ireland
| |
Collapse
|
10
|
Fisher KV, Scherer RC, Guo CG, Owen AS. Longitudinal phonatory characteristics after botulinum toxin type A injection. JOURNAL OF SPEECH AND HEARING RESEARCH 1996; 39:968-980. [PMID: 8898251 DOI: 10.1044/jshr.3905.968] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Following Botulinum Toxin Type A injection, glottal competency of an adductor spasmodic dysphonia patient is thought to vary over a wide range. This study quantifies variability in laryngeal adduction for one such patient over a 10-week period. Analyses of kinematic and aerodynamic measures were used to track the voice weekly. The measures included the electroglottographic waveform width (EGGW50), nondimensional electroglottographic slope quotient (SLQ), glottal flow open quotient (FOQ), dc glottal flow, and nondimensional glottal flow peak quotient (FPQ). The results suggested that change in degree of glottal adduction over time can be observed even when vocal instability is present within each recording session. Perceptual ratings of vocal quality (breathy to pressed) were related to the laryngeal measures. The coefficient of variation for EGGW50 and the percentage of dichrotic phonations reached minima during sessions with predominantly breathy and hypoadducted phonation. The methods used in this study show potential to aid decisions about dose level and sources of perceptual adductor spasmodic dysphonia symptoms for a given patient.
Collapse
Affiliation(s)
- K V Fisher
- University of Oklahoma Health Sciences Center, Oklahoma City, USA.
| | | | | | | |
Collapse
|
11
|
de Krom G. Some spectral correlates of pathological breathy and rough voice quality for different types of vowel fragments. JOURNAL OF SPEECH AND HEARING RESEARCH 1995; 38:794-811. [PMID: 7474973 DOI: 10.1044/jshr.3804.794] [Citation(s) in RCA: 89] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
This study deals with the relation between listeners' ratings of pathological breathiness and roughness and certain characteristics of the voice spectrum. Two general research questions were addressed: First, which spectral parameters may serve as useful predictors of breathiness and roughness? Second, does the type of speech fragment used for analysis have an effect on the obtained regression model? Listener ratings of breathiness and roughness were obtained for three types of vowel fragments: a vowel onset segment, a mid-vowel (post-onset) segment, and a vowel segment covering the onset and the acoustically more stable post-onset parts. Results indicated that the harmonics-to-noise ratio was the best single predictor of both rated breathiness and roughness, explaining up to 54% of the true rating variance. By combining different predictors, between 75% and 80% of the breathiness variance could be explained for all three types of fragments. For roughness, a strong effect of fragment type was observed, with most variance explained in vowel onset fragments (71%), and least in post-onset fragments (52%). The effect of fragment type was also observed when regression analyses were performed with six predictors based on a factor analysis of the acoustic data.
Collapse
Affiliation(s)
- G de Krom
- Research Institute for Language and Speech, University of Utrecht, The Netherlands
| |
Collapse
|
12
|
de Krom G. A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals. JOURNAL OF SPEECH AND HEARING RESEARCH 1993; 36:254-266. [PMID: 8487518 DOI: 10.1044/jshr.3602.254] [Citation(s) in RCA: 109] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
A new method to calculate a spectral harmonics-to-noise ratio (HNR) in speech signals is presented. The method involves discrimination between harmonic and noise energy in the magnitude spectrum by means of a comb-liftering operation in the cepstrum domain. Sensitivity of HNR to (a) additive noise and (b) jitter was tested with synthetic vowel-like signals, generated at 10 fundamental frequencies. All jitter and noise signals were analyzed at three window lengths in order to investigate the effect of the length of the analysis frame on the estimated HNR values. Results of a multiple linear regression analysis with noise or jitter, F0, and window length as predictors for HNR indicate a major effect of both noise and jitter on HNR, in that HNR decreases almost linearly with increasing noise levels or increasing jitter. The influence of F0 and window length on HNR is small for the jittered signals, but HNR increases considerably with increasing F0 or window length for the noise signals. We conclude that the method seems to be a valid technique for determining the amount of spectral noise, because it is almost linearly sensitive to both noise and jitter for a large part of the noise or jitter continuum. The strong negative relation between HNR and jitter illustrates that spectral noise measures cannot simply be taken as indicators of the actual amount of noise in the time signal. Instead, HNR integrates several aspects of the acoustic stability of the signal. As such, HNR may be a useful parameter in the analysis of voice quality, although it cannot be directly interpreted in terms of underlying glottal events or perceptual characteristics.
Collapse
Affiliation(s)
- G de Krom
- Research Institute for Language and Speech, University of Utrecht, The Netherlands
| |
Collapse
|
13
|
Green DC, Berke GS, Ward PH. Vocal fold medialization by surgical augmentation versus arytenoid adduction in the in vivo canine model. Ann Otol Rhinol Laryngol 1991; 100:280-7. [PMID: 2018285 DOI: 10.1177/000348949110000404] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
There are a variety of methods for treating unilateral vocal cord paralysis, but to date there have been few studies that compare these phonosurgical techniques by using objective measures of voice improvement. Vocal efficiency is an objective voice measure that is defined as the ratio of the acoustic power produced by the larynx to the subglottic air power. Vocal efficiency has been found to decrease with glottic disorders such as vocal cord paralysis and carcinoma. This study compared the effects of vocal fold medialization by surgical augmentation to those of arytenoid adduction on the vocal efficiency, videostroboscopy, and acoustics (jitter, shimmer, and signal-to-noise ratio) of a simulated unilateral vocal cord paralysis in an in vivo canine model. Arytenoid adduction was superior to surgical augmentation in vocal efficiency, traveling wave motion, and acoustics.
Collapse
Affiliation(s)
- D C Green
- Division of Head and Neck Surgery, University of California-Los Angeles 90024
| | | | | |
Collapse
|
14
|
|
15
|
Smith ME, Berke GS. The effects of phonosurgery on laryngeal vibration: Part I. Theoretic considerations. Otolaryngol Head Neck Surg 1990; 103:380-90. [PMID: 2122367 DOI: 10.1177/019459989010300308] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Surgical manipulation of the laryngeal framework (phonosurgery) is rapidly gaining interest and attention. To date, however, a comparative objective evaluation of the various phonosurgical techniques has not been reported. A theoretic model of the larynx, a four-mass model based on the work of Ishizaka (J Acoust Soc Am 1976;60:1193-8) and Koizumi et al. (J Acoust Soc Am 1987;82:1179-92), was developed and adapted to simulate laryngeal biomechanical behavior, as understood by current research. The model was then applied to a comparative evaluation of phonosurgical techniques. Input parameters that correlate laryngeal function and model simulation were developed. Surgical procedures were categorized according to their effect on these parameters. A model simulation of these techniques allowed comparison and prediction of the results of phonosurgery and a better understanding of the issues involved with surgical alteration of the voice.
Collapse
Affiliation(s)
- M E Smith
- Division of Head and Neck Surgery, University of California, Los Angeles
| | | |
Collapse
|
16
|
Habermann G. [Functional disorders of the voice and their treatment (author's transl)]. ARCHIVES OF OTO-RHINO-LARYNGOLOGY 1980; 227:171-345. [PMID: 7469925 DOI: 10.1007/bf00456373] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
|
17
|
Schultz-Coulon HJ. [Diagnosis of dysfunction of the voice (author's transl)]. ARCHIVES OF OTO-RHINO-LARYNGOLOGY 1980; 227:1-169. [PMID: 7469924 DOI: 10.1007/bf00456372] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
|