1
|
Shadle CH, Chen WR, Koenig LL, Preston JL. Refining and extending measures for fricative spectra, with special attention to the high-frequency rangea). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:1932-1944. [PMID: 37768114 PMCID: PMC10540850 DOI: 10.1121/10.0021075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 08/04/2023] [Accepted: 09/05/2023] [Indexed: 09/29/2023]
Abstract
Fricatives have noise sources that are filtered by the vocal tract and that typically possess energy over a much broader range of frequencies than observed for vowels and sonorant consonants. This paper introduces and refines fricative measurements that were designed to reflect underlying articulatory and aerodynamic conditions These show differences in the pattern of high-frequency energy for sibilants vs non-sibilants, voiced vs voiceless fricatives, and non-sibilants differing in place of articulation. The results confirm the utility of a spectral peak measure (FM) and low-mid frequency amplitude difference (AmpD) for sibilants. Using a higher-frequency range for defining FM for female voices for alveolars is justified; a still higher range was considered and rejected. High-frequency maximum amplitude (Fh) and amplitude difference between low- and higher-frequency regions (AmpRange) capture /f-θ/ differences in English and the dynamic amplitude range over the entire spectrum. For this dataset, with spectral information up to 15 kHz, a new measure, HighLevelD, was more effective than previously used LevelD and Slope in showing changes over time within the frication. Finally, isolated words and connected speech differ. This work contributes improved measures of fricative spectra and demonstrates the necessity of including high-frequency energy in those measures.
Collapse
Affiliation(s)
- Christine H Shadle
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06519, USA
| | - Wei-Rong Chen
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06519, USA
| | - Laura L Koenig
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06519, USA
| | - Jonathan L Preston
- Department of Communication Sciences and Disorders, Syracuse University, Syracuse, New York 13244, USA
| |
Collapse
|
2
|
Kröger BJ. Computer-Implemented Articulatory Models for Speech Production: A Review. Front Robot AI 2022; 9:796739. [PMID: 35494539 PMCID: PMC9040071 DOI: 10.3389/frobt.2022.796739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 02/21/2022] [Indexed: 11/24/2022] Open
Abstract
Modeling speech production and speech articulation is still an evolving research topic. Some current core questions are: What is the underlying (neural) organization for controlling speech articulation? How to model speech articulators like lips and tongue and their movements in an efficient but also biologically realistic way? How to develop high-quality articulatory-acoustic models leading to high-quality articulatory speech synthesis? Thus, on the one hand computer-modeling will help us to unfold underlying biological as well as acoustic-articulatory concepts of speech production and on the other hand further modeling efforts will help us to reach the goal of high-quality articulatory-acoustic speech synthesis based on more detailed knowledge on vocal tract acoustics and speech articulation. Currently, articulatory models are not able to reach the quality level of corpus-based speech synthesis. Moreover, biomechanical and neuromuscular based approaches are complex and still not usable for sentence-level speech synthesis. This paper lists many computer-implemented articulatory models and provides criteria for dividing articulatory models in different categories. A recent major research question, i.e., how to control articulatory models in a neurobiologically adequate manner is discussed in detail. It can be concluded that there is a strong need to further developing articulatory-acoustic models in order to test quantitative neurobiologically based control concepts for speech articulation as well as to uncover the remaining details in human articulatory and acoustic signal generation. Furthermore, these efforts may help us to approach the goal of establishing high-quality articulatory-acoustic as well as neurobiologically grounded speech synthesis.
Collapse
|
3
|
Vortex Formation Times in the Glottal Jet, Measured in a Scaled-Up Model. FLUIDS 2021; 6. [PMID: 34840965 PMCID: PMC8627194 DOI: 10.3390/fluids6110412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
In this paper, the timing of vortex formation on the glottal jet is studied using previously published velocity measurements of flow through a scaled-up model of the human vocal folds. The relative timing of the pulsatile glottal jet and the instability vortices are acoustically important since they determine the harmonic and broadband content of the voice signal. Glottis exit jet velocity time series were extracted from time-resolved planar DPIV measurements. These measurements were acquired at four glottal flow speeds (uSS = 16.1–38 cm/s) and four glottis open times (To = 5.67–23.7 s), providing a Reynolds number range Re = 4100–9700 and reduced vibration frequency f* = 0.01–0.06. Exit velocity waveforms showed temporal behavior on two time scales, one that correlates to the period of vibration and another characterized by short, sharp velocity peaks (which correlate to the passage of instability vortices through the glottis exit plane). The vortex formation time, estimated by computing the time difference between subsequent peaks, was shown to be not well-correlated from one vibration cycle to the next. The principal finding is that vortex formation time depends not only on cycle phase, but varies strongly with reduced frequency of vibration. In all cases, a strong high-frequency burst of vortex motion occurs near the end of the cycle, consistent with perceptual studies using synthesized speech.
Collapse
|
4
|
Lodermeyer A, Bagheri E, Kniesburges S, Näger C, Probst J, Döllinger M, Becker S. The mechanisms of harmonic sound generation during phonation: A multi-modal measurement-based approach. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:3485. [PMID: 34852620 DOI: 10.1121/10.0006974] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 10/11/2021] [Indexed: 06/13/2023]
Abstract
Sound generation during voiced speech remains an open research topic because the underlying process within the human larynx is hardly accessible for direct measurements. In the present study, harmonic sound generation during phonation was investigated with a model that replicates the fully coupled fluid-structure-acoustic interaction (FSAI). The FSAI was captured using a multi-modal approach by measuring the flow and acoustic source fields based on particle image velocimetry, as well as the surface velocity of the vocal folds based on laser vibrometry and high-speed imaging. Strong harmonic sources were localized near the glottis, as well as further downstream, during the presence of the supraglottal jet. The strongest harmonic content of the vocal fold surface motion was verified for the area near the glottis, which directly interacts with the glottal jet flow. Also, the acoustic back-coupling of the formant frequencies onto the harmonic oscillation of the vocal folds was verified. These findings verify that harmonic sound generation is the result of a strong interrelation between the vocal fold motion, modulated flow field, and vocal tract geometry.
Collapse
Affiliation(s)
- Alexander Lodermeyer
- Department of Process Machinery and Systems Engineering, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, 91058, Germany
| | - Eman Bagheri
- Department of Process Machinery and Systems Engineering, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, 91058, Germany
| | - Stefan Kniesburges
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, 91054, Germany
| | - Christoph Näger
- Department of Process Machinery and Systems Engineering, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, 91058, Germany
| | - Judith Probst
- Department of Process Machinery and Systems Engineering, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, 91058, Germany
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Medical School, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, 91054, Germany
| | - Stefan Becker
- Department of Process Machinery and Systems Engineering, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, 91058, Germany
| |
Collapse
|
5
|
Ringenberg H, Rogers D, Wei N, Krane M, Wei T. Phase-averaged and cycle-to-cycle analysis of jet dynamics in a scaled up vocal-fold model. JOURNAL OF FLUID MECHANICS 2021; 918:A44. [PMID: 34737460 PMCID: PMC8562556 DOI: 10.1017/jfm.2021.365] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Phase-averaged and cycle-to-cycle analysis of key contributors to sound production in phonation is examined in a scaled-up vocal-fold model. Simultaneous temporally and spatially resolved pressure and velocity measurements permitted examination of each term in the streamwise integral momentum equation. The relative sizes of these terms were used to address the issue of whether transglottal pressure is a surrogate for vocal-fold drag, a quantity directly related to sound production. Further, time traces of transglottal pressure and volume flow rate provided insight into the role of cycle-to-cycle variations in voiced sound production which affect voice quality. Experiments were conducted using a 10× scaled-up model in a free-surface water tunnel. Two-dimensional vocal-fold models with semi-circular ends inside a square duct were driven with constant opening and closing speeds. The time from opening to closed, To , was half the oscillation period. Time-resolved digital particle image velocimetry (DPIV) and pressure measurements along the duct centreline were made for 3650 ≤ Re ≤ 8100 and equivalent life frequencies from 52.5 to 97.5 Hz. Results showed that transglottal pressure does serve as a surrogate for the vocal-fold drag. However, smaller but non-negligible momentum flux and inertia terms, caused by the jet and vocal-fold motions, may also contribute to vocal-fold drag. Further, cycle-to-cycle variations including jet switching and modulation are inherent in flows of this type despite their high degrees of symmetry and repeatability. The origins of these variations and their potential role in sound production and voice quality are discussed.
Collapse
Affiliation(s)
- Hunter Ringenberg
- Mechanical & Materials Eng’g, University of
Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Dylan Rogers
- Mechanical & Materials Eng’g, University of
Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Nathaniel Wei
- Mechanical & Materials Eng’g, University of
Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Michael Krane
- Applied Research Laboratory, Pennsylvania State University,
State College, PA 16802, USA
| | - Timothy Wei
- Mechanical & Materials Eng’g, University of
Nebraska-Lincoln, Lincoln, NE 68588, USA
| |
Collapse
|
6
|
Titze IR. Regulation of laryngeal resistance and maximum power transfer with semi-occluded airway vocalization. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:4106. [PMID: 34241487 PMCID: PMC8205511 DOI: 10.1121/10.0005124] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 04/22/2021] [Accepted: 05/11/2021] [Indexed: 06/13/2023]
Abstract
Steady airflow resistances in semi-occluded airways as well as acoustic impedances in vocalization are quantified from the lungs to the lips. For clinical and voice training applications, the primary focus is on two airway conditions, an oral semi-occlusion and a semi-occlusion above the vocal folds. Laryngeal airflow resistance is divided into glottal airflow resistance and epilaryngeal airway resistance. Maximum aerodynamic power is transferred to the vocal tract if the glottal airflow resistance is reduced while the epilaryngeal airway resistance is increased. A semi-occlusion at the lips helps to set up this condition. For the acoustic power transfer, the epilaryngeal airway also serves to match the impedance of the source to the impedance of the vocal tract.
Collapse
Affiliation(s)
- Ingo R Titze
- Utah Center for Vocology, University of Utah, Salt Lake City, Utah 84112 USA
| |
Collapse
|
7
|
Yoshinaga T, Maekawa K, Iida A. Aeroacoustic differences between the Japanese fricatives [ɕ] and [ç]. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:2426. [PMID: 33940863 DOI: 10.1121/10.0003936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 03/06/2021] [Indexed: 06/12/2023]
Abstract
To elucidate the linguistic similarity between the alveolo-palatal sibilant [ɕ] and palatal non-sibilant [ç] in Japanese, the aeroacoustic differences between the two consonants were explored via experimentation with participants and analysis using simplified vocal tract models. The real-time magnetic resonance imaging (rtMRI) observations of articulatory movements demonstrated that some speakers use a nearly identical place of articulation for /si/ [ɕi] and /hi/ [çi]. Simplified vocal tract models were then constructed based on the data captured by static MRI, and the model-generated synthetic sounds were compared with speaker data producing [ɕ] and [ç]. Speaker data demonstrated that the amplitude of the broadband noise of [ç] was weaker than that of [ɕ]; the characteristic peak amplitude at approximately 4 kHz was greater in [ç] than in [ɕ], although the mid-sagittal vocal tract profiles were nearly identical for three of ten subjects in the rtMRI observation. These acoustic differences were reproduced by the proposed models, with differences in the width of the coronal plane constriction and the flow rate. The results suggest the need to include constriction width and flow rate as parameters for articulatory phonetic descriptions of speech sounds.
Collapse
Affiliation(s)
- Tsukasa Yoshinaga
- Toyohashi University of Technology, 1-1 Hibarigaoka, Tempaku, Toyohashi, Aichi 441-8580, Japan
| | - Kikuo Maekawa
- National Institute for Japanese Language and Linguistics, 10-2 Midoricho, Tachikawa, Tokyo 190-8561, Japan
| | - Akiyoshi Iida
- Toyohashi University of Technology, 1-1 Hibarigaoka, Tempaku, Toyohashi, Aichi 441-8580, Japan
| |
Collapse
|
8
|
Pont A, Guasch O, Arnela M. Finite element generation of sibilants /s/ and /z/ using random distributions of Kirchhoff vortices. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2020; 36:e3302. [PMID: 31883313 DOI: 10.1002/cnm.3302] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 09/20/2019] [Accepted: 12/20/2019] [Indexed: 06/10/2023]
Abstract
The numerical simulation of sibilant sounds in three-dimensional realistic vocal tracts constitutes a challenging problem because it involves a wide range of turbulent flow scales. Rotating eddies generate acoustic waves whose wavelengths are inversely proportional to the flow local Mach number. If that is low, very fine meshes are required to capture the flow dynamics. In standard hybrid computational aeroacoustics (CAA), where the incompressible Navier-Stokes equations are first solved to get a source term that is secondly input into an acoustic wave equation, this implies resorting to supercomputer facilities. As a consequence, only very short time intervals of the sibilant can be produced, which may be enough for its spectral characterization but insufficient to synthesize, for instance, an audio file from it or a syllable sound. In this work, we propose to substitute the aeroacoustic source term obtained from the computational fluid dynamics (CFD) in the first step of hybrid CAA, by a random distribution of Kirchhoff's spinning vortices, located in the region between the upper incisors and the lower lip. In this way, one only needs to solve a linear wave equation to generate a sibilant, and therefore avoids the costly large-scale computations. We show that our proposal can recover the outcomes of hybrid CAA simulations in average, and that it can be applied to generate sibilants /s/ and /z/. Modeling and implementation details of the Kirchhoff vortex distribution in a stabilized finite element code are discussed in the paper, as well as the outcomes of the simulations.
Collapse
Affiliation(s)
- Arnau Pont
- GTM Grup de Recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| | - Oriol Guasch
- GTM Grup de Recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| | - Marc Arnela
- GTM Grup de Recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| |
Collapse
|
9
|
Effects of velopharyngeal openings on flow characteristics of nasal emission. Biomech Model Mechanobiol 2020; 19:1447-1459. [PMID: 31925590 DOI: 10.1007/s10237-019-01280-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 12/17/2019] [Indexed: 10/25/2022]
Abstract
Nasal emission is a speech disorder where undesired airflow enters the nasal cavity during speech due to inadequate closure of the velopharyngeal valve. Nasal emission is typically inaudible with large velopharyngeal openings and very distorting with small openings. This study aims to understand how flow characteristics in the nasal cavity change as a function of velopharyngeal opening using computational fluid dynamics. The model is based on a subject who was diagnosed with distorting nasal emission and a small velopharyngeal opening. The baseline geometry was delineated from CT scans that were taken, while the subject was sustaining a sibilant sound. Modifications to the model were done by systematically widening or narrowing the velopharyngeal opening while keeping the geometry constant elsewhere. Results show that if the flow resistance across the velopharyngeal valve is smaller than resistance across the oral constriction, flow characteristics such as velocity and turbulence are inversely proportional to the size of the opening. If flow resistance is higher across the velopharyngeal valve than the oral constriction, turbulence in the nasal cavity will be reduced at a higher rate. These findings can be used to generalize that the area ratio of the velopharyngeal opening to the oral constriction is a factor that determines airflow characteristics and subsequently its sound during production of sibilant sound. It implies that the highest level of turbulence in the nasal cavity, and subsequently the sound that will likely be perceived as the most severe nasal emission is produced when the size of openings is equal.
Collapse
|
10
|
Sundström E, Oren L. Sound production mechanisms of audible nasal emission during the sibilant /s/. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:4199. [PMID: 31893718 PMCID: PMC7043896 DOI: 10.1121/1.5135566] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 11/05/2019] [Accepted: 11/05/2019] [Indexed: 06/10/2023]
Abstract
Audible nasal emission is a speech disorder that involves undesired sound generated by airflow into the nasal cavity during production of oral sounds. This disorder is associated with small-to-medium sized velopharyngeal openings. These openings induce turbulence in the nasal cavity, which in turn produces sound. The purpose of this study is to examine the aeroacoustic mechanisms that generate turbulent sound during production of a sibilant /s/ with and without a small opening of the velopharyngeal valve. The models are based on two pediatric subjects who were diagnosed with severe audible nasal emission. The geometries were delineated from computed tomography scans taken while the subjects were sustaining a sibilant sound. Large eddy simulation with the Ffowcs Williams and Hawkings analogy was used to predict the flow behavior and its acoustic characterization. It shows that the majority of the acoustic energy is produced by surface loading, which is related to dipole sources that resonate in the nasal cavity. The quadrupole source term that is associated with the unsteady shear layers is seen to be less significant. It also shows that closure of the velopharyngeal valve changes the far-field spectrum significantly because aeroacoustic mechanisms in the nasal cavity are eliminated.
Collapse
Affiliation(s)
- Elias Sundström
- Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, Ohio 45267, USA
| | - Liran Oren
- Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, Ohio 45267, USA
| |
Collapse
|
11
|
Yoshinaga T, Nozaki K, Wada S. A simplified vocal tract model for articulation of [s]: The effect of tongue tip elevation on [s]. PLoS One 2019; 14:e0223382. [PMID: 31600263 PMCID: PMC6786647 DOI: 10.1371/journal.pone.0223382] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 09/19/2019] [Indexed: 11/18/2022] Open
Abstract
Fricative consonants are known to be pronounced by controlling turbulent flow inside a vocal tract. In this study, a simplified vocal tract model was proposed to investigate the characteristics of flow and sound during production of the fricative [s] in a word context. By controlling the inlet flow rate and tongue speed, the acoustic characteristics of [s] were reproduced by the model. The measurements with a microphone and a hot-wire anemometer showed that the flow velocity at the teeth gap and far-field sound pressure started oscillating before the tongue reached the /s/ position, and continued during tongue descent. This behaviour was not affected by the changes of the tongue speed. These results indicate that there is a time shift between source generation and tongue movement. This time shift can be a physical constraint in the articulation of words which include /s/. With the proposed model, we could investigate the effects of tongue speed on the flow and sound generation in a parametric way. The proposed methodology is applicable for other phonemes to further explore the aeroacoustics of phonation.
Collapse
Affiliation(s)
| | - Kazunori Nozaki
- Osaka University Dental Hospital, Suita, Osaka, Japan
- * E-mail:
| | - Shigeo Wada
- Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka, Japan
| |
Collapse
|
12
|
McPhail MJ, Campo ET, Krane MH. Aeroacoustic source characterization in a physical model of phonation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:1230. [PMID: 31472595 PMCID: PMC6701979 DOI: 10.1121/1.5122787] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
This paper presents measurements conducted in a physical model of the adult human airway. The goals of this work are to (1) benchmark the physical model to excised larynx models in the literature and (2) empirically demonstrate the relationship between vocal fold drag and sound production. Results from the airway model are first benchmarked to published time-averaged behavior of excised larynx models. The airway model in this work exhibited higher glottal volume flow, lower glottal resistance, and less fundamental frequency variation than excised larynx models. Next, concurrent measurements of source behavior and radiated sound were compared. Unsteady transglottal pressure (a surrogate measure for vocal fold drag) and radiated sound, measured at the mouth, showed good correlation. In particular, the standard deviation and the ratio of the power of the first and second harmonics of the transglottal and mouth pressures were strongly correlated. This empirical result supports the assertion that vocal fold drag is the principal source of sound in phonation.
Collapse
Affiliation(s)
- Michael J McPhail
- Applied Research Laboratory, Pennsylvania State University, State College, Pennsylvania 16803, USA
| | - Elizabeth T Campo
- Applied Research Laboratory, Pennsylvania State University, State College, Pennsylvania 16803, USA
| | - Michael H Krane
- Applied Research Laboratory, Pennsylvania State University, State College, Pennsylvania 16803, USA
| |
Collapse
|
13
|
Sundström E, Oren L. Pharyngeal flow simulations during sibilant sound in a patient-specific model with velopharyngeal insufficiency. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:3137. [PMID: 31153316 PMCID: PMC6542651 DOI: 10.1121/1.5108889] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 04/02/2019] [Accepted: 04/30/2019] [Indexed: 06/09/2023]
Abstract
Dysfunction of the velopharyngeal valve in the human airway causes speech disorders because there is no separation between the oral and nasal cavities during normal oral speech. The speech literature hypothesizes that undesired sound is formed by turbulent flow in the nasal cavity in cases of small velopharyngeal openings. The aim is to determine the flow behavior and the sound-generating mechanism in the vocal tract using computational fluid dynamics in two patient-specific models with small and large velopharyngeal openings and contrast it with cases of complete velopharyngeal closure. The geometry for the models was reconstructed from computed tomography scans that were taken while the patients were sustaining a sibilant sound. The results for the turbulence are correlated with the broadband acoustic models of Proudman and Curle. The models show that turbulence in the vocal tract increases downstream of a constriction and that sound may be generated from it. Furthermore, most of the sound due to turbulence in the nasal cavity is governed by a dipole source where turbulence interacts with the nasal cavity walls. The generated sound power by turbulence itself in the nasal cavity (the quadrupole source) is two orders of magnitude less than the dipole source.
Collapse
Affiliation(s)
- Elias Sundström
- Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, Ohio 45267, USA
| | - Liran Oren
- Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, Ohio 45267, USA
| |
Collapse
|
14
|
Mattheus W, Brücker C. Characteristics of the pulsating jet flow through a dynamic glottal model with a lens-like constriction. Biomed Eng Lett 2019; 8:309-320. [PMID: 30603215 DOI: 10.1007/s13534-018-0075-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Revised: 05/09/2018] [Accepted: 05/14/2018] [Indexed: 11/28/2022] Open
Abstract
A computational study of the pulsating jet in a squared channel with a dynamic glottal-shaped constriction is presented. It follows the model experiments of Triep and Brücker (J Acoust Soc Am 127(2):1537-1547, 2010) with the cam-driven model that replicates the dynamic glottal motion in the process of human phonation. The boundary conditions are mapped from the model experiment onto the computational model and the three dimensional time resolved velocity and pressure fields are numerically calculated. This study aims to provide more details of flow separation and pressure distribution in the glottal gap and in the supraglottal flow field. Within the glottal gap a 'vena contracta' effect is generated in the mid-sagittal plane. The flow separation in the mid-coronal plane is therefore delayed to larger diffuser angles which leads to an 'axis-switching' effect from mid-sagittal to mid-coronal plane. The location of flow separation in mid-sagittal cross section moves up- and downwards along the vocal folds surface in streamwise direction. The generated jet shear layer forms a chain of coherent vortex structures within each glottal cycle. These vortices cause characteristic velocity and pressure fluctuations in the supraglottal region, that are in the range of 10-30 times of the fundamental frequency.
Collapse
Affiliation(s)
- Willy Mattheus
- 1Division of Phoniatrics and Audiology, Department of Otorhinolaryngology, Faculty of Medicine "Carl Gustav Carus", Technische Universität Dresden, Dresden, Germany
| | - Christoph Brücker
- 2Department of Mechanical Engineering and Aeronautics, City University London, Northampton Square, London, UK
| |
Collapse
|
15
|
Pont A, Guasch O, Baiges J, Codina R, van Hirtum A. Computational aeroacoustics to identify sound sources in the generation of sibilant /s/. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2019; 35:e3153. [PMID: 30203927 DOI: 10.1002/cnm.3153] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 08/31/2018] [Accepted: 09/05/2018] [Indexed: 06/08/2023]
Abstract
A sibilant fricative /s/ is generated when the turbulent jet in the narrow channel between the tongue blade and the hard palate is deflected downwards through the space between the upper and lower incisors and then impinges the space between the lower incisors and the lower lip. The flow eddies in that region become a source of direct aerodynamic sound, which is also diffracted by the speech articulators and radiated outwards. The numerical simulation of these phenomena is complex. The spectrum of an /s/ typically peaks between 4 and 10 kHz, which implies that very fine computational meshes are needed to capture the eddies producing such high frequencies. In this work, a large-scale computation of the aeroacoustics of /s/ has been performed for a realistic vocal tract geometry, resorting to two different acoustic analogies. A stabilized finite element method that acts as a large eddy simulation model has been adopted to solve the flow dynamics. Also, a numerical strategy has been implemented that allows the determination, in a single computational run, of the separate contribution of the sound diffracted by the upper incisors from the overall radiated sound. Results are presented for points located close to the lip opening showing the relative influence of the upper teeth depending on frequency.
Collapse
Affiliation(s)
- Arnau Pont
- Centre Internacional de Mètodes Numèrics en Enginyeria, Universitat Politècnica de Catalunya, Barcelona, Spain
- GTM-Grup de recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| | - Oriol Guasch
- GTM-Grup de recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| | - Joan Baiges
- Departament d'Enginyeria Civil i Ambiental, Universitat Politècnica de Catalunya, Barcelona, Spain
| | - Ramon Codina
- Departament d'Enginyeria Civil i Ambiental, Universitat Politècnica de Catalunya, Barcelona, Spain
| | | |
Collapse
|
16
|
Taitz A, Shalom DE, Trevisan MA. Vocal effort modulates the motor planning of short speech structures. Phys Rev E 2018; 97:052406. [PMID: 29906900 DOI: 10.1103/physreve.97.052406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Indexed: 06/08/2023]
Abstract
Speech requires programming the sequence of vocal gestures that produce the sounds of words. Here we explored the timing of this program by asking our participants to pronounce, as quickly as possible, a sequence of consonant-consonant-vowel (CCV) structures appearing on screen. We measured the delay between visual presentation and voice onset. In the case of plosive consonants, produced by sharp and well defined movements of the vocal tract, we found that delays are positively correlated with the duration of the transition between consonants. We then used a battery of statistical tests and mathematical vocal models to show that delays reflect the motor planning of CCVs and transitions are proxy indicators of the vocal effort needed to produce them. These results support that the effort required to produce the sequence of movements of a vocal gesture modulates the onset of the motor plan.
Collapse
Affiliation(s)
- Alan Taitz
- Physics Institute of Buenos Aires (IFIBA) CONICET, Buenos Aires, Argentina
| | - Diego E Shalom
- Department of Physics, Universidad de Buenos Aires, Buenos Aires 1428EGA, Argentina
| | - Marcos A Trevisan
- Physics Institute of Buenos Aires (IFIBA) CONICET, Buenos Aires, Argentina
- Department of Physics, Universidad de Buenos Aires, Buenos Aires 1428EGA, Argentina
| |
Collapse
|
17
|
Mindlin GB. Nonlinear dynamics in the study of birdsong. CHAOS (WOODBURY, N.Y.) 2017; 27:092101. [PMID: 28964148 PMCID: PMC5605333 DOI: 10.1063/1.4986932] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Accepted: 08/17/2017] [Indexed: 06/07/2023]
Abstract
Birdsong, a rich and complex behavior, is a stellar model to understand a variety of biological problems, from motor control to learning. It also enables us to study how behavior emerges when a nervous system, a biomechanical device and the environment interact. In this review, I will show that many questions in the field can benefit from the approach of nonlinear dynamics, and how birdsong can inspire new directions for research in dynamics.
Collapse
Affiliation(s)
- Gabriel B Mindlin
- Departamento de Física, FCEyN, Universidad de Buenos Aires IFIBA, CONICET, Argentina
| |
Collapse
|
18
|
Nozaki K, Maeda Y, Tamagawa H. The effect of wearing custom-made mouthguards on the aeroacoustic properties of Japanese sibilant /s/. Dent Traumatol 2012; 29:139-44. [DOI: 10.1111/j.1600-9657.2012.01140.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
19
|
Van Hirtum A, Pelorson X, Estienne O, Bailliet H. Experimental validation of flow models for a rigid vocal tract replica. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:2128-2138. [PMID: 21973367 DOI: 10.1121/1.3631631] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Flow through the vocal tract is studied through an in vitro rigid replica for different geometrical configurations and steady flow conditions with bulk Reynolds numbers Re<15,000. The vocal tract geometry is approximated by two consecutive obstacles, representing "tongue" and "tooth," in a rectangular channel of fixed length. For the upstream tongue obstacle with fixed constriction degree (81%) the streamwise position is varied and for the downstream obstacle the constriction degree is varied from 0% up to 96%. Different upstream pressures are considered for each geometrical configuration. Point pressure measurements at three fixed locations along the channel are experimentally assessed. In addition, the volume airflow rate is measured. The pressure distribution is estimated with a one-dimensional flow model, and the effects of different corrections to a laminar irrotational flow are assessed. The model outcome is validated against experimental data. Depending on the geometrical configuration, the best model accuracy is obtained by accounting for viscosity (needed for constriction degrees at the tooth that are small, i.e.,≤58%, or very large, i.e., ≥96%), a sudden constriction (large gap between both constrictions), or a bending geometry (narrow gap between both constrictions). Best overall model errors vary between 4% and 30% for all assessed geometrical configurations in cases where a tongue obstacle is present.
Collapse
Affiliation(s)
- Annemie Van Hirtum
- GIPSA-lab, UMR CNRS 5216, Grenoble University, F-38031 Grenoble Cedex 1, France.
| | | | | | | |
Collapse
|
20
|
Samlan RA, Story BH. Relation of structural and vibratory kinematics of the vocal folds to two acoustic measures of breathy voice based on computational modeling. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2011; 54:1267-83. [PMID: 21498582 PMCID: PMC3184371 DOI: 10.1044/1092-4388(2011/10-0195)] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
PURPOSE To relate vocal fold structure and kinematics to 2 acoustic measures: cepstral peak prominence (CPP) and the amplitude of the first harmonic relative to the second (H1-H2). METHOD The authors used a computational, kinematic model of the medial surfaces of the vocal folds to specify features of vocal fold structure and vibration in a manner consistent with breathy voice. Four model parameters were altered: degree of vocal fold adduction, surface bulging, vibratory nodal point, and supraglottal constriction. CPP and H1-H2 were measured from simulated glottal area, glottal flow, and acoustic waveforms and were related to the underlying vocal fold kinematics. RESULTS CPP decreased with increased separation of the vocal processes, whereas the nodal point location had little effect. H1-H2 increased as a function of separation of the vocal processes in the range of 1.0 mm to 1.5 mm and decreased with separation > 1.5 mm. CONCLUSIONS CPP is generally a function of vocal process separation. H1*-H2* (see paragraph 6 of article text for an explanation of the asterisks) will increase or decrease with vocal process separation on the basis of vocal fold shape, pivot point for the rotational mode, and supraglottal vocal tract shape, limiting its utility as an indicator of breathy voice. Future work will relate the perception of breathiness to vocal fold kinematics and acoustic measures.
Collapse
Affiliation(s)
- Robin A Samlan
- Speech Acoustics Laboratory, University of Arizona, Tucson, USA.
| | | |
Collapse
|
21
|
Misun V, Svancara P, Vasek M. Experimental Analysis of the Characteristics of Artificial Vocal Folds. J Voice 2011; 25:308-18. [DOI: 10.1016/j.jvoice.2009.12.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2009] [Accepted: 12/08/2009] [Indexed: 11/27/2022]
|
22
|
Erath BD, Plesniak MW. Impact of wall rotation on supraglottal jet stability in voiced speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 129:EL64-EL70. [PMID: 21428469 DOI: 10.1121/1.3533919] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Supraglottal jet variability was investigated in a scaled-up flow facility incorporating driven vocal fold models with and without wall rotation. Principle component analysis was performed on the experimental supraglottal flow fields to ascertain the roll of glottal wall motion on the development of the supraglottal jet. It is shown that intraglottal flow asymmetries that develop due to wall rotation are not the primary mechanism for generating large-scale cycle-to-cycle deflection of the supraglottal jet. However, wall rotation does decrease the energy content of the first mode, redistributing it to the higher modes through an increase in unstructured flow variability.
Collapse
Affiliation(s)
- Byron D Erath
- Department of Mechanical and Aerospace Engineering, The George Washington University, 801 22nd Street Northwest, Washington, DC 20052, USA.
| | | |
Collapse
|
23
|
Krane MH, Barry M, Wei T. Dynamics of temporal variations in phonatory flow. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:372-83. [PMID: 20649231 PMCID: PMC2921435 DOI: 10.1121/1.3365312] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2009] [Revised: 12/29/2009] [Accepted: 02/22/2010] [Indexed: 05/07/2023]
Abstract
This paper addresses the dynamic relevance of time variations of phonatory airflow, commonly neglected under the quasisteady phonatory flow assumption. In contrast to previous efforts, which relied on direct measurement of glottal impedance, this work uses spatially and temporally resolved measurements of the velocity field to estimate the unsteady and convective acceleration terms in the unsteady Bernoulli equation. Theoretical considerations suggest that phonatory flow is inherently unsteady when two related conditions apply: (1) that the unsteady and convective accelerations are commensurate, and (2) that the inertia of the glottal jet is non-negligible. Acceleration waveforms, computed from experimental data, show that unsteady and convective accelerations to be the same order of magnitude, throughout the cycle, and that the jet flow contributes significantly to the unsteady acceleration. In the middle of the cycle, however, jet inertia is negligible because the convective and unsteady accelerations nearly offset one another in the jet region. These results, consistent with previous findings treating quasisteady phonatory flow, emphasize that unsteady acceleration cannot be neglected during the final stages of the phonation cycle, during which voice sound power and spectral content are largely determined. Furthermore, glottal jet dynamics must be included in any model of phonatory airflow.
Collapse
Affiliation(s)
- Michael H Krane
- Applied Research Laboratory, Pennsylvania State University, University Park, Pennsylvania 16804, USA.
| | | | | |
Collapse
|
24
|
Abstract
The area function of the vocal tract in all of its spatial detail is not directly computable from the speech signal. But is partial, yet phonetically distinctive, information about articulation recoverable from the acoustic signal that arrives at the listener's ear? The answer to this question is important for phonetics, because various theories of speech perception predict different answers. Some theories assume that recovery of articulatory information must be possible, while others assume that it is impossible. However, neither type of theory provides firm evidence showing that distinctive articulatory information is or is not extractable from the acoustic signal. The present study focuses on vowel gestures and examines whether linguistically significant information, such as the constriction location, constriction degree, and rounding, is contained in the speech signal, and whether such information is recoverable from formant parameters. Perturbation theory and linear prediction were combined, in a manner similar to that in Mokhtari (1998) [Mokhtari, P. (1998). An acoustic-phonetic and articulatory study of speech-speaker dichotomy. Doctoral dissertation, University of New South Wales], to assess the accuracy of recovery of information about vowel constrictions. Distinctive constriction information estimated from the speech signal for ten American English vowels were compared to the constriction information derived from simultaneously collected X-ray microbeam articulatory data for 39 speakers [Westbury (1994). Xray microbeam speech production database user's handbook. University of Wisconsin, Madison, WI]. The recovery of distinctive articulatory information relies on a novel technique that uses formant frequencies and amplitudes, and does not depend on a principal components analysis of the articulatory data, as do most other inversion techniques. These results provide evidence that distinctive articulatory information for vowels can be recovered from the acoustic signal.
Collapse
|
25
|
Sprecher A, Olszewski A, Jiang JJ, Zhang Y. Updating signal typing in voice: addition of type 4 signals. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:3710-3716. [PMID: 20550269 PMCID: PMC2896412 DOI: 10.1121/1.3397477] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2009] [Revised: 03/22/2010] [Accepted: 03/25/2010] [Indexed: 05/29/2023]
Abstract
The addition of a fourth type of voice to Titze's voice classification scheme is proposed. This fourth voice type is characterized by primarily stochastic noise behavior and is therefore unsuitable for both perturbation and correlation dimension analysis. Forty voice samples were classified into the proposed four types using narrowband spectrograms. Acoustic, perceptual, and correlation dimension analyses were completed for all voice samples. Perturbation measures tended to increase with voice type. Based on reliability cutoffs, the type 1 and type 2 voices were considered suitable for perturbation analysis. Measures of unreliability were higher for type 3 and 4 voices. Correlation dimension analyses increased significantly with signal type as indicated by a one-way analysis of variance. Notably, correlation dimension analysis could not quantify the type 4 voices. The proposed fourth voice type represents a subset of voices dominated by noise behavior. Current measures capable of evaluating type 4 voices provide only qualitative data (spectrograms, perceptual analysis, and an infinite correlation dimension). Type 4 voices are highly complex and the development of objective measures capable of analyzing these voices remains a topic of future investigation.
Collapse
Affiliation(s)
- Alicia Sprecher
- Department of Surgery, Division of Otolaryngology-Head and Neck Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI 53792, USA
| | | | | | | |
Collapse
|
26
|
Little MA, Costello DAE, Harries ML. Objective dysphonia quantification in vocal fold paralysis: comparing nonlinear with classical measures. J Voice 2009; 25:21-31. [PMID: 19900790 DOI: 10.1016/j.jvoice.2009.04.004] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2009] [Accepted: 04/20/2009] [Indexed: 11/30/2022]
Abstract
Clinical acoustic voice-recording analysis is usually performed using classical perturbation measures, including jitter, shimmer, and noise-to-harmonic ratios (NHRs). However, restrictive mathematical limitations of these measures prevent analysis for severely dysphonic voices. Previous studies of alternative nonlinear random measures addressed wide varieties of vocal pathologies. Here, we analyze a single vocal pathology cohort, testing the performance of these alternative measures alongside classical measures. We present voice analysis pre- and postoperatively in 17 patients with unilateral vocal fold paralysis (UVFP). The patients underwent standard medialization thyroplasty surgery, and the voices were analyzed using jitter, shimmer, NHR, nonlinear recurrence period density entropy (RPDE), detrended fluctuation analysis (DFA), and correlation dimension. In addition, we similarly analyzed 11 healthy controls. Systematizing the preanalysis editing of the recordings, we found that the novel measures were more stable and, hence, reliable than the classical measures on healthy controls. RPDE and jitter are sensitive to improvements pre- to postoperation. Shimmer, NHR, and DFA showed no significant change (P>0.05). All measures detect statistically significant and clinically important differences between controls and patients, both treated and untreated (P<0.001, area under curve [AUC]>0.7). Pre- to postoperation grade, roughness, breathiness, asthenia, and strain (GRBAS) ratings show statistically significant and clinically important improvement in overall dysphonia grade (G) (AUC=0.946, P<0.001). Recalculating AUCs from other study data, we compare these results in terms of clinical importance. We conclude that, when preanalysis editing is systematized, nonlinear random measures may be useful for monitoring UVFP-treatment effectiveness, and there may be applications to other forms of dysphonia.
Collapse
Affiliation(s)
- Max A Little
- Systems Analysis, Modeling and Prediction Group, University of Oxford, Oxford, United Kingdom
| | | | | |
Collapse
|
27
|
Fuchs S, Koenig LL. Simultaneous measures of electropalatography and intraoral pressure in selected voiceless lingual consonants and consonant sequences of German. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:1988-2001. [PMID: 19813810 DOI: 10.1121/1.3180694] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
This work assessed relationships among intraoral pressure (IOP), electropalatographic (EPG) measures, and consonant sequence duration, in the following obstruents, clusters, and affricates of German: /t/, /sh/, /sht/, and /tsh/. The data showed significant correlations between IOP and percentage of articulatory contact (PC) for all speakers, whereas duration and place of articulation (measured by the EPG center of gravity) contributed less to IOP changes. Speakers differed in the strength of this relationship, possibly reflecting differences in vocal tract morphology or degree of laryngeal abduction. Single-point EPG and IOP measures in fricatives showed consistent correspondences across consonantal contexts, but the relationships for the stops were more complex and reflected positional effects. Temporal compression was observed for both members of the cluster, but only the fricative portion of the affricate. Conversely, coarticulation was observed for both the stop and fricative portion of the affricate, but only for the stop portion of the cluster, possibly reflecting biomechanical constraints. No clear differences were observed in coarticulatory resistance for stops and fricatives. These data contribute to a limited literature on articulatory-aerodynamic relationships in voiceless consonants and consonant sequences, and will provide a baseline for considering longer combinations of obstruents.
Collapse
Affiliation(s)
- Susanne Fuchs
- Center for General Linguistics (ZAS), Schuetzenstrasse 18, 10117 Berlin, Germany.
| | | |
Collapse
|
28
|
Tenhunen M, Rauhala E, Huupponen E, Saastamoinen A, Kulkas A, Himanen SL. High frequency components of tracheal sound are emphasized during prolonged flow limitation. Physiol Meas 2009; 30:467-78. [PMID: 19349649 DOI: 10.1088/0967-3334/30/5/004] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
29
|
Oren L, Khosla S, Murugappan S, King R, Gutmark E. Role of Subglottal Shape in Turbulence Reduction. Ann Otol Rhinol Laryngol 2009; 118:232-40. [DOI: 10.1177/000348940911800312] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Objectives: In previous work, we found that airflow at the superior edge of the vocal folds, in the excised canine larynx, can be laminar even when the tracheal airflow is predominantly turbulent. Turbulent flow directly above the folds may lead to an irregular or “rough” voice. Thus, it is important to determine the mechanism of turbulence reduction. From fluid mechanics, it is known that a smoothly converging duct will reduce turbulence. In this study, we tested the hypothesis that the majority of the turbulence reduction is due to the smooth converging shape of the subglottis. Methods: In 3 excised canine larynges, hot-wire anemometry was used to measure the turbulence intensity (TI) below the cricoid cartilage and 2 to 3 mm above the superior edge of the vocal folds. Laminar flow was seen when the TI was approximately less than 2%. For our measurements, flow into the subglottis had an average TI of more than 20% (high turbulence) in the shear layer and a TI of more than 15% in the center of the jet. The larynges were tested under steady conditions (folds not phonating) with the vocal processes approximated. Results: For the center of the jet, there is moderate turbulence below the cricoid cartilage and laminar flow 2 to 3 mm above the folds. For the shear layer, there is very high turbulence below the cricoid cartilage and low turbulence 2 to 3 mm above the folds. Conclusions: The smooth converging shape of the subglottis can produce a significant reduction in turbulence. These findings may have important voice implications for operations that may change the subglottal shape (such as vocal fold medialization or airway reconstruction).
Collapse
|
30
|
Krane M, Barry M, Wei T. Unsteady behavior of flow in a scaled-up vocal folds model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 122:3659-70. [PMID: 18247773 PMCID: PMC6624077 DOI: 10.1121/1.2409485] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Measurements of the fluid flow through a scaled-up model of the human glottis are presented to determine whether glottal flow may be approximated as unsteady. Time- and space-resolved velocity vector fields from digital particle image velocimetry (DPIV) measurements of the flow through the gap between two moving, rigid walls are presented in four cases, over a range of Strouhal numbers: 0.010, 0.018, 0.035, 0.040, corresponding to life-scale f(0) of 30, 58, 109, and 126 Hz, respectively, at a Reynolds number of 8000. It is observed that (1) glottal flow onset is delayed after glottal opening and (2) glottal flow shutoff occurs prior to closure. A comparison between flow through a fully open, nonmoving glottis and that through the moving vocal folds shows a marked difference in spatial structure of the glottal jet. The following features of the flow are seen to exhibit strong dependence on cycle frequency: (a) glottal exit plane velocity, (b) volume flow, (c) vortex shedding rates, and (d) vortex amplitude. Vortex shedding appears to be a factor both in controlling flow resistance and in cycle-to-cycle volume flow variations. All these observations strongly suggest that glottal flow is inherently unsteady.
Collapse
Affiliation(s)
- Michael Krane
- Center for Advanced Information Processing, Rutgers University, Piscataway, New Jersey 08854, USA
| | | | | |
Collapse
|
31
|
Little MA, McSharry PE, Roberts SJ, Costello DAE, Moroz IM. Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection. Biomed Eng Online 2007; 6:23. [PMID: 17594480 PMCID: PMC1913514 DOI: 10.1186/1475-925x-6-23] [Citation(s) in RCA: 194] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2007] [Accepted: 06/26/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Voice disorders affect patients profoundly, and acoustic tools can potentially measure voice function objectively. Disordered sustained vowels exhibit wide-ranging phenomena, from nearly periodic to highly complex, aperiodic vibrations, and increased "breathiness". Modelling and surrogate data studies have shown significant nonlinear and non-Gaussian random properties in these sounds. Nonetheless, existing tools are limited to analysing voices displaying near periodicity, and do not account for this inherent biophysical nonlinearity and non-Gaussian randomness, often using linear signal processing methods insensitive to these properties. They do not directly measure the two main biophysical symptoms of disorder: complex nonlinear aperiodicity, and turbulent, aeroacoustic, non-Gaussian randomness. Often these tools cannot be applied to more severe disordered voices, limiting their clinical usefulness. METHODS This paper introduces two new tools to speech analysis: recurrence and fractal scaling, which overcome the range limitations of existing tools by addressing directly these two symptoms of disorder, together reproducing a "hoarseness" diagram. A simple bootstrapped classifier then uses these two features to distinguish normal from disordered voices. RESULTS On a large database of subjects with a wide variety of voice disorders, these new techniques can distinguish normal from disordered cases, using quadratic discriminant analysis, to overall correct classification performance of 91.8 +/- 2.0%. The true positive classification performance is 95.4 +/- 3.2%, and the true negative performance is 91.5 +/- 2.3% (95% confidence). This is shown to outperform all combinations of the most popular classical tools. CONCLUSION Given the very large number of arbitrary parameters and computational complexity of existing techniques, these new techniques are far simpler and yet achieve clinically useful classification performance using only a basic classification technique. They do so by exploiting the inherent nonlinearity and turbulent randomness in disordered voice signals. They are widely applicable to the whole range of disordered voice phenomena by design. These new measures could therefore be used for a variety of practical clinical purposes.
Collapse
Affiliation(s)
- Max A Little
- Systems Analysis, Modelling and Prediction Group, Department of Engineering Science, University of Oxford, Parks Road, Oxford OX1 3PJ, UK
- Pattern Analysis Research Group, Department of Engineering Science, University of Oxford, Parks Road, Oxford OX1 3PJ, UK
- Applied Dynamical Systems Research Group, Oxford Centre for Industrial and Applied Mathematics, Mathematics Institute, University of Oxford, Oxford OX1 3JP, UK
| | - Patrick E McSharry
- Systems Analysis, Modelling and Prediction Group, Department of Engineering Science, University of Oxford, Parks Road, Oxford OX1 3PJ, UK
| | - Stephen J Roberts
- Pattern Analysis Research Group, Department of Engineering Science, University of Oxford, Parks Road, Oxford OX1 3PJ, UK
| | - Declan AE Costello
- Milton Keynes General Hospital, Standing Way, Eaglestone, Milton Keynes, Bucks MK6 5LD, UK
| | - Irene M Moroz
- Applied Dynamical Systems Research Group, Oxford Centre for Industrial and Applied Mathematics, Mathematics Institute, University of Oxford, Oxford OX1 3JP, UK
| |
Collapse
|
32
|
Birkholz P, Jackel D, Kroger BJ. Simulation of Losses Due to Turbulence in the Time-Varying Vocal System. ACTA ACUST UNITED AC 2007. [DOI: 10.1109/tasl.2006.889731] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
33
|
Neubauer J, Zhang Z, Miraghaie R, Berry DA. Coherent structures of the near field flow in a self-oscillating physical model of the vocal folds. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:1102-18. [PMID: 17348532 DOI: 10.1121/1.2409488] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Current theories of voice production depend critically upon knowledge of the near field flow which emanates from the glottis. While most modern theories predict complex, three-dimensional structures in the near field flow, few investigations have attempted to quantify such structures. Using methods of flow visualization and digital particle image velocimetry, this study measured the near field flow structures immediately downstream of a self-oscillating, physical model of the vocal folds, with a vocal tract attached. A spatio-temporal analysis of the structures was performed using the method of empirical orthogonal eigenfunctions. Some of the observed flow structures included vortex generation, vortex convection, and jet flapping. The utility of such data in the future development of more accurate, low-dimensional models of voice production is discussed.
Collapse
Affiliation(s)
- Jürgen Neubauer
- The Laryngeal Dynamics Laboratory, UCLA School of Medicine, 31-24 Rehabilition Center, 1000 Veteran Ave., Los Angeles, California 90095, USA.
| | | | | | | |
Collapse
|
34
|
Erath BD, Plesniak MW. The occurrence of the Coanda effect in pulsatile flow through static models of the human vocal folds. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:1000-11. [PMID: 16938987 DOI: 10.1121/1.2213522] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Pulsatile flow through a one-sided diffuser and static divergent vocal-fold models is investigated to ascertain the relevance of viscous-driven flow asymmetries in the larynx. The models were 7.5 times real size, and the flow was scaled to match Reynolds and Strouhal numbers, as well as the translaryngeal pressure drop. The Reynolds number varied from 0-2000, for flow oscillation frequencies corresponding to 100 and 150 Hz life-size. Of particular interest was the development of glottal flow skewing by attachment to the bounding walls, or Coanda effect, in a pulsatile flow field, and its impact on speech. The vocal folds form a divergent passage during phases of the phonation cycle when viscous effects such as flow separation are important. It was found that for divergence angles of less than 20 degrees, the attachment of the flow to the vocal-fold walls occurred when the acceleration of the forcing function was zero, and the flow had reached maximum velocity. For a divergence angle of 40 degrees, the fully separated central jet never attached to the vocal-fold walls. Inferences are made regarding the impact of the Coanda effect on the sound source contribution in speech.
Collapse
Affiliation(s)
- Byron D Erath
- School of Mechanical Engineering, Purdue University, West Lafayette, Indiana 47907, USA
| | | |
Collapse
|
35
|
Zhang Z, Mongeau LG. Broadband sound generation by confined pulsating jets in a mechanical model of the human larynx. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 119:3995-4005. [PMID: 16838542 DOI: 10.1121/1.2195268] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Experiments were performed to study the production of broadband sound in confined pulsating jets through orifices with a time-varying area. The goal was to better understand broadband sound generation at the human glottis during voicing. The broadband component was extracted from measured sound signals by the elimination of the periodic component through ensemble averaging. Comparisons were made between the probability density functions of the broadband sound in pulsating jets and of comparable stationary jets. The results indicate that the quasi-steady approximation may be valid for the broadband component when the turbulence is well established and the turbulence kinetic energy is comparatively large. A wavelet analysis of the broadband sound showed that random sound production was modulated at the driving frequency. Two distinct sound production peaks were observed during one cycle, presumably associated firstly with jet formation and secondly with flow deceleration during orifice closing. Most high-frequency sound was produced during the closing phase. Deviations from quasi-steady behavior were observed. As the driving frequency increased, sound production during the opening phase was reduced, possibly due to the shorter time available for turbulence to develop. These results may be useful for better quality voice synthesis.
Collapse
Affiliation(s)
- Zhaoyan Zhang
- Ray W. Herrick Laboratories, Purdue University, 140 South Intramural Drive, West Lafayette, Indiana 47907-2031, USA.
| | | |
Collapse
|
36
|
PETERS GUSTAV. TERRESTRIAL CARNIVORE SOUNDS WITH REPEATED RAPID ALTERNATION OF TWO STRUCTURALLY DIFFERENT COMPONENTS: AN INDICATION OF COMPLEX SOUND PRODUCTION MECHANISMS IN MAMMALS? BIOACOUSTICS 2006. [DOI: 10.1080/09524622.2006.9753561] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|