1
|
Contribution of Vocal Tract and Glottal Source Spectral Cues in the Generation of Acted Happy and Aggressive Spanish Vowels. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12042055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
The source-filter model is one of the main techniques applied to speech analysis and synthesis. Recent advances in voice production by means of three-dimensional (3D) source-filter models have overcome several limitations of classic one-dimensional techniques. Despite the development of preliminary attempts to improve the expressiveness of 3D-generated voices, they are still far from achieving realistic results. Towards this goal, this work analyses the contribution of both the the vocal tract (VT) and the glottal source spectral (GSS) cues in the generation of happy and aggressive speech through a GlottDNN-based analysis-by-synthesis methodology. Paired neutral expressive utterances are parameterised to generate different combinations of expressive vowels, applying the target expressive GSS and/or VT cues on the neutral vowels after transplanting the expressive prosody on these utterances. The conducted objective tests focused on Spanish [a], [i] and [u] vowels show that both GSS and VT cues significantly reduce the spectral distance to the expressive target. The results from the perceptual test show that VT cues make a statistically significant contribution in the expression of happy and aggressive emotions for [a] vowels, while the GSS contribution is significant in [i] and [u] vowels.
Collapse
|
2
|
Xia M, Cao S, Zhou R, Wang JY, Xu TY, Zhou ZK, Qian YM, Jiang H. Acoustic features as novel predictors of difficult laryngoscopy in orthognathic surgery: an observational study. ANNALS OF TRANSLATIONAL MEDICINE 2021; 9:1466. [PMID: 34734018 PMCID: PMC8506731 DOI: 10.21037/atm-21-4359] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 09/07/2021] [Indexed: 01/19/2023]
Abstract
Background The evaluation of the difficult intubation is an important process before anaesthesia. The unanticipated difficult intubation is associated with morbidity and mortality. This study aimed to determine whether acoustic features are valuable as an alternative method to predict difficult laryngoscopy (DL) in patients scheduled to undergo orthognathic surgery. Methods This study included 225 adult patients who were undergoing elective orthognathic surgery under general anaesthesia with tracheal intubation. Preoperatively, clinical airway evaluation was performed, and the acoustic data were collected. Twelve phonemes {[a], [o], [e], [i], [u], [ü], [ci], [qi], [chi], [le], [ke], and [en]} were recorded, and their formants (f1-f4) and bandwidths (bw1-bw4) were extracted. Difficult laryngoscopy was defined as direct laryngoscopy with a Cormack-Lehane grade of 3 or 4. Univariate and multivariate logistic regression analyses were used to examine the associations between acoustic features and DL. Results Difficult laryngoscopy was reported in 59/225 (26.2%) patients. The area under the curve (AUC) of the backward stepwise model including en_f2 [odds ratio (OR), 0.996; 95% confidence interval (CI), 0.994–0.999; P=0.006], ci_bw4 (OR, 0.997; 95% CI, 0.993–1.000; P=0.057), qi_bw4 (OR, 0.996; 95% CI, 0.993–0.999; P=0.017), le_f3 (OR, 0.998; 95% CI, 0.996–1.000; P=0.079), o_bw4 (OR, 1.001; 95% CI, 1.000–1.003; P=0.014), chi_f4 (OR, 1.003; 95% CI, 1.000–1.005; P=0.041), a_bw4 (OR, 0.999; 95% CI, 0.998–1.000; P=0.078) attained a value of 0.761 in the training set, but a value of 0.709 in the testing set. The sensitivity and specificity of the model in the testing set are 86.7% and 63.0%, respectively. Conclusions Acoustic features may be considered as useful predictors of DL during orthognathic surgery.
Collapse
Affiliation(s)
- Ming Xia
- Department of Anaesthesiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Shuang Cao
- Department of Anaesthesiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ren Zhou
- Department of Anaesthesiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Jia-Yi Wang
- Department of Anaesthesiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Tian-Yi Xu
- Department of Anaesthesiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zhi-Kai Zhou
- X-LANCE Lab, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yan-Min Qian
- X-LANCE Lab, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Hong Jiang
- Department of Anaesthesiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
3
|
Falk S, Kniesburges S, Schoder S, Jakubaß B, Maurerlehner P, Echternach M, Kaltenbacher M, Döllinger M. 3D-FV-FE Aeroacoustic Larynx Model for Investigation of Functional Based Voice Disorders. Front Physiol 2021; 12:616985. [PMID: 33762964 PMCID: PMC7982522 DOI: 10.3389/fphys.2021.616985] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 02/09/2021] [Indexed: 12/02/2022] Open
Abstract
For the clinical analysis of underlying mechanisms of voice disorders, we developed a numerical aeroacoustic larynx model, called simVoice, that mimics commonly observed functional laryngeal disorders as glottal insufficiency and vibrational left-right asymmetries. The model is a combination of the Finite Volume (FV) CFD solver Star-CCM+ and the Finite Element (FE) aeroacoustic solver CFS++. simVoice models turbulence using Large Eddy Simulations (LES) and the acoustic wave propagation with the perturbed convective wave equation (PCWE). Its geometry corresponds to a simplified larynx and a vocal tract model representing the vowel /a/. The oscillations of the vocal folds are externally driven. In total, 10 configurations with different degrees of functional-based disorders were simulated and analyzed. The energy transfer between the glottal airflow and the vocal folds decreases with an increasing glottal insufficiency and potentially reflects the higher effort during speech for patients being concerned. This loss of energy transfer may also have an essential influence on the quality of the sound signal as expressed by decreasing sound pressure level (SPL), Cepstral Peak Prominence (CPP), and Vocal Efficiency (VE). Asymmetry in the vocal fold oscillations also reduces the quality of the sound signal. However, simVoice confirmed previous clinical and experimental observations that a high level of glottal insufficiency worsens the acoustic signal quality more than oscillatory left-right asymmetry. Both symptoms in combination will further reduce the quality of the sound signal. In summary, simVoice allows for detailed analysis of the origins of disordered voice production and hence fosters the further understanding of laryngeal physiology, including occurring dependencies. A current walltime of 10 h/cycle is, with a prospective increase in computing power, auspicious for a future clinical use of simVoice.
Collapse
Affiliation(s)
- Sebastian Falk
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Stefan Kniesburges
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Stefan Schoder
- Institute of Fundamentals and Theory in Electrical Engineering, Division Vibro- and Aeroacoustics, Graz University of Technology, Graz, Austria
| | - Bernhard Jakubaß
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Paul Maurerlehner
- Institute of Fundamentals and Theory in Electrical Engineering, Division Vibro- and Aeroacoustics, Graz University of Technology, Graz, Austria
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Munich, Germany
| | - Manfred Kaltenbacher
- Institute of Fundamentals and Theory in Electrical Engineering, Division Vibro- and Aeroacoustics, Graz University of Technology, Graz, Austria
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head & Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
4
|
Dabbaghchian S, Arnela M, Engwall O, Guasch O. Simulation of vowel-vowel utterances using a 3D biomechanical-acoustic model. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2021; 37:e3407. [PMID: 33070445 DOI: 10.1002/cnm.3407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 09/17/2020] [Accepted: 09/28/2020] [Indexed: 06/11/2023]
Abstract
A link is established between biomechanical and acoustic 3D models for the numerical simulation of vowel-vowel utterances. The former rely on the activation and contraction of relevant muscles for voice production, which displace and distort speech organs. However, biomechanical models do not provide a closed computational domain of the 3D vocal tract airway where to simulate sound wave propagation. An algorithm is thus proposed to extract the vocal tract boundary from the surrounding anatomical structures at each time step of the transition between vowels. The resulting 3D geometries are fed into a 3D finite element acoustic model that solves the mixed wave equation for the acoustic pressure and particle velocity. An arbitrary Lagrangian-Eulerian framework is considered to account for the evolving vocal tract. Examples include six static vowels and three dynamic vowel-vowel utterances. Plausible muscle activation patterns are first determined for the static vowel sounds following an inverse method. Dynamic utterances are then generated by linearly interpolating the muscle activation of the static vowels. Results exhibit nonlinear trajectory of the vocal tract geometry, similar to that observed in electromagnetic midsagittal articulography. Clear differences are appreciated when comparing the generated sound with that obtained from direct linear interpolation of the vocal tract geometry. That is, interpolation between the starting and ending vocal tract geometries of an utterance, without resorting to any biomechanical model.
Collapse
Affiliation(s)
- Saeed Dabbaghchian
- Department of Speech, Music, and Hearing, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Marc Arnela
- GTM Grup de recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| | - Olov Engwall
- Department of Speech, Music, and Hearing, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Oriol Guasch
- GTM Grup de recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| |
Collapse
|
5
|
Tabain M, Kochetov A, Beare R. An ultrasound and formant study of manner contrasts at four coronal places of articulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:3195. [PMID: 33261411 DOI: 10.1121/10.0002486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 10/20/2020] [Indexed: 06/12/2023]
Abstract
This study examines consonant manner of articulation at four coronal places of articulation, using ultrasound and formant analyses of the Australian language Arrernte. Stop, nasal, and lateral articulations are examined at the dental, alveolar, retroflex, and alveo-palatal places of articulation: /t̪ n̪ l̪ / vs /t n l/ vs /ʈɳɭ/ vs /c ɲ ʎ/. Ultrasound data clearly show a more retracted tongue root for the lateral, and a more advanced tongue root for the nasal, as compared to the stop. However, the magnitude of the differences is much greater for the stop∼lateral contrast than for the stop∼nasal contrast. Acoustic results show clear effects on F1 in the adjacent vowels, in particular the preceding vowel, with F1 lower adjacent to nasals and higher adjacent to laterals, as compared to stops. Correlations between the articulatory and acoustic data are particularly strong for this formant. However, the retroflex place of articulation shows effects according to manner for higher formants as well, suggesting that a better understanding of retroflex acoustics for different manners of articulation is required. The study also suggests that articulatory symmetry and gestural economy are affected by the size of the phonemic inventory.
Collapse
|
6
|
Vampola T, Horáček J, Radolf V, Švec JG, Laukkanen AM. Influence of nasal cavities on voice quality: Computer simulations and experiments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:3218. [PMID: 33261400 DOI: 10.1121/10.0002487] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 10/20/2020] [Indexed: 06/12/2023]
Abstract
Nasal cavities are known to introduce antiresonances (dips) in the sound spectrum reducing the acoustic power of the voice. In this study, a three-dimensional (3D) finite element (FE) model of the vocal tract (VT) of one female subject was created for vowels [a:] and [i:] without and with a detailed model of nasal cavities based on CT (Computer Tomography) images. The 3D FE models were then used for analyzing the resonances, antiresonances and the acoustic pressure response spectra of the VT. The computed results were compared with the measurements of a VT model for the vowel [a:], obtained from the FE model by 3D printing. The nasality affects mainly the lowest formant frequency and decreases its peak level. The results confirm the main effect of nasalization, i.e., that sound pressure level decreases in the frequency region of the formants F1-F2 and emphasizes the frequency region of the formants F3-F5 around the singer's formant cluster. Additionally, many internal local resonances in the nasal and paranasal cavities were found in the 3D FE model. Their effect on the acoustic output was found to be minimal, but accelerometer measurements on the walls of the 3D-printed model suggested they could contribute to structure vibrations.
Collapse
Affiliation(s)
- Tomáš Vampola
- Department of Mechanics, Biomechanics and Mechatronics, Faculty of Mechanical Engineering, Czech Technical University in Prague, Technická 4, 160 00 Prague 6, Czech Republic
| | - Jaromír Horáček
- Institute of Thermomechanics, Academy of Science of the Czech Republic, Dolejškova 5, 182 00 Prague 8, Czech Republic
| | - Vojtěch Radolf
- Institute of Thermomechanics, Academy of Science of the Czech Republic, Dolejškova 5, 182 00 Prague 8, Czech Republic
| | - Jan G Švec
- Voice Research Lab, Department of Biophysics, Faculty of Science, Palacky University Olomouc, Tr. Svobody 26, 771 46 Olomouc, Czech Republic
| | - Anne-Maria Laukkanen
- Speech and Voice Research Laboratory, Faculty of Social Sciences, Tampere University, Virta, Åkerlundinkatu 5, 33100 Tampere, Finland
| |
Collapse
|
7
|
Brandner M, Blandin R, Frank M, Sontacchi A. A pilot study on the influence of mouth configuration and torso on singing voice directivity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:1169. [PMID: 33003835 DOI: 10.1121/10.0001736] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Directivity of speech and singing is determined primarily by the morphology of a person, i.e., head size, torso dimensions, posture, and vocal tract. Previous works have suggested from measurements that voice directivity in singing is controlled unintentionally by spectral emphasis in the range of 2-4 kHz. The attempt is made to try to identify to what extent voice directivity is affected by the mouth configuration and the torso. Therefore, simulations, together with measurements that investigate voice directivity in more detail, are presented. Simulations are presented for a piston in an infinite baffle, a radiating spherical cap, and an extended spherical cap model, taking into account transverse propagation modes. Measurements of a classical singer, an amateur singer, and a head and torso simulator are undertaken simultaneously in the horizontal and vertical planes. In order to assess differences of voice directivity common metrics, e.g., horizontal and vertical directivity indexes, are discussed and compared to improved alternatives. The measurements and simulations reveal that voice directivity in singing is affected if the mouth opening is changed significantly. The measurements show that the torso generates side lobes due to diffraction and reflections at frequencies related to the torso's dimensions.
Collapse
Affiliation(s)
- Manuel Brandner
- Institute of Electronic Music and Acoustics, University of Music and Performing Arts Graz, Graz 8010, Austria
| | - Remi Blandin
- Institute of Acoustics and Speech Communication, Technical University of Dresden, Dresden 01062, Germany
| | - Matthias Frank
- Institute of Electronic Music and Acoustics, University of Music and Performing Arts Graz, Graz 8010, Austria
| | - Alois Sontacchi
- Institute of Electronic Music and Acoustics, University of Music and Performing Arts Graz, Graz 8010, Austria
| |
Collapse
|
8
|
Tabain M, Butcher A, Breen G, Beare R. A formant study of the alveolar versus retroflex contrast in three Central Australian languages: Stop, nasal, and lateral manners of articulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2745. [PMID: 32359243 DOI: 10.1121/10.0001012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 06/10/2019] [Indexed: 06/11/2023]
Abstract
This study presents formant transition data from 21 speakers for the apical alveolar∼retroflex contrast in three neighbouring Central Australian languages: Arrernte, Pitjantjatjara, and Warlpiri. The contrast is examined for three manners of articulation: stop, nasal, and lateral /t ∼ ʈ/ /n ∼ ɳ/, and /l ∼ ɭ/, and three vowel contexts /a i u/. As expected, results show that a lower F3 and F4 in the preceding vowel signal a retroflex consonant; and that the alveolar∼retroflex contrast is most clearly realized in the context of an /a/ vowel, and least clearly realized in the context of an /i/ vowel. Results also show that the contrast is most clearly realized for the stop manner of articulation. These results provide an acoustic basis for the greater typological rarity of retroflex nasals and laterals as compared to stops. It is suggested that possible nasalization of the preceding vowel accounts for the poorer nasal consonant results, and that articulatory constraints on lateral consonant production account for the poorer lateral consonant results. Importantly, differences are noticed between speakers, and it is suggested that literacy plays a major role in maintenance of this marginal phonemic contrast.
Collapse
Affiliation(s)
| | | | - Gavan Breen
- Institute for Aboriginal Development, Alice Springs, Australia
| | - Richard Beare
- Monash University, and Murdoch Children's Research Institute, Melbourne, Australia
| |
Collapse
|
9
|
Pont A, Guasch O, Arnela M. Finite element generation of sibilants /s/ and /z/ using random distributions of Kirchhoff vortices. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2020; 36:e3302. [PMID: 31883313 DOI: 10.1002/cnm.3302] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 09/20/2019] [Accepted: 12/20/2019] [Indexed: 06/10/2023]
Abstract
The numerical simulation of sibilant sounds in three-dimensional realistic vocal tracts constitutes a challenging problem because it involves a wide range of turbulent flow scales. Rotating eddies generate acoustic waves whose wavelengths are inversely proportional to the flow local Mach number. If that is low, very fine meshes are required to capture the flow dynamics. In standard hybrid computational aeroacoustics (CAA), where the incompressible Navier-Stokes equations are first solved to get a source term that is secondly input into an acoustic wave equation, this implies resorting to supercomputer facilities. As a consequence, only very short time intervals of the sibilant can be produced, which may be enough for its spectral characterization but insufficient to synthesize, for instance, an audio file from it or a syllable sound. In this work, we propose to substitute the aeroacoustic source term obtained from the computational fluid dynamics (CFD) in the first step of hybrid CAA, by a random distribution of Kirchhoff's spinning vortices, located in the region between the upper incisors and the lower lip. In this way, one only needs to solve a linear wave equation to generate a sibilant, and therefore avoids the costly large-scale computations. We show that our proposal can recover the outcomes of hybrid CAA simulations in average, and that it can be applied to generate sibilants /s/ and /z/. Modeling and implementation details of the Kirchhoff vortex distribution in a stabilized finite element code are discussed in the paper, as well as the outcomes of the simulations.
Collapse
Affiliation(s)
- Arnau Pont
- GTM Grup de Recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| | - Oriol Guasch
- GTM Grup de Recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| | - Marc Arnela
- GTM Grup de Recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| |
Collapse
|
10
|
Evaluation of the association between voice formants and difficult facemask ventilation. Eur J Anaesthesiol 2019; 36:972-973. [PMID: 31688299 DOI: 10.1097/eja.0000000000001108] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
11
|
Schickhofer L, Mihaescu M. Analysis of the aerodynamic sound of speech through static vocal tract models of various glottal shapes. J Biomech 2019; 99:109484. [PMID: 31761432 DOI: 10.1016/j.jbiomech.2019.109484] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Revised: 10/30/2019] [Accepted: 10/30/2019] [Indexed: 11/25/2022]
Abstract
The acoustic spectrum of our voice can be divided into harmonic and inharmonic sound components. While the harmonic components, generated by the oscillatory motion of the vocal folds, are well described by reduced-order speech models, the accurate computation of the inharmonic components requires high-order flow simulations, which predict the vortex shedding and turbulent structures present in the shear layers of the glottal jet. This study characterizes the dominant frequencies in the unsteady flow of the intra- and supraglottal region. A realistic vocal tract geometry obtained through magnetic resonance imaging (MRI) is applied for the numerical domain, which is locally modified to account for different convergent and divergent glottal angles. Both time-averaged and fluctuating values of the flow variables are computed and their distribution at various glottal shapes is compared. The impact of the registered modes in the unsteady flow on the acoustic far field is computed through direct compressible flow simulations. Furthermore, acoustic analogies are applied to localize the sources of the aerodynamically generated sound.
Collapse
Affiliation(s)
- Lukas Schickhofer
- Department of Mechanics, Linné FLOW Centre, KTH Royal Institute of Technology, Stockholm SE-10044, Sweden.
| | - Mihai Mihaescu
- Department of Mechanics, Linné FLOW Centre, KTH Royal Institute of Technology, Stockholm SE-10044, Sweden.
| |
Collapse
|
12
|
Glottal Source Contribution to Higher Order Modes in the Finite Element Synthesis of Vowels. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9214535] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Articulatory speech synthesis has long been based on one-dimensional (1D) approaches. They assume plane wave propagation within the vocal tract and disregard higher order modes that typically appear above 5 kHz. However, such modes may be relevant in obtaining a more natural voice, especially for phonation types with significant high frequency energy (HFE) content. This work studies the contribution of the glottal source at high frequencies in the 3D numerical synthesis of vowels. The spoken vocal range is explored using an LF (Liljencrants–Fant) model enhanced with aspiration noise and controlled by the R d glottal shape parameter. The vowels [ɑ], [i], and [u] are generated with a finite element method (FEM) using realistic 3D vocal tract geometries obtained from magnetic resonance imaging (MRI), as well as simplified straight vocal tracts of a circular cross-sectional area. The symmetry of the latter prevents the onset of higher order modes. Thus, the comparison between realistic and simplified geometries enables us to analyse the influence of such modes. The simulations indicate that higher order modes may be perceptually relevant, particularly for tense phonations (lower R d values) and/or high fundamental frequency values, F 0 s. Conversely, vowels with a lax phonation and/or low F0s may result in inaudible HFE levels, especially if aspiration noise is not considered in the glottal source model.
Collapse
|
13
|
Carvalho CC, Silva DM, de Carvalho Junior AD, Santos Neto JM, Rio BR, Neto CN, Orange FA. Pre‐operative voice evaluation as a hypothetical predictor of difficult laryngoscopy. Anaesthesia 2019; 74:1147-1152. [DOI: 10.1111/anae.14732] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/11/2019] [Indexed: 12/12/2022]
Affiliation(s)
- C. C. Carvalho
- Instituto de Medicina Integral Prof. Fernando Figueira (IMIP) Recife Pernambuco Brazil
| | - D. M. Silva
- Hospital das Clínicas de Pernambuco Recife Pernambuco Brazil
| | | | | | - B. R. Rio
- Hospital das Clínicas de Pernambuco Recife Pernambuco Brazil
| | - C. N. Neto
- Instituto Dante Pazzanese de Cardiologia São Paulo Brazil
| | - F. A. Orange
- Instituto de Medicina Integral Prof. Fernando Figueira (IMIP) Recife Pernambuco Brazil
| |
Collapse
|
14
|
Schickhofer L, Malinen J, Mihaescu M. Compressible flow simulations of voiced speech using rigid vocal tract geometries acquired by MRI. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2049. [PMID: 31046346 DOI: 10.1121/1.5095250] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Accepted: 03/07/2019] [Indexed: 05/27/2023]
Abstract
Voiced speech consists mainly of the source signal that is frequency weighted by the acoustic filtering of the upper airways and vortex-induced sound through perturbation in the flow field. This study investigates the flow instabilities leading to vortex shedding and the importance of coherent structures in the supraglottal region downstream of the vocal folds for the far-field sound signal. Large eddy simulations of the compressible airflow through the glottal constriction are performed in realistic geometries obtained from three-dimensional magnetic resonance imaging data. Intermittent flow separation through the glottis is shown to introduce unsteady surface pressure through impingement of vortices. Additionally, dominant flow instabilities develop in the shear layer associated with the glottal jet. The aerodynamic perturbations in the near field and the acoustic signal in the far field are examined by means of spatial and temporal Fourier analysis. Furthermore, the acoustic sources due to the unsteady supraglottal flow are identified with the aid of surface spectra, and critical regions of amplification of the dominant frequencies of the investigated vowel geometries are identified.
Collapse
Affiliation(s)
- Lukas Schickhofer
- Department of Mechanics, Linné FLOW Centre, KTH Royal Institute of Technology, Stockholm, SE-10044, Sweden
| | - Jarmo Malinen
- Department of Mathematics and Systems Analysis, Aalto University, Aalto, FI-00076, Finland
| | - Mihai Mihaescu
- Department of Mechanics, Linné FLOW Centre, KTH Royal Institute of Technology, Stockholm, SE-10044, Sweden
| |
Collapse
|
15
|
Dabbaghchian S, Arnela M, Engwall O, Guasch O. Reconstruction of vocal tract geometries from biomechanical simulations. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2019; 35:e3159. [PMID: 30242981 PMCID: PMC6587943 DOI: 10.1002/cnm.3159] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2018] [Revised: 09/10/2018] [Accepted: 09/17/2018] [Indexed: 06/08/2023]
Abstract
Medical imaging techniques are usually utilized to acquire the vocal tract geometry in 3D, which may then be used, eg, for acoustic/fluid simulation. As an alternative, such a geometry may also be acquired from a biomechanical simulation, which allows to alter the anatomy and/or articulation to study a variety of configurations. In a biomechanical model, each physical structure is described by its geometry and its properties (such as mass, stiffness, and muscles). In such a model, the vocal tract itself does not have an explicit representation, since it is a cavity rather than a physical structure. Instead, its geometry is defined implicitly by all the structures surrounding the cavity, and such an implicit representation may not be suitable for visualization or for acoustic/fluid simulation. In this work, we propose a method to reconstruct the vocal tract geometry at each time step during the biomechanical simulation. Complexity of the problem, which arises from model alignment artifacts, is addressed by the proposed method. In addition to the main cavity, other small cavities, including the piriform fossa, the sublingual cavity, and the interdental space, can be reconstructed. These cavities may appear or disappear by the position of the larynx, the mandible, and the tongue. To illustrate our method, various static and temporal geometries of the vocal tract are reconstructed and visualized. As a proof of concept, the reconstructed geometries of three cardinal vowels are further used in an acoustic simulation, and the corresponding transfer functions are derived.
Collapse
Affiliation(s)
- Saeed Dabbaghchian
- Department of Speech, Music, and HearingKTH Royal Institute of TechnologyStockholmSweden
| | - Marc Arnela
- GTM Grup de recerca en Tecnologies Mèdia, La SalleUniversitat Ramon LlullBarcelonaSpain
| | - Olov Engwall
- Department of Speech, Music, and HearingKTH Royal Institute of TechnologyStockholmSweden
| | - Oriol Guasch
- GTM Grup de recerca en Tecnologies Mèdia, La SalleUniversitat Ramon LlullBarcelonaSpain
| |
Collapse
|
16
|
Pont A, Guasch O, Baiges J, Codina R, van Hirtum A. Computational aeroacoustics to identify sound sources in the generation of sibilant /s/. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2019; 35:e3153. [PMID: 30203927 DOI: 10.1002/cnm.3153] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 08/31/2018] [Accepted: 09/05/2018] [Indexed: 06/08/2023]
Abstract
A sibilant fricative /s/ is generated when the turbulent jet in the narrow channel between the tongue blade and the hard palate is deflected downwards through the space between the upper and lower incisors and then impinges the space between the lower incisors and the lower lip. The flow eddies in that region become a source of direct aerodynamic sound, which is also diffracted by the speech articulators and radiated outwards. The numerical simulation of these phenomena is complex. The spectrum of an /s/ typically peaks between 4 and 10 kHz, which implies that very fine computational meshes are needed to capture the eddies producing such high frequencies. In this work, a large-scale computation of the aeroacoustics of /s/ has been performed for a realistic vocal tract geometry, resorting to two different acoustic analogies. A stabilized finite element method that acts as a large eddy simulation model has been adopted to solve the flow dynamics. Also, a numerical strategy has been implemented that allows the determination, in a single computational run, of the separate contribution of the sound diffracted by the upper incisors from the overall radiated sound. Results are presented for points located close to the lip opening showing the relative influence of the upper teeth depending on frequency.
Collapse
Affiliation(s)
- Arnau Pont
- Centre Internacional de Mètodes Numèrics en Enginyeria, Universitat Politècnica de Catalunya, Barcelona, Spain
- GTM-Grup de recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| | - Oriol Guasch
- GTM-Grup de recerca en Tecnologies Mèdia, La Salle-Universitat Ramon Llull, Barcelona, Spain
| | - Joan Baiges
- Departament d'Enginyeria Civil i Ambiental, Universitat Politècnica de Catalunya, Barcelona, Spain
| | - Ramon Codina
- Departament d'Enginyeria Civil i Ambiental, Universitat Politècnica de Catalunya, Barcelona, Spain
| | | |
Collapse
|