1
|
Shahid MS, French AP, Valstar MF, Yakubov GE. Research in methodologies for modelling the oral cavity. Biomed Phys Eng Express 2024; 10:032001. [PMID: 38350128 DOI: 10.1088/2057-1976/ad28cc] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 02/13/2024] [Indexed: 02/15/2024]
Abstract
The paper aims to explore the current state of understanding surrounding in silico oral modelling. This involves exploring methodologies, technologies and approaches pertaining to the modelling of the whole oral cavity; both internally and externally visible structures that may be relevant or appropriate to oral actions. Such a model could be referred to as a 'complete model' which includes consideration of a full set of facial features (i.e. not only mouth) as well as synergistic stimuli such as audio and facial thermal data. 3D modelling technologies capable of accurately and efficiently capturing a complete representation of the mouth for an individual have broad applications in the study of oral actions, due to their cost-effectiveness and time efficiency. This review delves into the field of clinical phonetics to classify oral actions pertaining to both speech and non-speech movements, identifying how the various vocal organs play a role in the articulatory and masticatory process. Vitaly, it provides a summation of 12 articulatory recording methods, forming a tool to be used by researchers in identifying which method of recording is appropriate for their work. After addressing the cost and resource-intensive limitations of existing methods, a new system of modelling is proposed that leverages external to internal correlation modelling techniques to create a more efficient models of the oral cavity. The vision is that the outcomes will be applicable to a broad spectrum of oral functions related to physiology, health and wellbeing, including speech, oral processing of foods as well as dental health. The applications may span from speech correction, designing foods for the aging population, whilst in the dental field we would be able to gain information about patient's oral actions that would become part of creating a personalised dental treatment plan.
Collapse
Affiliation(s)
| | - Andrew P French
- School of Computer Science, University of Nottingham, NG8 1BB, United Kingdom
- School of Biosciences, University of Nottingham, LE12 5RD, United Kingdom
| | - Michel F Valstar
- School of Computer Science, University of Nottingham, NG8 1BB, United Kingdom
| | - Gleb E Yakubov
- School of Biosciences, University of Nottingham, LE12 5RD, United Kingdom
| |
Collapse
|
2
|
Shadle CH, Fulop SA, Chen WR, Whalen DH. Assessing accuracy of resonances obtained with reassigned spectrograms from the "ground truth" of physical vocal tract models. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:1253-1263. [PMID: 38341748 PMCID: PMC10858790 DOI: 10.1121/10.0024548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 01/04/2024] [Accepted: 01/06/2024] [Indexed: 02/13/2024]
Abstract
The reassigned spectrogram (RS) has emerged as the most accurate way to infer vocal tract resonances from the acoustic signal [Shadle, Nam, and Whalen (2016). "Comparing measurement errors for formants in synthetic and natural vowels," J. Acoust. Soc. Am. 139(2), 713-727]. To date, validating its accuracy has depended on formant synthesis for ground truth values of these resonances. Synthesis is easily controlled, but it has many intrinsic assumptions that do not necessarily accurately realize the acoustics in the way that physical resonances would. Here, we show that physical models of the vocal tract with derivable resonance values allow a separate approach to the ground truth, with a different range of limitations. Our three-dimensional printed vocal tract models were excited by white noise, allowing an accurate determination of the resonance frequencies. Then, sources with a range of fundamental frequencies were implemented, allowing a direct assessment of whether RS avoided the systematic bias towards the nearest strong harmonic to which other analysis techniques are prone. RS was indeed accurate at fundamental frequencies up to 300 Hz; above that, accuracy was somewhat reduced. Future directions include testing mechanical models with the dimensions of children's vocal tracts and making RS more broadly useful by automating the detection of resonances.
Collapse
Affiliation(s)
- Christine H Shadle
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06511, USA
| | - Sean A Fulop
- Department of Linguistics, Fresno State University, Fresno, California 93740, USA
| | - Wei-Rong Chen
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06511, USA
| | - D H Whalen
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06511, USA
| |
Collapse
|
3
|
Werner R, Fuchs S, Trouvain J, Kürbis S, Möbius B, Birkholz P. Acoustics of Breath Noises in Human Speech: Descriptive and Three-Dimensional Modeling Approaches. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023:1-15. [PMID: 37971432 DOI: 10.1044/2023_jslhr-23-00112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]
Abstract
PURPOSE Breathing is ubiquitous in speech production, crucial for structuring speech, and a potential diagnostic indicator for respiratory diseases. However, the acoustic characteristics of speech breathing remain underresearched. This work aims to characterize the spectral properties of human inhalation noises in a large speaker sample and explore their potential similarities with speech sounds. Speech sounds are mostly realized with egressive airflow. To account for this, we investigated the effect of airflow direction (inhalation vs. exhalation) on acoustic properties of certain vocal tract (VT) configurations. METHOD To characterize human inhalation, we describe spectra of breath noises produced by human speakers from two data sets comprising 34 female and 100 male participants. To investigate the effect of airflow direction, three-dimensional-printed VT models of a male and a female speaker with static VT configurations of four vowels and four fricatives were used. An airstream was directed through these VT configurations in both directions, and their spectral consequences were analyzed. RESULTS For human inhalations, we found spectra with a decreasing slope and several weak peaks below 3 kHz. These peaks show moderate (female) to strong (male) overlap with resonances found for participants inhaling with a VT configuration of a central vowel. Results for the VT models suggest that airflow direction is crucial for spectral properties of sibilants, /ç/, and /i:/, but not the other sounds we investigated. Inhalation noise is most similar to /ə/ where airflow direction does not play a role. CONCLUSIONS Inhalation is realized on ingressive airflow, and inhalation noises have specific resonance properties that are most similar to /ə/ but occur without phonation. Airflow direction does not play a role in this specific VT configuration, but subglottal resonances may do. For future work, we suggest investigating the articulation of speech breathing and link it to current work on pause postures. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24520585.
Collapse
Affiliation(s)
- Raphael Werner
- Department of Language Science and Technology, Saarland University, Saarbrücken, Germany
| | - Susanne Fuchs
- Leibniz-Centre General Linguistics (ZAS), Berlin, Germany
| | - Jürgen Trouvain
- Department of Language Science and Technology, Saarland University, Saarbrücken, Germany
| | - Steffen Kürbis
- Institute of Acoustics and Speech Communication, Technische Universität Dresden, Germany
| | - Bernd Möbius
- Department of Language Science and Technology, Saarland University, Saarbrücken, Germany
| | - Peter Birkholz
- Institute of Acoustics and Speech Communication, Technische Universität Dresden, Germany
| |
Collapse
|
4
|
Birkholz P, Blandin R, Kürbis S. Bandwidths of vocal tract resonances in physical models compared to transmission-line simulations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:3281. [PMID: 37307363 DOI: 10.1121/10.0019682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 05/25/2023] [Indexed: 06/14/2023]
Abstract
This study investigated how the bandwidths of resonances simulated by transmission-line models of the vocal tract compare to bandwidths measured from physical three-dimensional printed vowel resonators. Three types of physical resonators were examined: models with realistic vocal tract shapes based on Magnetic Resonance Imaging (MRI) data, straight axisymmetric tubes with varying cross-sectional areas, and two-tube approximations of the vocal tract with notched lips. All physical models had hard walls and closed glottis so the main loss mechanisms contributing to the bandwidths were sound radiation, viscosity, and heat conduction. These losses were accordingly included in the simulations, in two variants: A coarse approximation of the losses with frequency-independent lumped elements, and a detailed, theoretically more precise loss model. Across the examined frequency range from 0 to 5 kHz, the resonance bandwidths increased systematically from the simulations with the coarse loss model to the simulations with the detailed loss model, to the tube-shaped physical resonators, and to the MRI-based resonators. This indicates that the simulated losses, especially the commonly used approximations, underestimate the real losses in physical resonators. Hence, more realistic acoustic simulations of the vocal tract require improved models for viscous and radiation losses.
Collapse
Affiliation(s)
- Peter Birkholz
- Institute of Acoustics and Speech Communication, TU Dresden, Dresden, 01062, Germany
| | - Rémi Blandin
- Institute of Acoustics and Speech Communication, TU Dresden, Dresden, 01062, Germany
| | - Steffen Kürbis
- Institute of Acoustics and Speech Communication, TU Dresden, Dresden, 01062, Germany
| |
Collapse
|
5
|
Serrurier A, Neuschaefer-Rube C. Morphological and acoustic modeling of the vocal tract. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:1867. [PMID: 37002095 DOI: 10.1121/10.0017356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 02/07/2023] [Indexed: 05/18/2023]
Abstract
In speech production, the anatomical morphology forms the substrate on which the speakers build their articulatory strategy to reach specific articulatory-acoustic goals. The aim of this study is to characterize morphological inter-speaker variability by building a shape model of the full vocal tract including hard and soft structures. Static magnetic resonance imaging data from 41 speakers articulating altogether 1947 phonemes were considered, and the midsagittal articulator contours were manually outlined. A phoneme-independent average-articulation representative of morphology was calculated as the speaker mean articulation. A principal component analysis-driven shape model was derived from average-articulations, leading to five morphological components, which explained 87% of the variance. Almost three-quarters of the variance was related to independent variations of the horizontal oral and vertical pharyngeal lengths, the latter capturing male-female differences. The three additional components captured shape variations related to head tilt and palate shape. Plane wave propagation acoustic simulations were run to characterize morphological components. A lengthening of 1 cm of the vocal tract in the vertical or horizontal directions led to a decrease in formant values of 7%-8%. Further analyses are required to analyze three-dimensional variability and to understand the morphological-acoustic relationships per phoneme. Average-articulations and model code are publicly available (https://github.com/tonioser/VTMorphologicalModel).
Collapse
Affiliation(s)
- Antoine Serrurier
- Clinic for Phoniatrics, Pedaudiology, and Communication Disorders, University Hospital and Medical Faculty of the RWTH Aachen University, 52057 Aachen, Germany
| | - Christiane Neuschaefer-Rube
- Clinic for Phoniatrics, Pedaudiology, and Communication Disorders, University Hospital and Medical Faculty of the RWTH Aachen University, 52057 Aachen, Germany
| |
Collapse
|
6
|
Fleischer M, Rummel S, Stritt F, Fischer J, Bock M, Echternach M, Richter B, Traser L. Voice efficiency for different voice qualities combining experimentally derived sound signals and numerical modeling of the vocal tract. Front Physiol 2022; 13:1081622. [PMID: 36620215 PMCID: PMC9822708 DOI: 10.3389/fphys.2022.1081622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022] Open
Abstract
Purpose: Concerning voice efficiency considerations of different singing styles, from western classical singing to contemporary commercial music, only limited data is available to date. This single-subject study attempts to quantify the acoustic sound intensity within the human glottis depending on different vocal tract configurations and vocal fold vibration. Methods: Combining Finite-Element-Models derived from 3D-MRI data, audio recordings, and electroglottography (EGG) we analyzed vocal tract transfer functions, particle velocity and acoustic pressure at the glottis, and EGG-related quantities to evaluate voice efficiency at the glottal level and resonance characteristics of different voice qualities according to Estill Voice Training®. Results: Voice qualities Opera and Belting represent highly efficient strategies but apply different vowel strategies and should thus be capable of predominate orchestral sounds. Twang and Belting use similar vowels, but the twang vocal tract configuration enabled the occurrence of anti-resonances and was associated with reduced vocal fold contact but still partially comparable energy transfer from the glottis to the vocal tract. Speech was associated with highly efficient glottal to vocal tract energy transfer, but with the absence of psychoactive strategies makes it more susceptible to noise interference. Falsetto and Sobbing apply less efficiently. Falsetto mainly due to its voice source characteristics, Sobbing due to energy loss in the vocal tract. Thus technical amplification might be appropriate here. Conclusion: Differences exist between voice qualities regarding the sound intensity, caused by different vocal tract morphologies and oscillation characteristics of the vocal folds. The combination of numerical analysis of geometries inside the human body and experimentally determined data outside sheds light on acoustical quantities at the glottal level.
Collapse
Affiliation(s)
- Mario Fleischer
- Department of Audiology and Phoniatrics, Charité—Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | | | - Fiona Stritt
- Medical Center, Institute of Musicians’ Medicine, University of Freiburg, Freiburg, Germany
- Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Johannes Fischer
- Faculty of Medicine, University of Freiburg, Freiburg, Germany
- Medical Center, Department of Radiology, Medical Physics, University of Freiburg, Freiburg, Germany
| | - Michael Bock
- Faculty of Medicine, University of Freiburg, Freiburg, Germany
- Medical Center, Department of Radiology, Medical Physics, University of Freiburg, Freiburg, Germany
| | - Matthias Echternach
- Department of Otorhinolaryngology, Ludwig-Maximilians-Universität München, Division of Phoniatrics and Pediatric Audiology, LMU Klinikum, Munich, Germany
| | - Bernhard Richter
- Medical Center, Institute of Musicians’ Medicine, University of Freiburg, Freiburg, Germany
- Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Louisa Traser
- Medical Center, Institute of Musicians’ Medicine, University of Freiburg, Freiburg, Germany
- Faculty of Medicine, University of Freiburg, Freiburg, Germany
| |
Collapse
|
7
|
MEYER D, RUSHO RZ, ALAM W, CHRISTENSEN GE, HOWARD DM, ATHA J, HOFFMAN EA, STORY B, TITZE IR, LINGALA SG. High-Resolution Three-Dimensional Hybrid MRI + Low Dose CT Vocal Tract Modeling: A Cadaveric Pilot Study. J Voice 2022. [DOI: 10.1016/j.jvoice.2022.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
8
|
Ueyama M, Takano T. A decade of CO 2 flux measured by the eddy covariance method including the COVID-19 pandemic period in an urban center in Sakai, Japan. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2022; 304:119210. [PMID: 35358629 PMCID: PMC8958160 DOI: 10.1016/j.envpol.2022.119210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 03/21/2022] [Accepted: 03/22/2022] [Indexed: 06/14/2023]
Abstract
Cities constitute an important source of greenhouse gases, but few results originating from long-term, direct CO2 emission monitoring efforts have been reported. In this study, CO2 emissions were quasi-continuously measured in an urban center in Sakai, Osaka, Japan by the eddy covariance method from 2010 to 2021. Long-term CO2 emissions reached 22.2 ± 2.0 kg CO2 m-2 yr-1 from 2010 to 2019 (± denotes the standard deviation) in the western sector from the tower representing the densely built-up area. Throughout the decade, the annual CO2 emissions remained stable. According to an emission inventory, traffic emissions represented the major source of CO2 emissions within the flux footprint. The interannual variations in the annual CO2 flux were positively correlated with the mean annual traffic counts at two highway entrances and exits. The CO2 emissions decreased suddenly, by 32% ± 3.1%, in April and May 2020 during the period in which the first state of emergency associated with COVID-19 was declared. The annual CO2 emissions also decreased by 25% ± 3.1% in 2020. Direct long-term observations of CO2 emissions comprise a useful tool to monitor future emission reductions and sudden disruptions in emissions, such as those beginning in 2020 during the COVID-19 pandemic.
Collapse
Affiliation(s)
- Masahito Ueyama
- Graduate School of Life and Environmental Sciences, Osaka Prefecture University, 1-1 Gakuen-cho, Naka-ku, Sakai, Osaka, 599-8531, Japan.
| | - Tsugumi Takano
- Graduate School of Life and Environmental Sciences, Osaka Prefecture University, 1-1 Gakuen-cho, Naka-ku, Sakai, Osaka, 599-8531, Japan
| |
Collapse
|
9
|
Köberlein M, Birkholz P, Burdumy M, Richter B, Burk F, Traser L, Echternach M. Investigation of resonance strategies of high pitch singing sopranos using dynamic three-dimensional magnetic resonance imaging. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:4191. [PMID: 34972262 DOI: 10.1121/10.0008903] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 11/10/2021] [Indexed: 06/14/2023]
Abstract
Resonance-strategies with respect to vocal registers, i.e., frequency-ranges of uniform, demarcated voice quality, for the highest part of the female voice are still not completely understood. The first and second vocal tract resonances usually determine vowels. If the fundamental frequency exceeds the vowel-shaping resonance frequencies of speech, vocal tract resonances are tuned to voice source partials. It has not yet been clarified if such tuning is applicable for the entire voice-range, particularly for the top pitches. We investigated professional sopranos who regularly sing pitches above C6 (1047 Hz). Dynamic three-dimensional (3D) magnetic resonance imaging was used to calculate resonances for pitches from C5 (523 Hz) to C7 (2093 Hz) with different vowel configurations ([a:], [i:], [u:]), and different contexts (scales or octave jumps). A spectral analysis and an acoustic analysis of 3D-printed vocal tract models were conducted. The results suggest that there is no exclusive register-defining resonance-strategy. The intersection of fundamental frequency and first vocal tract resonance was not found to necessarily indicate a register shift. The articulators and the vocal tract resonances were either kept without significant adjustments, or the fR1:fo-tuning, wherein the first vocal tract resonance enhances the fundamental frequency, was applied until F6 (1396 Hz). An fR2:fo-tuning was not observed.
Collapse
Affiliation(s)
- Marie Köberlein
- Medical Faculty of the Albert-Ludwigs-University Freiburg, Freiburg Institute for Musicians' Medicine, University Medical Center Freiburg, University of Music Freiburg, Elsässer Straße 2m, 79110, Freiburg, Germany
| | - Peter Birkholz
- Institute of Acoustics and Speech Communication, Technische Universität Dresden, Germany
| | - Michael Burdumy
- Department of Medical Physics, Radiology, Freiburg University Medical Center, Germany
| | - Bernhard Richter
- Medical Faculty of the Albert-Ludwigs-University Freiburg, Freiburg Institute for Musicians' Medicine, University Medical Center Freiburg, University of Music Freiburg, Elsässer Straße 2m, 79110, Freiburg, Germany
| | - Fabian Burk
- Department of Otorhinolaryngology, Head and Neck Surgery, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Louisa Traser
- Medical Faculty of the Albert-Ludwigs-University Freiburg, Freiburg Institute for Musicians' Medicine, University Medical Center Freiburg, University of Music Freiburg, Elsässer Straße 2m, 79110, Freiburg, Germany
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, University Hospital, LMU Munich, Germany
| |
Collapse
|
10
|
Isaieva K, Laprie Y, Leclère J, Douros IK, Felblinger J, Vuissoz PA. Multimodal dataset of real-time 2D and static 3D MRI of healthy French speakers. Sci Data 2021; 8:258. [PMID: 34599194 PMCID: PMC8486854 DOI: 10.1038/s41597-021-01041-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 08/25/2021] [Indexed: 12/28/2022] Open
Abstract
The study of articulatory gestures has a wide spectrum of applications, notably in speech production and recognition. Sets of phonemes, as well as their articulation, are language-specific; however, existing MRI databases mostly include English speakers. In our present work, we introduce a dataset acquired with MRI from 10 healthy native French speakers. A corpus consisting of synthetic sentences was used to ensure a good coverage of the French phonetic context. A real-time MRI technology with temporal resolution of 20 ms was used to acquire vocal tract images of the participants speaking. The sound was recorded simultaneously with MRI, denoised and temporally aligned with the images. The speech was transcribed to obtain phoneme-wise segmentation of sound. We also acquired static 3D MR images for a wide list of French phonemes. In addition, we include annotations of spontaneous swallowing. Measurement(s) | Vocal tract images • Speech | Technology Type(s) | Magnetic Resonance Imaging • Microphone Device | Sample Characteristic - Organism | Homo sapiens |
Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.16404453
Collapse
Affiliation(s)
- Karyna Isaieva
- Université de Lorraine, INSERM, IADI, Nancy, F-54000, France.
| | - Yves Laprie
- Université de Lorraine, CNRS, Inria, LORIA, Nancy, F-54000, France
| | - Justine Leclère
- Université de Lorraine, INSERM, IADI, Nancy, F-54000, France.,Oral Medicine Department, University Hospital of Reims, 45 rue Cognacq-Jay, 51092, Reims, Cedex, France
| | - Ioannis K Douros
- Université de Lorraine, INSERM, IADI, Nancy, F-54000, France.,Université de Lorraine, CNRS, Inria, LORIA, Nancy, F-54000, France
| | - Jacques Felblinger
- Université de Lorraine, INSERM, IADI, Nancy, F-54000, France.,CIC-IT, INSERM, CHRU de Nancy, Nancy, F-54000, France
| | | |
Collapse
|
11
|
Yoshinaga T, Maekawa K, Iida A. Aeroacoustic differences between the Japanese fricatives [ɕ] and [ç]. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:2426. [PMID: 33940863 DOI: 10.1121/10.0003936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 03/06/2021] [Indexed: 06/12/2023]
Abstract
To elucidate the linguistic similarity between the alveolo-palatal sibilant [ɕ] and palatal non-sibilant [ç] in Japanese, the aeroacoustic differences between the two consonants were explored via experimentation with participants and analysis using simplified vocal tract models. The real-time magnetic resonance imaging (rtMRI) observations of articulatory movements demonstrated that some speakers use a nearly identical place of articulation for /si/ [ɕi] and /hi/ [çi]. Simplified vocal tract models were then constructed based on the data captured by static MRI, and the model-generated synthetic sounds were compared with speaker data producing [ɕ] and [ç]. Speaker data demonstrated that the amplitude of the broadband noise of [ç] was weaker than that of [ɕ]; the characteristic peak amplitude at approximately 4 kHz was greater in [ç] than in [ɕ], although the mid-sagittal vocal tract profiles were nearly identical for three of ten subjects in the rtMRI observation. These acoustic differences were reproduced by the proposed models, with differences in the width of the coronal plane constriction and the flow rate. The results suggest the need to include constriction width and flow rate as parameters for articulatory phonetic descriptions of speech sounds.
Collapse
Affiliation(s)
- Tsukasa Yoshinaga
- Toyohashi University of Technology, 1-1 Hibarigaoka, Tempaku, Toyohashi, Aichi 441-8580, Japan
| | - Kikuo Maekawa
- National Institute for Japanese Language and Linguistics, 10-2 Midoricho, Tachikawa, Tokyo 190-8561, Japan
| | - Akiyoshi Iida
- Toyohashi University of Technology, 1-1 Hibarigaoka, Tempaku, Toyohashi, Aichi 441-8580, Japan
| |
Collapse
|
12
|
Häsner P, Prescher A, Birkholz P. Effect of wavy trachea walls on the oscillation onset pressure of silicone vocal folds. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:466. [PMID: 33514162 DOI: 10.1121/10.0003362] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 12/29/2020] [Indexed: 06/12/2023]
Abstract
The influence of non-smooth trachea walls on phonation onset and offset pressures and the fundamental frequency of oscillation were experimentally investigated for three different synthetic vocal fold models. Three models of the trachea were compared: a cylindrical tube (smooth walls) and wavy-walled tubes with ripple depths of 1 and 2 mm. Threshold pressures for the onset and offset of phonation were measured at the lower and upper ends of each trachea tube. All measurements were performed both with and without a supraglottal resonator. While the fundamental frequency was not affected by non-smooth trachea walls, the phonation onset and offset pressures measured right below the glottis decreased with an increasing ripple depth of the trachea walls (up to 20% for 2 mm ripples). This effect was independent from the type of glottis model and the presence of a supraglottal resonator. The pressures at the lower end of the trachea and the average volume velocities showed a tendency to decrease with an increasing ripple depth of the trachea walls but to a much smaller extent. These results indicate that the subglottal geometry and the flow conditions in the trachea can substantially affect the oscillation of synthetic vocal folds.
Collapse
Affiliation(s)
- Patrick Häsner
- Insitute of Acoustics and Speech Communication, Technische Universität Dresden, Germany
| | - Andreas Prescher
- Institute of Molecular and Cellular Anatomy, Aachen University Hospital, Aachen, Germany
| | - Peter Birkholz
- Insitute of Acoustics and Speech Communication, Technische Universität Dresden, Germany
| |
Collapse
|