1
|
Luchesi LC, Cavalcanti JC, Lucci TK, David VF, Otta E, Monticelli PF. Zygosity Effects on Human Voice: Fundamental Frequency Analysis of Brazilian Twins' Speech. Twin Res Hum Genet 2024:1-8. [PMID: 39355961 DOI: 10.1017/thg.2024.33] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/03/2024]
Abstract
Voice production can be influenced by interindividual variations related to genetic, physiological, behavioral, and several environmental factors. Here we examined the effect of zygosity on speaking fundamental frequency (F0) statistical descriptors. Our aims were: (1) to determine whether the genetic similarity between monozygotic (MZ) and dizygotic (DZ) twins affects F0 characteristics, and (2) to quantify the contribution of genetic factors to these characteristics. The study involved 79 same-sex twin pairs of Brazilian Portuguese speakers, comprising 65 MZ and 14 DZ twins, aged 18 to 66 years (31.7 ± 11.6 years), with 21 male and 58 female pairs. Participants were recorded while uttering a greeting phrase and the Brazilian Portuguese version of the 'Happy Birthday to You' song. Speech segments were analyzed using Praat free software, and F0 measures were automatically extracted in both Hertz and semitone scales. Statistical descriptors, including centrality, dispersion, and extreme values of F0 were examined, and the ACE model (i.e., total genetic effects, A; shared environmental influences, C; and nonshared environmental influences, E) was employed to estimate the additive effect;ts of monozygosity. As anticipated, we observed a zygosity effect on several F0 parameters, with more similarity between MZ twins compared to DZ twins. We discuss the genetic influences on F0 parameters and the absence of a monozygosity effect in two of them. Additionally, we briefly address potential biases associated with the selected measurement scale for statistical modeling. Finally, we explore the influence of genetic factors on F0 patterns, as well as environmental, life history and linguistic factors, particularly concerning F0 variation in speech.
Collapse
Affiliation(s)
- Lilian C Luchesi
- Ethology and Bioacoustic Laboratory, Department of Psychology, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, São Paulo, Brazil
- Psychoethology and Human Ethology Laboratory, Department of Experimental Psychology, Instituto de Psicologia, Universidade de São Paulo, São Paulo
| | - Julio C Cavalcanti
- Integrated Acoustic Analysis and Cognition Laboratory, Pontifical Catholic University of São Paulo, Rua Ministro de Godoy, São Paulo, Brazil
- Institute of Language Studies, Department of Linguistics, University of Campinas, Campinas, São Paulo, Brazil
- Laboratory of Phonetics, Department of Linguistics, Stockholm University, Stockholm, Sweden
| | - Tania K Lucci
- Psychoethology and Human Ethology Laboratory, Department of Experimental Psychology, Instituto de Psicologia, Universidade de São Paulo, São Paulo
| | - Vinicius F David
- Psychoethology and Human Ethology Laboratory, Department of Experimental Psychology, Instituto de Psicologia, Universidade de São Paulo, São Paulo
| | - Emma Otta
- Psychoethology and Human Ethology Laboratory, Department of Experimental Psychology, Instituto de Psicologia, Universidade de São Paulo, São Paulo
| | - Patricia F Monticelli
- Ethology and Bioacoustic Laboratory, Department of Psychology, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, São Paulo, Brazil
| |
Collapse
|
2
|
Calvache C, Castillo-Triana N, Aguirre FD, Leguízamo P, Rojas S, Valenzuela P, Piedrahita MM, Ardila MDPR, Pérez DVB. Integration of Dysphagia Therapy Techniques into Voice Rehabilitation: Design and Content Validation of a Cross-Therapy Protocol. J Voice 2024:S0892-1997(24)00235-2. [PMID: 39244386 DOI: 10.1016/j.jvoice.2024.07.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 06/06/2024] [Accepted: 07/22/2024] [Indexed: 09/09/2024]
Abstract
BACKGROUND The intricate relationship between swallowing and phonation, sharing anatomical and physiological substrates, underscores a clinical demand for integrated therapeutic approaches. Existing interventions often address these functions in isolation, overlooking their interconnected dynamics. OBJECTIVE To design and validate a cross-therapy protocol incorporating dysphagia therapy techniques (maneuvers/exercises) into voice rehabilitation. This protocol aims to exploit the shared biomechanical components of swallowing and phonation to improve both functions simultaneously in patients with underlying hypofunctional laryngeal pathology. METHODS A descriptive research design was employed, consisting of three phases: a comprehensive literature review and expert discussions in a German seminar format to conceptualize the protocol; detailed analysis and categorization of swallowing maneuvers/exercises; and content validation by a panel of seven experts through a structured evaluation instrument. The process integrated motor learning and exercise physiology principles to ensure the protocol's clinical applicability and theoretical coherence. RESULTS The developed cross-therapy protocol incorporates four core swallowing therapy techniques to voice therapy procedures. Selected swallowing therapy techniques target laryngeal excursion and vocal fold closure because they are critical components of swallowing and phonation. Expert validation yielded a Content Validity Coefficient exceeding 0.90 for most items, indicating high consensus on the protocol's relevance, clarity, and applicability. Adjustments were made based on feedback, enhancing the protocol's precision and user-friendliness. CONCLUSION We present a novel, evidence-based therapy protocol for voice and swallowing difficulties resulting from hypofunctional laryngeal pathology. Its development marks a significant step toward bridging the gap between swallowing and voice therapy. Future empirical studies are needed to assess its effectiveness in clinical settings.
Collapse
Affiliation(s)
- Carlos Calvache
- Corporación Universitaria Iberoamericana, Department Communication Sciences and Disorders, Bogotá, Colombia; Vocology Research, Vocology Center, Bogotá, Colombia.
| | - Nicolás Castillo-Triana
- Corporación Universitaria Iberoamericana, Department Communication Sciences and Disorders, Bogotá, Colombia
| | - Fernando Delprado Aguirre
- Vocology Research, Vocology Center, Bogotá, Colombia; Fundación Universitaria María Cano, Speech Therapy Program, Medellín, Colombia
| | - Paola Leguízamo
- Escuela Colombiana de Rehabilitación, Speech Therapy Program, Bogotá, Colombia
| | - Sandra Rojas
- Escuela de Fonoaudiología, Facultad de Odontología y Ciencias de la Rehabilitación, Universidad San Sebastián, Santiago, Chile
| | | | | | | | | |
Collapse
|
3
|
Manda Y, Kodama N, Mori K, Adachi R, Matsugishi M, Minagi S. Basic characteristics of tongue pressure and electromyography generated by articulation of a syllable using the posterior part of the tongue. Sci Rep 2024; 14:20756. [PMID: 39237702 PMCID: PMC11377720 DOI: 10.1038/s41598-024-71909-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 09/02/2024] [Indexed: 09/07/2024] Open
Abstract
The basic function of the tongue in pronouncing diadochokinesis and other syllables is not fully understood. This study investigates the influence of sound pressure levels and syllables on tongue pressure and muscle activity in 19 healthy adults (mean age: 28.2 years; range: 22-33 years). Tongue pressure and activity of the posterior tongue were measured using electromyography (EMG) when the velar stops /ka/, /ko/, /ga/, and /go/ were pronounced at 70, 60, 50, and 40 dB. Spearman's rank correlation revealed a significant, yet weak, positive association between tongue pressure and EMG activity (ρ = 0.14, p < 0.05). Mixed-effects model analysis showed that tongue pressure and EMG activity significantly increased at 70 dB compared to other sound pressure levels. While syllables did not significantly affect tongue pressure, the syllable /ko/ significantly increased EMG activity (coefficient = 0.048, p = 0.013). Although no significant differences in tongue pressure were observed for the velar stops /ka/, /ko/, /ga/, and /go/, it is suggested that articulation is achieved by altering the activity of both extrinsic and intrinsic tongue muscles. These findings highlight the importance of considering both tongue pressure and muscle activity when examining the physiological factors contributing to sound pressure levels during speech.
Collapse
Affiliation(s)
- Yousuke Manda
- Department of Occlusal and Oral Functional Rehabilitation, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, 2-5-1 Shikata-cho, Kita-ku, Okayama, 700-8525, Japan
| | - Naoki Kodama
- Department of Occlusal and Oral Functional Rehabilitation, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, 2-5-1 Shikata-cho, Kita-ku, Okayama, 700-8525, Japan.
| | - Keitaro Mori
- Department of Occlusal and Oral Functional Rehabilitation, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, 2-5-1 Shikata-cho, Kita-ku, Okayama, 700-8525, Japan
| | - Reimi Adachi
- Department of Occlusal and Oral Functional Rehabilitation, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, 2-5-1 Shikata-cho, Kita-ku, Okayama, 700-8525, Japan
| | - Makoto Matsugishi
- Department of Occlusal and Oral Functional Rehabilitation, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, 2-5-1 Shikata-cho, Kita-ku, Okayama, 700-8525, Japan
| | - Shogo Minagi
- Department of Occlusal and Oral Functional Rehabilitation, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, 2-5-1 Shikata-cho, Kita-ku, Okayama, 700-8525, Japan
| |
Collapse
|
4
|
Payten CL, Chiapello G, Weir KA, Madill CJ. Frameworks, Terminology and Definitions Used for the Classification of Voice Disorders: A Scoping Review. J Voice 2024; 38:1070-1087. [PMID: 35317970 DOI: 10.1016/j.jvoice.2022.02.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 02/03/2022] [Accepted: 02/06/2022] [Indexed: 10/18/2022]
Abstract
BACKGROUND A challenge for clinicians and researchers in laryngology is a lack of international consensus for an agreed framework to classify homogenous groups of voice disorders. Consistency in terminology and agreement in how conditions are classified will provide greater clarity for clinicians and researchers. OBJECTIVE This scoping review aimed to examine the published literature on frameworks, terminology, and criteria for the classification of voice disorders. DESIGN Seven online databases (MEDLINE, Embase, CINAHL, PsycInfo, Scopus, Cochrane Collaboration, Web of Science) and grey literature sources were searched. Studies published from 1940 to 2021 were included if they provided a descriptive detail of a classification framework structure and described the methodological approaches to determine classification. A narrative synthesis of the main concepts including terminology, classification criteria, grouping of conditions, critical appraisal items and gaps in research was undertaken. RESULTS A total of 2,675 publications were screened. Twenty sources met inclusion criteria, including published articles and grey literature. Thirty-five classification groups and over 150 sub-groups were described. The classification group labels, and criteria for inclusion of conditions varied across the frameworks. Several key themes in terminology and criteria useful for classification are discussed, and a core set of suggested terms and definitions are presented. CONCLUSIONS The quality of research on classification frameworks for voice disorders is low and not one system encompasses all voice disorders across the whole spectrum. Continued high quality research using consensus methodology and inter-rater reliability scores is recommended to develop and test an internationally agreed classification framework for voice disorders.
Collapse
Affiliation(s)
- Christopher L Payten
- Department of Speech Pathology and Audiology, Gold Coast Health, Gold Coast University Hospital, Southport, Queensland, Australia; Faculty of Medicine and Health, Sydney School of Health Sciences, Discipline of Speech Pathology, University of Sydney, Camperdown, New South Wales, Australia.
| | - Greg Chiapello
- Department of Speech Pathology and Audiology, Gold Coast Health, Gold Coast University Hospital, Southport, Queensland, Australia
| | - Kelly A Weir
- Department of Allied Health Research, Gold Coast Health, Gold Coast University Hospital, Southport, Queensland, Australia; Menzies Health Institute Queensland, School of Health Sciences, Griffith University, Gold Coast Campus, Southport, Queensland, Australia
| | - Catherine J Madill
- Faculty of Medicine and Health, Sydney School of Health Sciences, Discipline of Speech Pathology, University of Sydney, Camperdown, New South Wales, Australia
| |
Collapse
|
5
|
Jing H, Ge H, Tang H, Weng W, Choi S, Wang C, Wang L, Cui X. Assessing respiratory airflow unsteadiness under different tidal respiratory frequencies using large eddy simulation method. Comput Biol Med 2024; 179:108834. [PMID: 38996553 DOI: 10.1016/j.compbiomed.2024.108834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 06/11/2024] [Accepted: 06/29/2024] [Indexed: 07/14/2024]
Abstract
Unsteady respiratory airflow characteristics play a crucial role in understanding the deposition of toxic particles and inhaled aerosol drugs in the human respiratory tract. Considering the variations in respiratory flow rate and glottis motion under different respiratory frequencies, these respiratory airflow characteristics are studied by large-eddy simulations, including pressure field, power loss, modal spatial patterns, and vortex structures. Firstly, the results reveal that varying respiratory frequencies significantly affect airflow unsteadiness, turbulent evolution, and vortex structure dissipation, as they increase the complexity and butterfly effect introduced by the turbulent disturbance. Secondly, the pressure drops and flow rate at the glottis also conform to a power-law relationship considering the respiratory physiological characteristics, especially under low respiratory frequencies. Glottis motion plays different roles in energy consumption during inspiration and expiration, and its magnitude can be predicted using a polynomial function based on glottis area and respiratory flowrate under different respiratory frequencies. Finally, modal decomposition can be effectively applied to the study of respiratory flow characteristics, but we recommend separately studying the inspiration and expiration. The spatial distribution of the dominant mode characterizes the majority of respiratory flow characteristics and are influenced by respiratory frequency. Spectral entropy results indicate that glottis motion and slow breathing both delay the transitions in the upper respiratory tract during inspiration and expiration. These results confirm that the respiratory physiology characteristics under different respiratory frequencies have a significant impact on the unsteady respiratory airflow characteristics and warrant further study.
Collapse
Affiliation(s)
- Hao Jing
- School of Aerospace Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Haiwen Ge
- Research Center for Intelligent Supercomputing, Zhejiang Laboratory, Hangzhou, 311101, China
| | - Hui Tang
- Department of Mechanical Engineering, The Hong Kong Polytechnic University, Hong Kong, 999077, China
| | - Wenguo Weng
- Institute of Public Safety Research, Department of Engineering Physics, Tsinghua University, Beijing, 100084, China
| | - Sanghun Choi
- School of Mechanical Engineering, Kyungpook National University, Daegu, 41566, South Korea
| | - Chenglei Wang
- Department of Mechanical Engineering, The Hong Kong Polytechnic University, Hong Kong, 999077, China
| | - Li Wang
- School of Environmental Science and Engineering, Tianjin University, Tianjin, 300072, China
| | - Xinguang Cui
- School of Aerospace Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China.
| |
Collapse
|
6
|
Zhang Z. Contribution of Undesired Medial Surface Shape to Suboptimal Voice Outcome After Medialization Laryngoplasty. J Voice 2024; 38:1220-1226. [PMID: 35410779 DOI: 10.1016/j.jvoice.2022.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/09/2022] [Accepted: 03/10/2022] [Indexed: 10/18/2022]
Abstract
OBJECTIVES Voice production in pathological conditions or after surgical intervention often involves undesired medial surface shape such as reduced vertical thickness and/or left-right asymmetry in medial surface shape. The effect of such undesired medial surface on voice production remains unclear, and is often not taken into consideration during planning of surgical intervention, due to difficulty of imaging the medial surface in patients. This study aims to better understand how voice outcomes are impacted by undesired medial surface shape. METHODS Computational simulations were conducted to parametrically manipulate medial surface shape and stiffness and observe its consequence on voice production. RESULTS The results showed that undesired medial surface shape can result in incomplete glottal closure, weak voice production, increased phonation threshold, and significantly reduced vocal efficiency, particularly in the presence of left-right stiffness asymmetry. CONCLUSIONS In addition to approximating the vocal folds, medialization laryngoplasty should aim to sufficiently increase medial surface thickness, which may improve voice outcomes in patients whose voices remain unsatisfactory or suboptimal after initial intervention. While a divergent implant may increase medial surface thickness, precise implant placement in anticipation of tissue and implant deformation during the insertion process is equally important in order to achieve desired medial surface shape and optimal voice outcomes.
Collapse
Affiliation(s)
- Zhaoyan Zhang
- Department of Head and Neck Surgery, University of California, Los Angeles, California.
| |
Collapse
|
7
|
Michaud-Dorko J, Farbos de Luzan C, Dion GR, Gutmark E, Oren L. Comparison of Aerodynamic and Elastic Properties in Tissue and Synthetic Models of Vocal Fold Vibrations. Bioengineering (Basel) 2024; 11:834. [PMID: 39199792 PMCID: PMC11351855 DOI: 10.3390/bioengineering11080834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Revised: 08/13/2024] [Accepted: 08/14/2024] [Indexed: 09/01/2024] Open
Abstract
Three laryngeal models were used to investigate the aerodynamic and elastic properties of vocal fold vibration: cadaveric human, excised canine, and synthetic silicone vocal folds. The aim was to compare the characteristics of these models to enhance our understanding of phonatory mechanisms. Flow and medial glottal wall geometry were acquired via particle image velocimetry. Elastic properties were assessed from force-displacement tests. Relatively, the human larynges had higher fundamental frequency values, while canine and synthetic models exhibited greater flow rates. Canine models demonstrated the highest divergence angles and vertical stiffness gradients followed by the human model, both displaying flow separation vortices during closing. Synthetic models, whose advantage is their accessibility and repeatability, displayed the lowest glottal divergence angles and total circulation values compared to tissue models with no flow separation vortices. The elasticity tests revealed that tissue models showed significant hysteresis and vertical stiffness gradients, unlike the synthetic models. These results underscore the importance of model selection based on specific research needs and highlight the potential of canine and synthetic models for controlled experimental studies in phonation.
Collapse
Affiliation(s)
- Jacob Michaud-Dorko
- Department of Biomedical Engineering, University of Cincinnati, 665 Baldwin Hall, Cincinnati, OH 45221-0070, USA; (G.R.D.); (L.O.)
| | - Charles Farbos de Luzan
- Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, OH 45267-0528, USA; (C.F.d.L.); (E.G.)
| | - Gregory R. Dion
- Department of Biomedical Engineering, University of Cincinnati, 665 Baldwin Hall, Cincinnati, OH 45221-0070, USA; (G.R.D.); (L.O.)
- Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, OH 45267-0528, USA; (C.F.d.L.); (E.G.)
| | - Ephraim Gutmark
- Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, OH 45267-0528, USA; (C.F.d.L.); (E.G.)
- Department of Aerospace Engineering, University of Cincinnati, 799 Rhodes Hall, Cincinnati, OH 45221-0070, USA
| | - Liran Oren
- Department of Biomedical Engineering, University of Cincinnati, 665 Baldwin Hall, Cincinnati, OH 45221-0070, USA; (G.R.D.); (L.O.)
- Department of Otolaryngology-Head and Neck Surgery, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, OH 45267-0528, USA; (C.F.d.L.); (E.G.)
| |
Collapse
|
8
|
Deng JJ, Peterson SD. Sensitivity of Phonation Onset Pressure to Vocal Fold Stiffness Distribution. J Biomech Eng 2024; 146:081003. [PMID: 38345603 DOI: 10.1115/1.4064718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Indexed: 03/22/2024]
Abstract
Phonation onset is characterized by the unstable growth of vocal fold (VF) vibrations that ultimately results in self-sustained oscillation and the production of modal voice. Motivated by histological studies, much research has focused on the role of the layered structure of the vocal folds in influencing phonation onset, wherein the outer "cover" layer is relatively soft and the inner "body" layer is relatively stiff. Recent research, however, suggests that the body-cover (BC) structure over-simplifies actual stiffness distributions by neglecting important spatial variations, such as inferior-superior (IS) and anterior-posterior gradients and smooth transitions in stiffness from one histological layer to another. Herein, we explore sensitivity of phonation onset to stiffness gradients and smoothness. By assuming no a priori stiffness distribution and considering a second-order Taylor series sensitivity analysis of phonation onset pressure with respect to stiffness, we find two general smooth stiffness distributions most strongly influence onset pressure: a smooth stiffness containing aspects of BC differences and IS gradients in the cover, which plays a role in minimizing onset pressure, and uniform increases in stiffness, which raise onset pressure and frequency. While the smooth stiffness change contains aspects qualitatively similar to layered BC distributions used in computational studies, smooth transitions in stiffness result in higher sensitivity of onset pressure than discrete layering. These two general stiffness distributions also provide a simple, low-dimensional, interpretation of how complex variations in VF stiffness affect onset pressure, enabling refined exploration of the effects of stiffness distributions on phonation onset.
Collapse
Affiliation(s)
- Jonathan J Deng
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
| | - Sean D Peterson
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
| |
Collapse
|
9
|
Parra JA, Calvache C, Alzamendi GA, Ibarra EJ, Soláque L, Peterson SD, Zañartu M. Asymmetric triangular body-cover model of the vocal folds with bilateral intrinsic muscle activation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 156:939-953. [PMID: 39133633 DOI: 10.1121/10.0028164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 07/12/2024] [Indexed: 08/21/2024]
Abstract
Many voice disorders are linked to imbalanced muscle activity and known to exhibit asymmetric vocal fold vibration. However, the relation between imbalanced muscle activation and asymmetric vocal fold vibration is not well understood. This study introduces an asymmetric triangular body-cover model of the vocal folds, controlled by the activation of bilateral intrinsic laryngeal muscles, to investigate the effects of muscle imbalance on vocal fold oscillation. Various scenarios were considered, encompassing imbalance in individual muscles and muscle pairs, as well as accounting for asymmetry in lumped element parameters. Measurements of amplitude and phase asymmetries were employed to match the oscillatory behavior of two pathological cases: unilateral paralysis and muscle tension dysphonia. The resulting simulations exhibit muscle imbalance consistent with expectations in the composition of these voice disorders, yielding asymmetries exceeding 30% for paralysis and below 5% for dysphonia. This underscores the relevance of muscle imbalance in representing phonatory scenarios and its potential for characterizing asymmetry in vocal fold vibration.
Collapse
Affiliation(s)
- Jesús A Parra
- Department of Electronic Engineering, Universidad Técnica Federico Santa Maria, Valparaíso, Chile
| | - Carlos Calvache
- Department of Mechatronics Engineering, Universidad Militar, Bogota, Colombia
- Department Communication Sciences and Disorders, Corporación Universitaria Iberoamericana, Bogotá, Colombia
| | - Gabriel A Alzamendi
- Institute for Research and Development on Bioengineering and Bioinformatics, Consejo Nacional de Investigaciones Científicas y Técnicas, Universidad Nacional de Entre Ríos, Oro Verde, Entre Ríos, Argentina
- Facultad de Ingeniería, Universidad Nacional de Entre Ríos, Entre Ríos, Argentina
| | - Emiro J Ibarra
- Department of Electronic Engineering, Universidad Técnica Federico Santa Maria, Valparaíso, Chile
| | - Leonardo Soláque
- Department of Mechatronics Engineering, Universidad Militar, Bogota, Colombia
| | - Sean D Peterson
- Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, Ontario, Canada
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa Maria, Valparaíso, Chile
| |
Collapse
|
10
|
Thomson SL. Synthetic, self-oscillating vocal fold models for voice production researcha). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 156:1283-1308. [PMID: 39172710 PMCID: PMC11348498 DOI: 10.1121/10.0028267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 07/26/2024] [Accepted: 07/30/2024] [Indexed: 08/24/2024]
Abstract
Sound for the human voice is produced by vocal fold flow-induced vibration and involves a complex coupling between flow dynamics, tissue motion, and acoustics. Over the past three decades, synthetic, self-oscillating vocal fold models have played an increasingly important role in the study of these complex physical interactions. In particular, two types of models have been established: "membranous" vocal fold models, such as a water-filled latex tube, and "elastic solid" models, such as ultrasoft silicone formed into a vocal fold-like shape and in some cases with multiple layers of differing stiffness to mimic the human vocal fold tissue structure. In this review, the designs, capabilities, and limitations of these two types of models are presented. Considerations unique to the implementation of elastic solid models, including fabrication processes and materials, are discussed. Applications in which these models have been used to study the underlying mechanical principles that govern phonation are surveyed, and experimental techniques and configurations are reviewed. Finally, recommendations for continued development of these models for even more lifelike response and clinical relevance are summarized.
Collapse
Affiliation(s)
- Scott L Thomson
- Department of Mechanical and Civil Engineering, Brigham Young University-Idaho, Rexburg, Idaho 83460, USA
| |
Collapse
|
11
|
Cecchin-Albertoni C, Deny O, Planat-Bénard V, Guissard C, Paupert J, Vaysse F, Marty M, Casteilla L, Monsarrat P, Kémoun P. The oral organ: A new vision of the mouth as a whole for a gerophysiological approach to healthy aging. Ageing Res Rev 2024; 99:102360. [PMID: 38821417 DOI: 10.1016/j.arr.2024.102360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 05/07/2024] [Accepted: 05/28/2024] [Indexed: 06/02/2024]
Abstract
This article brings a new perspective on oral physiology by presenting the oral organ as an integrated entity within the entire organism and its surrounding environment. Rather than considering the mouth solely as a collection of discrete functions, this novel approach emphasizes its role as a dynamic interphase, supporting interactions between the body and external factors. As a resilient ecosystem, the equilibrium of mouth ecological niches is the result of a large number of interconnected factors including the heterogeneity of different oral structures, diversity of resources, external and internal pressures and biological actors. The manuscript seeks to deepen the understanding of age-related changes within the oral cavity and throughout the organism, aligning with the evolving field of gerophysiology. The strategic position and fundamental function of the mouth make it an invaluable target for early prevention, diagnosis, treatment, and even reversal of aging effects throughout the entire organism. Recognizing the oral cavity capacity for sensory perception, element capture and information processing underscores its vital role in continuous health monitoring. Overall, this integrated understanding of the oral physiology aims at advancing comprehensive approaches to the oral healthcare and promoting broader awareness of its implications on the overall well-being.
Collapse
Affiliation(s)
- Chiara Cecchin-Albertoni
- Oral Medicine Department and CHU de Toulouse, Toulouse Institute of Oral Medicine and Science, Toulouse, France; RESTORE Research Center, Université de Toulouse, INSERM, CNRS, EFS, ENVT, Université P. Sabatier, Toulouse, France
| | - Olivier Deny
- Oral Medicine Department and CHU de Toulouse, Toulouse Institute of Oral Medicine and Science, Toulouse, France; RESTORE Research Center, Université de Toulouse, INSERM, CNRS, EFS, ENVT, Université P. Sabatier, Toulouse, France
| | - Valérie Planat-Bénard
- RESTORE Research Center, Université de Toulouse, INSERM, CNRS, EFS, ENVT, Université P. Sabatier, Toulouse, France
| | - Christophe Guissard
- Oral Medicine Department and CHU de Toulouse, Toulouse Institute of Oral Medicine and Science, Toulouse, France; RESTORE Research Center, Université de Toulouse, INSERM, CNRS, EFS, ENVT, Université P. Sabatier, Toulouse, France
| | - Jenny Paupert
- RESTORE Research Center, Université de Toulouse, INSERM, CNRS, EFS, ENVT, Université P. Sabatier, Toulouse, France
| | - Frédéric Vaysse
- Oral Medicine Department and CHU de Toulouse, Toulouse Institute of Oral Medicine and Science, Toulouse, France
| | - Mathieu Marty
- Oral Medicine Department and CHU de Toulouse, Toulouse Institute of Oral Medicine and Science, Toulouse, France; LIRDEF, Faculty of Educational Sciences, Paul Valery University, Montpellier CEDEX 5 34199, France
| | - Louis Casteilla
- RESTORE Research Center, Université de Toulouse, INSERM, CNRS, EFS, ENVT, Université P. Sabatier, Toulouse, France
| | - Paul Monsarrat
- Oral Medicine Department and CHU de Toulouse, Toulouse Institute of Oral Medicine and Science, Toulouse, France; RESTORE Research Center, Université de Toulouse, INSERM, CNRS, EFS, ENVT, Université P. Sabatier, Toulouse, France; Artificial and Natural Intelligence Toulouse Institute ANITI, Toulouse, France
| | - Philippe Kémoun
- Oral Medicine Department and CHU de Toulouse, Toulouse Institute of Oral Medicine and Science, Toulouse, France; RESTORE Research Center, Université de Toulouse, INSERM, CNRS, EFS, ENVT, Université P. Sabatier, Toulouse, France.
| |
Collapse
|
12
|
Zhang Z. Principal dimensions of voice production and their role in vocal expression. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 156:278-283. [PMID: 38980102 PMCID: PMC11236430 DOI: 10.1121/10.0027913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 06/20/2024] [Accepted: 06/24/2024] [Indexed: 07/10/2024]
Abstract
How we produce and perceive voice is constrained by laryngeal physiology and biomechanics. Such constraints may present themselves as principal dimensions in the voice outcome space that are shared among speakers. This study attempts to identify such principal dimensions in the voice outcome space and the underlying laryngeal control mechanisms in a three-dimensional computational model of voice production. A large-scale voice simulation was performed with parametric variations in vocal fold geometry and stiffness, glottal gap, vocal tract shape, and subglottal pressure. Principal component analysis was applied to data combining both the physiological control parameters and voice outcome measures. The results showed three dominant dimensions accounting for at least 50% of the total variance. The first two dimensions describe respiratory-laryngeal coordination in controlling the energy balance between low- and high-frequency harmonics in the produced voice, and the third dimension describes control of the fundamental frequency. The dominance of these three dimensions suggests that voice changes along these principal dimensions are likely to be more consistently produced and perceived by most speakers than other voice changes, and thus are more likely to have emerged during evolution and be used to convey important personal information, such as emotion and larynx size.
Collapse
Affiliation(s)
- Zhaoyan Zhang
- Department of Head and Neck Surgery, University of California, Los Angeles, 31-24 Rehab Center, 1000 Veteran Avenue, Los Angeles, California 90095-1794, USA
| |
Collapse
|
13
|
Borjon JI, Abney DH, Yu C, Smith LB. Infant vocal productions coincide with body movements. Dev Sci 2024; 27:e13491. [PMID: 38433472 PMCID: PMC11161311 DOI: 10.1111/desc.13491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 02/14/2024] [Accepted: 02/21/2024] [Indexed: 03/05/2024]
Abstract
Producing recognizable words is a difficult motor task; a one-syllable word can require the coordination of over 80 muscles. Thus, it is not surprising that the development of word productions in infancy lags considerably behind receptive language and is a known limiting factor in language development. A large literature has focused on the vocal apparatus, its articulators, and language development. There has been limited study of the relations between non-speech motor skills and the quality of early speech productions. Here we present evidence that the spontaneous vocalizations of 9- to 24-month-old infants recruit extraneous, synergistic co-activations of hand and head movements and that the temporal precision of the co-activation of vocal and extraneous muscle groups tightens with age and improved recognizability of speech. These results implicate an interaction between the muscle groups that produce speech and other body movements and provide new empirical pathways for understanding the role of motor development in language acquisition. RESEARCH HIGHLIGHTS: The spontaneous vocalizations of 9- to 24-month-old infants recruit extraneous, synergistic co-activations of hand and head movements. The temporal precision of these hand and head movements during vocal production tighten with age and improved speech recognition. These results implicate an interaction between the muscle groups producing speech with other body movements. These results provide new empirical pathways for understanding the role of motor development in language acquisition.
Collapse
Affiliation(s)
- Jeremy I. Borjon
- Department of Psychology, University of Houston, Houston, USA
- Texas Institute for Measurement, Evaluation, and Statistics, University of Houston, Houston, USA
- Texas Center for Learning Disorders, University of Houston, Houston, USA
| | - Drew H. Abney
- Department of Psychology, University of Georgia, Athens, USA
| | - Chen Yu
- Department of Psychology, University of Texas, Austin, USA
| | - Linda B. Smith
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, USA
| |
Collapse
|
14
|
Patel RR, Döllinger M, Jakubaß B, Pinhack H, Katz U, Semmler M. Analyzing Vocal Fold Frequency Dynamics Using High-Speed 3D Laser Video Endoscopy. Laryngoscope 2024; 134:3267-3276. [PMID: 38481073 PMCID: PMC11182720 DOI: 10.1002/lary.31394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 02/24/2024] [Accepted: 02/29/2024] [Indexed: 06/18/2024]
Abstract
OBJECTIVE To examine changes in lateral and vertical vibratory motion along the anterior, middle, and posterior sections of the vocal folds, as a function of vocal frequency variations. METHODS Absolute measurements of vocal fold surface dynamics from high-speed videoendoscopy with custom laser endoscope were made on 23 vocally healthy adults during sustained /i:/ production at 10%, 20%, and 80% of pitch range. The 3D parameters of amplitude (mm), maximum velocity opening/closing (mm/s), and mean velocity opening/closing (mm/s) were computed for the lateral and vertical vibratory motion along the anterior, middle, and posterior sections of the vocal folds. Linear mixed model analysis was conducted to evaluate the differences in (a) vocal frequency levels (high vs. normal vs. low pitch), (b) axis level (vertical vs. lateral), (c) position level (anterior vs. middle vs. posterior), and (d) gender differences (male vs. female). RESULTS Overall, the superior surface vertical motion of the vocal fold is greater compared with the lateral motion, especially in males. Along the superior surface, the mean and maximum closing velocities are greater posteriorly for low pitch. The location (anterior, middle, and posterior) along the superior surface is relevant only for vocal fold closing rather than opening, as the dynamics are different along the various locations. CONCLUSIONS The study highlights the significance of assessing the vertical motion of the superior surface of the vocal fold to understand the complex dynamics of voice production. LEVEL OF EVIDENCE NA Laryngoscope, 134:3267-3276, 2024.
Collapse
Affiliation(s)
- Rita R. Patel
- Department of Otolaryngology Head and Neck Surgery, Indiana University, Indianapolis, Indiana, United States
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Bernhard Jakubaß
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Hanna Pinhack
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Ute Katz
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Marion Semmler
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
15
|
Jia SJ, Jing JQ, Yang CJ. A Review on Autism Spectrum Disorder Screening by Artificial Intelligence Methods. J Autism Dev Disord 2024:10.1007/s10803-024-06429-9. [PMID: 38842671 DOI: 10.1007/s10803-024-06429-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/30/2024] [Indexed: 06/07/2024]
Abstract
PURPOSE With the increasing prevalence of autism spectrum disorders (ASD), the importance of early screening and diagnosis has been subject to considerable discussion. Given the subtle differences between ASD children and typically developing children during the early stages of development, it is imperative to investigate the utilization of automatic recognition methods powered by artificial intelligence. We aim to summarize the research work on this topic and sort out the markers that can be used for identification. METHODS We searched the papers published in the Web of Science, PubMed, Scopus, Medline, SpringerLink, Wiley Online Library, and EBSCO databases from 1st January 2013 to 13th November 2023, and 43 articles were included. RESULTS These articles mainly divided recognition markers into five categories: gaze behaviors, facial expressions, motor movements, voice features, and task performance. Based on the above markers, the accuracy of artificial intelligence screening ranged from 62.13 to 100%, the sensitivity ranged from 69.67 to 100%, the specificity ranged from 54 to 100%. CONCLUSION Therefore, artificial intelligence recognition holds promise as a tool for identifying children with ASD. However, it still needs to continually enhance the screening model and improve accuracy through multimodal screening, thereby facilitating timely intervention and treatment.
Collapse
Affiliation(s)
- Si-Jia Jia
- Faculty of Education, East China Normal University, Shanghai, China
| | - Jia-Qi Jing
- Faculty of Education, East China Normal University, Shanghai, China
| | - Chang-Jiang Yang
- Faculty of Education, East China Normal University, Shanghai, China.
- China Research Institute of Care and Education of Infants and Young, Shanghai, China.
| |
Collapse
|
16
|
Gao S, Ma EPM. The Relationship Between Voice Parameters and Speech Intelligibility: A Scoping Review. J Voice 2024:S0892-1997(24)00130-9. [PMID: 38755076 DOI: 10.1016/j.jvoice.2024.04.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 04/07/2024] [Accepted: 04/08/2024] [Indexed: 05/18/2024]
Abstract
OBJECTIVE To synthesize existing evidence of the relationship between voice parameters and speech intelligibility. METHODS Following Preferred Reporting Items for Systematic Reviews and Meta-Analysis extension for Scoping Review (PRISMA-ScR) guidelines, 13 databases were searched and a manual search was conducted. A narrative synthesis of methodological quality, study characteristics, participant demographics, voice parameter categorization, and their relationship to speech intelligibility was conducted. A Grading of Recommendations Assessment, Development, and Evaluation (GRADE) assessment was also performed. RESULTS A total of 5593 studies were retrieved, and 30 eligible studies were included in the final scoping review. The studies were given scores of 10-25 (average 16.93) out of 34 in the methodological quality assessment. Research that analyzed voice parameters related to speech intelligibility, encompassing perceptual, acoustic, and aerodynamic parameters, was included. Validated and nonvalidated perceptual voice assessments showed divergent results regarding the relationship between perceptual parameters and speech intelligibility. The relationship between acoustic parameters and speech intelligibility was found to be complex and the results were inconsistent. The limited research on aerodynamic parameters did not reach a consensus on their relationship with speech intelligibility. Studies in which listeners were not speech-language pathologists (SLPs) far outnumbered those with SLP listeners, and research conducted in English contexts significantly exceeded that in non-English contexts. The GRADE evaluation indicated that the quality of evidence varied from low to moderate. DISCUSSION The results for the relationship between voice parameters and intelligibility showed significant heterogeneity. Future research should consider age-related voice changes and include diverse age groups. To enhance validity and comparability, it will be necessary to report effect sizes, tool validity, inter-rater reliability, and calibration procedures. Voice assessments should account for the validation status of tools because of their potential impact on the outcomes. The linguistic context may also influence the results.
Collapse
Affiliation(s)
- Shaohua Gao
- Voice Research Laboratory, Faculty of Education, The University of Hong Kong, Pok Fu Lam, Hong Kong
| | - Estella P-M Ma
- Voice Research Laboratory, Faculty of Education, The University of Hong Kong, Pok Fu Lam, Hong Kong.
| |
Collapse
|
17
|
Bonini LDS, Dos Santos AP, Vitor JDS, Brasolotto AG, Antonetti-Carvalho AE, Silverio KCA. Water Resistance Therapy in Individuals with Parkinson's Disease: A Session-by-Session Analysis of the Vocal Quality. J Voice 2024:S0892-1997(24)00106-1. [PMID: 38735802 DOI: 10.1016/j.jvoice.2024.03.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 03/23/2024] [Accepted: 03/26/2024] [Indexed: 05/14/2024]
Abstract
OBJECTIVES Verify session-by-session effects of the water resistance therapy (WRT) on the vocal quality of individuals with Parkinson's disease (PD). METHODS This is a retrospective analytical study. Then, the samples were acquired from a database composed of 10 men aged between 50 and 90 years old diagnosed with PD. The participants underwent WRT with a resonance tube; then, they were guided to perform the following phonatory tasks: comfortable pitch and loudness, high pitch, low pitch, ascending and descending glissandos, and sentence uttering. Furthermore, tube depth ranged from 2 cm to 9 cm. Finally, WRT was implemented twice per week, totaling eight sessions, each lasting 45 minutes. Participants were assessed before and after each therapy session. Hence, the data were assessed with spectrographic analysis, vocal intensity, cepstral peak prominence-smoothed, alpha ratio, L1-L0, oscillatory frequency, and auditory-perceptual assessment of overall degree, roughness, breathiness, and instability. One-way repeated measures analysis of variance and Friedman tests were applied (P < 0.05). Furthermore, Holm-Sidak and Tukey tests were used as posthoc tests. RESULTS After the sixth session, the spectrographic analysis revealed that the tracing color intensity of medium frequencies darkened, whereas a better result could be observed after the eighth session. Regarding vocal intensity, the improvement could be observed from the third session. Additionally, L1-L0 followed the same results. The overall degree auditory-perceptual assessment revealed the best results only after the second, third, and fourth sessions; however, after the eighth session, the instability increased. CONCLUSIONS WRT allowed better results from the third session, with some improvements in the sixth session. However, the instability increased after the eighth session; thus, it is important to review the phonatory tasks and session numbers to avoid an overload in the phonatory system.
Collapse
Affiliation(s)
- Letícia de Souza Bonini
- Speech-Language Pathology and Audiology Department at Faculdade de Odontologia de Bauru, Universidade de São Paulo, Bauru, São Paulo, Brazil.
| | - Ana Paula Dos Santos
- Speech-Language Pathology and Audiology Department at Faculdade de Odontologia de Bauru, Universidade de São Paulo, Bauru, São Paulo, Brazil.
| | - Jhonatan da Silva Vitor
- Speech-Language Pathology and Audiology Department at Faculdade de Odontologia de Bauru, Universidade de São Paulo, Bauru, São Paulo, Brazil.
| | - Alcione Ghedini Brasolotto
- Speech-Language Pathology and Audiology Department at Faculdade de Odontologia de Bauru, Universidade de São Paulo, Bauru, São Paulo, Brazil.
| | - Angélica Emygdio Antonetti-Carvalho
- Speech-Language Pathology and Audiology Department at Faculdade de Odontologia de Bauru, Universidade de São Paulo, Bauru, São Paulo, Brazil.
| | - Kelly Cristina Alves Silverio
- Speech-Language Pathology and Audiology Department at Faculdade de Odontologia de Bauru, Universidade de São Paulo, Bauru, São Paulo, Brazil.
| |
Collapse
|
18
|
Robotti C, Costantini G, Saggio G, Cesarini V, Calastri A, Maiorano E, Piloni D, Perrone T, Sabatini U, Ferretti VV, Cassaniti I, Baldanti F, Gravina A, Sakib A, Alessi E, Pietrantonio F, Pascucci M, Casali D, Zarezadeh Z, Zoppo VD, Pisani A, Benazzo M. Machine Learning-based Voice Assessment for the Detection of Positive and Recovered COVID-19 Patients. J Voice 2024; 38:796.e1-796.e13. [PMID: 34965907 PMCID: PMC8616736 DOI: 10.1016/j.jvoice.2021.11.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 11/17/2021] [Accepted: 11/18/2021] [Indexed: 12/12/2022]
Abstract
Many virological tests have been implemented during the Coronavirus Disease 2019 (COVID-19) pandemic for diagnostic purposes, but they appear unsuitable for screening purposes. Furthermore, current screening strategies are not accurate enough to effectively curb the spread of the disease. Therefore, the present study was conducted within a controlled clinical environment to determine eventual detectable variations in the voice of COVID-19 patients, recovered and healthy subjects, and also to determine whether machine learning-based voice assessment (MLVA) can accurately discriminate between them, thus potentially serving as a more effective mass-screening tool. Three different subpopulations were consecutively recruited: positive COVID-19 patients, recovered COVID-19 patients and healthy individuals as controls. Positive patients were recruited within 10 days from nasal swab positivity. Recovery from COVID-19 was established clinically, virologically and radiologically. Healthy individuals reported no COVID-19 symptoms and yielded negative results at serological testing. All study participants provided three trials for multiple vocal tasks (sustained vowel phonation, speech, cough). All recordings were initially divided into three different binary classifications with a feature selection, ranking and cross-validated RBF-SVM pipeline. This brough a mean accuracy of 90.24%, a mean sensitivity of 91.15%, a mean specificity of 89.13% and a mean AUC of 0.94 across all tasks and all comparisons, and outlined the sustained vowel as the most effective vocal task for COVID discrimination. Moreover, a three-way classification was carried out on an external test set comprised of 30 subjects, 10 per class, with a mean accuracy of 80% and an accuracy of 100% for the detection of positive subjects. Within this assessment, recovered individuals proved to be the most difficult class to identify, and all the misclassified subjects were declared positive; this might be related to mid and short-term vocal traces of COVID-19, even after the clinical resolution of the infection. In conclusion, MLVA may accurately discriminate between positive COVID-19 patients, recovered COVID-19 patients and healthy individuals. Further studies should test MLVA among larger populations and asymptomatic positive COVID-19 patients to validate this novel screening technology and test its potential application as a potentially more effective surveillance strategy for COVID-19.
Collapse
Affiliation(s)
- Carlo Robotti
- Department of Otolaryngology - Head and Neck Surgery, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy; Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, Pavia, Italy.
| | - Giovanni Costantini
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy.
| | - Giovanni Saggio
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy.
| | - Valerio Cesarini
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
| | - Anna Calastri
- Department of Otolaryngology - Head and Neck Surgery, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - Eugenia Maiorano
- Department of Otolaryngology - Head and Neck Surgery, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - Davide Piloni
- Pneumology Unit, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - Tiziano Perrone
- Department of Internal Medicine, Fondazione IRCCS Policlinico San Matteo, University of Pavia, Pavia, Italy
| | - Umberto Sabatini
- Department of Internal Medicine, Fondazione IRCCS Policlinico San Matteo, University of Pavia, Pavia, Italy
| | - Virginia Valeria Ferretti
- Clinical Epidemiology and Biometry Unit, Fondazione IRCCS Policlinico San Matteo Foundation, Pavia, Italy
| | - Irene Cassaniti
- Molecular Virology Unit, Microbiology and Virology Department, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - Fausto Baldanti
- Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, Pavia, Italy; Molecular Virology Unit, Microbiology and Virology Department, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - Andrea Gravina
- Otorhinolaryngology Department, University of Rome Tor Vergata, Rome, Italy
| | - Ahmed Sakib
- Otorhinolaryngology Department, University of Rome Tor Vergata, Rome, Italy
| | - Elena Alessi
- Internal Medicine Unit, Ospedale dei Castelli ASL Roma 6, Ariccia, Italy
| | | | - Matteo Pascucci
- Internal Medicine Unit, Ospedale dei Castelli ASL Roma 6, Ariccia, Italy
| | - Daniele Casali
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
| | - Zakarya Zarezadeh
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
| | - Vincenzo Del Zoppo
- Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
| | - Antonio Pisani
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy; IRCCS Mondino Foundation, Pavia, Italy
| | - Marco Benazzo
- Department of Otolaryngology - Head and Neck Surgery, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy; Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, Pavia, Italy
| |
Collapse
|
19
|
Cao S, Rosenzweig I, Bilotta F, Jiang H, Xia M. Automatic detection of obstructive sleep apnea based on speech or snoring sounds: a narrative review. J Thorac Dis 2024; 16:2654-2667. [PMID: 38738242 PMCID: PMC11087644 DOI: 10.21037/jtd-24-310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 04/15/2024] [Indexed: 05/14/2024]
Abstract
Background and Objective Obstructive sleep apnea (OSA) is a common chronic disorder characterized by repeated breathing pauses during sleep caused by upper airway narrowing or collapse. The gold standard for OSA diagnosis is the polysomnography test, which is time consuming, expensive, and invasive. In recent years, more cost-effective approaches for OSA detection based in predictive value of speech and snoring has emerged. In this paper, we offer a comprehensive summary of current research progress on the applications of speech or snoring sounds for the automatic detection of OSA and discuss the key challenges that need to be overcome for future research into this novel approach. Methods PubMed, IEEE Xplore, and Web of Science databases were searched with related keywords. Literature published between 1989 and 2022 examining the potential of using speech or snoring sounds for automated OSA detection was reviewed. Key Content and Findings Speech and snoring sounds contain a large amount of information about OSA, and they have been extensively studied in the automatic screening of OSA. By importing features extracted from speech and snoring sounds into artificial intelligence models, clinicians can automatically screen for OSA. Features such as formant, linear prediction cepstral coefficients, mel-frequency cepstral coefficients, and artificial intelligence algorithms including support vector machines, Gaussian mixture model, and hidden Markov models have been extensively studied for the detection of OSA. Conclusions Due to the significant advantages of noninvasive, low-cost, and contactless data collection, an automatic approach based on speech or snoring sounds seems to be a promising tool for the detection of OSA.
Collapse
Affiliation(s)
- Shuang Cao
- Department of Anesthesiology, The Ninth People’s Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ivana Rosenzweig
- Sleep and Brain Plasticity Centre, CNS, IoPPN, King’s College London, London, UK
- Sleep Disorders Centre, Guy’s and St Thomas’ Hospital, GSTT NHS, London, UK
| | - Federico Bilotta
- Department of Anaesthesia and Critical Care Medicine, Policlinico Umberto 1 Hospital, Sapienza University of Rome, Rome, Italy
| | - Hong Jiang
- Department of Anesthesiology, The Ninth People’s Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ming Xia
- Department of Anesthesiology, The Ninth People’s Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
20
|
Oreskovic J, Kaufman J, Fossat Y. Impact of Audio Data Compression on Feature Extraction for Vocal Biomarker Detection: Validation Study. JMIR BIOMEDICAL ENGINEERING 2024; 9:e56246. [PMID: 38875677 PMCID: PMC11058552 DOI: 10.2196/56246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 02/28/2024] [Accepted: 03/23/2024] [Indexed: 06/16/2024] Open
Abstract
BACKGROUND Vocal biomarkers, derived from acoustic analysis of vocal characteristics, offer noninvasive avenues for medical screening, diagnostics, and monitoring. Previous research demonstrated the feasibility of predicting type 2 diabetes mellitus through acoustic analysis of smartphone-recorded speech. Building upon this work, this study explores the impact of audio data compression on acoustic vocal biomarker development, which is critical for broader applicability in health care. OBJECTIVE The objective of this research is to analyze how common audio compression algorithms (MP3, M4A, and WMA) applied by 3 different conversion tools at 2 bitrates affect features crucial for vocal biomarker detection. METHODS The impact of audio data compression on acoustic vocal biomarker development was investigated using uncompressed voice samples converted into MP3, M4A, and WMA formats at 2 bitrates (320 and 128 kbps) with MediaHuman (MH) Audio Converter, WonderShare (WS) UniConverter, and Fast Forward Moving Picture Experts Group (FFmpeg). The data set comprised recordings from 505 participants, totaling 17,298 audio files, collected using a smartphone. Participants recorded a fixed English sentence up to 6 times daily for up to 14 days. Feature extraction, including pitch, jitter, intensity, and Mel-frequency cepstral coefficients (MFCCs), was conducted using Python and Parselmouth. The Wilcoxon signed rank test and the Bonferroni correction for multiple comparisons were used for statistical analysis. RESULTS In this study, 36,970 audio files were initially recorded from 505 participants, with 17,298 recordings meeting the fixed sentence criteria after screening. Differences between the audio conversion software, MH, WS, and FFmpeg, were notable, impacting compression outcomes such as constant or variable bitrates. Analysis encompassed diverse data compression formats and a wide array of voice features and MFCCs. Wilcoxon signed rank tests yielded P values, with those below the Bonferroni-corrected significance level indicating significant alterations due to compression. The results indicated feature-specific impacts of compression across formats and bitrates. MH-converted files exhibited greater resilience compared to WS-converted files. Bitrate also influenced feature stability, with 38 cases affected uniquely by a single bitrate. Notably, voice features showed greater stability than MFCCs across conversion methods. CONCLUSIONS Compression effects were found to be feature specific, with MH and FFmpeg showing greater resilience. Some features were consistently affected, emphasizing the importance of understanding feature resilience for diagnostic applications. Considering the implementation of vocal biomarkers in health care, finding features that remain consistent through compression for data storage or transmission purposes is valuable. Focused on specific features and formats, future research could broaden the scope to include diverse features, real-time compression algorithms, and various recording methods. This study enhances our understanding of audio compression's influence on voice features and MFCCs, providing insights for developing applications across fields. The research underscores the significance of feature stability in working with compressed audio data, laying a foundation for informed voice data use in evolving technological landscapes.
Collapse
|
21
|
Li Z, Zhang D, Chen H, Liu Y, Wang HC. Voice Pitch Shaping and Genderization: New Needs of Cosmetic Phonoplastic Surgery. Aesthetic Plast Surg 2024:10.1007/s00266-024-03919-0. [PMID: 38565723 DOI: 10.1007/s00266-024-03919-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Accepted: 02/08/2024] [Indexed: 04/04/2024]
Abstract
Voices can convey content, emotion, and essential information about an individual's gender and social information. Closely related to gender identification and sexual attraction, voices also positively affect many psychological factors of individuals. Surgeries have evolved from treating congenital diseases to fulfilling an individual's aesthetic needs for voice. Voice shaping is emerging as the next cosmetic surgery hotspot after skincare and appearance and body shaping. This paper summarizes the development of voice pitch shaping and genderization procedures out of the cosmetic need. LEVEL OF EVIDENCE IV: This journal requires that authors assign a level of evidence to each article. For a full description of these evidence-based medicine ratings, please refer to the Table of Contents or the online Instructions to Authors https://www.springer.com/00266 .
Collapse
Affiliation(s)
- Zhijin Li
- Department of Plastic Surgery, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Dingyue Zhang
- Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Hongsai Chen
- Department of Otorhinolaryngology, Shanghai Ninth People's Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Ying Liu
- Department of Plastic and Reconstructive Surgery, Shanghai Ninth People's Hospital, Shanghai Jiaotong University School of Medicine, No. 639 of Zhizaoju Road, District Huangpu, Shanghai, 200011, China.
| | - Hayson Chenyu Wang
- Department of Plastic and Reconstructive Surgery, Shanghai Ninth People's Hospital, Shanghai Jiaotong University School of Medicine, No. 639 of Zhizaoju Road, District Huangpu, Shanghai, 200011, China.
| |
Collapse
|
22
|
Cruz DRD, Zheng A, Debele T, Larson P, Dion GR, Park YC. Drug delivery systems for wound healing treatment of upper airway injury. Expert Opin Drug Deliv 2024; 21:573-591. [PMID: 38588553 PMCID: PMC11208077 DOI: 10.1080/17425247.2024.2340653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 04/04/2024] [Indexed: 04/10/2024]
Abstract
INTRODUCTION Endotracheal intubation is a common procedure to maintain an open airway with risks for traumatic injury. Pathological changes resulting from intubation can cause upper airway complications, including vocal fold scarring, laryngotracheal stenosis, and granulomas and present with symptoms such as dysphonia, dysphagia, and dyspnea. Current intubation-related laryngotracheal injury treatment approaches lack standardized guidelines, relying on individual clinician experience, and surgical and medical interventions have limitations and carry risks. AREAS COVERED The clinical and preclinical therapeutics for wound healing in the upper airway are described. This review discusses the current developments on local drug delivery systems in the upper airway utilizing particle-based delivery systems, including nanoparticles and microparticles, and bulk-based delivery systems, encompassing hydrogels and polymer-based approaches. EXPERT OPINION Complex laryngotracheal diseases pose challenges for effective treatment, struggling due to the intricate anatomy, limited access, and recurrence. Symptomatic management often requires invasive surgical procedures or medications that are unable to achieve lasting effects. Recent advances in nanotechnology and biocompatible materials provide potential solutions, enabling precise drug delivery, personalization, and extended treatment efficacy. Combining these technologies could lead to groundbreaking treatments for upper airways diseases, significantly improving patients' quality of life. Research and innovation in this field are crucial for further advancements.
Collapse
Affiliation(s)
- Denzel Ryan D. Cruz
- Medical Scientist Training Program, College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - Avery Zheng
- Chemical Engineering Program, College of Engineering and Applied Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Tilahun Debele
- Chemical Engineering Program, College of Engineering and Applied Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Peter Larson
- Department of Otolaryngology – Head and Neck Surgery, College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - Gregory R. Dion
- Department of Otolaryngology – Head and Neck Surgery, College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - Yoonjee C. Park
- Chemical Engineering Program, College of Engineering and Applied Sciences, University of Cincinnati, Cincinnati, OH, USA
| |
Collapse
|
23
|
Sarmet M, Santos DB, Mangilli LD, Million JL, Maldaner V, Zeredo JL. Chronic respiratory failure negatively affects speech function in patients with bulbar and spinal onset amyotrophic lateral sclerosis: retrospective data from a tertiary referral center. LOGOP PHONIATR VOCO 2024; 49:17-26. [PMID: 35767076 DOI: 10.1080/14015439.2022.2092209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 02/04/2022] [Accepted: 06/15/2022] [Indexed: 10/17/2022]
Abstract
Background: Although dysarthria and respiratory failure are widely described in literature as part of the natural history of Amyotrophic lateral sclerosis (ALS), the specific interaction between them has been little explored.Aim: To investigate the relationship between chronic respiratory failure and the speech of ALS patients.Materials and methods: In this cross-sectional retrospective study we reviewed the medical records of all patients diagnosed with ALS that were accompanied by a tertiary referral center. In order to determine the presence and degree of speech impairment, the Amyotrophic Lateral Sclerosis Functional Rating Scale-revised (ALSFRS-R) speech sub-scale was used. Respiratory function was assessed through spirometry and through venous blood gasometry obtained from a morning peripheral venous sample. To determine whether differences among groups classified by speech function were significant, maximum and mean spirometry values of participants were compared using multivariate analysis of variance (MANOVA) with Tukey's post hoc test.Results: Seventy-five cases were selected, of which 73.3% presented speech impairment and 70.7% respiratory impairment. Respiratory and speech functions were moderately correlated (seated FVC r = 0.64; supine FVC r = 0.60; seated FEV1 r = 0.59 and supine FEV1 r = 0.54, p < .001). Multivariable logistic regression revealed that the following variables were significantly associated with the presence of speech impairment after adjusting for other risk factors: seated FVC (odds ratio [OR] = 0.862) and seated FEV1 (OR = 1.106). The final model was 81.1% predictive of speech impairment. The presence of daytime hypercapnia was not correlated to increasing speech impairment.Conclusion: The restrictive pattern developed by ALS patients negatively influences speech function. Speech is a complex and multifactorial process, and lung volume presents a pivotal role in its function. Thus, we were able to find that lung volumes presented a significant correlation to speech function, especially in those with bulbar onset and respiratory impairment. Neurobiological and physiological aspects of this relationship should be explored in further studies with the ALS population.
Collapse
Affiliation(s)
- Max Sarmet
- Graduate Department of Health Science and Technology, University of Brasília (UnB), Brasília, Brazil
- Hospital de Apoio de Brasília (HAB), Tertiary Referral Center of Neuromuscular Diseases, Brasília, Brazil
| | - Dante Brasil Santos
- Hospital de Apoio de Brasília (HAB), Tertiary Referral Center of Neuromuscular Diseases, Brasília, Brazil
- UniEvangélica, Graduate Program of Human Movement and Rehabilitation, Anápolis, Brazil
| | | | - Janae Lyon Million
- Department of Human Biology, University of California Santa Cruz, Santa Cruz, CA, United States of America
| | - Vinicius Maldaner
- Hospital de Apoio de Brasília (HAB), Tertiary Referral Center of Neuromuscular Diseases, Brasília, Brazil
- UniEvangélica, Graduate Program of Human Movement and Rehabilitation, Anápolis, Brazil
| | - Jorge L Zeredo
- Graduate Department of Health Science and Technology, University of Brasília (UnB), Brasília, Brazil
| |
Collapse
|
24
|
Franzone R, Petrigna L, Signorelli D, Musumeci G. The Relationship between Posture and Muscle Tensive Dysphonia in Teachers: A Systematic Scoping Review. J Funct Morphol Kinesiol 2024; 9:60. [PMID: 38651418 PMCID: PMC11036206 DOI: 10.3390/jfmk9020060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 03/26/2024] [Accepted: 03/27/2024] [Indexed: 04/25/2024] Open
Abstract
Teachers usually present work-related pain such as neck pain. Their posture could be the cause of these problems; indeed, it is often a sway-back posture. Furthermore, teachers can also experience problems with their voice such as dysphonia, specifically muscle tension dysphonia (MTD). This scoping review aims to find the correlation between teachers' posture and MTD. It also studies how a posture-based treatment can influence this disorder. Randomized controlled trials, controlled clinical trials, prospective cohort studies, and cross-sectional studies that considered the relationship between posture and MTD and that included teachers in their sample. The search led to an initial number of 396 articles; after the screening process, a final number of eight articles were included. A total of 303 patients were analyzed and all showed altered alignment of the head around the cervical spine with hypertonus of the cricothyroid, suprahyoid, and sternocleidomastoid muscles. Although MTD is a disorder with a multifactorial etiology, the articles revealed a correlation between posture and MTD related to a forward protraction of the cervical spine with a hypertonus of the laryngeal and hyoid musculature. This study also detected that an intervention in posture could reduce vocal disorders.
Collapse
Affiliation(s)
| | | | | | - Giuseppe Musumeci
- Department of Biomedical and Biotechnological Sciences, Section of Anatomy, Histology and Movement Science, School of Medicine, University of Catania, Via S. Sofia 97, 95123 Catania, Italy; (R.F.); (L.P.); (D.S.)
| |
Collapse
|
25
|
Che Z, Wan X, Xu J, Duan C, Zheng T, Chen J. Speaking without vocal folds using a machine-learning-assisted wearable sensing-actuation system. Nat Commun 2024; 15:1873. [PMID: 38472193 PMCID: PMC10933441 DOI: 10.1038/s41467-024-45915-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 02/06/2024] [Indexed: 03/14/2024] Open
Abstract
Voice disorders resulting from various pathological vocal fold conditions or postoperative recovery of laryngeal cancer surgeries, are common causes of dysphonia. Here, we present a self-powered wearable sensing-actuation system based on soft magnetoelasticity that enables assisted speaking without relying on the vocal folds. It holds a lightweighted mass of approximately 7.2 g, skin-alike modulus of 7.83 × 105 Pa, stability against skin perspiration, and a maximum stretchability of 164%. The wearable sensing component can effectively capture extrinsic laryngeal muscle movement and convert them into high-fidelity and analyzable electrical signals, which can be translated into speech signals with the assistance of machine learning algorithms with an accuracy of 94.68%. Then, with the wearable actuation component, the speech could be expressed as voice signals while circumventing vocal fold vibration. We expect this approach could facilitate the restoration of normal voice function and significantly enhance the quality of life for patients with dysfunctional vocal folds.
Collapse
Affiliation(s)
- Ziyuan Che
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Xiao Wan
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Jing Xu
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Chrystal Duan
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Tianqi Zheng
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Jun Chen
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| |
Collapse
|
26
|
Park J, Choi S, Takatoh J, Zhao S, Harrahill A, Han BX, Wang F. Brainstem control of vocalization and its coordination with respiration. Science 2024; 383:eadi8081. [PMID: 38452069 PMCID: PMC11223444 DOI: 10.1126/science.adi8081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 01/18/2024] [Indexed: 03/09/2024]
Abstract
Phonation critically depends on precise controls of laryngeal muscles in coordination with ongoing respiration. However, the neural mechanisms governing these processes remain unclear. We identified excitatory vocalization-specific laryngeal premotor neurons located in the retroambiguus nucleus (RAmVOC) in adult mice as being both necessary and sufficient for driving vocal cord closure and eliciting mouse ultrasonic vocalizations (USVs). The duration of RAmVOC activation can determine the lengths of both USV syllables and concurrent expiration periods, with the impact of RAmVOC activation depending on respiration phases. RAmVOC neurons receive inhibition from the preBötzinger complex, and inspiration needs override RAmVOC-mediated vocal cord closure. Ablating inhibitory synapses in RAmVOC neurons compromised this inspiration gating of laryngeal adduction, resulting in discoordination of vocalization with respiration. Our study reveals the circuits for vocal production and vocal-respiratory coordination.
Collapse
Affiliation(s)
- Jaehong Park
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, 27708, USA
| | - Seonmi Choi
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Jun Takatoh
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Shengli Zhao
- Department of Neurobiology, Duke University Medical Center, Durham, NC, 27710, USA
| | - Andrew Harrahill
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Bao-Xia Han
- Department of Neurobiology, Duke University Medical Center, Durham, NC, 27710, USA
| | - Fan Wang
- Department of Brain and Cognitive Sciences, McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
27
|
Schlegel P, Rhyn Chung H, Döllinger M, Chhetri DK. Reconstruction of Vocal Fold Medial Surface 3D Trajectories: Effects of Neuromuscular Stimulation and Airflow. Laryngoscope 2024; 134:1249-1257. [PMID: 37672673 PMCID: PMC10915101 DOI: 10.1002/lary.31029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 08/12/2023] [Accepted: 08/22/2023] [Indexed: 09/08/2023]
Abstract
INTRODUCTION Analysis of medial surface dynamics of the vocal folds (VF) is critical to understanding voice production and treatment of voice disorders. We analyzed VF medial surface vibratory dynamics, evaluating the effects of airflow and nerve stimulation using 3D reconstruction and empirical eigenfunctions (EEF). STUDY DESIGN In vivo canine hemilarynx phonation. METHODS An in vivo canine hemilarynx was phonated while graded stimulation of the recurrent and superior laryngeal nerves (RLN and SLN) was performed. For each phonatory condition, vibratory cycles were 3D reconstructed from tattooed landmarks on the VF medial surface at low, medium, and high airflows. Parameters describing medial surface trajectory shape were calculated, and underlying patterns were emphasized using EEFs. Fundamental frequency and smoothed cepstral peak prominence (CPPS) were calculated from acoustic data. RESULTS Convex-hull area of landmark trajectories increased with increasing flow and decreasing nerve activation level. Trajectory shapes observed included circular, ellipsoid, bent, and figure-eight. They were more circular on the superior and anterior VF, and more elliptical and line-like on the inferior and posterior VF. The EEFs capturing synchronal opening and closing (EEF1) and alternating convergent/divergent (EEF2) glottis shapes were mostly unaffected by flow and nerve stimulation levels. CPPS increased with higher airflow except for low RLN activation and very dominant SLN stimulation. CONCLUSION We analyzed VF vibration as a function of neuromuscular stimulation and airflow levels. Oscillation patterns such as figure-eight and bent trajectories were linked to high nerve activation and flow. Further studies investigating longer sections of 3D reconstructed oscillations are needed. LEVEL OF EVIDENCE N/A, Basic Science Laryngoscope, 134:1249-1257, 2024.
Collapse
Affiliation(s)
- Patrick Schlegel
- Department of Head and Neck Surgery, University of California, Los Angeles; Los Angeles, CA
| | - Hye Rhyn Chung
- Department of Head and Neck Surgery, University of California, Los Angeles; Los Angeles, CA
| | - Michael Döllinger
- Division of Phoniatrics and Pediatric Audiology, Department of Head and Neck Surgery, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
| | - Dinesh K. Chhetri
- Department of Head and Neck Surgery, University of California, Los Angeles; Los Angeles, CA
| |
Collapse
|
28
|
Schlegel P, Berry DA, Moffatt C, Zhang Z, Chhetri DK. Register transitions in an in vivo canine model as a function of intrinsic laryngeal muscle stimulation, fundamental frequency, and sound pressure level. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2139-2150. [PMID: 38498507 PMCID: PMC10954347 DOI: 10.1121/10.0025135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/09/2024] [Accepted: 02/16/2024] [Indexed: 03/20/2024]
Abstract
Phonatory instabilities and involuntary register transitions can occur during singing. However, little is known regarding the mechanisms which govern such transitions. To investigate this phenomenon, we systematically varied laryngeal muscle activation and airflow in an in vivo canine larynx model during phonation. We calculated voice range profiles showing average nerve activations for all combinations of fundamental frequency (F0) and sound pressure level (SPL). Further, we determined closed-quotient (CQ) and minimum-posterior-area (MPA) based on high-speed video recordings. While different combinations of muscle activation favored different combinations of F0 and SPL, in the investigated larynx there was a consistent region of instability at about 400 Hz which essentially precluded phonation. An explanation for this region may be a larynx specific coupling between sound source and subglottal tract or an effect based purely on larynx morphology. Register transitions crossed this region, with different combinations of cricothyroid and thyroarytenoid muscle (TA) activation stabilizing higher or lower neighboring frequencies. Observed patterns in CQ and MPA dependent on TA activation reproduced patterns found in singers in previous work. Lack of control of TA stimulation may result in phonation instabilities, and enhanced control of TA stimulation may help to avoid involuntary register transitions, especially in the singing voice.
Collapse
Affiliation(s)
- Patrick Schlegel
- Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California-Los Angeles, Los Angeles, California 90095, USA
| | - David A Berry
- Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California-Los Angeles, Los Angeles, California 90095, USA
| | - Clare Moffatt
- Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California-Los Angeles, Los Angeles, California 90095, USA
| | - Zhaoyan Zhang
- Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California-Los Angeles, Los Angeles, California 90095, USA
| | - Dinesh K Chhetri
- Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California-Los Angeles, Los Angeles, California 90095, USA
| |
Collapse
|
29
|
Riede T, Kobrina A, Pasch B. Anatomy and mechanisms of vocal production in harvest mice. J Exp Biol 2024; 227:jeb246553. [PMID: 38269528 DOI: 10.1242/jeb.246553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 01/18/2024] [Indexed: 01/26/2024]
Abstract
Characterizing mechanisms of vocal production provides important insight into the ecology of acoustic divergence. In this study, we characterized production mechanisms of two types of vocalizations emitted by western harvest mice (Reithrodontomys megalotis), a species uniquely positioned to inform trait evolution because it is a sister taxon to peromyscines (Peromyscus and Onychomys spp.), which use vocal fold vibrations to produce long-distance calls, but more ecologically and acoustically similar to baiomyines (Baiomys and Scotinomys spp.), which employ a whistle mechanism. We found that long-distance calls (∼10 kHz) were produced by airflow-induced vocal fold vibrations, whereas high-frequency quavers used in close-distance social interactions (∼80 kHz) were generated by a whistle mechanism. Both production mechanisms were facilitated by a characteristic laryngeal morphology. Our findings indicate that the use of vocal fold vibrations for long-distance communication is widespread in reithrodontomyines (Onychomys, Peromyscus, Reithrodontomys spp.) despite overlap in frequency content that characterizes baiomyine whistled vocalizations. The results illustrate how different production mechanisms shape acoustic variation in rodents and contribute to ecologically relevant communication distances.
Collapse
Affiliation(s)
- Tobias Riede
- Department of Physiology, Midwestern University Glendale, Glendale, AZ 85308, USA
| | - Anastasiya Kobrina
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - Bret Pasch
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ 86011, USA
- Wildlife Conservation and Management, School of Natural Resources and the Environment, The University of Arizona, Tucson, AZ 85721, USA
| |
Collapse
|
30
|
Elemans CPH, Jiang W, Jensen MH, Pichler H, Mussman BR, Nattestad J, Wahlberg M, Zheng X, Xue Q, Fitch WT. Evolutionary novelties underlie sound production in baleen whales. Nature 2024; 627:123-129. [PMID: 38383781 DOI: 10.1038/s41586-024-07080-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 01/16/2024] [Indexed: 02/23/2024]
Abstract
Baleen whales (mysticetes) use vocalizations to mediate their complex social and reproductive behaviours in vast, opaque marine environments1. Adapting to an obligate aquatic lifestyle demanded fundamental physiological changes to efficiently produce sound, including laryngeal specializations2-4. Whereas toothed whales (odontocetes) evolved a nasal vocal organ5, mysticetes have been thought to use the larynx for sound production1,6-8. However, there has been no direct demonstration that the mysticete larynx can phonate, or if it does, how it produces the great diversity of mysticete sounds9. Here we combine experiments on the excised larynx of three mysticete species with detailed anatomy and computational models to show that mysticetes evolved unique laryngeal structures for sound production. These structures allow some of the largest animals that ever lived to efficiently produce frequency-modulated, low-frequency calls. Furthermore, we show that this phonation mechanism is likely to be ancestral to all mysticetes and shares its fundamental physical basis with most terrestrial mammals, including humans10, birds11, and their closest relatives, odontocetes5. However, these laryngeal structures set insurmountable physiological limits to the frequency range and depth of their vocalizations, preventing them from escaping anthropogenic vessel noise12,13 and communicating at great depths14, thereby greatly reducing their active communication range.
Collapse
Affiliation(s)
- Coen P H Elemans
- Sound Communication and Behaviour Group, Department of Biology, University of Southern Denmark, Odense, Denmark.
| | - Weili Jiang
- Department of Mechanical Engineering, Rochester Institute of Technology, Rochester, NY, USA
| | - Mikkel H Jensen
- Sound Communication and Behaviour Group, Department of Biology, University of Southern Denmark, Odense, Denmark
| | - Helena Pichler
- Department of Behavioral and Cognitive Biology, Faculty of Life Sciences, University of Vienna, Vienna, Austria
| | - Bo R Mussman
- Department of Radiology, Odense University Hospital, Odense, Denmark
- Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Jacob Nattestad
- Department of Radiology, Odense University Hospital, Odense, Denmark
| | - Magnus Wahlberg
- Sound Communication and Behaviour Group, Department of Biology, University of Southern Denmark, Odense, Denmark
| | - Xudong Zheng
- Department of Mechanical Engineering, Rochester Institute of Technology, Rochester, NY, USA
| | - Qian Xue
- Department of Mechanical Engineering, Rochester Institute of Technology, Rochester, NY, USA
| | - W Tecumseh Fitch
- Department of Behavioral and Cognitive Biology, Faculty of Life Sciences, University of Vienna, Vienna, Austria.
- Vienna Cognitive Science Hub, University of Vienna, Vienna, Austria.
| |
Collapse
|
31
|
Kreiman J. Information conveyed by voice qualitya). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:1264-1271. [PMID: 38345424 DOI: 10.1121/10.0024609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 01/09/2024] [Indexed: 02/15/2024]
Abstract
The problem of characterizing voice quality has long caused debate and frustration. The richness of the available descriptive vocabulary is overwhelming, but the density and complexity of the information voices convey lead some to conclude that language can never adequately specify what we hear. Others argue that terminology lacks an empirical basis, so that language-based scales are inadequate a priori. Efforts to provide meaningful instrumental characterizations have also had limited success. Such measures may capture sound patterns but cannot at present explain what characteristics, intentions, or identity listeners attribute to the speaker based on those patterns. However, some terms continually reappear across studies. These terms align with acoustic dimensions accounting for variance across speakers and languages and correlate with size and arousal across species. This suggests that labels for quality rest on a bedrock of biology: We have evolved to perceive voices in terms of size/arousal, and these factors structure both voice acoustics and descriptive language. Such linkages could help integrate studies of signals and their meaning, producing a truly interdisciplinary approach to the study of voice.
Collapse
Affiliation(s)
- Jody Kreiman
- Departments of Head and Neck Surgery and Linguistics, University of California, Los Angeles, Los Angeles, California 90095-1794, USA
| |
Collapse
|
32
|
Delviniotis DS, Theodoridis S, Delvinioti N. Aerodynamic Parameters in Byzantine Chant Voices: Comparisons Across Pitch and Loudness. J Voice 2024:S0892-1997(23)00413-7. [PMID: 38246827 DOI: 10.1016/j.jvoice.2023.12.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 12/27/2023] [Accepted: 12/27/2023] [Indexed: 01/23/2024]
Abstract
OBJECTIVE This study was designed to assess the impact of phonation frequency and loudness increase on aerodynamic parameters of the singing voice in Byzantine chant (BC). DESIGN Aerodynamic measurements in BC were obtained and statistically analyzed. METHOD Fifteen experienced BC chanters, all baritones, performed the ascending notes G2, C3, E3, G3, C4, E4, and G4, at normal and high levels of loudness within a mask, while repeating strings of /pi/ syllables. The parameters of airflow (FR), subglottal pressure (Psub), and sound pressure level (SPL) were directly measured, and from them, the glottal flow resistance (Rg) and vocal efficiency (VE) were calculated. All the parameters' values were statistically analyzed. RESULTS Statistically significant differences for FR, Psub, and SPL parameters in BC between the two loudness levels, at constant pitch, and for Psub, SPL, Rg, and VE among different pitches, at constant loudness levels were detected. When loudness increases, a) only the mean values of FR, Psub, and SPL, within C3-C4, increase, whereas those of Rg and VE do not show any change, and b) at G2, only the mean Psub increases, while in the upper range E4-G4, both mean SPL and mean VE decrease. When pitch is raised, a) for each level of loudness, within G2-E4 pitch range, the means of Psub, SPL, Rg, and VE increase while this is not the case for FR, and b) in the highest range (E4-G4), average SPL and VE drop while Rg and Psub remain stable. Our findings suggest that: a) most participants increase Psub and SPL without modification of Rg when loudness increases, and b) most participants increase both SPL and Psub while changing Rg with pitch rise. Idiosyncratic differences among the participants were detected in Rg and Psub, because of pitch rise, and, also, in Rg and VE due to loudness increase. CONCLUSIONS The results from this study reveal that, within the C3-C4 pitch range: a) there is independent control between the loudness and glottal adduction, and b) Psub is the main tool for increasing both the loudness and SPL. For some exceptions among the participants, either the Rg alteration or other modifications of the vocal system are, possibly, the cause of the loudness increase. The increased mean values of SPL, Rg, and Psub with pitch rise, for most participants, suggest that both glottal adduction and Psub increase together with the SPL and pitch increase. The VE increase within G2-E4 pitches reaches a maximum value at E4. Some exceptions among the participants exist that suggest the possible use of different phonatory strategies when changing either the pitch or the vocal loudness.
Collapse
Affiliation(s)
- Dimitrios S Delviniotis
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Ilisia, Athens, Greece.
| | - Sergios Theodoridis
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Ilisia, Athens, Greece
| | - Nektaria Delvinioti
- Department of Music Studies, National and Kapodistrian University of Athens, Ilisia, Athens, Greece
| |
Collapse
|
33
|
Sauder CL, Kapsner-Smith MR, Simmons E, Meyer T, Doyle PC, Eadie TL. The Effect of Rating Method on Reliability of Judgments of Strain Across Populations. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2024; 33:393-405. [PMID: 38060689 PMCID: PMC11000812 DOI: 10.1044/2023_ajslp-23-00174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 08/17/2023] [Accepted: 10/17/2023] [Indexed: 01/05/2024]
Abstract
PURPOSE Variability in auditory-perceptual ratings of voice limits their utility, with the poorest reliability often noted for vocal strain. The purpose of this study was to determine whether an experimental method, called visual sort and rate (VSR), promoted stronger rater reliability than visual analog scale (VAS), for ratings of strain in two clinical populations: adductor laryngeal dystonia (ADLD) and vocal hyperfunction (VH). METHOD Connected speech samples from speakers with ADLD and VH as well as age- and sex-matched controls were selected from a database. Fifteen inexperienced listeners rated strain for two speaker sets (25 ADLD speakers and five controls; 25 VH speakers and five controls) across four rating blocks: VAS-ADLD, VSR-ADLD, VAS-VH, and VSR-VH. For the VAS task, listeners rated each speaker for strain using a vertically oriented 100-mm VAS. For the VSR task, stimuli were distributed into sets of samples with a range of severities in each set. Listeners sorted and ranked samples for strain within each set, and final ratings were captured on a vertically oriented 100-mm VAS. Intrarater reliability (Pearson's r) and interrater variability (mean of the squared differences between a listener's ratings and group mean ratings) were compared across rating methods and populations using two repeated-measures analyses of variance. RESULTS Intrarater reliability of strain was significantly stronger when listeners used VSR compared to VAS; listeners also showed significantly better intrarater reliability in ADLD than VH. Listeners demonstrated significantly less interrater variability (better reliability) when using VSR compared to VAS. No significant effect of population or interactions was found between listeners for measures of interrater variability. CONCLUSIONS VSR increases intrarater reliability for ratings of vocal strain in speakers with VH and ADLD. VSR decreases variability of auditory-perceptual judgments of strain between inexperienced listeners in these clinical populations. Future research should determine whether benefits of VSR extend to voice clinicians and/or clinical settings.
Collapse
Affiliation(s)
- Cara L. Sauder
- Department of Speech & Hearing Sciences, University of Washington, Seattle
| | | | - Emily Simmons
- Department of Speech & Hearing Sciences, University of Washington, Seattle
| | - Tanya Meyer
- Department of Otolaryngology—Head & Neck Surgery, University of Washington School of Medicine, Seattle
| | - Philip C. Doyle
- Division of Laryngology, Department of Otolaryngology—Head & Neck Surgery, Stanford University School of Medicine, CA
| | - Tanya L. Eadie
- Department of Speech & Hearing Sciences, University of Washington, Seattle
- Department of Otolaryngology—Head & Neck Surgery, University of Washington School of Medicine, Seattle
| |
Collapse
|
34
|
Chung HR, Reddy NK, Manzoor D, Schlegel P, Zhang Z, Chhetri DK. Histologic Examination of Vocal Fold Mucosal Wave and Vibration. Laryngoscope 2024; 134:264-271. [PMID: 37522475 PMCID: PMC10828106 DOI: 10.1002/lary.30928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 05/29/2023] [Accepted: 07/18/2023] [Indexed: 08/01/2023]
Abstract
OBJECTIVES Despite gross anatomic and histologic differences between human and canine vocal folds, similar wave patterns have been described yet not fully characterized. We reconstructed vocal fold (VF) vibration in a canine hemilarynx and performed histologic examination of the same vocal fold. We demonstrate comparable wave patterns while exploring the importance of certain anatomic architectures. METHODS An in vivo canine hemilarynx was phonated against a glass prism at low and high muscle activation conditions. Vibration was captured using high-speed video, and trajectories of VF medial surface tattooed landmarks were 3D-reconstructed. The method of empirical eigenfunctions was used to capture the essential dynamics of vibratory movement. Histologic examination of the hemilarynx was performed. RESULTS Oscillation patterns were highly similar between the in vivo canine and previous reports of ex vivo human models. The two most dominant eigenfunctions comprised over 90% of total variance of movement, representing opening/closing and convergent/divergent movement patterns, respectively. We demonstrate a vertical phase difference during the glottal cycle. The time delay between the inferior and superior VF was greater during opening than closing for both activation conditions. Histological examination of canine VF showed not only a thicker lamina propria layer superiorly but also a distinct pattern of thyroarytenoid muscle fibers and fascicles as described in human studies. CONCLUSIONS Histologic and vibratory examination of the canine vocal fold demonstrated human vocal fold vibratory patterns despite certain microstructural differences. This study suggests that the multilayered lamina propria may not be fundamental to vibratory patterns necessary for human-like voice production. LEVEL OF EVIDENCE NA (Basic science study) Laryngoscope, 134:264-271, 2024.
Collapse
Affiliation(s)
- Hye Rhyn Chung
- University of California, Los Angeles, David Geffen School of Medicine
| | - Neha K. Reddy
- University of California, Los Angeles, David Geffen School of Medicine
| | - Daniel Manzoor
- Department of Pathology, University of California, Los Angeles
| | - Patrick Schlegel
- Department of Head and Neck Surgery, University of California, Los Angeles
| | - Zhaoyan Zhang
- Department of Head and Neck Surgery, University of California, Los Angeles
| | - Dinesh K. Chhetri
- Department of Head and Neck Surgery, University of California, Los Angeles
| |
Collapse
|
35
|
Sundberg J, Salomão GL, Scherer KR. Emotional expressivity in singing. Assessing physiological and acoustic indicators of two opera singers' voice characteristics. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:18-28. [PMID: 38169520 DOI: 10.1121/10.0023938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 11/21/2023] [Indexed: 01/05/2024]
Abstract
In an earlier study, we analyzed how audio signals obtained from three professional opera singers varied when they sang one octave wide eight-tone scales in ten different emotional colors. The results showed systematic variations in voice source and long-term-average spectrum (LTAS) parameters associated with major emotion "families". For two of the singers, subglottal pressure (PSub) also was recorded, thus allowing analysis of an additional main physiological voice control parameter, glottal resistance (defined as the ratio between PSub and glottal flow), and related to glottal adduction. In the present study, we analyze voice source and LTAS parameters derived from the audio signal and their correlation with Psub and glottal resistance. The measured parameters showed a systematic relationship with the four emotion families observed in our previous study. They also varied systematically with values of the ten emotions along the valence, power, and arousal dimensions; valence showed a significant correlation with the ratio between acoustic voice source energy and subglottal pressure, while Power varied significantly with sound level and two measures related to the spectral dominance of the lowest spectrum partial. the fundamental.
Collapse
Affiliation(s)
- Johan Sundberg
- Department of Speech Music and Hearing, School of Electrical Engineering, Royal Institute of Technology (KTH), Stockholm, Sweden
| | - Gláucia Laís Salomão
- Stockholm University Brain Imaging Centre (SUBIC), Department of Linguistics, Stockholm University, Stockholm, Sweden
| | - Klaus R Scherer
- Department of Psychology, University of Geneva, Geneva, Switzerland
| |
Collapse
|
36
|
Cavalcanti JC, Eriksson A, Barbosa PA. Multiparametric Analysis of Speaking Fundamental Frequency in Genetically Related Speakers Using Different Speech Materials: Some Forensic Implications. J Voice 2024; 38:243.e11-243.e29. [PMID: 34629229 DOI: 10.1016/j.jvoice.2021.08.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 08/03/2021] [Accepted: 08/09/2021] [Indexed: 11/18/2022]
Abstract
OBJECTIVES To assess the speaker-discriminatory potential of a set of fundamental frequency estimates in intraidentical twin pair comparisons and cross-pair comparisons (i.e., among all speakers). PARTICIPANTS A total of 20 Brazilian Portuguese speakers of the same dialect, namely 10 male identical twin pairs aged between 19 and 35, were recruited. METHOD The participants were recorded directly through professional microphones while taking part in a spontaneous dialogue over mobile phones. Acoustic measurements were performed in connected speech samples, and in lengthened vowels, at least 160 ms long produced during spontaneous speech. RESULTS f0 baseline, central tendency, and extreme values were found mostly discriminatory in intra-twin pair and cross-pair comparisons. These were also the estimates displaying the largest effect sizes. Overall, only three identical twins were found statistically different regarding their f0 patterns in connected speech, but not for lengthened vowel-based f0 metrics. Estimates of f0 variation and modulation were found the least discriminatory across speakers, which may signal the control of speaking style and dialect on dynamic patterns of f0. Concerning system performance, the base value of f0 (f0 baseline) was found the most reliable metric, displaying the lowest equal error rate (EER). CONCLUSIONS The outcomes suggest that, although identical twins were very closely related regarding their f0 patterns, some pairs could still be differentiated acoustically, only in connected speech. Such findings reinforce the relevance of analyzing long-term f0 metrics for speaker comparison purposes, with particular consideration to f0 baseline. Furthermore, f0 differences across subjects were suggested as more expressive in connected speech than in lengthened vowels.
Collapse
Affiliation(s)
- Julio Cesar Cavalcanti
- Department of linguistics, Stockholm University, Stockholm, Sweden; Institute of language studies, Campinas State University, Campinas, São Paulo, Brazil.
| | - Anders Eriksson
- Department of linguistics, Stockholm University, Stockholm, Sweden.
| | - Plinio A Barbosa
- Institute of language studies, Campinas State University, Campinas, São Paulo, Brazil.
| |
Collapse
|
37
|
Luizard P, Bailly L, Yousefi-Mashouf H, Girault R, Orgéas L, Henrich Bernardoni N. Flow-induced oscillations of vocal-fold replicas with tuned extensibility and material properties. Sci Rep 2023; 13:22658. [PMID: 38114547 PMCID: PMC10730560 DOI: 10.1038/s41598-023-48080-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 11/22/2023] [Indexed: 12/21/2023] Open
Abstract
Human vocal folds are highly deformable non-linear oscillators. During phonation, they stretch up to 50% under the complex action of laryngeal muscles. Exploring the fluid/structure/acoustic interactions on a human-scale replica to study the role of the laryngeal muscles remains a challenge. For that purpose, we designed a novel in vitro testbed to control vocal-folds pre-phonatory deformation. The testbed was used to study the vibration and the sound production of vocal-fold replicas made of (i) silicone elastomers commonly used in voice research and (ii) a gelatin-based hydrogel we recently optimized to approximate the mechanics of vocal folds during finite strains under tension, compression and shear loadings. The geometrical and mechanical parameters measured during the experiments emphasized the effect of the vocal-fold material and pre-stretch on the vibration patterns and sounds. In particular, increasing the material stiffness increases glottal flow resistance, subglottal pressure required to sustain oscillations and vibratory fundamental frequency. In addition, although the hydrogel vocal folds only oscillate at low frequencies (close to 60 Hz), the subglottal pressure they require for that purpose is realistic (within the range 0.5-2 kPa), as well as their glottal opening and contact during a vibration cycle. The results also evidence the effect of adhesion forces on vibration and sound production.
Collapse
Affiliation(s)
- Paul Luizard
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, 38000, France
- CNRS, Centrale Marseille, Aix Marseille Univ, LMA UMR 7031, Marseille, France
- Audio Communication Group, Technische Universität Berlin, Einsteinufer 17c, Berlin, 10587, Germany
| | - Lucie Bailly
- Univ. Grenoble Alpes, CNRS, Grenoble INP, 3SR, Grenoble, 38000, France
| | - Hamid Yousefi-Mashouf
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, 38000, France
- Univ. Grenoble Alpes, CNRS, Grenoble INP, 3SR, Grenoble, 38000, France
| | - Raphaël Girault
- Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, 38000, France
| | - Laurent Orgéas
- Univ. Grenoble Alpes, CNRS, Grenoble INP, 3SR, Grenoble, 38000, France
| | | |
Collapse
|
38
|
Mandour YMH, El Hamshary A, Abdel-Elhay SA, Abdel-Hamid MS, Gomaa M. Laryngeal Changes After Septoplasty and Turbinectomy. Indian J Otolaryngol Head Neck Surg 2023; 75:3242-3247. [PMID: 37974822 PMCID: PMC10645820 DOI: 10.1007/s12070-023-03951-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 06/08/2023] [Indexed: 11/19/2023] Open
Abstract
Studies found only a little amount of evidence about the impact of septoplasty on the mechanism of voice production, as well as vocal cord and laryngeal mucosal changes. Nasal obstruction is a common medical issue that is linked to changes in the quality of resonance of voice. To assess patients with deviated nasal septum and inferior turbinate hypertrophy's voice alterations using laryngeal stroboscope before and after septoplasty and turbinectomy. In this prospective case-control study, patients in group A had inferior turbinate hypertrophy and a nasal septal deviation, while participants in group B were healthy controls who were matched for age and gender. All of the included patients had their laryngeal stroboscope and acoustic voice characteristics evaluated both preoperatively and three months after surgery. Only the baseline evaluation of healthy controls was done. We included 30 patients with mean age 24.43 ± 7.81 years, and males accounted for two thirds of the included cases, speech testing showed that Amplitude perturbation significantly improved post septoplasty with p values < 0.05, while Fundamental frequency and NHR parameters didn't show statistically significant improvement compared to preoperative measurements and control groups. Paired comparison of laryngeal erythema, mucosal edema and mucosal waves showed significant improvement compared to preoperative laryngeal stroboscopic findings with p values < 0.001 each. Significant improvements were made to septal deviation following surgery nasal obstruction caused by nasal septal deviation and inferior turbinate hypertrophy is associated with amplitude perturbation, laryngeal erythema, mucosal edema, and mucosal waves in the patients.
Collapse
|
39
|
Burk F, Traser L, Burdumy M, Richter B, Echternach M. Dynamic changes of vocal tract dimensions with sound pressure level during messa di vocea). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:3595-3603. [PMID: 38038612 DOI: 10.1121/10.0022582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 11/14/2023] [Indexed: 12/02/2023]
Abstract
The messa di voce (MdV), which consists of a continuous crescendo and subsequent decrescendo on one pitch is one of the more difficult exercises of the technical repertoire of Western classical singing. With rising lung pressure, regulatory adjustments both on the level of the glottis and the vocal tract are required to keep the pitch stable. The dynamic changes of vocal tract dimensions with the bidirectional variation of sound pressure level (SPL) during MdV were analyzed by two-dimensional real-time magnetic resonance imaging (25 frames/s) and synchronous audio recordings in 12 professional singer subjects. Close associations in the respective articulatory kinetics were found between SPL and lip opening, jaw opening, pharynx width, uvula elevation, and vertical larynx position. However, changes in vocal tract dimensions during plateaus of SPL suggest that perceived loudness could have been varied beyond the dimension of SPL. Further multimodal investigation, including the analysis of sound spectra, is needed for a better understanding of the role of vocal tract resonances in the control of vocal loudness in human phonation.
Collapse
Affiliation(s)
- Fabian Burk
- Department of Otorhinolaryngology and Plastic Surgery, SRH Wald-Klinikum Gera, Gera, Germany
- Institute of Musicians' Medicine, University Medical Center Freiburg, Freiburg im Breisgau, Germany
| | - Louisa Traser
- Institute of Musicians' Medicine, University Medical Center Freiburg, Freiburg im Breisgau, Germany
| | - Michael Burdumy
- Department of Radiology, Medical Physics, University Medical Center Freiburg, Freiburg im Breisgau, Germany
| | - Bernhard Richter
- Institute of Musicians' Medicine, University Medical Center Freiburg, Freiburg im Breisgau, Germany
| | - Matthias Echternach
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Munich University Hospital (LMU), Munich, Germany
| |
Collapse
|
40
|
Paiva GM, Silva POC, Silva LJAD, Nascimento KA, Silva ABDVE, Abreu SRD, Almeida AAFD, Lopes LW. Spectral and cepstral measurements in women with behavioral dysphonia. Codas 2023; 36:e20220327. [PMID: 37970895 DOI: 10.1590/2317-1782/20232022327pt] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 03/20/2023] [Indexed: 11/19/2023] Open
Abstract
PURPOSE To investigate whether there are differences in cepstral and spectral acoustic measures between women with behavioral dysphonia with and without laryngeal lesions and verify whether there is a correlation between such measures and the auditory-perceptual evaluation of voice quality. METHODS The sample comprised 78 women with behavioral dysphonia without laryngeal lesions (BDWOL) and 68 with behavioral dysphonia with laryngeal lesions (vocal nodules) (BDWL). Cepstral peak prominence (CPP), cepstral peak prominence-smoothed (CPPS), spectral decrease, and H1-H2 (difference between the amplitude of the first and second harmonics) were extracted. They were submitted to the auditory-perceptual evaluation (APE) of the grade of hoarseness (GH), roughness (RO), breathiness (BR), and strain (ST). RESULTS BDWL women had higher H1-H2 values and lower CPP and CPPS values than BDWOL women. More deviant voices had lower CPP and CPPS values. Breathy voices had lower CPP and CPPS values and higher H1-H2 values than rough ones. There was a weak negative correlation between CPP and RO, a moderate negative correlation with GH, and a strong negative correlation with BR. CPPS had a moderate negative correlation with GH, RO, and BR. H1-H2 had a weak positive correlation with BR. There was a weak positive correlation between spectral decrease and ST. CONCLUSION H1-H2, CPP, and CPPS were different between BDWOL and BDWL women. Furthermore, cepstral and spectral measures were correlated with the different APE parameters.
Collapse
|
41
|
Bouhabel S, Park S, Kolosova K, Latifi N, Kost K, Li-Jessen NYK, Mongeau L. Functional Analysis of Injectable Substance Treatment on Surgically Injured Rabbit Vocal Folds. J Voice 2023; 37:829-839. [PMID: 34353684 PMCID: PMC8807745 DOI: 10.1016/j.jvoice.2021.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 05/28/2021] [Accepted: 06/02/2021] [Indexed: 02/04/2023]
Abstract
OBJECTIVES The objective of this study was to evaluate the efficacy of immediate injection treatments of dexamethasone, hyaluronic acid (HA)/gelatin (Ge) hydrogel and glycol-chitosan solution on the phonatory function of rabbit larynges at 42 days after surgical injury of the vocal folds, piloting a novel ex vivo phonatory functional analysis protocol. METHODS A modified microflap procedure was performed on the left vocal fold of 12 rabbits to induce an acute injury. Animals were randomized into one of four treatment groups with 0.1 mL injections of dexamethasone, HA/Ge hydrogel, glycol-chitosan or saline as control. The left mid vocal fold lamina propria was injected immediately following injury. The right vocal fold served as an uninjured control. Larynges were harvested at Day 42 after injection, then were subjected to airflow-bench evaluation. Acoustic, aerodynamic and laryngeal high-speed videoendoscopy (HSV) analyses were performed. HSV segments of the vibrating vocal folds were rated by three expert laryngologists. Six parameters related to vocal fold vibratory characteristics were evaluated on a Likert scale. RESULTS The fundamental frequency, one possible surrogate of vocal fold stiffness and scarring, was lower in the dexamethasone and HA/Ge hydrogel treatment groups compared to that of the saline control (411.52±11.63 Hz). The lowest fundamental frequency value was observed in the dexamethasone group (348.79±14.99 Hz). Expert visual ratings of the HSV segments indicated an overall positive outcome in the dexamethasone treatment group, though the impacts were below statistical significance. CONCLUSION Dexamethasone injections might be used as an adjunctive option for iatrogenic vocal fold scarring. An increased sample size, histological correlate, and experimental method improvements will be needed to confirm this finding. Results suggested a promising use of HSV and acoustic analysis techniques to identify and monitor post-surgical vocal fold repair and scarring, providing a useful tool for future studies of vocal fold scar treatments.
Collapse
Affiliation(s)
- Sarah Bouhabel
- Department of Otolaryngology - Head and Neck Surgery, McGill University, Montreal, Quebec, Canada.
| | - Scott Park
- Department of Mechanical Engineering, McGill University, Montreal, Quebec, Canada
| | - Ksenia Kolosova
- Department of Physics, McGill University, Montreal, Quebec, Canada
| | - Neda Latifi
- Department of Mechanical Engineering, McGill University, Montreal, Quebec, Canada
| | - Karen Kost
- Department of Otolaryngology - Head and Neck Surgery, McGill University, Montreal, Quebec, Canada
| | - Nicole Y K Li-Jessen
- Department of Otolaryngology - Head and Neck Surgery, McGill University, Montreal, Quebec, Canada; School of Communication Sciences and Disorders, McGill University, Montreal, Quebec, Canada
| | - Luc Mongeau
- Department of Otolaryngology - Head and Neck Surgery, McGill University, Montreal, Quebec, Canada; Department of Mechanical Engineering, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
42
|
Albino DDO, do Nascimento UN, Plec EMRL, Santos MAR, Gama ACC. Comparison between the acoustic fundamental frequency of the voice and the vibration frequency of the vocal folds analyzed by digital kymography. Codas 2023; 35:e20220173. [PMID: 37909493 PMCID: PMC10702710 DOI: 10.1590/2317-1782/20232022173pt] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 11/07/2022] [Indexed: 11/03/2023] Open
Abstract
PURPOSE To compare the frequency of vocal fold opening variation, analyzed by digital kymography, with the fundamental voice frequency obtained by acoustic analysis, in individuals without laryngeal alteration. METHODS Observational analytical cross-sectional study. The participants were forty-eight women and 38 men from 18 to 55 years of age. The evaluation was made by voice acoustic analysis, by the habitual emission of the vowel /a/ for 3 seconds, and days of the week, and digital kymography (DKG), by the habitual emission of the vowels /i/ and /ɛ/. The measurements analyzed were acoustic fundamental frequency (f0), extracted by the Computerized Speech Lab (CSL) program, and dominant frequency of the variation of right (R-freq) and left (L-freq) vocal fold opening, obtained through the KIPS image processing program. The mounting of the kymograms consisted in the manual demarcation of the region by vertical lines delimiting width and horizontal lines separating the posterior, middle and anterior thirds of the Rima glottidis. In the statistical analysis, the Anderson-Darling test was used to verify the normality of the sample. The ANOVA and Tukey tests were performed for the comparison of measurements between the groups. For the comparison of age between the groups, the Mann-Whitney test was used. RESULTS There are no differences between the values of the frequency measurement analyzed by digital kymography, with the acoustic fundamental frequency, in individuals without laryngeal alteration. CONCLUSION The values of the dominant frequency of the vocal folds opening variation, as assessed by digital kymography, and the acoustic fundamental frequency of the voice are similar, allowing comparison between these measurements in the multidimensional evaluation of the voice, in individuals without laryngeal alteration.
Collapse
Affiliation(s)
- Déborah de Oliveira Albino
- Departamento de Fonoaudiologia, Faculdade de Medicina, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG), Brasil.
| | - Ualisson Nogueira do Nascimento
- Programa de Pós-graduação em Ciências Fonoaudiológicas, Faculdade de Medicina, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG), Brasil.
| | - Elisa Meiti Ribeiro Lin Plec
- Programa de Pós-graduação em Ciências Fonoaudiológicas, Faculdade de Medicina, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG), Brasil.
| | | | - Ana Cristina Côrtes Gama
- Departamento de Fonoaudiologia, Faculdade de Medicina, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG), Brasil.
| |
Collapse
|
43
|
Jiang W, Zheng X, Farbos de Luzan C, Oren L, Gutmark E, Xue Q. The Effects of Negative Pressure Induced by Flow Separation Vortices on Vocal Fold Dynamics during Voice Production. Bioengineering (Basel) 2023; 10:1215. [PMID: 37892945 PMCID: PMC10604472 DOI: 10.3390/bioengineering10101215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/22/2023] [Accepted: 10/17/2023] [Indexed: 10/29/2023] Open
Abstract
This study used a two-dimensional flow-structure-interaction computer model to investigate the effects of flow-separation-vortex-induced negative pressure on vocal fold vibration and flow dynamics during vocal fold vibration. The study found that negative pressure induced by flow separation vortices enhances vocal fold vibration by increasing aeroelastic energy transfer during vibration. The result showed that the intraglottal pressure was predominantly negative after flow separation before gradually recovering to zero at the glottis exit. When the negative pressure was removed, the vibration amplitude and flow rate were reduced by up to 20%, and the closing speed, flow skewness quotient, and maximum flow declination rate were reduced by up to 40%. The study provides insights into the complex interactions between flow dynamics, vocal fold vibration, and energy transfer during voice production.
Collapse
Affiliation(s)
- Weili Jiang
- Mechanical Engineering Department, Rochester Institute of Technology, Rochester, NY 14623, USA; (W.J.); (X.Z.)
- Mechanical Engineering Department, University of Maine, Orono, ME 04469, USA
| | - Xudong Zheng
- Mechanical Engineering Department, Rochester Institute of Technology, Rochester, NY 14623, USA; (W.J.); (X.Z.)
- Mechanical Engineering Department, University of Maine, Orono, ME 04469, USA
| | - Charles Farbos de Luzan
- Department of Otolaryngology Head and Neck Surgery, University of Cincinnati, Cincinnati, OH 45267, USA; (C.F.d.L.); (L.O.)
| | - Liran Oren
- Department of Otolaryngology Head and Neck Surgery, University of Cincinnati, Cincinnati, OH 45267, USA; (C.F.d.L.); (L.O.)
| | - Ephraim Gutmark
- Department of Aerospace Engineering and Engineering Mechanics, University of Cincinnati, Cincinnati, OH 45267, USA;
| | - Qian Xue
- Mechanical Engineering Department, Rochester Institute of Technology, Rochester, NY 14623, USA; (W.J.); (X.Z.)
- Mechanical Engineering Department, University of Maine, Orono, ME 04469, USA
| |
Collapse
|
44
|
Zhang Z. The influence of source-filter interaction on the voice source in a three-dimensional computational model of voice production. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:2462-2475. [PMID: 37855666 PMCID: PMC10589054 DOI: 10.1121/10.0021879] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 09/28/2023] [Accepted: 09/30/2023] [Indexed: 10/20/2023]
Abstract
The goal of this computational study is to quantify global effects of vocal tract constriction at various locations (false vocal folds, aryepiglottic folds, pharynx, oral cavity, and lips) on the voice source across a large range of vocal fold conditions. The results showed that while inclusion of a uniform vocal tract had notable effects on the voice source, further constricting the vocal tract only had small effects except for conditions of extreme constriction, at which constrictions at any location along the vocal tract decreased the mean and peak-to-peak amplitude of the glottal flow waveform. Although narrowing in the epilarynx increased the normalized maximum flow declination rate, vocal tract constriction in general slightly reduced the source strength and high-frequency harmonic production at the glottis, except for a limited set of vocal fold conditions (e.g., soft, long vocal folds subject to relatively high pressure). This suggests that simultaneous laryngeal and vocal tract adjustments are required to maximize source-filter interaction. While vocal tract adjustments are often assumed to improve voice production, our results indicate that such improvements are mainly due to changes in vocal tract acoustic response rather than improved voice production at the glottis.
Collapse
Affiliation(s)
- Zhaoyan Zhang
- UCLA School of Medicine, 31-24 Rehabilitation Center, 1000 Veteran Avenue, Los Angeles, California 90095-1794, USA
| |
Collapse
|
45
|
Vurma A, Meister E, Meister L, Ross J, Raju M, Kala V, Dede T. The intensities of vowels and plosive bursts and their impact on text intelligibility in singinga). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:2653-2664. [PMID: 37877771 DOI: 10.1121/10.0021968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 10/05/2023] [Indexed: 10/26/2023]
Abstract
In classical singing, there are often problems with the intelligibility of sung text. The present study aims to test the hypotheses that (1) in loud operatic singing, compared with speaking, the intensity of voiceless plosives increases less than the intensity of vowels, leading to poorer recognition of plosives; and (2) pronouncing the plosive bursts with greater intensity improves their recognition. The acoustic analysis of nine opera arias in Italian from the Classical and Romantic periods performed by ten classically trained singers showed that the average difference in the intensity of vowels when sung and spoken was 14.6 dB [standard deviation (SD) = 7.2 dB], while the difference in the intensity of voiceless plosive bursts was only 6.6 dB (SD = 6 dB). In a perception test with 73 participants, increasing the intensity of the plosive bursts generally improved the recognition of plosives in the sung /a-plosive-a/ sequences, but mainly when reverberation and/or pink noise imitating instrumental accompaniments were added to the stimuli. At the same time, recognition of plosives was often better than chance even when the plosive burst was missing and replaced by silence.
Collapse
Affiliation(s)
- Allan Vurma
- Estonian Academy of Music and Theatre, Tallinn, 10116, Estonia
| | - Einar Meister
- Tallinn University of Technology, Tallinn, 19086, Estonia
| | - Lya Meister
- Tallinn University of Technology, Tallinn, 19086, Estonia
| | - Jaan Ross
- Estonian Academy of Music and Theatre, Tallinn, 10116, Estonia
| | - Marju Raju
- Estonian Academy of Music and Theatre, Tallinn, 10116, Estonia
| | - Veeda Kala
- Estonian Academy of Music and Theatre, Tallinn, 10116, Estonia
| | - Tuuri Dede
- Estonian Academy of Music and Theatre, Tallinn, 10116, Estonia
| |
Collapse
|
46
|
Behlau M, Madazio G, Yamasaki R. Dynamic vocal analysis: vocal functionality evaluation. Codas 2023; 35:e20210083. [PMID: 37729254 PMCID: PMC10546986 DOI: 10.1590/2317-1782/20232021083pt] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 10/10/2022] [Indexed: 09/22/2023] Open
Abstract
Dynamic vocal analysis (DVA) is an auditory-perceptual and acoustic vocal assessment strategy that provides estimates on the biomechanics and aerodynamics of vocal production by performing frequency and intensity variation tasks and using voice acoustic spectrography. The objective of this experience report is to demonstrate the use of DVA in the assessment of vocal functionality of dysphonic and non-dysphonic individuals, with a special focus on the laryngeal musculature. Phonatory tasks consisted of sustained vowel, "a" or "é", and/or connected speech, in three intensities (habitual, soft, and loud) and three frequencies (habitual, high, and low), as well as ascending and descending glissando. The adjustments of the laryngeal and paralaryngeal muscles can be inferred from the different DVA tasks. The main characteristics of the laryngeal muscles analyzed are control of glottic adduction, stretching, and shortening of the vocal folds; the main characteristics of the paralaryngeal musculature are mainly related to the vertical laryngeal position in the neck. While the sustained vowel evaluates the vocal functionality with a focus on the larynx, connected speech allows the evaluation of the articulatory adjustments employed. An acoustic spectrographic software can be used to visualize the performance of such tasks. The clinical application of the DVA will be exemplified using acoustic spectrography plates from normal and dysphonic voices, taken from a voice bank. Individuals who perform the DVA tasks in a balanced way, with adequate vocal quality and without phonatory effort, demonstrate good vocal functionality. On the other hand, difficulties in performing these tasks with worsening vocal quality and/or increased muscle tension may be indications of altered vocal functionality.
Collapse
Affiliation(s)
- Mara Behlau
- Centro de Estudos da Voz - CEV - São Paulo, SP, Brasil.
| | | | - Rosiane Yamasaki
- Centro de Estudos da Voz - CEV - São Paulo, SP, Brasil.
- Universidade Federal de São Paulo - UNIFESP - São Paulo, SP, Brasil.
| |
Collapse
|
47
|
Qayyum U, Mumtaz N, Saqulain G. Vocal health of parents of children with hearing assistive devices. Pak J Med Sci 2023; 39:1434-1439. [PMID: 37680838 PMCID: PMC10480716 DOI: 10.12669/pjms.39.5.7570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 05/30/2023] [Accepted: 06/16/2023] [Indexed: 09/09/2023] Open
Abstract
Background & Objectives VH (Vocal health) is the need of the hour. VH of parents of children with hearing assistive devices (HAD) reveals a literature gap, during the habilitation process of their children. To explore the vocal health of parents of children with hearing assistive devices. Methods This cross-sectional study was conducted at Riphah International University from September to December 2021. Study recruited N=384 parents of Hearing Impaired children (HIC) using HAD for at least two years, of both genders and aged 2-9 years using convenience sampling. Voice-related quality of life (V-RQOL), and vocal health Index (VHI) -10 were used for data collection. Data was analyzed on SPSS Version 25. Descriptive statistics, Anova and t-test were utilized to see difference between means of groups. P<0.05 shows significant-results. Results Parents of children using hearing assistive devices had excellent V-RQOL score in 350(91.14%) parents. There was no significant difference in V=RQOL as regards type of hearing assistive device use (p=0.102), laterality of device use (p=0.918) and degree of hearing loss (p=0.143). However, type of hearing loss revealed significant difference (p=0.021). Also VHI score revealed significantly (p=0.008) lower means in parents of children with cochlear implants. Conclusion Current study concludes that the parents raising hearing impaired children with hearing assistive devices, possess good vocal health as determined by VHI and V-RQOL scores with only a very small number of parents reporting vocal symptoms.
Collapse
Affiliation(s)
- Uzma Qayyum
- Uzma Qayyum, MS (SLP) Speech Language Pathologist, Department of Speech Language Pathology, Riphah International University, Lahore, Pakistan
| | - Nazia Mumtaz
- Nazia Mumtaz, FCPS (Rehab Sciences) Head of Department, Department of Speech Language Pathology, Faculty of Rehab and Allied Health Sciences, Riphah International University, Lahore, Pakistan
| | - Ghulam Saqulain
- Ghulam Saqulain, FCPS (Otorhinolaryngology) Head of Department & Professor, Department of Otorhinolaryngology, Capital Hospital PGMI, Islamabad, Pakistan
| |
Collapse
|
48
|
Rai S, Ramdas D, Jacob NL, Bajaj G, Balasubramanium RK, Bhat JS. Normative data for certain vocal fold biomarkers among young normophonic adults using ultrasonography. Eur Arch Otorhinolaryngol 2023; 280:4165-4173. [PMID: 37221308 PMCID: PMC10382443 DOI: 10.1007/s00405-023-08025-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 05/09/2023] [Indexed: 05/25/2023]
Abstract
PURPOSE The current study aimed to profile vocal fold morphology, vocal fold symmetry, gender and task-specific data for vocal fold length (VFL) and vocal fold displacement velocity (VFDV) in young normophonic adults in the age range of 18-30 years using ultrasonography (USG). METHODS Participants underwent USG across quiet breathing, /a/ phonation and /i/ phonation tasks, and acoustic analysis was conducted to explore the relationship between USG and acoustic measures. RESULTS The study found that males have longer vocal folds than females, and overall greater velocities were observed in /a/ phonation, followed by /i/ phonation, with the lowest velocity observed in the quiet breathing task. CONCLUSIONS The obtained norms can be used as a quantitative benchmark for analyzing the vocal fold behavior in young adults.
Collapse
Affiliation(s)
- Santosh Rai
- Department of Radiodiagnosis and Imaging, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal, 575001 Karnataka India
| | - Divya Ramdas
- Department of Audiology and Speech Language Pathology, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal, 575001 Karnataka India
| | - Nidhi Lalu Jacob
- Department of Audiology and Speech Language Pathology, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal, 575001 Karnataka India
| | - Gagan Bajaj
- Department of Audiology and Speech Language Pathology, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal, 575001 Karnataka India
| | - Radish Kumar Balasubramanium
- Department of Audiology and Speech Language Pathology, Kasturba Medical College, Mangalore, Manipal Academy of Higher Education, Manipal, 575001 Karnataka India
| | - Jayashree S. Bhat
- Department of Audiology and Speech Language Pathology, Nitte Institute of Speech and Hearing, Deralakatte, Mangalore, Karnataka India
| |
Collapse
|
49
|
Chan MPY, Kuang J. The effect of tone language background on cue integration in pitch perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:819-830. [PMID: 37563829 DOI: 10.1121/10.0020565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 07/18/2023] [Indexed: 08/12/2023]
Abstract
This study explores the effect of native language and musicality on voice quality cue integration in pitch perception. Previous work by Cui and Kang [(2019). J. Acoust. Soc. Am. 146(6), 4086-4096] found no differences in pitch perception strategies between English and Mandarin speakers. The present study asks whether Cantonese listeners may perform differently, as Cantonese consists of multiple level tones. Participants completed two experiments: (i) a forced choice pitch classification experiment involving four spectral slope permutations that vary in fo across an 11 step continuum, and (ii) the MBEMA test that quantifies listeners' musicality. Results show that Cantonese speakers do not differ from English and Mandarin speakers in terms of overall categoricity and perceptual shift, that Cantonese speakers do not have advantages in musicality, and that musicality is a significant predictor for participants' pitch perception strategies. Listeners with higher musicality scores tend to rely more on fo cues than voice quality cues compared to listeners with lower musicality. These findings support the notion that voice quality integration in pitch perception is not language specific, and may be a universal psychoacoustic phenomenon at a non-lexical level.
Collapse
Affiliation(s)
- May Pik Yu Chan
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6228, USA
| | - Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6228, USA
| |
Collapse
|
50
|
Idrisoglu A, Dallora AL, Anderberg P, Berglund JS. Applied Machine Learning Techniques to Diagnose Voice-Affecting Conditions and Disorders: Systematic Literature Review. J Med Internet Res 2023; 25:e46105. [PMID: 37467031 PMCID: PMC10398366 DOI: 10.2196/46105] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 04/26/2023] [Accepted: 05/23/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND Normal voice production depends on the synchronized cooperation of multiple physiological systems, which makes the voice sensitive to changes. Any systematic, neurological, and aerodigestive distortion is prone to affect voice production through reduced cognitive, pulmonary, and muscular functionality. This sensitivity inspired using voice as a biomarker to examine disorders that affect the voice. Technological improvements and emerging machine learning (ML) technologies have enabled possibilities of extracting digital vocal features from the voice for automated diagnosis and monitoring systems. OBJECTIVE This study aims to summarize a comprehensive view of research on voice-affecting disorders that uses ML techniques for diagnosis and monitoring through voice samples where systematic conditions, nonlaryngeal aerodigestive disorders, and neurological disorders are specifically of interest. METHODS This systematic literature review (SLR) investigated the state of the art of voice-based diagnostic and monitoring systems with ML technologies, targeting voice-affecting disorders without direct relation to the voice box from the point of view of applied health technology. Through a comprehensive search string, studies published from 2012 to 2022 from the databases Scopus, PubMed, and Web of Science were scanned and collected for assessment. To minimize bias, retrieval of the relevant references in other studies in the field was ensured, and 2 authors assessed the collected studies. Low-quality studies were removed through a quality assessment and relevant data were extracted through summary tables for analysis. The articles were checked for similarities between author groups to prevent cumulative redundancy bias during the screening process, where only 1 article was included from the same author group. RESULTS In the analysis of the 145 included studies, support vector machines were the most utilized ML technique (51/145, 35.2%), with the most studied disease being Parkinson disease (PD; reported in 87/145, 60%, studies). After 2017, 16 additional voice-affecting disorders were examined, in contrast to the 3 investigated previously. Furthermore, an upsurge in the use of artificial neural network-based architectures was observed after 2017. Almost half of the included studies were published in last 2 years (2021 and 2022). A broad interest from many countries was observed. Notably, nearly one-half (n=75) of the studies relied on 10 distinct data sets, and 11/145 (7.6%) used demographic data as an input for ML models. CONCLUSIONS This SLR revealed considerable interest across multiple countries in using ML techniques for diagnosing and monitoring voice-affecting disorders, with PD being the most studied disorder. However, the review identified several gaps, including limited and unbalanced data set usage in studies, and a focus on diagnostic test rather than disorder-specific monitoring. Despite the limitations of being constrained by only peer-reviewed publications written in English, the SLR provides valuable insights into the current state of research on ML-based voice-affecting disorder diagnosis and monitoring and highlighting areas to address in future research.
Collapse
Affiliation(s)
- Alper Idrisoglu
- Department of Health, Blekinge Institute of Technology, Karslkrona, Sweden
| | - Ana Luiza Dallora
- Department of Health, Blekinge Institute of Technology, Karslkrona, Sweden
| | - Peter Anderberg
- Department of Health, Blekinge Institute of Technology, Karslkrona, Sweden
- School of Health Sciences, University of Skövde, Skövde, Sweden
| | | |
Collapse
|