1
|
Huang J, Guo P, Zhang S, Ji M, An R. Use of Deep Neural Networks to Predict Obesity With Short Audio Recordings: Development and Usability Study. JMIR AI 2024; 3:e54885. [PMID: 39052997 DOI: 10.2196/54885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Revised: 03/10/2024] [Accepted: 06/13/2024] [Indexed: 07/27/2024]
Abstract
BACKGROUND The escalating global prevalence of obesity has necessitated the exploration of novel diagnostic approaches. Recent scientific inquiries have indicated potential alterations in voice characteristics associated with obesity, suggesting the feasibility of using voice as a noninvasive biomarker for obesity detection. OBJECTIVE This study aims to use deep neural networks to predict obesity status through the analysis of short audio recordings, investigating the relationship between vocal characteristics and obesity. METHODS A pilot study was conducted with 696 participants, using self-reported BMI to classify individuals into obesity and nonobesity groups. Audio recordings of participants reading a short script were transformed into spectrograms and analyzed using an adapted YOLOv8 model (Ultralytics). The model performance was evaluated using accuracy, recall, precision, and F1-scores. RESULTS The adapted YOLOv8 model demonstrated a global accuracy of 0.70 and a macro F1-score of 0.65. It was more effective in identifying nonobesity (F1-score of 0.77) than obesity (F1-score of 0.53). This moderate level of accuracy highlights the potential and challenges in using vocal biomarkers for obesity detection. CONCLUSIONS While the study shows promise in the field of voice-based medical diagnostics for obesity, it faces limitations such as reliance on self-reported BMI data and a small, homogenous sample size. These factors, coupled with variability in recording quality, necessitate further research with more robust methodologies and diverse samples to enhance the validity of this novel approach. The findings lay a foundational step for future investigations in using voice as a noninvasive biomarker for obesity detection.
Collapse
Affiliation(s)
- Jingyi Huang
- School of Economics and Management, Shanghai University of Sport, Shanghai, China
| | - Peiqi Guo
- Brown School, Washington University in St. Louis, St. Louis, MO, United States
| | - Sheng Zhang
- School of Journalism and Communication, Shanghai University of Sport, Shanghai, China
| | - Mengmeng Ji
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine in St. Louis, St. Louis, MO, United States
| | - Ruopeng An
- Brown School, Washington University in St. Louis, St. Louis, MO, United States
- Division of Data and Computational Sciences, Washington University in St. Louis, St. Louis, MO, United States
| |
Collapse
|
2
|
Souza JA, Pasqualoto AS, Cielo CA, Andriollo DB, Moraes DAO. Can We Use the Maximum Phonation Time as a Screening of Pulmonary Forced Vital Capacity in Post-COVID-19 Syndrome Patients? J Voice 2024:S0892-1997(24)00118-8. [PMID: 38649315 DOI: 10.1016/j.jvoice.2024.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 03/30/2024] [Accepted: 04/01/2024] [Indexed: 04/25/2024]
Abstract
OBJECTIVE To verify the accuracy of the maximum phonation time of the vowel /a/ (MPT/a/), fricative /s/ (MPT/s/), number counting (MPTC), and number reached in this count (CN) to estimate forced vital capacity (FVC) in patients with post-COVID-19 syndrome. METHOD Cross-sectional study involving adult patients, who were admitted to the intensive care unit and referred to the Post-COVID-19 Rehabilitation Outpatient Clinic. Voice function was assessed using a Vocal Handicap Index (VHI) self-assessment questionnaire and MPT tests. To perform the phonatory tests, the patients remained in a standing posture and were instructed to inhale as much air as possible and, during a single exhalation, at usual pitch and loudness, sustain the emission of /a/ and /s/; and in another breath, to perform the ascending numerical count, starting from the number one up to the highest number they could reach. Pulmonary function was assessed by spirometry. The receiver operating characteristic (ROC) curve was plotted, and FVC values lower than the normal limit by Z-score (fifth percentile) were classified as impaired lung function. The predictive values and likelihood ratios were calculated. RESULTS A total of 70 patients participated, with 20-30% having a high VHI. Approximately 24% had an FVC impairment and significantly low values of MPT/a/, MPT/s/, MPTC, and CN. The test results showed overall accuracy of 70% and the cutoff points of 9.69, 6.78, 10.60, and 13, respectively, with high sensitivity, predictive negative value and low specificity, predictive positive value, and positive likelihood ratio. CONCLUSIONS Our results suggest that the MPT has moderate discriminatory power for FVC impairment, indicating that it is not a reliable indicator of pulmonary function in the population studied. Therefore, in patients with an MPT of less than 10.60 seconds, or a CN lower than 13, other criteria should be added to improve the diagnostic accuracy and support the decision to perform more complex investigations.
Collapse
Affiliation(s)
- Juliana Alves Souza
- Department of Speech, Hearing and Language Sciences and Postgraduate Program in Human Communication Disorders, Voice Laboratory of he Universidade Federal de Santa Maria (UFSM), Santa Maria, Brazil.
| | - Adriane Schmidt Pasqualoto
- Department of Speech, Hearing and Language Sciences and Postgraduate Program in Human Communication Disorders, Voice Laboratory of he Universidade Federal de Santa Maria (UFSM), Santa Maria, Brazil; Department of Physiotherapy and Postgraduate Program in Human Communication Disorders at Universidade Federal de Santa Maria (UFSM), Santa Maria, Brazil
| | - Carla Aparecida Cielo
- Department of Speech, Hearing and Language Sciences and Postgraduate Program in Human Communication Disorders, Voice Laboratory of he Universidade Federal de Santa Maria (UFSM), Santa Maria, Brazil
| | - Débora Bonesso Andriollo
- Department of Speech, Hearing and Language Sciences and Postgraduate Program in Human Communication Disorders, Voice Laboratory of he Universidade Federal de Santa Maria (UFSM), Santa Maria, Brazil
| | - Denis Altieri Oliveira Moraes
- Department of Speech, Hearing and Language Sciences and Postgraduate Program in Human Communication Disorders, Voice Laboratory of he Universidade Federal de Santa Maria (UFSM), Santa Maria, Brazil; Departament of Statistics and Postgraduate Program in Human Communication Disorders at Universidade Federal de Santa Maria (UFSM), Santa Maria, Brazil
| |
Collapse
|
3
|
Feltrin TD, Gracioli MDSP, Cielo CA, Souza JA, Moraes DADO, Pasqualoto AS. Maximum Phonation Times as Biomarkers of Lung Function. J Voice 2024:S0892-1997(23)00406-X. [PMID: 38331702 DOI: 10.1016/j.jvoice.2023.12.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 12/14/2023] [Accepted: 12/15/2023] [Indexed: 02/10/2024]
Abstract
PURPOSE To verify whether measurements of maximal phonation times are biomarkers of forced vital capacity in patients with chronic obstructive pulmonary disease, and to characterize the vocal aspects of these patients, taking into account variables, such as age, body mass index, use of bronchodilators, presence of symptoms, and quality of life related to voice. METHODS Complete records of 25 subjects with chronic obstructive pulmonary disease, both sexes, aged 31 to 85 years, evaluated by forced vital capacity, maximum phonation times of /a/, and numerical count and number reached at this count, Vocal Symptom Scale, Voice Quality of Life. Data were presented descriptively and statistically analyzed using Student's t test for independent samples and Mann-Whitney U test. A significance level of 5% was accepted. The receiver operating characteristic curve was plotted and the standardized value of forced vital capacity <80% was considered as an indicator of pulmonary dysfunction. RESULTS Patients exhibited reduced maximum phonation times for /a/, numeric counting, and reached digits in counting; discrepancies in Vocal Signs and Symptoms and Voice Quality of Life Scale scores. Numeric counting times of up to 12.5 seconds indicated that forced vital capacity may be impaired. CONCLUSION The patients with chronic obstructive pulmonary disease examined in this study exhibited vocal deviations as evidenced by reduced maximum phonation times of /a/, numeric counting, and the digit reached during counting, as well as deviations in vocal self-assessment. Maximum phonation time in numerical counting was considered a biomarker of pulmonary function impairment.
Collapse
|
4
|
Lechien JR, Geneid A, Bohlender JE, Cantarella G, Avellaneda JC, Desuter G, Sjogren EV, Finck C, Hans S, Hess M, Oguz H, Remacle MJ, Schneider-Stickler B, Tedla M, Schindler A, Vilaseca I, Zabrodsky M, Dikkers FG, Crevier-Buchman L. Consensus for voice quality assessment in clinical practice: guidelines of the European Laryngological Society and Union of the European Phoniatricians. Eur Arch Otorhinolaryngol 2023; 280:5459-5473. [PMID: 37707614 DOI: 10.1007/s00405-023-08211-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 08/21/2023] [Indexed: 09/15/2023]
Abstract
INTRODUCTION To update the European guidelines for the assessment of voice quality (VQ) in clinical practice. METHODS Nineteen laryngologists-phoniatricians of the European Laryngological Society (ELS) and the Union of the European Phoniatricians (UEP) participated to a modified Delphi process to propose statements about subjective and objective VQ assessments. Two anonymized voting rounds determined a consensus statement to be acceptable when 80% of experts agreed with a rating of at least 3/4. The statements with ≥ 3/4 score by 60-80% of experts were improved and resubmitted to voting until they were validated or rejected. RESULTS Of the 90 initial statements, 51 were validated after two voting rounds. A multidimensional set of minimal VQ evaluations was proposed and included: baseline VQ anamnesis (e.g., allergy, medical and surgical history, medication, addiction, singing practice, job, and posture), videolaryngostroboscopy (mucosal wave symmetry, amplitude, morphology, and movements), patient-reported VQ assessment (30- or 10-voice handicap index), perception (Grade, Roughness, Breathiness, Asthenia, and Strain), aerodynamics (maximum phonation time), acoustics (Mean F0, Jitter, Shimmer, and noise-to-harmonic ratio), and clinical instruments associated with voice comorbidities (reflux symptom score, reflux sign assessment, eating-assessment tool-10, and dysphagia handicap index). For perception, aerodynamics and acoustics, experts provided guidelines for the methods of measurement. Some additional VQ evaluations are proposed for voice professionals or patients with some laryngeal diseases. CONCLUSION The ELS-UEP consensus for VQ assessment provides clinical statements for the baseline and pre- to post-treatment evaluations of VQ and to improve collaborative research by adopting common and validated VQ evaluation approach.
Collapse
Affiliation(s)
- Jerome R Lechien
- Department of Otolaryngology-Head Neck Surgery, Foch Hospital, University of Paris Saclay, Paris, France.
- Department of Otolaryngology-Head Neck Surgery, CHU Saint-Pierre, Brussels, Belgium.
- Department of Laryngology and Broncho-Esophagology, EpiCURA Hospital, Anatomy Department of University of Mons, Mons, Belgium.
- Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Paris, France.
| | - Ahmed Geneid
- Department of Otolaryngology and Phoniatrics-Head and Neck Surgery, Helsinki University Hospital and University of Helsinki, Helsinki, Finland
| | - Jörg E Bohlender
- Department of Phoniatrics and Speech Pathology, Clinic for Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Giovanna Cantarella
- Department of Otolaryngology and Head and Neck Surgery Fondazione, IRCCS Ca' Granda Ospedale Maggiore Policlinico, Milan, Italy
- Department of Clinical Sciences and Community Health Università degli Studi di Milano, Milan, Italy
| | - Juan C Avellaneda
- Department of Surgery, Otolaryngology Service. Hospital Universitario Mayor Mederi, Universidad del Rosario, Bogotá, Colombia
| | - Gauthier Desuter
- ENT, Head and Neck Surgery, Antwerp University Hospital, Edegem, Belgium
| | - Elisabeth V Sjogren
- Department of Otorhinolaryngology, Head and Neck Surgery, Leiden University Medical Center, Leiden, The Netherlands
| | - Camille Finck
- Department of Otorhinolaryngology-Head and Neck Surgery, CHU de Liege, Université de Liège, Liège, Belgium
| | - Stephane Hans
- Department of Otolaryngology-Head Neck Surgery, Foch Hospital, University of Paris Saclay, Paris, France
- Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Paris, France
| | - Markus Hess
- Medical Voice Center (MEVOC), Hamburg, Germany
| | - Haldun Oguz
- Department of Otolaryngology, Fonomer, Ankara, Turkey
| | - Marc J Remacle
- Department of Otolaryngology-Head Neck Surgery, Foch Hospital, University of Paris Saclay, Paris, France
- Department of Otorhinolaryngology-Head and Neck Surgery, Center Hospitalier de Luxembourg, Eich, Luxembourg
| | | | - Miroslav Tedla
- Department of Otolaryngology, Head and Neck Surgery, Comenius University, University Hospital, Bratislava, Slovakia
| | - Antonio Schindler
- Department of Biomedical and Clinical Sciences, Università degli Studi di Milano, Milan, Italy
| | - Isabel Vilaseca
- Department of Otorhinolaryngology, Hospital Clínic, Barcelona, Spain
- University of Barcelona, Barcelona, Spain
| | - Michal Zabrodsky
- Department of Otorhinolaryngology and Head and Neck Surgery, University Hospital Motol, First Faculty of Medicine, Charles University, Prague, Czech Republic
| | - Frederik G Dikkers
- Department of Otorhinolaryngology-Head and Neck Surgery, Amsterdam UMC Location AMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Lise Crevier-Buchman
- Department of Otolaryngology-Head Neck Surgery, Foch Hospital, University of Paris Saclay, Paris, France
- Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Paris, France
| |
Collapse
|
5
|
Lopes BP, Korn GP, Nunes FB, Gama ACC. Immediate effects of the incentive spirometer in women with healthy voice. Codas 2023; 36:e20220291. [PMID: 37970892 DOI: 10.1590/2317-1782/20232022291pt] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 05/16/2023] [Indexed: 11/19/2023] Open
Abstract
PURPOSE To evaluate the immediate effect of the incentive spirometer on acoustic measures, aerodynamic measures and on the auditory-perceptual assessment of vocal quality in vocally healthy women. METHODS This is an experimental intra-subject comparison study with the participation of 22 women without vocal complaints. Acoustic measures, aerodynamic measures and auditory-perceptual assessment of vocal quality were obtained before and immediately after using the incentive spirometer by the participants. The device was used in the orthostatic position and the participants performed three sets of ten repetitions with a one-minute interval between sets. RESULTS After using the incentive spirometer, there was a significant reduction in jitter, shimmer and PPQ (period perturbation quotient) measurements and an increase in maximum expiratory volume, while the other acoustic and aerodynamic measurements were not significantly impacted. In addition, there was improvement in vocal quality in eight (36.4%) participants and 11 (50.0%) participants showed no changes in the auditory perceptual assessment of voice quality after using the incentive spirometer. CONCLUSION The use of the incentive spirometer is safe and, in its immediate effect, positively impacts the acoustic measures of short-term aperiodicity of frequency and intensity and increases the maximum expiratory volume in women with healthy voices.
Collapse
Affiliation(s)
- Bárbara Pereira Lopes
- Programa de Pós-graduação em Ciências Fonoaudiológicas, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG), Brasil
| | - Gustavo Polacow Korn
- Departamento de Otorrinolaringologia, Faculdade de Medicina, Universidade Federal de São Paulo - UNIFESP - São Paulo (SP) Brasil
| | - Flávio Barbosa Nunes
- Departamento de Otorrinolaringologia, Faculdade de Medicina, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG), Brasil
| | - Ana Cristina Côrtes Gama
- Departamento de Fonoaudiologia, Faculdade de Medicina, Universidade Federal de Minas Gerais - UFMG - Belo Horizonte (MG) Brasil
| |
Collapse
|
6
|
Silek H, Dogan M. Voice Analysis in Patients with Essential Tremor. J Voice 2023:S0892-1997(23)00144-3. [PMID: 37336699 DOI: 10.1016/j.jvoice.2023.04.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 04/24/2023] [Accepted: 04/24/2023] [Indexed: 06/21/2023]
Abstract
OBJECTIVE The objective of this study was to reveal the phonetic characteristics of patients with or without voice tremor in patients with essential tremor (ET), determine whether these phonetic features are ET specific, and test the influence of ET on vocal tremor. METHODS The study included a total of 30 patients with ET and 29 healthy volunteers. The severity of ET was evaluated using the Washington Heights Inwood Genetic Study of Essential Tremor (WHIGET) tremor rating scale. Patients with major tremor complaints for at least 3years, WHIGET scoring scores below 15, and patients newly diagnosed in our clinic and for whom drug therapy has not yet been started were selected. RESULTS A total of 59 participants (n = 34 with ET and n = 25 as control) were included in the study. The ages of the participants ranged from 20 to 82years, with a mean age of 54.50 ± 15.04years. The gender distribution was 57.6% male and 42.4% female, and there was no statistically significant difference between the two groups in terms of age and gender. The study found that individuals with ET had significantly higher jitter, shimmer, S/Z, Pataka, frequency tremor intensity index, amplitude tremor intensity index, and frequency tremor power index values than the control group. However, there was no statistically significant difference between the two groups in terms of MPT, frequency tremor cyclicality, amplitude tremor cyclicality, frequency tremor frequency, and amplitude tremor frequency values. CONCLUSION Our study shows that, even in the absence of essential voice tremor, there is an effect of ET on voice quality. These findings contribute to the understanding of the nonmotor symptoms of ET and may aid in the diagnosis and management of this condition. Further research is needed to explore the potential use of acoustic analysis parameters in the diagnosis and monitoring of ET.
Collapse
Affiliation(s)
- Hakan Silek
- Department of Neurology, Faculty of Medicine, Yeditepe University, Istanbul, Turkey.
| | - Muzeyyen Dogan
- Department of Otolaryngology and Head & Neck Surgery, Faculty of Medicine, Yeditepe University, Istanbul, Turkey
| |
Collapse
|
7
|
Garbey M, Joerger G, Lesport Q, Girma H, McNett S, Abu-Rub M, Kaminski H. A Digital Telehealth System to Compute the Myasthenia Gravis Core Examination Metrics. JMIR NEUROTECHNOLOGY 2023; 2:e43387. [PMID: 37435094 PMCID: PMC10334459 DOI: 10.2196/43387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/13/2023]
Abstract
Background Telemedicine practice for neurological diseases has grown significantly during the COVID-19 pandemic.Telemedicine offers an opportunity to assess digitalization of examinations and enhances access to modern computer vision and artificial intelligence processing to annotate and quantify examinations in a consistent and reproducible manner. The Myasthenia Gravis Core Examination (MG-CE) has been recommended for the telemedicine evaluation of patients with myasthenia gravis. Objective We aimed to assess the ability to take accurate and robust measurements during the examination, which would allow improvement in workflow efficiency by making the data acquisition and analytics fully automatic and thereby limit the potential for observation bias. Methods We used Zoom (Zoom Video Communications) videos of patients with myasthenia gravis undergoing the MG-CE. The core examination tests required 2 broad categories of processing. First, computer vision algorithms were used to analyze videos with a focus on eye or body motions. Second, for the assessment of examinations involving vocalization, a different category of signal processing methods was required. In this way, we provide an algorithm toolbox to assist clinicians with the MG-CE. We used a data set of 6 patients recorded during 2 sessions. Results Digitalization and control of quality of the core examination are advantageous and let the medical examiner concentrate on the patient instead of managing the logistics of the test. This approach showed the possibility of standardized data acquisition during telehealth sessions and provided real-time feedback on the quality of the metrics the medical doctor is assessing. Overall, our new telehealth platform showed submillimeter accuracy for ptosis and eye motion. In addition, the method showed good results in monitoring muscle weakness, demonstrating that continuous analysis is likely superior to pre-exercise and post-exercise subjective assessment. Conclusions We demonstrated the ability to objectively quantitate the MG-CE. Our results indicate that the MG-CE should be revisited to consider some of the new metrics that our algorithm identified. We provide a proof of concept involving the MG-CE, but the method and tools developed can be applied to many neurological disorders and have great potential to improve clinical care.
Collapse
Affiliation(s)
- Marc Garbey
- Department of Surgery, School of Medicine & Health Sciences, George Washington University, Washington, DC, United States
- ORintelligence LLC, Houston, TX, United States
- Laboratoire des Sciences de l’Ingénieur pour l’Environnement (LaSIE UMR-CNRS 7356), University of La Rochelle, La Rochelle, France
- Care Constitution Corporation, Washington, DC, United States
| | - Guillaume Joerger
- ORintelligence LLC, Houston, TX, United States
- Care Constitution Corporation, Washington, DC, United States
| | - Quentin Lesport
- Department of Surgery, School of Medicine & Health Sciences, George Washington University, Washington, DC, United States
- Laboratoire des Sciences de l’Ingénieur pour l’Environnement (LaSIE UMR-CNRS 7356), University of La Rochelle, La Rochelle, France
- Care Constitution Corporation, Washington, DC, United States
| | - Helen Girma
- Department of Neurology & Rehabilitation Medicine, School of Medicine & Health Sciences, George Washington University, Washington, DC, United States
| | - Sienna McNett
- Department of Neurology & Rehabilitation Medicine, School of Medicine & Health Sciences, George Washington University, Washington, DC, United States
| | - Mohammad Abu-Rub
- Department of Neurology & Rehabilitation Medicine, School of Medicine & Health Sciences, George Washington University, Washington, DC, United States
| | - Henry Kaminski
- Department of Neurology & Rehabilitation Medicine, School of Medicine & Health Sciences, George Washington University, Washington, DC, United States
| |
Collapse
|
8
|
Alam MZ, Simonetti A, Brillantino R, Tayler N, Grainge C, Siribaddana P, Nouraei SAR, Batchelor J, Rahman MS, Mancuzo EV, Holloway JW, Holloway JA, Rezwan FI. Predicting Pulmonary Function From the Analysis of Voice: A Machine Learning Approach. Front Digit Health 2022; 4:750226. [PMID: 35211691 PMCID: PMC8861188 DOI: 10.3389/fdgth.2022.750226] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 01/14/2022] [Indexed: 11/21/2022] Open
Abstract
Introduction To self-monitor asthma symptoms, existing methods (e.g. peak flow metre, smart spirometer) require special equipment and are not always used by the patients. Voice recording has the potential to generate surrogate measures of lung function and this study aims to apply machine learning approaches to predict lung function and severity of abnormal lung function from recorded voice for asthma patients. Methods A threshold-based mechanism was designed to separate speech and breathing from 323 recordings. Features extracted from these were combined with biological factors to predict lung function. Three predictive models were developed using Random Forest (RF), Support Vector Machine (SVM), and linear regression algorithms: (a) regression models to predict lung function, (b) multi-class classification models to predict severity of lung function abnormality, and (c) binary classification models to predict lung function abnormality. Training and test samples were separated (70%:30%, using balanced portioning), features were normalised, 10-fold cross-validation was used and model performances were evaluated on the test samples. Results The RF-based regression model performed better with the lowest root mean square error of 10·86. To predict severity of lung function impairment, the SVM-based model performed best in multi-class classification (accuracy = 73.20%), whereas the RF-based model performed best in binary classification models for predicting abnormal lung function (accuracy = 85%). Conclusion Our machine learning approaches can predict lung function, from recorded voice files, better than published approaches. This technique could be used to develop future telehealth solutions including smartphone-based applications which have potential to aid decision making and self-monitoring in asthma.
Collapse
Affiliation(s)
- Md. Zahangir Alam
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - Albino Simonetti
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- Department of Information and Electrical Engineering and Applied Mathematics/DIEM, University of Salerno, Fisciano, Italy
| | - Raffaele Brillantino
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- Department of Information and Electrical Engineering and Applied Mathematics/DIEM, University of Salerno, Fisciano, Italy
| | - Nick Tayler
- Peter Doherty Institute, The University of Melbourne, Melbourne, VIC, Australia
| | - Chris Grainge
- Hunter Medical Research Institute, The University of Newcastle, Newcastle, NSW, Australia
- Department of Respiratory Medicine, John Hunter Hospital, Newcastle, NSW, Australia
| | - Pandula Siribaddana
- Postgraduate Institute of Medicine, University of Colombo, Colombo, Sri Lanka
| | - S. A. Reza Nouraei
- Clinical Informatics Research Unit, University of Southampton, Southampton, United Kingdom
- Robert White Centre for Airway Voice and Swallowing, Poole Hospital, Poole, United Kingdom
| | - James Batchelor
- Clinical Informatics Research Unit, University of Southampton, Southampton, United Kingdom
| | - M. Sohel Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - Eliane V. Mancuzo
- Medical School, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - John W. Holloway
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- National Institute for Health Research Southampton Biomedical Research Centre, University Hospital Southampton, Southampton, United Kingdom
| | - Judith A. Holloway
- Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- MSc Allergy, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Faisal I. Rezwan
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- Department of Computer Science, Aberystwyth University, Aberystwyth, United Kingdom
- *Correspondence: Faisal I. Rezwan
| |
Collapse
|
9
|
Black RJ, Novakovic D, Plit M, Miles A, MacDonald P, Madill C. Swallowing and laryngeal complications in lung and heart transplantation: Etiologies and diagnosis. J Heart Lung Transplant 2021; 40:1483-1494. [PMID: 34836605 DOI: 10.1016/j.healun.2021.08.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 07/29/2021] [Accepted: 08/19/2021] [Indexed: 10/20/2022] Open
Abstract
Despite continued surgical advancements in the field of cardiothoracic transplantation, post-operative complications remain a burden for the patient and the multidisciplinary team. Lesser-known complications including swallowing disorders (dysphagia), and voice disorders (dysphonia), are now being reported. Such disorders are known to be associated with increased morbidity and mortality in other medical populations, however their etiology amongst the heart and lung transplant populations has received little attention in the literature. This paper explores the potential mechanisms of oropharyngeal dysphagia and dysphonia following transplantation and discusses optimal modalities of diagnostic evaluation and management. A greater understanding of the implications of swallowing and laryngeal dysfunction in the heart and lung transplant populations is important to expedite early diagnosis and management in order to optimize patient outcomes, minimize allograft injury and improve quality of life.
Collapse
Affiliation(s)
- Rebecca J Black
- Speech Pathology Department, St Vincent's Hospital, Darlinghurst, NSW, Australia; Faculty of Medicine and Health, The University of Sydney, Australia.
| | - Daniel Novakovic
- Faculty of Medicine and Health, The University of Sydney, Australia
| | | | | | - Peter MacDonald
- Faculty of Medicine and Health, The University of Sydney, Australia
| | - Catherine Madill
- Faculty of Medicine and Health, The University of Sydney, Australia
| |
Collapse
|
10
|
Vatanparvar K, Nathan V, Nemati E, Rahman MM, McCaffrey D, Kuang J, Gao JA. SpeechSpiro: Lung Function Assessment from Speech Pattern as an Alternative to Spirometry for Mobile Health Tracking. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:7237-7243. [PMID: 34892769 DOI: 10.1109/embc46164.2021.9630077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Respiratory illnesses are common in the United States and globally; people deal with these illnesses in various forms, such as asthma, chronic obstructive pulmonary diseases, or infectious respiratory diseases (e.g., coronavirus). The lung function of subjects affected by these illnesses degrades due to infection or inflammation in their respiratory airways. Typically, lung function is assessed using in-clinic medical equipment, and quite recently, via portable spirometry devices. Research has shown that the obstruction and restriction in the respiratory airways affect individuals' voice characteristics. Hence, audio features could play a role in predicting the lung function and severity of the obstruction. In this paper, we go beyond well-known voice audio features and create a hybrid deep learning model using CNN-LSTM to discover spatiotemporal patterns in speech and predict the lung function parameters with accuracy comparable to conventional devices. We validate the performance and generalizability of our method using the data collected from 201 subjects enrolled in two studies internally and in collaboration with a pulmonary hospital. SpeechSpiro measures lung function parameters (e.g., forced vital capacity) with a mean normalized RMSE of 12% and R2 score of up to 76% using 60-second phone audio recordings of individuals reading a passage.Clinical relevance - Speech-based spirometry has the potential to eliminate the need for an additional device to carry out the lung function assessment outside clinical settings; hence, it can enable continuous and mobile track of the individual's condition, healthy or with a respiratory illness, using a smartphone.
Collapse
|
11
|
Assessment of Dysphonia in Children with Pompe Disease Using Auditory-Perceptual and Acoustic/Physiologic Methods. J Clin Med 2021; 10:jcm10163617. [PMID: 34441913 PMCID: PMC8396833 DOI: 10.3390/jcm10163617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 08/07/2021] [Accepted: 08/11/2021] [Indexed: 11/17/2022] Open
Abstract
Bulbar and respiratory weakness occur commonly in children with Pompe disease and frequently lead to dysarthria. However, changes in vocal quality associated with this motor speech disorder are poorly described. The goal of this study was to characterize the vocal function of children with Pompe disease using auditory-perceptual and physiologic/acoustic methods. High-quality voice recordings were collected from 21 children with Pompe disease. The Grade, Roughness, Breathiness, Asthenia, and Strain (GRBAS) scale was used to assess voice quality and ratings were compared to physiologic/acoustic measurements collected during sustained phonation tasks, reading of a standard passage, and repetition of a short phrase at maximal volume. Based on ratings of grade, dysphonia was present in 90% of participants and was most commonly rated as mild or moderate in severity. Duration of sustained phonation tasks was reduced and shimmer was increased in comparison to published reference values for children without dysphonia. Specific measures of loudness were found to have statistically significant relationships with perceptual ratings of grade, breathiness, asthenia, and strain. Our data suggest that dysphonia is common in children with Pompe disease and primarily reflects impairments in respiratory and laryngeal function; however, the primary cause of dysphonia remains unclear. Future studies should seek to quantify the relative contribution of deficits in individual speech subsystems on voice quality and motor speech performance more broadly.
Collapse
|
12
|
Arnold RJ, Gaskill CS, Bausek N. Effect of Combined Respiratory Muscle Training (cRMT) on Dysphonia following Single CVA: A Retrospective Pilot Study. J Voice 2021:S0892-1997(21)00109-0. [PMID: 33992476 DOI: 10.1016/j.jvoice.2021.03.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 03/27/2021] [Accepted: 03/30/2021] [Indexed: 10/21/2022]
Abstract
BACKGROUND Although dysphonia is less prevalent than dysphagia following cerebrovascular accidents, dysphonia does contribute to the burden of disease resulting from stroke. Strengthening muscles of the larynx and respiratory tract through respiratory muscle training (RMT) has proven effective in improving voice after neurological insult. However, approaches to strengthen only the expiratory muscle groups (EMST) dominate the clinical study literature, with variable outcomes. By focusing on exhalation, the contribution of inspiratory muscles to phonation may have been overlooked. This study investigated the effect of combined respiratory muscle training (cRMT) to improve voice function in stroke patients. METHODS Recorded data of twenty patients with dysphonia following stroke were allocated to an intervention (IG) or a control group (CG) based upon whether they chose cRMT or not while awaiting pro bono voice therapy services. The intervention group (n = 10) was treated daily with three 5-minute sessions of combined resistive respiratory muscle training for 28 days, while the control group (n = 10) received no cRMT or other exercise intervention. Perceptual and acoustic measurements as well as a pulmonary function test were assessed pre-and post-intervention. RESULTS The intervention group demonstrated significant improvements after 28 days of cRMT in peak flow (127%), patient self-perception of voice improvement (84.41%), as well as in five of the six categories of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) overall severity (63.22%), breathiness (61.06%), strain (63.43%), pitch range (48.11%) and loudness (57.51%), compared to the control group who did not receive treatment. Furthermore, cRMT also led to significant improvements in maximum phonation time (212.5%), acoustic parameters of vocal intensity, and total semitone range (165.45%). CONCLUSIONS This pilot study shows promise of the feasibility and effectiveness of cRMT to lessen the signs and symptoms of dysphonia while simultaneously improving breath support.
Collapse
Affiliation(s)
- Robert J Arnold
- Chief Clinical Officer, Applied Clinical Scientist, Southeastern Biocommunication Associates, LLC., Birmingham, Alabama
| | - Christopher S Gaskill
- Consulting Voice Scientist, Southeastern Biocommunication Associates, LLC., Birmingham, Alabama
| | - Nina Bausek
- Research Collaborator, Department of Cardiovascular Medicine, Mayo Clinic, Rochester, Minnesota.
| |
Collapse
|