Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ghasemzadeh H, Tajik Khass M, Khalil Arjmandi M, Pooyan M. Detection of vocal disorders based on phase space parameters and Lyapunov spectrum. Biomed Signal Process Control 2015. [DOI: 10.1016/j.bspc.2015.07.002] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

For:	Ghasemzadeh H, Tajik Khass M, Khalil Arjmandi M, Pooyan M. Detection of vocal disorders based on phase space parameters and Lyapunov spectrum. Biomed Signal Process Control 2015. [DOI: 10.1016/j.bspc.2015.07.002] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Number

Cited by Other Article(s)

Ghasemzadeh H, Hillman RE, Mehta DD. Consistency of the Signature of Phonotraumatic Vocal Hyperfunction Across Different Ambulatory Voice Measures. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024:1-24. [PMID: 38861454 DOI: 10.1044/2024_jslhr-23-00515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2024]

Abstract

PURPOSE

Although different factors and voice measures have been associated with phonotraumatic vocal hyperfunction (PVH), it is unclear what percentage of individuals with PVH exhibit such differences during their daily lives. This study used a machine learning approach to quantify the consistency with which PVH manifests according to ambulatory voice measures. Analyses included acoustic parameters of phonation as well as temporal aspects of phonation and rest, with the goal of determining optimally consistent signatures of PVH.

METHOD

Ambulatory neck-surface acceleration signals were recorded over 1 week from 116 female participants diagnosed with PVH and age-, sex-, and occupation-matched vocally healthy controls. The consistency of the manifestation of PVH was defined as the percentage of participants in each group that exhibited an atypical signature based on a target voice measure. Evaluation of each machine learning model used nested 10-fold cross-validation to improve the generalizability of findings. In Experiment 1, we trained separate logistic regression models based on the distributional characteristics of 14 voice measures and durations of voicing and resting segments. In Experiments 2 and 3, features of voicing and resting duration augmented the existing distributional characteristics to examine whether more consistent signatures would result.

RESULTS

Experiment 1 showed that the difference in the magnitude of the first two harmonics (H1-H2) exhibited the most consistent signature (69.4% of participants with PVH and 20.4% of controls had an atypical H1-H2 signature), followed by spectral tilt over eight harmonics (73.6% participants with PVH and 32.1% of controls had an atypical spectral tilt signature) and estimated sound pressure level (SPL; 66.9% participants with PVH and 27.6% of controls had an atypical SPL signature). Additionally, 77.6% of participants with PVH had atypical resting duration, with 68.9% exhibiting atypical voicing duration. Experiments 2 and 3 showed that augmenting the best-performing voice measures with univariate features of voicing or resting durations yielded only incremental improvement in the classifier's performance.

CONCLUSIONS

Females with PVH were more likely to use more abrupt vocal fold closure (lower H1-H2), phonate louder (higher SPL), and take shorter vocal rests. They were also less likely to use higher fundamental frequency during their daily activities. The difference in the voicing duration signature between participants with PVH and controls had a large effect size, providing strong empirical evidence regarding the role of voice use in the development of PVH.

Collapse

Ghasemzadeh H, Hillman RE, Mehta DD. Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Estimating Sample Size and Reducing Overfitting. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024;67:753-781. [PMID: 38386017 PMCID: PMC11005022 DOI: 10.1044/2023_jslhr-23-00273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/29/2023] [Accepted: 12/19/2023] [Indexed: 02/23/2024]

Abstract

PURPOSE

Many studies using machine learning (ML) in speech, language, and hearing sciences rely upon cross-validations with single data splitting. This study's first purpose is to provide quantitative evidence that would incentivize researchers to instead use the more robust data splitting method of nested k-fold cross-validation. The second purpose is to present methods and MATLAB code to perform power analysis for ML-based analysis during the design of a study.

METHOD

First, the significant impact of different cross-validations on ML outcomes was demonstrated using real-world clinical data. Then, Monte Carlo simulations were used to quantify the interactions among the employed cross-validation method, the discriminative power of features, the dimensionality of the feature space, the dimensionality of the model, and the sample size. Four different cross-validation methods (single holdout, 10-fold, train-validation-test, and nested 10-fold) were compared based on the statistical power and confidence of the resulting ML models. Distributions of the null and alternative hypotheses were used to determine the minimum required sample size for obtaining a statistically significant outcome (5% significance) with 80% power. Statistical confidence of the model was defined as the probability of correct features being selected for inclusion in the final model.

RESULTS

ML models generated based on the single holdout method had very low statistical power and confidence, leading to overestimation of classification accuracy. Conversely, the nested 10-fold cross-validation method resulted in the highest statistical confidence and power while also providing an unbiased estimate of accuracy. The required sample size using the single holdout method could be 50% higher than what would be needed if nested k-fold cross-validation were used. Statistical confidence in the model based on nested k-fold cross-validation was as much as four times higher than the confidence obtained with the single holdout-based model. A computational model, MATLAB code, and lookup tables are provided to assist researchers with estimating the minimum sample size needed during study design.

CONCLUSION

The adoption of nested k-fold cross-validation is critical for unbiased and robust ML studies in the speech, language, and hearing sciences.

SUPPLEMENTAL MATERIAL

https://doi.org/10.23641/asha.25237045.

Collapse

Ghasemzadeh H, Doyle PC, Searl J. Image representation of the acoustic signal: An effective tool for modeling spectral and temporal dynamics of connected speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022;152:580. [PMID: 35931551 PMCID: PMC9458292 DOI: 10.1121/10.0012734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 06/09/2022] [Accepted: 06/30/2022] [Indexed: 06/15/2023]

Alhussain G, Shuweihdi F, Abd-alrazaq A, Alali H, Househ M. The Effectiveness of Supervised Machine Learning in Screening and Diagnosing Voice Disorders: A Systematic Review and Meta-Analysis (Preprint).. [DOI: 10.2196/preprints.38472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]

Abstract

BACKGROUND

Voice screening and diagnosis are processes that are used during voice disorders investigations. Both have limited standardized tests, which are affected by the clinician’s experience and subjective judgment. Machine learning (ML) algorithms were introduced and employed in screening/diagnosing patients’ voices as an objective tool. The effectiveness of ML algorithms in assessing and diagnosing voice disorders has been investigated by numerous studies.

OBJECTIVE

This systematic review aims to assess the effectiveness of ML algorithms in screening and diagnosing voice disorders.

METHODS

An electronic search was conducted in five databases. We included studies that examined the performance (accuracy, sensitivity, and specificity) of any ML algorithms in detecting abnormal voice samples. Two reviewers independently selected the studies, extracted data from the included studies, and assessed the risk of bias in the included studies. The methodological quality of each study was assessed using the QUADAS-2 tool. Characteristics of studies, population, and index tests were extracted. Meta-analyses were conducted for pooling accuracy, sensitivity, and specificity of ML techniques. Sources of heterogeneity were addressed by excluding some studies and discussing the possible sources of it.

RESULTS

Out of 1409 records retrieved, 13 studies were included (participants: 4079) in this review. Thirteen machine learning techniques were used in the included studies, but the most commonly used technique was SVM. The pooled accuracy, sensitivity, and specificity of ML techniques in screening voice disorders were 93%, 96%, and 93%, respectively. LS-SVM had the highest accuracy (99%) while K-NN had the highest sensitivity (98%) and specificity (98%). Quadric Discriminant analysis (QDA) achieved the lowest accuracy (91%), sensitivity (89%), and specificity (89%).

CONCLUSIONS

ML showed promising findings in screening voice disorders. However, the findings could not be conclusive in diagnosing voice disorders due to the limited number of studies that used ML for diagnosing purposes, thus, more investigations need to be made. Accordingly, it might not be possible to use ML as a substitution for the current diagnostic tools. Instead, it might be used as a decision support tool for clinicians to assess their patients, this could improve the management process for voice disorders assessment.

Collapse

Al-Hussain G, Shuweihdi F, Alali H, Househ M, Abd-Alrazaq A. The Effectiveness of Supervised Machine Learning in Screening and Diagnosing Voice Disorders: A Systematic Review and Meta-Analysis (Preprint). J Med Internet Res 2022;24:e38472. [PMID: 36239999 PMCID: PMC9617188 DOI: 10.2196/38472] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 06/17/2022] [Accepted: 07/28/2022] [Indexed: 11/13/2022] Open

Voice pathology detection by using the deep network architecture. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107310] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Jeffrey Kuo CF, Li YC, Weng WH, Pinos Leon KB, Chu YH. Applied image processing techniques in video laryngoscope for occult tumor detection. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2019.101633] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Ghasemzadeh H, Deliyski DD, Ford DS, Kobler JB, Hillman RE, Mehta DD. Method for Vertical Calibration of Laser-Projection Transnasal Fiberoptic High-Speed Videoendoscopy. J Voice 2019;34:847-861. [PMID: 31151853 DOI: 10.1016/j.jvoice.2019.04.015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Accepted: 04/29/2019] [Indexed: 01/28/2023]

Abstract

The ability to provide absolute calibrated measurement of the laryngeal structures during phonation is of paramount importance to voice science and clinical practice. Calibrated three-dimensional measurement could provide essential information for modeling purposes, for studying the developmental aspects of vocal fold vibration, for refining functional voice assessment and treatment outcomes evaluation, and for more accurate staging and grading of laryngeal disease. Recently, a laser-calibrated transnasal fiberoptic endoscope compatible with high-speed videoendoscopy (HSV) and capable of providing three-dimensional measurements was developed. The optical principle employed is to project a grid of 7 × 7 green laser points across the field of view (FOV) at an angle relative to the imaging axis, such that (after calibration) the position of each laser point within the FOV encodes the vertical distance from the tip of the endoscope to the laryngeal tissues. The purpose of this study was to develop a precise method for vertical calibration of the endoscope. Investigating the position of the laser points showed that, besides the vertical distance, they also depend on the parameters of the lens coupler, including the FOV position within the image frame and the rotation angle of the endoscope. The presented automatic calibration method was developed to compensate for the effect of these parameters. Statistical image processing and pattern recognition were used to detect the FOV, the center of FOV, and the fiducial marker. This step normalizes the HSV frames to a standard coordinate system and removes the dependence of the laser-point positions on the parameters of the lens coupler. Then, using a statistical learning technique, a calibration protocol was developed to model the trajectories of all laser points as the working distance was varied. Finally, a set of experiments was conducted to measure the accuracy and reliability of every step of the procedure. The system was able to measure absolute vertical distance with mean percent error in the range of 1.7% to 4.7%, depending on the working distance.

Collapse

On the design of automatic voice condition analysis systems. Part I: Review of concepts and an insight to the state of the art. Biomed Signal Process Control 2019. [DOI: 10.1016/j.bspc.2018.12.024] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Calibrated steganalysis of mp3stego in multi-encoder scenario. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2018.12.035] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Ankışhan H. Classification of acoustic signals with new feature: Fibonacci space (FSp). Biomed Signal Process Control 2019. [DOI: 10.1016/j.bspc.2018.08.037] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Multi-Scale Permutation Entropy Based on Improved LMD and HMM for Rolling Bearing Diagnosis. ENTROPY 2017. [DOI: 10.3390/e19040176] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]