1
|
Brahmi Z, Mahyoob M, Al-Sarem M, Algaraady J, Bousselmi K, Alblwi A. Exploring the Role of Machine Learning in Diagnosing and Treating Speech Disorders: A Systematic Literature Review. Psychol Res Behav Manag 2024; 17:2205-2232. [PMID: 38835654 PMCID: PMC11149643 DOI: 10.2147/prbm.s460283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Accepted: 05/07/2024] [Indexed: 06/06/2024] Open
Abstract
Purpose Speech disorders profoundly impact the overall quality of life by impeding social operations and hindering effective communication. This study addresses the gap in systematic reviews concerning machine learning-based assistive technology for individuals with speech disorders. The overarching purpose is to offer a comprehensive overview of the field through a Systematic Literature Review (SLR) and provide valuable insights into the landscape of ML-based solutions and related studies. Methods The research employs a systematic approach, utilizing a Systematic Literature Review (SLR) methodology. The study extensively examines the existing literature on machine learning-based assistive technology for speech disorders. Specific attention is given to ML techniques, characteristics of exploited datasets in the training phase, speaker languages, feature extraction techniques, and the features employed by ML algorithms. Originality This study contributes to the existing literature by systematically exploring the machine learning landscape in assistive technology for speech disorders. The originality lies in the focused investigation of ML-speech recognition for impaired speech disorder users over ten years (2014-2023). The emphasis on systematic research questions related to ML techniques, dataset characteristics, languages, feature extraction techniques, and feature sets adds a unique and comprehensive perspective to the current discourse. Findings The systematic literature review identifies significant trends and critical studies published between 2014 and 2023. In the analysis of the 65 papers from prestigious journals, support vector machines and neural networks (CNN, DNN) were the most utilized ML technique (20%, 16.92%), with the most studied disease being Dysarthria (35/65, 54% studies). Furthermore, an upsurge in using neural network-based architectures, mainly CNN and DNN, was observed after 2018. Almost half of the included studies were published between 2021 and 2022).
Collapse
Affiliation(s)
- Zaki Brahmi
- Department of Computer Science, Taibah University, Madina, Kingdom of Saudi Arabia
| | - Mohammad Mahyoob
- Department of Languages and Translation, Taibah University, Madina, Kingdom of Saudi Arabia
| | - Mohammed Al-Sarem
- Department of Computer Science, Taibah University, Madina, Kingdom of Saudi Arabia
| | | | - Khadija Bousselmi
- Department of Computer Science, LISTIC, University of Savoie Mont Blanc, Chambéry, France
| | - Abdulaziz Alblwi
- Department of Computer Science, Taibah University, Madina, Kingdom of Saudi Arabia
| |
Collapse
|
2
|
Woisard V, Balaguer M, Fredouille C, Farinas J, Ghio A, Lalain M, Puech M, Astesano C, Pinquier J, Lepage B. Construction of an automatic score for the evaluation of speech disorders among patients treated for a cancer of the oral cavity or the oropharynx: The Carcinologic Speech Severity Index. Head Neck 2021; 44:71-88. [PMID: 34729847 DOI: 10.1002/hed.26903] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 08/15/2021] [Accepted: 10/05/2021] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND Speech disorders impact quality of life for patients treated with oral cavity and oropharynx cancers. However, there is a lack of uniform and applicable methods for measuring the impact on speech production after treatment in this tumor location. OBJECTIVE The objective of this work is to (1) model an automatic severity index of speech applicable in clinical practice, that is equivalent or superior to a severity score obtained by human listeners, via several acoustics parameters extracted (a) directly from speech signal and (b) resulting from speech processing and (2) derive an automatic speech intelligibility classification (i.e., mild, moderate, severe) to predict speech disability and handicap by combining the listener comprehension score with self-reported quality of life related to speech. METHODS Eighty-seven patients treated for cancer of the oral cavity or the oropharynx and 35 controls performed different tasks of speech production and completed questionnaires on speech-related quality of life. The audio recordings were then evaluated by human perception and automatic speech processing. Then, a score was developed through a classic logistic regression model allowing description of the severity of patients' speech disorders. RESULTS Among the group of parameters subject to extraction from automatic processing of the speech signal, six were retained, producing a correlation at 0.87 with the perceptual reference score, 0.77 with the comprehension score, and 0.5 with speech-related quality of life. The parameters that contributed the most are based on automatic speech recognition systems. These are mainly the automatic average normalized likelihood score on a text reading task and the score of cumulative rankings on pseudowords. The reduced automatic YC2SI is modeled in this way: YC2SIp = 11.48726 + (1.52926 × Xaveraged normalized likelihood reading ) + (-1.94e-06 × Xscore of cumulative ranks pseudowords ). CONCLUSION Automatic processing of speech makes it possible to arrive at valid, reliable, and reproducible parameters able to serve as references in the framework of follow-up of patients treated for cancer of the oral cavity or the oropharynx.
Collapse
Affiliation(s)
- Virginie Woisard
- ENT Department, University Hospital of Toulouse, Toulouse, France.,Oncorehabilation Unit, University Institute of Cancer of Toulouse Oncopole, Toulouse, France.,Laboratoire Octogone-Lordat, Jean Jaures University Toulouse II, Toulouse, France
| | - Mathieu Balaguer
- ENT Department, University Hospital of Toulouse, Toulouse, France.,Institut de Recherche en Informatique de Toulouse, CNRS, Paul Sabatier University Toulouse III, Toulouse, France
| | - Corinne Fredouille
- Laboratoire d'Informatique d'Avignon, Avignon University, Avignon, France
| | - Jérôme Farinas
- Oncorehabilation Unit, University Institute of Cancer of Toulouse Oncopole, Toulouse, France
| | - Alain Ghio
- Laboratoire Parole et Langage, Aix-Marseille University, Marseille, France
| | - Muriel Lalain
- Laboratoire Parole et Langage, Aix-Marseille University, Marseille, France
| | - Michèle Puech
- ENT Department, University Hospital of Toulouse, Toulouse, France.,Oncorehabilation Unit, University Institute of Cancer of Toulouse Oncopole, Toulouse, France
| | - Corine Astesano
- Laboratoire Octogone-Lordat, Jean Jaures University Toulouse II, Toulouse, France
| | - Julien Pinquier
- Oncorehabilation Unit, University Institute of Cancer of Toulouse Oncopole, Toulouse, France
| | - Benoît Lepage
- ENT Department, University Hospital of Toulouse, Toulouse, France.,USMR, Université Paul Sabatier Toulouse III, Toulouse, France
| |
Collapse
|
3
|
Perspectives on Speech and Language Interaction for Daily Assistive Technology. ACM TRANSACTIONS ON ACCESSIBLE COMPUTING 2015. [DOI: 10.1145/2791576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
4
|
Perspectives on Speech and Language Interaction for Daily Assistive Technology. ACM TRANSACTIONS ON ACCESSIBLE COMPUTING 2015. [DOI: 10.1145/2756765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|