1
|
Rogers HP, Hseu A, Kim J, Silberholz E, Jo S, Dorste A, Jenkins K. Voice as a Biomarker of Pediatric Health: A Scoping Review. CHILDREN (BASEL, SWITZERLAND) 2024; 11:684. [PMID: 38929263 PMCID: PMC11201680 DOI: 10.3390/children11060684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 05/24/2024] [Accepted: 05/29/2024] [Indexed: 06/28/2024]
Abstract
The human voice has the potential to serve as a valuable biomarker for the early detection, diagnosis, and monitoring of pediatric conditions. This scoping review synthesizes the current knowledge on the application of artificial intelligence (AI) in analyzing pediatric voice as a biomarker for health. The included studies featured voice recordings from pediatric populations aged 0-17 years, utilized feature extraction methods, and analyzed pathological biomarkers using AI models. Data from 62 studies were extracted, encompassing study and participant characteristics, recording sources, feature extraction methods, and AI models. Data from 39 models across 35 studies were evaluated for accuracy, sensitivity, and specificity. The review showed a global representation of pediatric voice studies, with a focus on developmental, respiratory, speech, and language conditions. The most frequently studied conditions were autism spectrum disorder, intellectual disabilities, asphyxia, and asthma. Mel-Frequency Cepstral Coefficients were the most utilized feature extraction method, while Support Vector Machines were the predominant AI model. The analysis of pediatric voice using AI demonstrates promise as a non-invasive, cost-effective biomarker for a broad spectrum of pediatric conditions. Further research is necessary to standardize the feature extraction methods and AI models utilized for the evaluation of pediatric voice as a biomarker for health. Standardization has significant potential to enhance the accuracy and applicability of these tools in clinical settings across a variety of conditions and voice recording types. Further development of this field has enormous potential for the creation of innovative diagnostic tools and interventions for pediatric populations globally.
Collapse
Affiliation(s)
- Hannah Paige Rogers
- Department of Cardiology, Boston Children’s Hospital, Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA
| | - Anne Hseu
- Department of Otolaryngology, Boston Children’s Hospital, 333 Longwood Ave, Boston, MA 02115, USA
| | - Jung Kim
- Department of Pediatrics, Boston Children’s Hospital, Boston, MA 02115, USA
| | | | - Stacy Jo
- Department of Otolaryngology, Boston Children’s Hospital, 333 Longwood Ave, Boston, MA 02115, USA
| | - Anna Dorste
- Boston Children’s Hospital, 300 Longwood Avenue, Boston, MA 02115, USA
| | - Kathy Jenkins
- Department of Cardiology, Boston Children’s Hospital, Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA
| |
Collapse
|
2
|
Abdusalomov AB, Safarov F, Rakhimov M, Turaev B, Whangbo TK. Improved Feature Parameter Extraction from Speech Signals Using Machine Learning Algorithm. SENSORS (BASEL, SWITZERLAND) 2022; 22:8122. [PMID: 36365819 PMCID: PMC9654697 DOI: 10.3390/s22218122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 10/14/2022] [Accepted: 10/20/2022] [Indexed: 06/16/2023]
Abstract
Speech recognition refers to the capability of software or hardware to receive a speech signal, identify the speaker's features in the speech signal, and recognize the speaker thereafter. In general, the speech recognition process involves three main steps: acoustic processing, feature extraction, and classification/recognition. The purpose of feature extraction is to illustrate a speech signal using a predetermined number of signal components. This is because all information in the acoustic signal is excessively cumbersome to handle, and some information is irrelevant in the identification task. This study proposes a machine learning-based approach that performs feature parameter extraction from speech signals to improve the performance of speech recognition applications in real-time smart city environments. Moreover, the principle of mapping a block of main memory to the cache is used efficiently to reduce computing time. The block size of cache memory is a parameter that strongly affects the cache performance. In particular, the implementation of such processes in real-time systems requires a high computation speed. Processing speed plays an important role in speech recognition in real-time systems. It requires the use of modern technologies and fast algorithms that increase the acceleration in extracting the feature parameters from speech signals. Problems with overclocking during the digital processing of speech signals have yet to be completely resolved. The experimental results demonstrate that the proposed method successfully extracts the signal features and achieves seamless classification performance compared to other conventional speech recognition algorithms.
Collapse
Affiliation(s)
| | - Furkat Safarov
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Korea
| | - Mekhriddin Rakhimov
- Department of Artificial Intelligence, Tashkent University of Information Technologies Named after Muhammad Al-Khwarizmi, Tashkent 100200, Uzbekistan
| | - Boburkhon Turaev
- Department of Artificial Intelligence, Tashkent University of Information Technologies Named after Muhammad Al-Khwarizmi, Tashkent 100200, Uzbekistan
| | - Taeg Keun Whangbo
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Korea
| |
Collapse
|