1
|
Van Der Donckt J, Kappen M, Degraeve V, Demuynck K, Vanderhasselt MA, Van Hoecke S. Ecologically valid speech collection in behavioral research: The Ghent Semi-spontaneous Speech Paradigm (GSSP). Behav Res Methods 2024; 56:5693-5708. [PMID: 38091208 PMCID: PMC11335842 DOI: 10.3758/s13428-023-02300-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/17/2023] [Indexed: 08/21/2024]
Abstract
This paper introduces the Ghent Semi-spontaneous Speech Paradigm (GSSP), a new method for collecting unscripted speech data for affective-behavioral research in both experimental and real-world settings through the description of peer-rated pictures with a consistent affective load. The GSSP was designed to meet five criteria: (1) allow flexible speech recording durations, (2) provide a straightforward and non-interfering task, (3) allow for experimental control, (4) favor spontaneous speech for its prosodic richness, and (5) require minimal human interference to enable scalability. The validity of the GSSP was evaluated through an online task, in which this paradigm was implemented alongside a fixed-text read-aloud task. The results indicate that participants were able to describe images with an adequate duration, and acoustic analysis demonstrated a trend for most features in line with the targeted speech styles (i.e., unscripted spontaneous speech versus scripted read-aloud speech). A speech style classification model using acoustic features achieved a balanced accuracy of 83% on within-dataset validation, indicating separability between the GSSP and read-aloud speech task. Furthermore, when validating this model on an external dataset that contains interview and read-aloud speech, a balanced accuracy score of 70% is obtained, indicating an acoustic correspondence between the GSSP speech and spontaneous interviewee speech. The GSSP is of special interest for behavioral and speech researchers looking to capture spontaneous speech, both in longitudinal ambulatory behavioral studies and laboratory studies. To facilitate future research on speech styles, acoustics, and affective states, the task implementation code, the collected dataset, and analysis notebooks are available.
Collapse
Affiliation(s)
- Jonas Van Der Donckt
- IDLab, Ghent University - imec, Technologiepark Zwijnaarde 122, 9052, Ghent, Zwijnaarde, Belgium.
- Department of Electronics and Information Systems, Ghent University, Ghent, Belgium.
| | - Mitchel Kappen
- Department of Head and Skin, Ghent University, University Hospital Ghent (UZ Ghent), Department of Psychiatry and Medical Psychology, Corneel Heymanslaan 10, 9000, Gent, Belgium.
- Ghent Experimental Psychiatry (GHEP) Lab, Ghent University, Ghent, Belgium.
| | - Vic Degraeve
- IDLab, Ghent University - imec, Technologiepark Zwijnaarde 122, 9052, Ghent, Zwijnaarde, Belgium
- Department of Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Kris Demuynck
- IDLab, Ghent University - imec, Technologiepark Zwijnaarde 122, 9052, Ghent, Zwijnaarde, Belgium
- Department of Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Marie-Anne Vanderhasselt
- Department of Head and Skin, Ghent University, University Hospital Ghent (UZ Ghent), Department of Psychiatry and Medical Psychology, Corneel Heymanslaan 10, 9000, Gent, Belgium
- Ghent Experimental Psychiatry (GHEP) Lab, Ghent University, Ghent, Belgium
- Department of Experimental Clinical and Health Psychology, Ghent University, Ghent, Belgium
| | - Sofie Van Hoecke
- IDLab, Ghent University - imec, Technologiepark Zwijnaarde 122, 9052, Ghent, Zwijnaarde, Belgium
- Department of Electronics and Information Systems, Ghent University, Ghent, Belgium
| |
Collapse
|
2
|
Yang M, El-Attar AA, Chaspari T. Deconstructing demographic bias in speech-based machine learning models for digital health. Front Digit Health 2024; 6:1351637. [PMID: 39119589 PMCID: PMC11306200 DOI: 10.3389/fdgth.2024.1351637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 07/15/2024] [Indexed: 08/10/2024] Open
Abstract
Introduction Machine learning (ML) algorithms have been heralded as promising solutions to the realization of assistive systems in digital healthcare, due to their ability to detect fine-grain patterns that are not easily perceived by humans. Yet, ML algorithms have also been critiqued for treating individuals differently based on their demography, thus propagating existing disparities. This paper explores gender and race bias in speech-based ML algorithms that detect behavioral and mental health outcomes. Methods This paper examines potential sources of bias in the data used to train the ML, encompassing acoustic features extracted from speech signals and associated labels, as well as in the ML decisions. The paper further examines approaches to reduce existing bias via using the features that are the least informative of one's demographic information as the ML input, and transforming the feature space in an adversarial manner to diminish the evidence of the demographic information while retaining information about the focal behavioral and mental health state. Results Results are presented in two domains, the first pertaining to gender and race bias when estimating levels of anxiety, and the second pertaining to gender bias in detecting depression. Findings indicate the presence of statistically significant differences in both acoustic features and labels among demographic groups, as well as differential ML performance among groups. The statistically significant differences present in the label space are partially preserved in the ML decisions. Although variations in ML performance across demographic groups were noted, results are mixed regarding the models' ability to accurately estimate healthcare outcomes for the sensitive groups. Discussion These findings underscore the necessity for careful and thoughtful design in developing ML models that are capable of maintaining crucial aspects of the data and perform effectively across all populations in digital healthcare applications.
Collapse
Affiliation(s)
- Michael Yang
- Computer Science & Engineering, Texas A&M University, College Station, TX, United States
| | - Abd-Allah El-Attar
- Computer Science & Engineering, Texas A&M University Qatar, Al Rayyan, Qatar
| | - Theodora Chaspari
- Institute of Cognitive Science & Computer Science, University of Colorado Boulder, Boulder, CO, United States
| |
Collapse
|
3
|
Soula M, Messas NI, Aridhi S, Urbinelli R, Guyon A. Effects of trace element dietary supplements on voice parameters and some physiological and psychological parameters related to stress. Heliyon 2024; 10:e29127. [PMID: 38655294 PMCID: PMC11035998 DOI: 10.1016/j.heliyon.2024.e29127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 03/29/2024] [Accepted: 04/01/2024] [Indexed: 04/26/2024] Open
Abstract
Trace elements, often used as dietary supplements, are widely accessible without prescription at pharmacies. Pronutri has pioneered Nutripuncture®, a methodology that utilizes orally consumed trace elements to elicit a physiological response akin to that of acupuncture. Pronutri has empirically observed that the user's voice becomes deeper following an exclusive ingestion procedure. Given that alterations in vocal characteristics are often linked to stress, the Pronutri researchers postulated that the pills have the capacity to promptly alleviate stress upon ingestion. Nevertheless, there is a lack of scientific substantiation about the impact of these supplements on voice (or stress) indicators. The aim of this research was to determine whether there is a consistent impact of trace element ingestion on vocal characteristics, namely the fundamental frequency of the voice, as well as other physiological and psychological stress measurements. In order to achieve this objective, we have devised a unique methodology to examine this hypothesis. This involves conducting a monocentric crossover, randomized, triple-blind, placebo-controlled trial with a sample size of 43 healthy individuals. This study demonstrates that compared to placebo tablets, consuming 10 metal traces containing tablets at once is enough to cause noticeable changes in the vocal spectrum in the direction of an improvement of the voice timbre "richness", and a decrease in the occurrence of spontaneous electrodermal activity, suggesting a stress reduction. However, there were no significant changes observed in the other parameters that were tested. These parameters include vocal measures such as voice frequency F0, standard deviation from this frequency, jitter, and shimmer. Additionally, physiological measures such as respiratory rate, oxygenation and heart rate variability parameters, as well as psychological measures such as self-assessment analogic scales of anxiety, stress, muscle tension, and nervous tension, did not show any significant changes. Ultimately, our research revealed that the ingestion of 10 trace elements pills may promptly elicit a targeted impact on both vocal spectrum and electrodermal activity. Despite the limited impact, these findings warrant more research to explore the long-term effects of trace elements on voice and stress reduction.
Collapse
Affiliation(s)
- Maxime Soula
- Université Côte d'Azur, Institut Neuromod, Mod4NeuCog, France
| | | | - Slah Aridhi
- Sensoria Analytics, Sophia Antipolis, France
| | | | - Alice Guyon
- Université côte d'Azur, CNRS UMR 7275, Institut de Pharmacologie Moléculaire et Cellulaire, 660 route des Lucioles, 06560, Valbonne Sophia Antipolis, France
- Université Côte d'Azur, Institut Neuromod, Mod4NeuCog, France
| |
Collapse
|
4
|
Kappen M, Vanhollebeke G, Van Der Donckt J, Van Hoecke S, Vanderhasselt MA. Acoustic and prosodic speech features reflect physiological stress but not isolated negative affect: a multi-paradigm study on psychosocial stressors. Sci Rep 2024; 14:5515. [PMID: 38448417 PMCID: PMC10918109 DOI: 10.1038/s41598-024-55550-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 02/25/2024] [Indexed: 03/08/2024] Open
Abstract
Heterogeneity in speech under stress has been a recurring issue in stress research, potentially due to varied stress induction paradigms. This study investigated speech features in semi-guided speech following two distinct psychosocial stress paradigms (Cyberball and MIST) and their respective control conditions. Only negative affect increased during Cyberball, while self-reported stress, skin conductance response rate, and negative affect increased during MIST. Fundamental frequency (F0), speech rate, and jitter significantly changed during MIST, but not Cyberball; HNR and shimmer showed no expected changes. The results indicate that observed speech features are robust in semi-guided speech and sensitive to stressors eliciting additional physiological stress responses, not solely decreases in negative affect. These differences between stressors may explain literature heterogeneity. Our findings support the potential of speech as a stress level biomarker, especially when stress elicits physiological reactions, similar to other biomarkers. This highlights its promise as a tool for measuring stress in everyday settings, considering its affordability, non-intrusiveness, and ease of collection. Future research should test these results' robustness and specificity in naturalistic settings, such as freely spoken speech and noisy environments while exploring and validating a broader range of informative speech features in the context of stress.
Collapse
Affiliation(s)
- Mitchel Kappen
- Department of Head and Skin, Department of Psychiatry and Medical Psychology, Ghent University, University Hospital Ghent (UZ Ghent), Ghent, Belgium.
- Ghent Experimental Psychiatry (GHEP) Lab, Ghent University, Ghent, Belgium.
- Department of Experimental Clinical and Health Psychology, Ghent University, Ghent, Belgium.
| | - Gert Vanhollebeke
- Department of Head and Skin, Department of Psychiatry and Medical Psychology, Ghent University, University Hospital Ghent (UZ Ghent), Ghent, Belgium
- Ghent Experimental Psychiatry (GHEP) Lab, Ghent University, Ghent, Belgium
- Department of Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Jonas Van Der Donckt
- IDLab, Ghent University - Imec, Ghent, Belgium
- Department of Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Sofie Van Hoecke
- IDLab, Ghent University - Imec, Ghent, Belgium
- Department of Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Marie-Anne Vanderhasselt
- Department of Head and Skin, Department of Psychiatry and Medical Psychology, Ghent University, University Hospital Ghent (UZ Ghent), Ghent, Belgium
- Ghent Experimental Psychiatry (GHEP) Lab, Ghent University, Ghent, Belgium
| |
Collapse
|
5
|
Luo J, Wu Y, Liu M, Li Z, Wang Z, Zheng Y, Feng L, Lu J, He F. Differentiation between depression and bipolar disorder in child and adolescents by voice features. Child Adolesc Psychiatry Ment Health 2024; 18:19. [PMID: 38287442 PMCID: PMC10826007 DOI: 10.1186/s13034-024-00708-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 01/11/2024] [Indexed: 01/31/2024] Open
Abstract
OBJECTIVE Major depressive disorder (MDD) and bipolar disorder (BD) are serious chronic disabling mental and emotional disorders, with symptoms that often manifest atypically in children and adolescents, making diagnosis difficult without objective physiological indicators. Therefore, we aimed to objectively identify MDD and BD in children and adolescents by exploring their voiceprint features. METHODS This study included a total of 150 participants, with 50 MDD patients, 50 BD patients, and 50 healthy controls aged between 6 and 16 years. After collecting voiceprint data, chi-square test was used to screen and extract voiceprint features specific to emotional disorders in children and adolescents. Then, selected characteristic voiceprint features were used to establish training and testing datasets with the ratio of 7:3. The performances of various machine learning and deep learning algorithms were compared using the training dataset, and the optimal algorithm was selected to classify the testing dataset and calculate the sensitivity, specificity, accuracy, and ROC curve. RESULTS The three groups showed differences in clustering centers for various voice features such as root mean square energy, power spectral slope, low-frequency percentile energy level, high-frequency spectral slope, spectral harmonic gain, and audio signal energy level. The model of linear SVM showed the best performance in the training dataset, achieving a total accuracy of 95.6% in classifying the three groups in the testing dataset, with sensitivity of 93.3% for MDD, 100% for BD, specificity of 93.3%, AUC of 1 for BD, and AUC of 0.967 for MDD. CONCLUSION By exploring the characteristics of voice features in children and adolescents, machine learning can effectively differentiate between MDD and BD in a population, and voice features hold promise as an objective physiological indicator for the auxiliary diagnosis of mood disorder in clinical practice.
Collapse
Affiliation(s)
- Jie Luo
- National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
| | - Yuanzhen Wu
- National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
| | - Mengqi Liu
- National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
| | - Zhaojun Li
- Beijing Institute of Technology, School of Integrated Circuits and Electronics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China
| | - Zhuo Wang
- Beijing Institute of Technology, School of Integrated Circuits and Electronics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China
| | - Yi Zheng
- National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China
| | - Lihui Feng
- Beijing Institute of Technology, School of Optics and Photonics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China
| | - Jihua Lu
- Beijing Institute of Technology, School of Integrated Circuits and Electronics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China.
| | - Fan He
- National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China.
| |
Collapse
|
6
|
Kappen M, Vanderhasselt MA, Slavich GM. Speech as a promising biosignal in precision psychiatry. Neurosci Biobehav Rev 2023; 148:105121. [PMID: 36914080 PMCID: PMC11219249 DOI: 10.1016/j.neubiorev.2023.105121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 03/02/2023] [Accepted: 03/07/2023] [Indexed: 03/15/2023]
Abstract
Health research and health care alike are presently based on infrequent assessments that provide an incomplete picture of clinical functioning. Consequently, opportunities to identify and prevent health events before they occur are missed. New health technologies are addressing these critical issues by enabling the continual monitoring of health-related processes using speech. These technologies are a great match for the healthcare environment because they make high-frequency assessments non-invasive and highly scalable. Indeed, existing tools can now extract a wide variety of health-relevant biosignals from smartphones by analyzing a person's voice and speech. These biosignals are linked to health-relevant biological pathways and have shown promise in detecting several disorders, including depression and schizophrenia. However, more research is needed to identify the speech signals that matter most, validate these signals against ground-truth outcomes, and translate these data into biomarkers and just-in-time adaptive interventions. We discuss these issues herein by describing how assessing everyday psychological stress through speech can help both researchers and health care providers monitor the impact that stress has on a wide variety of mental and physical health outcomes, such as self-harm, suicide, substance abuse, depression, and disease recurrence. If done appropriately and securely, speech is a novel digital biosignal that could play a key role in predicting high-priority clinical outcomes and delivering tailored interventions that help people when they need it most.
Collapse
Affiliation(s)
- Mitchel Kappen
- Department of Head and Skin, Ghent University, University Hospital Ghent (UZ Ghent), Department of Psychiatry and Medical Psychology, Ghent, Belgium
| | - Marie-Anne Vanderhasselt
- Department of Head and Skin, Ghent University, University Hospital Ghent (UZ Ghent), Department of Psychiatry and Medical Psychology, Ghent, Belgium
| | - George M Slavich
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, USA.
| |
Collapse
|
7
|
Kappen M, van der Donckt J, Vanhollebeke G, Allaert J, Degraeve V, Madhu N, Van Hoecke S, Vanderhasselt MA. Acoustic speech features in social comparison: how stress impacts the way you sound. Sci Rep 2022; 12:22022. [PMID: 36539505 PMCID: PMC9767914 DOI: 10.1038/s41598-022-26375-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 12/14/2022] [Indexed: 12/24/2022] Open
Abstract
The use of speech as a digital biomarker to detect stress levels is increasingly gaining attention. Yet, heterogeneous effects of stress on specific acoustic speech features have been observed, possibly due to previous studies' use of different stress labels/categories and the lack of solid stress induction paradigms or validation of experienced stress. Here, we deployed a controlled, within-subject psychosocial stress induction experiment in which participants received both neutral (control condition) and negative (negative condition) comparative feedback after solving a challenging cognitive task. This study is the first to use a (non-actor) within-participant design that verifies a successful stress induction using both self-report (i.e., decreased reported valence) and physiological measures (i.e., increased heart rate acceleration using event-related cardiac responses during feedback exposure). Analyses of acoustic speech features showed a significant increase in Fundamental Frequency (F0) and Harmonics-to-Noise Ratio (HNR), and a significant decrease in shimmer during the negative feedback condition. Our results using read-out-loud speech comply with earlier research, yet we are the first to validate these results in a well-controlled but ecologically-valid setting to guarantee the generalization of our findings to real-life settings. Further research should aim to replicate these results in a free speech setting to test the robustness of our findings for real-world settings and should include semantics to also take into account what you say and not only how you say it.
Collapse
Affiliation(s)
- Mitchel Kappen
- grid.5342.00000 0001 2069 7798Department of Head and Skin, Department of Psychiatry and Medical Psychology, Ghent University, University Hospital Ghent (UZ Ghent), Corneel Heymanslaan 10-13K12, 9000 Ghent, Belgium ,grid.5342.00000 0001 2069 7798Ghent Experimental Psychiatry (GHEP) Lab, Ghent University, Ghent, Belgium ,grid.5342.00000 0001 2069 7798Department of Experimental Clinical and Health Psychology, Ghent University, Ghent, Belgium
| | | | - Gert Vanhollebeke
- grid.5342.00000 0001 2069 7798Department of Head and Skin, Department of Psychiatry and Medical Psychology, Ghent University, University Hospital Ghent (UZ Ghent), Corneel Heymanslaan 10-13K12, 9000 Ghent, Belgium ,grid.5342.00000 0001 2069 7798Ghent Experimental Psychiatry (GHEP) Lab, Ghent University, Ghent, Belgium ,grid.5342.00000 0001 2069 7798Department of Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Jens Allaert
- grid.5342.00000 0001 2069 7798Department of Head and Skin, Department of Psychiatry and Medical Psychology, Ghent University, University Hospital Ghent (UZ Ghent), Corneel Heymanslaan 10-13K12, 9000 Ghent, Belgium ,grid.5342.00000 0001 2069 7798Ghent Experimental Psychiatry (GHEP) Lab, Ghent University, Ghent, Belgium ,grid.5342.00000 0001 2069 7798Department of Experimental Clinical and Health Psychology, Ghent University, Ghent, Belgium
| | - Vic Degraeve
- grid.5342.00000 0001 2069 7798IDLab, Ghent University-Imec, Ghent, Belgium
| | - Nilesh Madhu
- grid.5342.00000 0001 2069 7798IDLab, Ghent University-Imec, Ghent, Belgium
| | - Sofie Van Hoecke
- grid.5342.00000 0001 2069 7798IDLab, Ghent University-Imec, Ghent, Belgium
| | - Marie-Anne Vanderhasselt
- grid.5342.00000 0001 2069 7798Department of Head and Skin, Department of Psychiatry and Medical Psychology, Ghent University, University Hospital Ghent (UZ Ghent), Corneel Heymanslaan 10-13K12, 9000 Ghent, Belgium ,grid.5342.00000 0001 2069 7798Ghent Experimental Psychiatry (GHEP) Lab, Ghent University, Ghent, Belgium
| |
Collapse
|
8
|
Gao X, Ma K, Yang H, Wang K, Fu B, Zhu Y, She X, Cui B. A rapid, non-invasive method for fatigue detection based on voice information. Front Cell Dev Biol 2022; 10:994001. [PMID: 36176279 PMCID: PMC9513181 DOI: 10.3389/fcell.2022.994001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 08/24/2022] [Indexed: 11/19/2022] Open
Abstract
Fatigue results from a series of physiological and psychological changes due to continuous energy consumption. It can affect the physiological states of operators, thereby reducing their labor capacity. Fatigue can also reduce efficiency and, in serious cases, cause severe accidents. In addition, it can trigger pathological-related changes. By establishing appropriate methods to closely monitor the fatigue status of personnel and relieve the fatigue on time, operation-related injuries can be reduced. Existing fatigue detection methods mostly include subjective methods, such as fatigue scales, or those involving the use of professional instruments, which are more demanding for operators and cannot detect fatigue levels in real time. Speech contains information that can be used as acoustic biomarkers to monitor physiological and psychological statuses. In this study, we constructed a fatigue model based on the method of sleep deprivation by collecting various physiological indexes, such as P300 and glucocorticoid level in saliva, as well as fatigue questionnaires filled by 15 participants under different fatigue procedures and graded the fatigue levels accordingly. We then extracted the speech features at different instances and constructed a model to match the speech features and the degree of fatigue using a machine learning algorithm. Thus, we established a method to rapidly judge the degree of fatigue based on speech. The accuracy of the judgment based on unitary voice could reach 94%, whereas that based on long speech could reach 81%. Our fatigue detection method based on acoustic information can easily and rapidly determine the fatigue levels of the participants. This method can operate in real time and is non-invasive and efficient. Moreover, it can be combined with the advantages of information technology and big data to expand its applicability.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Bo Cui
- *Correspondence: Xiaojun She, ; Bo Cui,
| |
Collapse
|