Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: He L, Cao C. Automated depression analysis using convolutional neural networks from speech. J Biomed Inform 2018;83:103-111. [PMID: 29852317 DOI: 10.1016/j.jbi.2018.05.007] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Revised: 04/25/2018] [Accepted: 05/12/2018] [Indexed: 11/17/2022]

For:	He L, Cao C. Automated depression analysis using convolutional neural networks from speech. J Biomed Inform 2018;83:103-111. [PMID: 29852317 DOI: 10.1016/j.jbi.2018.05.007] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Revised: 04/25/2018] [Accepted: 05/12/2018] [Indexed: 11/17/2022]

Number

Cited by Other Article(s)

Zhang X, Zhang X, Chen W, Li C, Yu C. Improving speech depression detection using transfer learning with wav2vec 2.0 in low-resource environments. Sci Rep 2024;14:9543. [PMID: 38664511 PMCID: PMC11045867 DOI: 10.1038/s41598-024-60278-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 04/21/2024] [Indexed: 04/28/2024] Open

Li G, Zarei MA, Alibakhshi G, Labbafi A. Teachers and educators' experiences and perceptions of artificial-powered interventions for autism groups. BMC Psychol 2024;12:199. [PMID: 38605422 PMCID: PMC11010416 DOI: 10.1186/s40359-024-01664-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 03/15/2024] [Indexed: 04/13/2024] Open

Abstract

BACKGROUND

Artificial intelligence-powered interventions have emerged as promising tools to support autistic individuals. However, more research must examine how teachers and educators perceive and experience these AI systems when implemented.

OBJECTIVES

The first objective was to investigate informants' perceptions and experiences of AI-empowered interventions for children with autism. Mainly, it explores the informants' perceived benefits and challenges of using AI-empowered interventions and their recommendations for avoiding the perceived challenges.

METHODOLOGY

A qualitative phenomenological approach was used. Twenty educators and parents with experience implementing AI interventions for autism were recruited through purposive sampling. Semi-structured and focus group interviews conducted, transcribed verbatim, and analyzed using thematic analysis.

FINDINGS

The analysis identified four major themes: perceived benefits of AI interventions, implementation challenges, needed support, and recommendations for improvement. Benefits included increased engagement and personalized learning. Challenges included technology issues, training needs, and data privacy concerns.

CONCLUSIONS

AI-powered interventions show potential to improve autism support, but significant challenges must be addressed to ensure effective implementation from an educator's perspective. The benefits of personalized learning and student engagement demonstrate the potential value of these technologies. However, with adequate training, technical support, and measures to ensure data privacy, many educators will likely find integrating AI systems into their daily practices easier.

IMPLICATIONS

To realize the full benefits of AI for autism, developers must work closely with educators to understand their needs, optimize implementation, and build trust through transparent privacy policies and procedures. With proper support, AI interventions can transform how autistic individuals are educated by tailoring instruction to each student's unique profile and needs.

Collapse

Xu X, Li J, Zhu Z, Zhao L, Wang H, Song C, Chen Y, Zhao Q, Yang J, Pei Y. A Comprehensive Review on Synergy of Multi-Modal Data and AI Technologies in Medical Diagnosis. Bioengineering (Basel) 2024;11:219. [PMID: 38534493 DOI: 10.3390/bioengineering11030219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 02/15/2024] [Accepted: 02/21/2024] [Indexed: 03/28/2024] Open

Han MM, Li XY, Yi XY, Zheng YS, Xia WL, Liu YF, Wang QX. Automatic recognition of depression based on audio and video: A review. World J Psychiatry 2024;14:225-233. [PMID: 38464777 PMCID: PMC10921287 DOI: 10.5498/wjp.v14.i2.225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 12/18/2023] [Accepted: 01/24/2024] [Indexed: 02/06/2024] Open

Affiliation(s)

Meng-Meng Han Shandong Mental Health Center, Shandong University, Jinan 250014, Shandong Province, China Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, Shandong Province, China
Xing-Yun Li Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, Shandong Province, China Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, Shandong Province, China Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250353, Shandong Province, China
Xin-Yu Yi Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, Shandong Province, China Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, Shandong Province, China Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250353, Shandong Province, China
Yun-Shao Zheng Department of Ward Two, Shandong Mental Health Center, Shandong University, Jinan 250014, Shandong Province, China
Wei-Li Xia Shandong Mental Health Center, Shandong University, Jinan 250014, Shandong Province, China
Ya-Fei Liu Shandong Mental Health Center, Shandong University, Jinan 250014, Shandong Province, China
Qing-Xiang Wang Shandong Mental Health Center, Shandong University, Jinan 250014, Shandong Province, China

Collapse

Han J, Li H, Lin H, Wu P, Wang S, Tu J, Lu J. Depression prediction based on LassoNet-RNN model: A longitudinal study. Heliyon 2023;9:e20684. [PMID: 37842633 PMCID: PMC10570602 DOI: 10.1016/j.heliyon.2023.e20684] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 09/21/2023] [Accepted: 10/04/2023] [Indexed: 10/17/2023] Open

Yang W, Liu J, Cao P, Zhu R, Wang Y, Liu JK, Wang F, Zhang X. Attention guided learnable time-domain filterbanks for speech depression detection. Neural Netw 2023;165:135-149. [PMID: 37285730 DOI: 10.1016/j.neunet.2023.05.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Revised: 05/13/2023] [Accepted: 05/20/2023] [Indexed: 06/09/2023]

Abstract

Depression, as a global mental health problem, is lacking effective screening methods that can help with early detection and treatment. This paper aims to facilitate the large-scale screening of depression by focusing on the speech depression detection (SDD) task. Currently, direct modeling on the raw signal yields a large number of parameters, and the existing deep learning-based SDD models mainly use the fixed Mel-scale spectral features as input. However, these features are not designed for depression detection, and the manual settings limit the exploration of fine-grained feature representations. In this paper, we learn the effective representations of the raw signals from an interpretable perspective. Specifically, we present a joint learning framework with attention-guided learnable time-domain filterbanks for depression classification (DALF), which collaborates with the depression filterbanks features learning (DFBL) module and multi-scale spectral attention learning (MSSA) module. DFBL is capable of producing biologically meaningful acoustic features by employing learnable time-domain filters, and MSSA is used to guide the learnable filters to better retain the useful frequency sub-bands. We collect a new dataset, the Neutral Reading-based Audio Corpus (NRAC), to facilitate the research in depression analysis, and we evaluate the performance of DALF on the NRAC and the public DAIC-woz datasets. The experimental results demonstrate that our method outperforms the state-of-the-art SDD methods with an F1 of 78.4% on the DAIC-woz dataset. In particular, DALF achieves F1 scores of 87.3% and 81.7% on two parts of the NRAC dataset. By analyzing the filter coefficients, we find that the most important frequency range identified by our method is 600-700Hz, which corresponds to the Mandarin vowels /e/ and /eˆ/ and can be considered as an effective biomarker for the SDD task. Taken together, our DALF model provides a promising approach to depression detection.

Collapse

Pan W, Deng F, Wang X, Hang B, Zhou W, Zhu T. Exploring the ability of vocal biomarkers in distinguishing depression from bipolar disorder, schizophrenia, and healthy controls. Front Psychiatry 2023;14:1079448. [PMID: 37575564 PMCID: PMC10415910 DOI: 10.3389/fpsyt.2023.1079448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 06/30/2023] [Indexed: 08/15/2023] Open

Abstract

Background

Vocal features have been exploited to distinguish depression from healthy controls. While there have been some claims for success, the degree to which changes in vocal features are specific to depression has not been systematically studied. Hence, we examined the performances of vocal features in differentiating depression from bipolar disorder (BD), schizophrenia and healthy controls, as well as pairwise classifications for the three disorders.

Methods

We sampled 32 bipolar disorder patients, 106 depression patients, 114 healthy controls, and 20 schizophrenia patients. We extracted i-vectors from Mel-frequency cepstrum coefficients (MFCCs), and built logistic regression models with ridge regularization and 5-fold cross-validation on the training set, then applied models to the test set. There were seven classification tasks: any disorder versus healthy controls; depression versus healthy controls; BD versus healthy controls; schizophrenia versus healthy controls; depression versus BD; depression versus schizophrenia; BD versus schizophrenia.

Results

The area under curve (AUC) score for classifying depression and bipolar disorder was 0.5 (F-score = 0.44). For other comparisons, the AUC scores ranged from 0.75 to 0.92, and the F-scores ranged from 0.73 to 0.91. The model performance (AUC) of classifying depression and bipolar disorder was significantly worse than that of classifying bipolar disorder and schizophrenia (corrected p < 0.05). While there were no significant differences in the remaining pairwise comparisons of the 7 classification tasks.

Conclusion

Vocal features showed discriminatory potential in classifying depression and the healthy controls, as well as between depression and other mental disorders. Future research should systematically examine the mechanisms of voice features in distinguishing depression with other mental disorders and develop more sophisticated machine learning models so that voice can assist clinical diagnosis better.

Collapse

Du M, Liu S, Wang T, Zhang W, Ke Y, Chen L, Ming D. Depression recognition using a proposed speech chain model fusing speech production and perception features. J Affect Disord 2023;323:299-308. [PMID: 36462607 DOI: 10.1016/j.jad.2022.11.060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Revised: 10/22/2022] [Accepted: 11/20/2022] [Indexed: 12/05/2022]

A New Regression Model for Depression Severity Prediction Based on Correlation among Audio Features Using a Graph Convolutional Neural Network. Diagnostics (Basel) 2023;13:diagnostics13040727. [PMID: 36832211 PMCID: PMC9955540 DOI: 10.3390/diagnostics13040727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 02/10/2023] [Accepted: 02/13/2023] [Indexed: 02/17/2023] Open

Eysenbach G, Jang EH, Lee SH, Choi KY, Park JG, Shin HC. Automatic Depression Detection Using Smartphone-Based Text-Dependent Speech Signals: Deep Convolutional Neural Network Approach. J Med Internet Res 2023;25:e34474. [PMID: 36696160 PMCID: PMC9909514 DOI: 10.2196/34474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 05/20/2022] [Accepted: 12/18/2022] [Indexed: 12/23/2022] Open

Abstract

BACKGROUND

Automatic diagnosis of depression based on speech can complement mental health treatment methods in the future. Previous studies have reported that acoustic properties can be used to identify depression. However, few studies have attempted a large-scale differential diagnosis of patients with depressive disorders using acoustic characteristics of non-English speakers.

OBJECTIVE

This study proposes a framework for automatic depression detection using large-scale acoustic characteristics based on the Korean language.

METHODS

We recruited 153 patients who met the criteria for major depressive disorder and 165 healthy controls without current or past mental illness. Participants' voices were recorded on a smartphone while performing the task of reading predefined text-based sentences. Three approaches were evaluated and compared to detect depression using data sets with text-dependent read speech tasks: conventional machine learning models based on acoustic features, a proposed model that trains and classifies log-Mel spectrograms by applying a deep convolutional neural network (CNN) with a relatively small number of parameters, and models that train and classify log-Mel spectrograms by applying well-known pretrained networks.

RESULTS

The acoustic characteristics of the predefined text-based sentence reading automatically detected depression using the proposed CNN model. The highest accuracy achieved with the proposed CNN on the speech data was 78.14%. Our results show that the deep-learned acoustic characteristics lead to better performance than those obtained using the conventional approach and pretrained models.

CONCLUSIONS

Checking the mood of patients with major depressive disorder and detecting the consistency of objective descriptions are very important research topics. This study suggests that the analysis of speech data recorded while reading text-dependent sentences could help predict depression status automatically by capturing the characteristics of depression. Our method is smartphone based, is easily accessible, and can contribute to the automatic identification of depressive states.

Collapse

Smart voice recognition based on deep learning for depression diagnosis. ARTIFICIAL LIFE AND ROBOTICS 2023. [DOI: 10.1007/s10015-023-00852-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Chen Y, Ma S, Yang X, Liu D, Yang J. Screening Children's Intellectual Disabilities with Phonetic Features, Facial Phenotype and Craniofacial Variability Index. Brain Sci 2023;13:brainsci13010155. [PMID: 36672135 PMCID: PMC9857173 DOI: 10.3390/brainsci13010155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 12/31/2022] [Accepted: 01/09/2023] [Indexed: 01/18/2023] Open

Liu Z, Yu H, Li G, Chen Q, Ding Z, Feng L, Yao Z, Hu B. Ensemble learning with speaker embeddings in multiple speech task stimuli for depression detection. Front Neurosci 2023;17:1141621. [PMID: 37034153 PMCID: PMC10076578 DOI: 10.3389/fnins.2023.1141621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 03/09/2023] [Indexed: 04/11/2023] Open

Abstract

Introduction

As a biomarker of depression, speech signal has attracted the interest of many researchers due to its characteristics of easy collection and non-invasive. However, subjects' speech variation under different scenes and emotional stimuli, the insufficient amount of depression speech data for deep learning, and the variable length of speech frame-level features have an impact on the recognition performance.

Methods

The above problems, this study proposes a multi-task ensemble learning method based on speaker embeddings for depression classification. First, we extract the Mel Frequency Cepstral Coefficients (MFCC), the Perceptual Linear Predictive Coefficients (PLP), and the Filter Bank (FBANK) from the out-domain dataset (CN-Celeb) and train the Resnet x-vector extractor, Time delay neural network (TDNN) x-vector extractor, and i-vector extractor. Then, we extract the corresponding speaker embeddings of fixed length from the depression speech database of the Gansu Provincial Key Laboratory of Wearable Computing. Support Vector Machine (SVM) and Random Forest (RF) are used to obtain the classification results of speaker embeddings in nine speech tasks. To make full use of the information of speech tasks with different scenes and emotions, we aggregate the classification results of nine tasks into new features and then obtain the final classification results by using Multilayer Perceptron (MLP). In order to take advantage of the complementary effects of different features, Resnet x-vectors based on different acoustic features are fused in the ensemble learning method.

Results

Experimental results demonstrate that (1) MFCC-based Resnet x-vectors perform best among the nine speaker embeddings for depression detection; (2) interview speech is better than picture descriptions speech, and neutral stimulus is the best among the three emotional valences in the depression recognition task; (3) our multi-task ensemble learning method with MFCC-based Resnet x-vectors can effectively identify depressed patients; (4) in all cases, the combination of MFCC-based Resnet x-vectors and PLP-based Resnet x-vectors in our ensemble learning method achieves the best results, outperforming other literature studies using the depression speech database.

Discussion

Our multi-task ensemble learning method with MFCC-based Resnet x-vectors can fuse the depression related information of different stimuli effectively, which provides a new approach for depression detection. The limitation of this method is that speaker embeddings extractors were pre-trained on the out-domain dataset. We will consider using the augmented in-domain dataset for pre-training to improve the depression recognition performance further.

Collapse

Alghowinem S, Gedeon T, Goecke R, Cohn JF, Parker G. Interpretation of Depression Detection Models via Feature Selection Methods. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING 2023;14:133-152. [PMID: 36938342 PMCID: PMC10019578 DOI: 10.1109/taffc.2020.3035535] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]

Abstract

Given the prevalence of depression worldwide and its major impact on society, several studies employed artificial intelligence modelling to automatically detect and assess depression. However, interpretation of these models and cues are rarely discussed in detail in the AI community, but have received increased attention lately. In this study, we aim to analyse the commonly selected features using a proposed framework of several feature selection methods and their effect on the classification results, which will provide an interpretation of the depression detection model. The developed framework aggregates and selects the most promising features for modelling depression detection from 38 feature selection algorithms of different categories. Using three real-world depression datasets, 902 behavioural cues were extracted from speech behaviour, speech prosody, eye movement and head pose. To verify the generalisability of the proposed framework, we applied the entire process to depression datasets individually and when combined. The results from the proposed framework showed that speech behaviour features (e.g. pauses) are the most distinctive features of the depression detection model. From the speech prosody modality, the strongest feature groups were F0, HNR, formants, and MFCC, while for the eye activity modality they were left-right eye movement and gaze direction, and for the head modality it was yaw head movement. Modelling depression detection using the selected features (even though there are only 9 features) outperformed using all features in all the individual and combined datasets. Our feature selection framework did not only provide an interpretation of the model, but was also able to produce a higher accuracy of depression detection with a small number of features in varied datasets. This could help to reduce the processing time needed to extract features and creating the model.

Collapse

König A, Tröger J, Mallick E, Mina M, Linz N, Wagnon C, Karbach J, Kuhn C, Peter J. Detecting subtle signs of depression with automated speech analysis in a non-clinical sample. BMC Psychiatry 2022;22:830. [PMID: 36575442 PMCID: PMC9793349 DOI: 10.1186/s12888-022-04475-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 12/14/2022] [Indexed: 12/28/2022] Open

Abstract

BACKGROUND

Automated speech analysis has gained increasing attention to help diagnosing depression. Most previous studies, however, focused on comparing speech in patients with major depressive disorder to that in healthy volunteers. An alternative may be to associate speech with depressive symptoms in a non-clinical sample as this may help to find early and sensitive markers in those at risk of depression.

METHODS

We included n = 118 healthy young adults (mean age: 23.5 ± 3.7 years; 77% women) and asked them to talk about a positive and a negative event in their life. Then, we assessed the level of depressive symptoms with a self-report questionnaire, with scores ranging from 0-60. We transcribed speech data and extracted acoustic as well as linguistic features. Then, we tested whether individuals below or above the cut-off of clinically relevant depressive symptoms differed in speech features. Next, we predicted whether someone would be below or above that cut-off as well as the individual scores on the depression questionnaire. Since depression is associated with cognitive slowing or attentional deficits, we finally correlated depression scores with performance in the Trail Making Test.

RESULTS

In our sample, n = 93 individuals scored below and n = 25 scored above cut-off for clinically relevant depressive symptoms. Most speech features did not differ significantly between both groups, but individuals above cut-off spoke more than those below that cut-off in the positive and the negative story. In addition, higher depression scores in that group were associated with slower completion time of the Trail Making Test. We were able to predict with 93% accuracy who would be below or above cut-off. In addition, we were able to predict the individual depression scores with low mean absolute error (3.90), with best performance achieved by a support vector machine.

CONCLUSIONS

Our results indicate that even in a sample without a clinical diagnosis of depression, changes in speech relate to higher depression scores. This should be investigated in more detail in the future. In a longitudinal study, it may be tested whether speech features found in our study represent early and sensitive markers for subsequent depression in individuals at risk.

Collapse

Francese R, Attanasio P. Emotion detection for supporting depression screening. MULTIMEDIA TOOLS AND APPLICATIONS 2022;82:12771-12795. [PMID: 36570729 PMCID: PMC9761032 DOI: 10.1007/s11042-022-14290-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 10/14/2022] [Accepted: 12/03/2022] [Indexed: 06/17/2023]

Barua PD, Vicnesh J, Lih OS, Palmer EE, Yamakawa T, Kobayashi M, Acharya UR. Artificial intelligence assisted tools for the detection of anxiety and depression leading to suicidal ideation in adolescents: a review. Cogn Neurodyn 2022:1-22. [PMID: 36467993 PMCID: PMC9684805 DOI: 10.1007/s11571-022-09904-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 09/26/2022] [Accepted: 10/17/2022] [Indexed: 11/24/2022] Open

Newborn Cry-Based Diagnostic System to Distinguish between Sepsis and Respiratory Distress Syndrome Using Combined Acoustic Features. Diagnostics (Basel) 2022;12:diagnostics12112802. [PMID: 36428865 PMCID: PMC9689015 DOI: 10.3390/diagnostics12112802] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 11/05/2022] [Accepted: 11/11/2022] [Indexed: 11/18/2022] Open

Abstract

Crying is the only means of communication for a newborn baby with its surrounding environment, but it also provides significant information about the newborn's health, emotions, and needs. The cries of newborn babies have long been known as a biomarker for the diagnosis of pathologies. However, to the best of our knowledge, exploring the discrimination of two pathology groups by means of cry signals is unprecedented. Therefore, this study aimed to identify septic newborns with Neonatal Respiratory Distress Syndrome (RDS) by employing the Machine Learning (ML) methods of Multilayer Perceptron (MLP) and Support Vector Machine (SVM). Furthermore, the cry signal was analyzed from the following two different perspectives: 1) the musical perspective by studying the spectral feature set of Harmonic Ratio (HR), and 2) the speech processing perspective using the short-term feature set of Gammatone Frequency Cepstral Coefficients (GFCCs). In order to assess the role of employing features from both short-term and spectral modalities in distinguishing the two pathology groups, they were fused in one feature set named the combined features. The hyperparameters (HPs) of the implemented ML approaches were fine-tuned to fit each experiment. Finally, by normalizing and fusing the features originating from the two modalities, the overall performance of the proposed design was improved across all evaluation measures, achieving accuracies of 92.49% and 95.3% by the MLP and SVM classifiers, respectively. The MLP classifier was outperformed in terms of all evaluation measures presented in this study, except for the Area Under Curve of Receiver Operator Characteristics (AUC-ROC), which signifies the ability of the proposed design in class separation. The achieved results highlighted the role of combining features from different levels and modalities for a more powerful analysis of the cry signals, as well as including a neural network (NN)-based classifier. Consequently, attaining a 95.3% accuracy for the separation of two entangled pathology groups of RDS and sepsis elucidated the promising potential for further studies with larger datasets and more pathology groups.

Collapse

Dhelim S, Chen L, Ning H, Nugent C. Artificial intelligence for suicide assessment using Audiovisual Cues: a review. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10290-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Malhotra A, Jindal R. Deep learning techniques for suicide and depression detection from online social media: A scoping review. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Zlatintsi A, Filntisis PP, Garoufis C, Efthymiou N, Maragos P, Menychtas A, Maglogiannis I, Tsanakas P, Sounapoglou T, Kalisperakis E, Karantinos T, Lazaridi M, Garyfalli V, Mantas A, Mantonakis L, Smyrnis N. E-Prevention: Advanced Support System for Monitoring and Relapse Prevention in Patients with Psychotic Disorders Analyzing Long-Term Multimodal Data from Wearables and Video Captures. SENSORS (BASEL, SWITZERLAND) 2022;22:7544. [PMID: 36236643 PMCID: PMC9572170 DOI: 10.3390/s22197544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 09/23/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]

Affiliation(s)

Athanasia Zlatintsi School of ECE, National Technical University of Athens, 157 73 Athens, Greece
Panagiotis P. Filntisis School of ECE, National Technical University of Athens, 157 73 Athens, Greece
Christos Garoufis School of ECE, National Technical University of Athens, 157 73 Athens, Greece
Niki Efthymiou School of ECE, National Technical University of Athens, 157 73 Athens, Greece
Petros Maragos School of ECE, National Technical University of Athens, 157 73 Athens, Greece
Andreas Menychtas Department of Digital Systems, University of Piraeus, 185 34 Pireas, Greece
Ilias Maglogiannis Department of Digital Systems, University of Piraeus, 185 34 Pireas, Greece
Panayiotis Tsanakas School of ECE, National Technical University of Athens, 157 73 Athens, Greece
Thomas Sounapoglou BLOCKACHAIN PC, 555 35 Thessaloniki, Greece
Emmanouil Kalisperakis Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute “COSTAS STEFANIS”, 115 27 Athens, Greece 1st Department of Psychiatry, Eginition Hospital, Medical School, National and Kapodistrian University of Athens, 115 28 Athens, Greece
Thomas Karantinos Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute “COSTAS STEFANIS”, 115 27 Athens, Greece
Marina Lazaridi Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute “COSTAS STEFANIS”, 115 27 Athens, Greece 1st Department of Psychiatry, Eginition Hospital, Medical School, National and Kapodistrian University of Athens, 115 28 Athens, Greece
Vasiliki Garyfalli Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute “COSTAS STEFANIS”, 115 27 Athens, Greece 1st Department of Psychiatry, Eginition Hospital, Medical School, National and Kapodistrian University of Athens, 115 28 Athens, Greece
Asimakis Mantas Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute “COSTAS STEFANIS”, 115 27 Athens, Greece
Leonidas Mantonakis Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute “COSTAS STEFANIS”, 115 27 Athens, Greece 1st Department of Psychiatry, Eginition Hospital, Medical School, National and Kapodistrian University of Athens, 115 28 Athens, Greece
Nikolaos Smyrnis Laboratory of Cognitive Neuroscience and Sensorimotor Control, University Mental Health, Neurosciences and Precision Medicine Research Institute “COSTAS STEFANIS”, 115 27 Athens, Greece 2nd Department of Psychiatry, University General Hospital “ATTIKON”, Medical School, National and Kapodistrian University of Athens, 124 62 Athens, Greece

Collapse

Depression detection based on nonlinear and linear speech features in I-vector/SVDA framework. Comput Biol Med 2022;149:105926. [DOI: 10.1016/j.compbiomed.2022.105926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 07/07/2022] [Accepted: 07/30/2022] [Indexed: 11/18/2022]

Wu P, Wang R, Lin H, Zhang F, Tu J, Sun M. Automatic depression recognition by intelligent speech signal processing: A systematic survey. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2022. [DOI: 10.1049/cit2.12113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Kshirsagar PR, Manoharan H, Selvarajan S, Alterazi HA, Singh D, Lee HN. Perception Exploration on Robustness Syndromes With Pre-processing Entities Using Machine Learning Algorithm. Front Public Health 2022;10:893989. [PMID: 35784247 PMCID: PMC9243559 DOI: 10.3389/fpubh.2022.893989] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 04/27/2022] [Indexed: 11/13/2022] Open

He L, Tiwari P, Lv C, Wu W, Guo L. Reducing noisy annotations for depression estimation from facial images. Neural Netw 2022;153:120-129. [DOI: 10.1016/j.neunet.2022.05.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 04/17/2022] [Accepted: 05/25/2022] [Indexed: 11/28/2022]

Bhadra S, Kumar CJ. An insight into diagnosis of depression using machine learning techniques: a systematic review. Curr Med Res Opin 2022;38:749-771. [PMID: 35129401 DOI: 10.1080/03007995.2022.2038487] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]

Abstract

BACKGROUND

In this modern era, depression is one of the most prevalent mental disorders from which millions of individuals are affected today. The symptoms of depression are heterogeneous and often coincide with other disorders such as bipolar disorder, Parkinson's, schizophrenia, etc. It is a serious mental illness that may lead to other health problems if left untreated. Currently, identifying individuals with depression is totally based on the expertise of the clinician's experience. In order to assist clinicians in identifying the characteristics and classifying depressed people, different types of data modalities and machine learning techniques have been incorporated by researchers in this field. This study aims to find the answers to some important questions related to the trend of publications, data modality, machine learning models, dataset usage, pre-processing techniques and feature extraction and selection techniques that are prevalent and guide the direction of future research on depression diagnosis.

METHODS

This systematic review was conducted using a broad range of articles from two major databases: IEEE Xplore and PubMed. Studies ranging from the years 2011 to April 2021 were retrieved from the databases resulting in a total of 590 articles (53 articles from the IEEE Xplore database and 537 articles from the PubMed database). Out of those, the articles which satisfied the defined inclusion criteria were investigated for further analysis.

RESULTS

A total of 135 articles were identified and analysed for this review. High growth in the number of publications has been observed in recent years. Furthermore, significant diversity in the use of data modalities and machine learning classifiers has also been noted in this study. fMRI data with an SVM classifier was found to be the most popular choice among researchers. In most of the studies, data scarcity and small sample size, particularly for neuroimaging data are major concerns. The use of identical data pre-processing tools for similar data modalities can be seen. This study also provides statistical analysis of the current framework with respect to the modality, machine learning classifier, sample size and accuracy by applying one-way ANOVA and the Tukey - Kramer test.

CONCLUSION

The results indicate that an effective fusion of machine learning techniques with a potential data modality has a promising future for assisting clinicians in automatic depression diagnosis.

Collapse

Ravi V, Wang J, Flint J, Alwan A. FRAUG: A FRAME RATE BASED DATA AUGMENTATION METHOD FOR DEPRESSION DETECTION FROM SPEECH SIGNALS. PROCEEDINGS OF THE ... IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING. ICASSP (CONFERENCE) 2022;2022:6267-6271. [PMID: 35531125 PMCID: PMC9070766 DOI: 10.1109/icassp43922.2022.9746307] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Gupta S, Goel L, Singh A, Prasad A, Ullah MA. Psychological Analysis for Depression Detection from Social Networking Sites. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:4395358. [PMID: 35432513 PMCID: PMC9007657 DOI: 10.1155/2022/4395358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 02/28/2022] [Accepted: 03/24/2022] [Indexed: 11/23/2022]

Machine Learning Algorithms for Depression: Diagnosis, Insights, and Research Directions. ELECTRONICS 2022. [DOI: 10.3390/electronics11071111] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]

ENIC: Ensemble and Nature Inclined Classification with Sparse Depiction based Deep and Transfer Learning for Biosignal Classification. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108416] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Automatic Identification of Emotional Information in Spanish TV Debates and Human–Machine Interactions. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12041902] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Tonn P, Seule L, Degani Y, Herzinger S, Klein A, Schulze N. Evaluation of a Digital Content-free Speech Analysis Tool to Measure Affective Distress in Mental Health (Preprint). JMIR Form Res 2022;6:e37061. [PMID: 36040767 PMCID: PMC9472064 DOI: 10.2196/37061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 05/08/2022] [Accepted: 05/09/2022] [Indexed: 11/13/2022] Open

Birnbaum ML, Abrami A, Heisig S, Ali A, Arenare E, Agurto C, Lu N, Kane JM, Cecchi G. Acoustic and Facial Features From Clinical Interviews for Machine Learning-Based Psychiatric Diagnosis: Algorithm Development. JMIR Ment Health 2022;9:e24699. [PMID: 35072648 PMCID: PMC8822433 DOI: 10.2196/24699] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 04/29/2021] [Accepted: 12/01/2021] [Indexed: 01/26/2023] Open

Abstract

BACKGROUND

In contrast to all other areas of medicine, psychiatry is still nearly entirely reliant on subjective assessments such as patient self-report and clinical observation. The lack of objective information on which to base clinical decisions can contribute to reduced quality of care. Behavioral health clinicians need objective and reliable patient data to support effective targeted interventions.

OBJECTIVE

We aimed to investigate whether reliable inferences-psychiatric signs, symptoms, and diagnoses-can be extracted from audiovisual patterns in recorded evaluation interviews of participants with schizophrenia spectrum disorders and bipolar disorder.

METHODS

We obtained audiovisual data from 89 participants (mean age 25.3 years; male: 48/89, 53.9%; female: 41/89, 46.1%): individuals with schizophrenia spectrum disorders (n=41), individuals with bipolar disorder (n=21), and healthy volunteers (n=27). We developed machine learning models based on acoustic and facial movement features extracted from participant interviews to predict diagnoses and detect clinician-coded neuropsychiatric symptoms, and we assessed model performance using area under the receiver operating characteristic curve (AUROC) in 5-fold cross-validation.

RESULTS

The model successfully differentiated between schizophrenia spectrum disorders and bipolar disorder (AUROC 0.73) when aggregating face and voice features. Facial action units including cheek-raising muscle (AUROC 0.64) and chin-raising muscle (AUROC 0.74) provided the strongest signal for men. Vocal features, such as energy in the frequency band 1 to 4 kHz (AUROC 0.80) and spectral harmonicity (AUROC 0.78), provided the strongest signal for women. Lip corner-pulling muscle signal discriminated between diagnoses for both men (AUROC 0.61) and women (AUROC 0.62). Several psychiatric signs and symptoms were successfully inferred: blunted affect (AUROC 0.81), avolition (AUROC 0.72), lack of vocal inflection (AUROC 0.71), asociality (AUROC 0.63), and worthlessness (AUROC 0.61).

CONCLUSIONS

This study represents advancement in efforts to capitalize on digital data to improve diagnostic assessment and supports the development of a new generation of innovative clinical tools by employing acoustic and facial data analysis.

Collapse

Artificial Intelligence Enabled Personalised Assistive Tools to Enhance Education of Children with Neurodevelopmental Disorders-A Review. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022;19:ijerph19031192. [PMID: 35162220 PMCID: PMC8835076 DOI: 10.3390/ijerph19031192] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 01/07/2022] [Accepted: 01/10/2022] [Indexed: 11/26/2022]

Calić G, Petrović-Lazić M, Mentus T, Babac S. Acoustic features of voice in adults suffering from depression. PSIHOLOSKA ISTRAZIVANJA 2022. [DOI: 10.5937/psistra25-39224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Hajduska-Dér B, Kiss G, Sztahó D, Vicsi K, Simon L. The applicability of the Beck Depression Inventory and Hamilton Depression Scale in the automatic recognition of depression based on speech signal processing. Front Psychiatry 2022;13:879896. [PMID: 35990073 PMCID: PMC9385975 DOI: 10.3389/fpsyt.2022.879896] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Accepted: 07/18/2022] [Indexed: 11/25/2022] Open

Klangpornkun N, Ruangritchai M, Munthuli A, Onsuwan C, Jaisin K, Pattanaseri K, Lortrakul J, Thanakulakkarachai P, Anansiripinyo T, Amornlaksananon A, Laohawee S, Tantibundhit C. Classification of Depression and Other Psychiatric Conditions Using Speech Features Extracted from a Thai Psychiatric and Verbal Screening Test. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021;2021:651-656. [PMID: 34891377 DOI: 10.1109/embc46164.2021.9629571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Kwon N, Kim S. Depression Severity Detection Using Read Speech with a Divide-and-Conquer Approach. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021;2021:633-637. [PMID: 34891373 DOI: 10.1109/embc46164.2021.9629868] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Prabhu S, Mittal H, Varagani R, Jha S, Singh S. Harnessing emotions for depression detection. Pattern Anal Appl 2021. [DOI: 10.1007/s10044-021-01020-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Niu M, Liu B, Tao J, Li Q. A time-frequency channel attention and vectorization network for automatic depression level prediction. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.04.056] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Little B, Alshabrawy O, Stow D, Ferrier IN, McNaney R, Jackson DG, Ladha K, Ladha C, Ploetz T, Bacardit J, Olivier P, Gallagher P, O'Brien JT. Deep learning-based automated speech detection as a marker of social functioning in late-life depression. Psychol Med 2021;51:1441-1450. [PMID: 31944174 PMCID: PMC8311821 DOI: 10.1017/s0033291719003994] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Revised: 10/23/2019] [Accepted: 12/13/2019] [Indexed: 11/24/2022]

Dong Y, Yang X. A hierarchical depression detection model based on vocal and emotional cues. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.02.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Amir O, Anker SD, Gork I, Abraham WT, Pinney SP, Burkhoff D, Shallom ID, Haviv R, Edelman ER, Lotan C. Feasibility of remote speech analysis in evaluation of dynamic fluid overload in heart failure patients undergoing haemodialysis treatment. ESC Heart Fail 2021;8:2467-2472. [PMID: 33955187 PMCID: PMC8318440 DOI: 10.1002/ehf2.13367] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 03/02/2021] [Accepted: 04/01/2021] [Indexed: 12/02/2022] Open

Abstract

Aims

This study aimed to assess the ability of a voice analysis application to discriminate between wet and dry states in chronic heart failure (CHF) patients undergoing regular scheduled haemodialysis treatment due to volume overload as a result of their chronic renal failure.

Methods and results

In this single‐centre, observational study, five patients with CHF, peripheral oedema of ≥2, and pulmonary congestion‐related dyspnoea, undergoing haemodialysis three times per week, recorded five sentences into a standard smartphone/tablet before and after haemodialysis. Recordings were provided that same noon/early evening and the next morning and evening. Patient weight was measured at the hospital before and after each haemodialysis session. Recordings were analysed by a smartphone application (app) algorithm, to compare speech measures (SMs) of utterances collected over time. On average, patients provided recordings throughout 25.8 ± 3.9 dialysis treatment cycles, resulting in a total of 472 recordings. Weight changes of 1.95 ± 0.64 kg were documented during cycles. Median baseline SM prior to dialysis was 0.87 ± 0.17, and rose to 1.07 ± 0.15 following the end of the dialysis session, at noon (P = 0.0355), and remained at a similar level until the following morning (P = 0.007). By the evening of the day following dialysis, SMs returned to baseline levels (0.88 ± 0.19). Changes in patient weight immediately after dialysis positively correlated with SM changes, with the strongest correlation measured the evening of the dialysis day [slope: −0.40 ± 0.15 (95% confidence interval: −0.71 to −0.10), P = 0.0096].

Conclusions

The fluid‐controlled haemodialysis model demonstrated the ability of the app algorithm to identify cyclic changes in SMs, which reflected bodily fluid levels. The voice analysis platform bears considerable potential as a harbinger of impending fluid overload in a range of clinical scenarios, which will enhance monitoring and triage efforts, ultimately optimizing remote CHF management.

Collapse

Xiao Y, Wang T, Deng W, Yang L, Zeng B, Lao X, Zhang S, Liu X, Ouyang D, Liao G, Liang Y. Data mining of an acoustic biomarker in tongue cancers and its clinical validation. Cancer Med 2021;10:3822-3835. [PMID: 33938165 PMCID: PMC8178493 DOI: 10.1002/cam4.3872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Revised: 01/30/2021] [Accepted: 03/14/2021] [Indexed: 11/08/2022] Open

Abstract

The promise of speech disorders as biomarkers in clinical examination has been identified in a broad spectrum of neurodegenerative diseases. However, to the best of our knowledge, a validated acoustic marker with established discriminative and evaluative properties has not yet been developed for oral tongue cancers. Here we cross-sectionally collected a screening dataset that included acoustic parameters extracted from 3 sustained vowels /ɑ/, /i/, /u/ and binary perceptual outcomes from 12 consonant-vowel syllables. We used a support vector machine with linear kernel function within this dataset to identify the formant centralization ratio (FCR) as a dominant predictor of different perceptual outcomes across gender and syllable. The Acoustic analysis, Perceptual evaluation and Quality of Life assessment (APeQoL) was used to validate the FCR in 33 patients with primary resectable oral tongue cancers. Measurements were taken before (pre-op) and four to six weeks after (post-op) surgery. The speech handicap index (SHI), a speech-specific questionnaire, was also administrated at these time points. Pre-op correlation analysis within the APeQoL revealed overall consistency and a strong correlation between FCR and SHI scores. FCRs also increased significantly with increasing T classification pre-operatively, especially for women. Longitudinally, the main effects of T classification, the extent of resection, and their interaction effects with time (pre-op vs. post-op) on FCRs were all significant. For pre-operative FCR, after merging the two datasets, a cut-off value of 0.970 produced an AUC of 0.861 (95% confidence interval: 0.785-0.938) for T_3-4 patients. In sum, this study determined that FCR is an acoustic marker with the potential to detect disease and related speech function in oral tongue cancers. These are preliminary findings that need to be replicated in longitudinal studies and/or larger cohorts.

Collapse

Affiliation(s)

Yudong Xiao Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
Tao Wang Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
Wei Deng Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
Le Yang Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
Bin Zeng Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
Xiaomei Lao Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
Sien Zhang Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
Xiangqi Liu Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
Daiqiao Ouyang Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
Guiqing Liao Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China
Yujie Liang Department of Oral and Maxillofacial Surgery, Guanghua School of Stomatology, Guangdong Provincial Key Laboratory of Stomatology, Sun Yat-sen University, Guangzhou, China

Collapse

Alemayehu D, Hemmings R, Natarajan K, Roychoudhury S. Perspectives on Virtual (Remote) Clinical Trials as the "New Normal" to Accelerate Drug Development. Clin Pharmacol Ther 2021;111:373-381. [PMID: 33792920 DOI: 10.1002/cpt.2248] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 03/12/2021] [Indexed: 01/27/2023]

Mohammadi Y, Moradi MH. Prediction of Depression Severity Scores Based on Functional Connectivity and Complexity of the EEG Signal. Clin EEG Neurosci 2021;52:52-60. [PMID: 33040603 DOI: 10.1177/1550059420965431] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Analysis of gender and identity issues in depression detection on de-identified speech. COMPUT SPEECH LANG 2021. [DOI: 10.1016/j.csl.2020.101118] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Solomon DH, Rudin RS. Digital health technologies: opportunities and challenges in rheumatology. Nat Rev Rheumatol 2020;16:525-535. [PMID: 32709998 DOI: 10.1038/s41584-020-0461-x] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/24/2020] [Indexed: 12/22/2022]

Abstract

The past decade in rheumatology has seen tremendous innovation in digital health technologies, including the electronic health record, virtual visits, mobile health, wearable technology, digital therapeutics, artificial intelligence and machine learning. The increased availability of these technologies offers opportunities for improving important aspects of rheumatology, including access, outcomes, adherence and research. However, despite its growth in some areas, particularly with non-health-care consumers, digital health technology has not substantially changed the delivery of rheumatology care. This Review discusses key barriers and opportunities to improve application of digital health technologies in rheumatology. Key topics include smart design, voice enablement and the integration of electronic patient-reported outcomes. Smart design involves active engagement with the end users of the technologies, including patients and clinicians through focus groups, user testing sessions and prototype review. Voice enablement using voice assistants could be critical for enabling patients with hand arthritis to effectively use smartphone apps and might facilitate patient engagement with many technologies. Tracking many rheumatic diseases requires frequent monitoring of patient-reported outcomes. Current practice only collects this information sporadically, and rarely between visits. Digital health technology could enable patient-reported outcomes to inform appropriate timing of face-to-face visits and enable improved application of treat-to-target strategies. However, best practice standards for digital health technologies do not yet exist. To achieve the potential of digital health technology in rheumatology, rheumatology professionals will need to be more engaged upstream in the technology design process and provide leadership to effectively incorporate the new tools into clinical care.

Collapse

Su C, Xu Z, Pathak J, Wang F. Deep learning in mental health outcome research: a scoping review. Transl Psychiatry 2020;10:116. [PMID: 32532967 PMCID: PMC7293215 DOI: 10.1038/s41398-020-0780-3] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Revised: 02/17/2020] [Accepted: 02/26/2020] [Indexed: 12/17/2022] Open

Convolutional neural networks. Mach Learn 2020. [DOI: 10.1016/b978-0-12-815739-8.00010-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]