1
|
Zhang Z, Zhang S, Ni D, Wei Z, Yang K, Jin S, Huang G, Liang Z, Zhang L, Li L, Ding H, Zhang Z, Wang J. Multimodal Sensing for Depression Risk Detection: Integrating Audio, Video, and Text Data. SENSORS (BASEL, SWITZERLAND) 2024; 24:3714. [PMID: 38931497 PMCID: PMC11207438 DOI: 10.3390/s24123714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 05/30/2024] [Accepted: 06/06/2024] [Indexed: 06/28/2024]
Abstract
Depression is a major psychological disorder with a growing impact worldwide. Traditional methods for detecting the risk of depression, predominantly reliant on psychiatric evaluations and self-assessment questionnaires, are often criticized for their inefficiency and lack of objectivity. Advancements in deep learning have paved the way for innovations in depression risk detection methods that fuse multimodal data. This paper introduces a novel framework, the Audio, Video, and Text Fusion-Three Branch Network (AVTF-TBN), designed to amalgamate auditory, visual, and textual cues for a comprehensive analysis of depression risk. Our approach encompasses three dedicated branches-Audio Branch, Video Branch, and Text Branch-each responsible for extracting salient features from the corresponding modality. These features are subsequently fused through a multimodal fusion (MMF) module, yielding a robust feature vector that feeds into a predictive modeling layer. To further our research, we devised an emotion elicitation paradigm based on two distinct tasks-reading and interviewing-implemented to gather a rich, sensor-based depression risk detection dataset. The sensory equipment, such as cameras, captures subtle facial expressions and vocal characteristics essential for our analysis. The research thoroughly investigates the data generated by varying emotional stimuli and evaluates the contribution of different tasks to emotion evocation. During the experiment, the AVTF-TBN model has the best performance when the data from the two tasks are simultaneously used for detection, where the F1 Score is 0.78, Precision is 0.76, and Recall is 0.81. Our experimental results confirm the validity of the paradigm and demonstrate the efficacy of the AVTF-TBN model in detecting depression risk, showcasing the crucial role of sensor-based data in mental health detection.
Collapse
Affiliation(s)
- Zhenwei Zhang
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518060, China; (Z.Z.); (D.N.); (G.H.); (Z.L.); (L.Z.); (L.L.); (H.D.)
- Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen 518060, China
| | - Shengming Zhang
- Affiliated Mental Health Center, Southern University of Science and Technology, Shenzhen 518055, China;
| | - Dong Ni
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518060, China; (Z.Z.); (D.N.); (G.H.); (Z.L.); (L.Z.); (L.L.); (H.D.)
- Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen 518060, China
| | - Zhaoguo Wei
- Shenzhen Kangning Hospital, Shenzhen 518020, China; (Z.W.); (K.Y.); (S.J.)
- Shenzhen Mental Health Center, Shenzhen 518020, China
| | - Kongjun Yang
- Shenzhen Kangning Hospital, Shenzhen 518020, China; (Z.W.); (K.Y.); (S.J.)
- Shenzhen Mental Health Center, Shenzhen 518020, China
| | - Shan Jin
- Shenzhen Kangning Hospital, Shenzhen 518020, China; (Z.W.); (K.Y.); (S.J.)
- Shenzhen Mental Health Center, Shenzhen 518020, China
| | - Gan Huang
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518060, China; (Z.Z.); (D.N.); (G.H.); (Z.L.); (L.Z.); (L.L.); (H.D.)
- Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen 518060, China
| | - Zhen Liang
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518060, China; (Z.Z.); (D.N.); (G.H.); (Z.L.); (L.Z.); (L.L.); (H.D.)
- Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen 518060, China
| | - Li Zhang
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518060, China; (Z.Z.); (D.N.); (G.H.); (Z.L.); (L.Z.); (L.L.); (H.D.)
- Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen 518060, China
| | - Linling Li
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518060, China; (Z.Z.); (D.N.); (G.H.); (Z.L.); (L.Z.); (L.L.); (H.D.)
- Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen 518060, China
| | - Huijun Ding
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518060, China; (Z.Z.); (D.N.); (G.H.); (Z.L.); (L.Z.); (L.L.); (H.D.)
- Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, Shenzhen 518060, China
| | - Zhiguo Zhang
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China
- Peng Cheng Laboratory, Shenzhen 518055, China
| | - Jianhong Wang
- Shenzhen Kangning Hospital, Shenzhen 518020, China; (Z.W.); (K.Y.); (S.J.)
- Shenzhen Mental Health Center, Shenzhen 518020, China
| |
Collapse
|
2
|
Yan T, Chen G, Zhang H, Wang G, Yan Z, Li Y, Xu S, Zhou Q, Shi R, Tian Z, Wang B. Convolutional neural network with parallel convolution scale attention module and ResCBAM for breast histology image classification. Heliyon 2024; 10:e30889. [PMID: 38770292 PMCID: PMC11103517 DOI: 10.1016/j.heliyon.2024.e30889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 05/04/2024] [Accepted: 05/07/2024] [Indexed: 05/22/2024] Open
Abstract
Breast cancer is the most common cause of female morbidity and death worldwide. Compared with other cancers, early detection of breast cancer is more helpful to improve the prognosis of patients. In order to achieve early diagnosis and treatment, clinical treatment requires rapid and accurate diagnosis. Therefore, the development of an automatic detection system for breast cancer suitable for patient imaging is of great significance for assisting clinical treatment. Accurate classification of pathological images plays a key role in computer-aided medical diagnosis and prognosis. However, in the automatic recognition and classification methods of breast cancer pathological images, the scale information, the loss of image information caused by insufficient feature fusion, and the enormous structure of the model may lead to inaccurate or inefficient classification. To minimize the impact, we proposed a lightweight PCSAM-ResCBAM model based on two-stage convolutional neural network. The model included a Parallel Convolution Scale Attention Module network (PCSAM-Net) and a Residual Convolutional Block Attention Module network (ResCBAM-Net). The first-level convolutional network was built through a 4-layer PCSAM module to achieve prediction and classification of patches extracted from images. To optimize the network's ability to represent global features of images, we proposed a tiled feature fusion method to fuse patch features from the same image, and proposed a residual convolutional attention module. Based on the above, the second-level convolutional network was constructed to achieve predictive classification of images. We evaluated the performance of our proposed model on the ICIAR2018 dataset and the BreakHis dataset, respectively. Furthermore, through model ablation studies, we found that scale attention and dilated convolution play an important role in improving model performance. Our proposed model outperforms the existing state-of-the-art models on 200 × and 400 × magnification datasets with a maximum accuracy of 98.74 %.
Collapse
Affiliation(s)
- Ting Yan
- Translational Medicine Research Center, Shanxi Medical University, Taiyuan, China
| | - Guohui Chen
- Translational Medicine Research Center, Shanxi Medical University, Taiyuan, China
| | - Huimin Zhang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - Guolan Wang
- Computer Information Engineering Institute, Shanxi Technology and Business College, Taiyuan, China
| | - Zhenpeng Yan
- Translational Medicine Research Center, Shanxi Medical University, Taiyuan, China
| | - Ying Li
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - Songrui Xu
- Translational Medicine Research Center, Shanxi Medical University, Taiyuan, China
| | - Qichao Zhou
- Translational Medicine Research Center, Shanxi Medical University, Taiyuan, China
| | - Ruyi Shi
- Department of Cell Biology and Genetics, Shanxi Medical University, Taiyuan, Shanxi, 030001, China
| | - Zhi Tian
- Second Clinical Medical College, Shanxi Medical University, 382 Wuyi Road, Taiyuan, Shanxi, 030001, China
- Department of Orthopedics, The Second Hospital of Shanxi Medical University, Shanxi Key Laboratory of Bone and Soft Tissue Injury Repair, 382 Wuyi Road, Taiyuan, Shanxi, 030001, China
| | - Bin Wang
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| |
Collapse
|
3
|
Han MM, Li XY, Yi XY, Zheng YS, Xia WL, Liu YF, Wang QX. Automatic recognition of depression based on audio and video: A review. World J Psychiatry 2024; 14:225-233. [PMID: 38464777 PMCID: PMC10921287 DOI: 10.5498/wjp.v14.i2.225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 12/18/2023] [Accepted: 01/24/2024] [Indexed: 02/06/2024] Open
Abstract
Depression is a common mental health disorder. With current depression detection methods, specialized physicians often engage in conversations and physiological examinations based on standardized scales as auxiliary measures for depression assessment. Non-biological markers-typically classified as verbal or non-verbal and deemed crucial evaluation criteria for depression-have not been effectively utilized. Specialized physicians usually require extensive training and experience to capture changes in these features. Advancements in deep learning technology have provided technical support for capturing non-biological markers. Several researchers have proposed automatic depression estimation (ADE) systems based on sounds and videos to assist physicians in capturing these features and conducting depression screening. This article summarizes commonly used public datasets and recent research on audio- and video-based ADE based on three perspectives: Datasets, deficiencies in existing research, and future development directions.
Collapse
Affiliation(s)
- Meng-Meng Han
- Shandong Mental Health Center, Shandong University, Jinan 250014, Shandong Province, China
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, Shandong Province, China
| | - Xing-Yun Li
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, Shandong Province, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, Shandong Province, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250353, Shandong Province, China
| | - Xin-Yu Yi
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, Shandong Province, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, Shandong Province, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250353, Shandong Province, China
| | - Yun-Shao Zheng
- Department of Ward Two, Shandong Mental Health Center, Shandong University, Jinan 250014, Shandong Province, China
| | - Wei-Li Xia
- Shandong Mental Health Center, Shandong University, Jinan 250014, Shandong Province, China
| | - Ya-Fei Liu
- Shandong Mental Health Center, Shandong University, Jinan 250014, Shandong Province, China
| | - Qing-Xiang Wang
- Shandong Mental Health Center, Shandong University, Jinan 250014, Shandong Province, China
| |
Collapse
|
4
|
Li X, Yi X, Ye J, Zheng Y, Wang Q. SFTNet: A microexpression-based method for depression detection. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107923. [PMID: 37989077 DOI: 10.1016/j.cmpb.2023.107923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 10/10/2023] [Accepted: 11/06/2023] [Indexed: 11/23/2023]
Abstract
BACKGROUND AND OBJECTIVES Depression is a typical mental illness, and early screening can effectively prevent exacerbation of the condition. Many studies have found that the expressions of depressed patients are different from those of other subjects, and microexpressions have been used in the clinical detection of mental illness. However, there are few methods for the automatic detection of depression based on microexpressions. METHODS A new dataset of 156 participants (76 in the case group and 80 in the control group) was created. All data were collected in the context of a new emotional stimulation experiment and doctor-patient conversation. We first analyzed the Average Number of Occurrences (ANO) and Average Duration (AD) of facial expressions in the case group and the control group. Then, we proposed a two-stream model SFTNet for identifying depression based on microexpressions, which consists of a single-temporal network (STNet) and a full-temporal network (FTNet). STNet is used to extract features from facial images at a single time node, FTNet is used to extract features from all-time nodes, and the decision network combines the two features to identify depression through decision fusion. The code for SFTNet is available at https://github.com/muzixingyun/SFTNet. RESULTS We found that the AD of all subjects was less than 20 frames (2/3 seconds) and that the facial expressions of the control group were richer. SFTNet achieved excellent results on the emotional stimulus experimental dataset, with Accuracy, Precision and Recall of 0.873, 0.888 and 0.846, respectively. We also conducted experiments on the doctor-patient conversation dataset, and the Accuracy, Precision and Recall were 0.829, 0.817 and 0.837, respectively. SFTNet can also be applied to microexpression detection task with more accuracy than SOTA models. CONCLUSIONS In the emotional stimulation experiment, the subjects in the case group are more likely to show negative emotions. Compared to SOTA models, our depression detection method is more accurate and can assist doctors in the diagnosis of depression.
Collapse
Affiliation(s)
- Xingyun Li
- Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
| | - Xinyu Yi
- Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
| | - Jiayu Ye
- Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
| | | | | |
Collapse
|
5
|
Bhadra S, Kumar CJ. Enhancing the efficacy of depression detection system using optimal feature selection from EHR. Comput Methods Biomech Biomed Engin 2024; 27:222-236. [PMID: 36820618 DOI: 10.1080/10255842.2023.2181660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 02/13/2023] [Indexed: 02/24/2023]
Abstract
Diagnosing depression at an early stage is crucial and majorly depends on the clinician's skill. The present work aims to develop an automated tool for assisting the diagnostic procedure of depression using multiple machine-learning techniques. The dataset of sample size 4184 used in this study contains biometric and demographic information of individuals with or without depression, accessed from the University of Nice Sophia-Antipolis. The Artificial Neural Network (ANN), Support Vector Machine (SVM), Random Forest (RF) and Extreme Gradient Boosting (XGBoost) are used for classifying the depressed from the control group. To enhance the computational efficiency, various feature selection algorithms like Recursive Feature Elimination (RFE), Mutual Information (MI) and three bio-inspired techniques, viz. Particle Swarm Optimization (PSO), Genetic Algorithm (GA) and Firefly Algorithms (FA) have been incorporated. To enhance the feature selection process further, majority voting is carried out in all possible combinations of three, four and five feature selection techniques. These feature selection techniques bring down the feature set size significantly to a mean of 33 from the actual size of 61 which is a reduction of 45.90%. The classification accuracy of the enhanced model varies between 84.18% and 88.46%, which is a significant improvement in performance as compared to the pre-existing models (83.76-85.89%). The proposed predictive models outperform the pre-existing classification models without feature selection and thereby enhancing both the performance and efficiency of the diagnostic process.
Collapse
Affiliation(s)
- Sweta Bhadra
- Department of Computer Science and Information Technology, Cotton University, Guwahati, India
| | - Chandan Jyoti Kumar
- Department of Computer Science and Information Technology, Cotton University, Guwahati, India
| |
Collapse
|
6
|
Li Z, An Z, Cheng W, Zhou J, Zheng F, Hu B. MHA: a multimodal hierarchical attention model for depression detection in social media. Health Inf Sci Syst 2023; 11:6. [PMID: 36660408 PMCID: PMC9846704 DOI: 10.1007/s13755-022-00197-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Accepted: 09/06/2022] [Indexed: 01/19/2023] Open
Abstract
As a serious mental disease, depression causes great harm to the physical and mental health of individuals, and becomes an important cause of suicide. Therefore, it is necessary to accurately identify and treat depressed patients. Compared with traditional clinical diagnosis methods, a large amount of real and different types of data on social media provides new ideas for depression detection research. In this paper, we construct a depression detection data set based on Weibo, and propose a Multimodal Hierarchical Attention (MHA) model for social media depression detection. Multimodal data is fed into the model and the attention mechanism is applied within and between modalities at the same time. Experimental results show that the proposed model achieves the best classification performance. In addition, we propose a distribution normalization method, which can optimize the data distribution and improve the accuracy of depression detection.
Collapse
Affiliation(s)
- Zepeng Li
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Zhengyi An
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Wenchuan Cheng
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Jiawei Zhou
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Fang Zheng
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
| | - Bin Hu
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, 730000 Gansu China
- School of Medical Technology, Beijing Institute of Technology, Beijing, 100081 Beijing China
- CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200000 Shanghai China
| |
Collapse
|
7
|
Li Y, Liu Z, Zhou L, Yuan X, Shangguan Z, Hu X, Hu B. A facial depression recognition method based on hybrid multi-head cross attention network. Front Neurosci 2023; 17:1188434. [PMID: 37292164 PMCID: PMC10244529 DOI: 10.3389/fnins.2023.1188434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 05/02/2023] [Indexed: 06/10/2023] Open
Abstract
Introduction Deep-learn methods based on convolutional neural networks (CNNs) have demonstrated impressive performance in depression analysis. Nevertheless, some critical challenges need to be resolved in these methods: (1) It is still difficult for CNNs to learn long-range inductive biases in the low-level feature extraction of different facial regions because of the spatial locality. (2) It is difficult for a model with only a single attention head to concentrate on various parts of the face simultaneously, leading to less sensitivity to other important facial regions associated with depression. In the case of facial depression recognition, many of the clues come from a few areas of the face simultaneously, e.g., the mouth and eyes. Methods To address these issues, we present an end-to-end integrated framework called Hybrid Multi-head Cross Attention Network (HMHN), which includes two stages. The first stage consists of the Grid-Wise Attention block (GWA) and Deep Feature Fusion block (DFF) for the low-level visual depression feature learning. In the second stage, we obtain the global representation by encoding high-order interactions among local features with Multi-head Cross Attention block (MAB) and Attention Fusion block (AFB). Results We experimented on AVEC2013 and AVEC2014 depression datasets. The results of AVEC 2013 (RMSE = 7.38, MAE = 6.05) and AVEC 2014 (RMSE = 7.60, MAE = 6.01) demonstrated the efficacy of our method and outperformed most of the state-of-the-art video-based depression recognition approaches. Discussion We proposed a deep learning hybrid model for depression recognition by capturing the higher-order interactions between the depression features of multiple facial regions, which can effectively reduce the error in depression recognition and gives great potential for clinical experiments.
Collapse
|
8
|
Liu Z, Yuan X, Li Y, Shangguan Z, Zhou L, Hu B. PRA-Net: Part-and-Relation Attention Network for depression recognition from facial expression. Comput Biol Med 2023; 157:106589. [PMID: 36934531 DOI: 10.1016/j.compbiomed.2023.106589] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 01/04/2023] [Accepted: 01/22/2023] [Indexed: 01/25/2023]
Abstract
Artificial intelligence methods are widely applied to depression recognition and provide an objective solution. Many effective automated methods for detecting depression use facial expressions, which are strong indicators to reflect psychiatric disorders. However, these methods suffer from insufficient representations of depression. To this end, we propose a novel Part-and-Relation Attention Network (PRA-Net), which can enhance depression representations by accurately focusing on features that are highly correlated with depression. Specifically, we first perform partition on the feature map instead of the original image, in order to obtain part features rich in semantic information. Afterwards, self-attention is used to calculate the weight of each part feature. Following, the relationship between the part feature and the global content representation is explored by relation attention to refine the weight. Finally, all features are aggregated into a more compact and depression-informative representation via both weights for depression score prediction. Extensive experiments demonstrate the superiority of our method. Compared to other end-to-end methods, our method achieves state-of-the-art performance on AVEC2013 and AVEC2014.
Collapse
Affiliation(s)
- Zhenyu Liu
- Gansu Provincial Key Laboratory of Wearable Computing School of Information Science and Engineering Lanzhou University, Lanzhou, China.
| | - Xiaoyan Yuan
- Gansu Provincial Key Laboratory of Wearable Computing School of Information Science and Engineering Lanzhou University, Lanzhou, China.
| | - Yutong Li
- Gansu Provincial Key Laboratory of Wearable Computing School of Information Science and Engineering Lanzhou University, Lanzhou, China.
| | - Zixuan Shangguan
- Gansu Provincial Key Laboratory of Wearable Computing School of Information Science and Engineering Lanzhou University, Lanzhou, China.
| | - Li Zhou
- Gansu Provincial Key Laboratory of Wearable Computing School of Information Science and Engineering Lanzhou University, Lanzhou, China.
| | - Bin Hu
- Gansu Provincial Key Laboratory of Wearable Computing School of Information Science and Engineering Lanzhou University, Lanzhou, China.
| |
Collapse
|
9
|
Wu H, Zhang Z, Li X, Shang K, Han Y, Geng Z, Pan T. A novel pedal musculoskeletal response based on differential spatio-temporal LSTM for human activity recognition. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
10
|
Othmani A, Zeghina AO, Muzammel M. A Model of Normality Inspired Deep Learning Framework for Depression Relapse Prediction Using Audiovisual Data. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107132. [PMID: 36183638 DOI: 10.1016/j.cmpb.2022.107132] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 09/04/2022] [Accepted: 09/13/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND Depression (Major Depressive Disorder) is one of the most common mental illnesses. According to the World Health Organization, more than 300 million people in the world are affected. A first depressive episode can be solved by a spontaneous remission within 6 to 12 months. It has been shown that depression affects speech production and facial expressions. Although numerous studies are proposed in the literature for depression recognition using audiovisual cues, depression relapse using audiovisual cues has not been studied in the literature. METHOD In this paper, we propose a deep learning-based approach for depression recognition and depression relapse prediction using audiovisual data. For more versatility and reusability, the proposed approach is based on a Model of Normality inspired framework where we define depression relapse by the closeness of the audiovisual patterns of a subject after a symptom-free period to the audiovisual patterns of depressed subjects. A model of Normality is an anomaly detection distance-based approach that computes a distance of normality between the deep audiovisual encoding of a test sample and a learned representation from audiovisual encodings of anomaly-free data. RESULTS The proposed approach shows a very promising results with an accuracy of 87.4% and a F1-score of 82.3% for relapse/depression prediction using a Leave-One-Subject-Out training strategy on the DAIC-Woz dataset. CONCLUSION The proposed model of normality-based framework is accurate in detecting depression and in predicting depression relapse. A prospective monitoring system is proposed for assisting depressed patients. The proposed framework is easily extensible and others modalities will be integrated in future works.
Collapse
Affiliation(s)
- Alice Othmani
- Université Paris-Est Créteil (UPEC), LISSI, Vitry sur Seine 94400, France.
| | | | - Muhammad Muzammel
- Université Paris-Est Créteil (UPEC), LISSI, Vitry sur Seine 94400, France
| |
Collapse
|
11
|
Aria M, Hashemzadeh M, Farajzadeh N. QDL-CMFD: A Quality-independent and deep Learning-based Copy-Move image forgery detection method. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
12
|
Dilian O, Kimmel R, Tezmah-Shahar R, Agmon M. Can We Quantify Aging-Associated Postural Changes Using Photogrammetry? A Systematic Review. SENSORS (BASEL, SWITZERLAND) 2022; 22:6640. [PMID: 36081099 PMCID: PMC9459795 DOI: 10.3390/s22176640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 08/28/2022] [Accepted: 08/30/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Aging is widely known to be associated with changes in standing posture. Recent advancements in the field of computerized image processing have allowed for improved analyses of several health conditions using photographs. However, photogrammetry's potential for assessing aging-associated postural changes is yet unclear. Thus, the aim of this review is to evaluate the potential of photogrammetry in quantifying age-related postural changes. MATERIALS AND METHODS We searched the databases PubMed Central, Scopus, Embase, and SciELO from the beginning of records to March 2021. Inclusion criteria were: (a) participants were older adults aged ≥60; (b) standing posture was assessed by photogrammetric means. PRISMA guidelines were followed. We used the Newcastle-Ottawa Scale to assess methodological quality. RESULTS Of 946 articles reviewed, after screening and the removal of duplicates, 11 reports were found eligible for full-text assessment, of which 5 full studies met the inclusion criteria. Significant changes occurring with aging included deepening of thoracic kyphosis, flattening of lumbar lordosis, and increased sagittal inclination. CONCLUSIONS These changes agree with commonly described aging-related postural changes. However, detailed quantification of these changes was not found; the photogrammetrical methods used were often unvalidated and did not adhere to known protocols. These methodological difficulties call for further studies using validated photogrammetrical methods and improved research methodologies.
Collapse
Affiliation(s)
- Omer Dilian
- The Cheryl Spencer School of Nursing, Faculty of Social Welfare and Health Sciences, University of Haifa, Haifa 3498838, Israel
| | - Ron Kimmel
- Department of Computer Science, Technion Israel Institute of Technology, Haifa 3200003, Israel
| | - Roy Tezmah-Shahar
- The Cheryl Spencer School of Nursing, Faculty of Social Welfare and Health Sciences, University of Haifa, Haifa 3498838, Israel
| | - Maayan Agmon
- The Cheryl Spencer School of Nursing, Faculty of Social Welfare and Health Sciences, University of Haifa, Haifa 3498838, Israel
| |
Collapse
|
13
|
He L, Tiwari P, Lv C, Wu W, Guo L. Reducing noisy annotations for depression estimation from facial images. Neural Netw 2022; 153:120-129. [DOI: 10.1016/j.neunet.2022.05.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 04/17/2022] [Accepted: 05/25/2022] [Indexed: 11/28/2022]
|
14
|
Li Y, Zhong Z, Zhang F, Zhao X. Artificial Intelligence-Based Human-Computer Interaction Technology Applied in Consumer Behavior Analysis and Experiential Education. Front Psychol 2022; 13:784311. [PMID: 35465552 PMCID: PMC9020504 DOI: 10.3389/fpsyg.2022.784311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 01/10/2022] [Indexed: 11/24/2022] Open
Abstract
In the course of consumer behavior, it is necessary to study the relationship between the characteristics of psychological activities and the laws of behavior when consumers acquire and use products or services. With the development of the Internet and mobile terminals, electronic commerce (E-commerce) has become an important form of consumption for people. In order to conduct experiential education in E-commerce combined with consumer behavior, courses to understand consumer satisfaction. From the perspective of E-commerce companies, this study proposes to use artificial intelligence (AI) image recognition technology to recognize and analyze consumer facial expressions. First, it analyzes the way of human-computer interaction (HCI) in the context of E-commerce and obtains consumer satisfaction with the product through HCI technology. Then, a deep neural network (DNN) is used to predict the psychological behavior and consumer psychology of consumers to realize personalized product recommendations. In the course education of consumer behavior, it helps to understand consumer satisfaction and make a reasonable design. The experimental results show that consumers are highly satisfied with the products recommended by the system, and the degree of sanctification reaches 93.2%. It is found that the DNN model can learn consumer behavior rules during evaluation, and its prediction effect is increased by 10% compared with the traditional model, which confirms the effectiveness of the recommendation system under the DNN model. This study provides a reference for consumer psychological behavior analysis based on HCI in the context of AI, which is of great significance to help understand consumer satisfaction in consumer behavior education in the context of E-commerce.
Collapse
Affiliation(s)
- Yanmin Li
- Pan Tianshou College of Architecture, Art and Design, Ningbo University, Ningbo, China
| | - Ziqi Zhong
- Department of Management, The London School of Economics and Political Science, London, United Kingdom
| | - Fengrui Zhang
- College of Life Science, Sichuan Agricultural University, Yaan, China
| | - Xinjie Zhao
- School of Software and Microelectronics, Peking University, Beijing, China
| |
Collapse
|
15
|
He L, Guo C, Tiwari P, Su R, Pandey HM, Dang W. DepNet: An automated industrial intelligent system using deep learning for video‐based depression analysis. INT J INTELL SYST 2021. [DOI: 10.1002/int.22704] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Lang He
- School of Computer Science and Technology Xi'an University of Posts and Telecommunications Xi'an Shaanxi China
- Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing Xi'an Shaanxi China
- Xi'an Key Laboratory of Big Data and Intelligent Computing Xi'an Shaanxi China
| | - Chenguang Guo
- School of Electronics and Information Northwestern Polytechnical University Xi'an Shaanxi China
| | - Prayag Tiwari
- Department of Computer Science Aalto University Espoo Finland
| | - Rui Su
- School of Foreign Languages Northwest University Xi'an Shaanxi China
| | - Hari Mohan Pandey
- Department of Computer Science Edge Hill University Ormskirk United Kingdom
| | - Wei Dang
- Xi'an Mental Health Center Xi'an Shaanxi China
| |
Collapse
|
16
|
Chen Q, Chaturvedi I, Ji S, Cambria E. Sequential fusion of facial appearance and dynamics for depression recognition. Pattern Recognit Lett 2021. [DOI: 10.1016/j.patrec.2021.07.005] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
17
|
Chen J, Lu W, Xue F. "Looking beneath the surface": A visual-physical feature hybrid approach for unattended gauging of construction waste composition. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2021; 286:112233. [PMID: 33684803 DOI: 10.1016/j.jenvman.2021.112233] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 02/17/2021] [Accepted: 02/19/2021] [Indexed: 06/12/2023]
Abstract
There are various scenarios challenging human experts to judge the interior of something based on limited surface information. Likewise, at waste disposal facilities around the world, human inspectors are often challenged to gauge the composition of waste bulks to determine admissibility and chargeable levy. Manual approaches are laborious, hazardous, and prone to carelessness and fatigue, making unattended gauging of construction waste composition using simple surface information highly desired. This research attempts to contribute to automated waste composition gauging by harnessing a valuable dataset from Hong Kong. Firstly, visual features, called visual inert probability (VIP), characterizing inert and non-inert materials are extracted from 1127 photos of waste bulks using a fine-tuned convolutional neural network (CNN). Then, these visual features together with easy-to-obtain physical features (e.g., weight and depth) are fed to a tailor-made support vector machine (SVM) model to determine waste composition as measured by the proportions of inert and non-inert materials. The visual-physical feature hybrid model achieved a waste composition gauging accuracy of 94% in the experiments. This high performance implies that the model, with proper adaption and integration, could replace human inspectors to smooth the operation of the waste disposal facilities.
Collapse
Affiliation(s)
- Junjie Chen
- Department of Real Estate and Construction, The University of Hong Kong, Pokfulam Road, Hong Kong, China.
| | - Weisheng Lu
- Department of Real Estate and Construction, The University of Hong Kong, Pokfulam Road, Hong Kong, China.
| | - Fan Xue
- Department of Real Estate and Construction, The University of Hong Kong, Pokfulam Road, Hong Kong, China.
| |
Collapse
|