Abstract
BACKGROUND
Efficient detection of depression stigma in mass media is important for designing effective stigma reduction strategies. Using linguistic analysis methods, this paper aims to build computational models for detecting stigma expressions in Chinese social media posts (Sina Weibo).
METHODS
A total of 15,879 Weibo posts with keywords were collected and analyzed. First, a content analysis was conducted on all 15,879 posts to determine whether each of them reflected depression stigma or not. Second, using four algorithms (Simple Logistic Regression, Multilayer Perceptron Neural Networks, Support Vector Machine, and Random Forest), two groups of classification models were built based on selected linguistic features; one for differentiating between posts with and without depression stigma, and one for differentiating among posts with three specific types of depression stigma.
RESULTS
First, 967 of 15,879 posts (6.09%) indicated depression stigma. 39.30%, 15.82%, and 14.99% of them endorsed the stigmatizing view that "People with depression are unpredictable", "Depression is a sign of personal weakness", and "Depression is not a real medical illness", respectively. Second, the highest F-Measure value for differentiating between stigma and non-stigma reached 75.2%. The highest F-Measure value for differentiating among three specific types of stigma reached 86.2%.
LIMITATIONS
Due to the limited and imbalanced dataset of Chinese Weibo posts, the findings of this study might have limited generalizability.
CONCLUSIONS
This paper confirms that incorporating linguistic analysis methods into online detection of stigma can be beneficial to improve the performance of stigma reduction programs.
Collapse