1
|
Wang T, Xu X. The good, the bad, and the ambivalent: Extrapolating affective values for 38,000+ Chinese words via a computational model. Behav Res Methods 2024; 56:5386-5405. [PMID: 37968560 DOI: 10.3758/s13428-023-02274-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/16/2023] [Indexed: 11/17/2023]
Abstract
Word affective ratings are important tools in psycholinguistic research, natural language processing, and many other fields. However, even for well-studied languages, such norms are usually limited in scale. To extrapolate affective (i.e., valence and arousal) values for words in the SUBTLEX-CH database (Cai & Brysbaert, 2010, PLoS ONE, 5(6):e10729), we implemented a computational neural network which captured how words' vector-based semantic representations corresponded to the probability densities of their valence and arousal. Based on these probability density functions, we predicted not only a word's affective values, but also their respective degrees of variability that could characterize individual differences in human affective ratings. The resulting estimates of affective values largely converged with human ratings for both valence and arousal, and the estimated degrees of variability also captured important features of the variability in human ratings. We released the extrapolated affective values, together with their corresponding degrees of variability, for over 38,000 Chinese words in the Open Science Framework ( https://osf.io/s9zmd/ ). We also discussed how the view of embodied cognition could be illuminated by this computational model.
Collapse
Affiliation(s)
- Tianqi Wang
- School of Foreign Languages, Shanghai Jiao Tong University, 800 Dongchuan Rd., Shanghai, 200240, China
- Speech Science Laboratory, The University of Hong Kong, Hong Kong, China
- Academic Unit of Human Communication, Development, and Information Sciences, The University of Hong Kong, Hong Kong, China
| | - Xu Xu
- School of Foreign Languages, Shanghai Jiao Tong University, 800 Dongchuan Rd., Shanghai, 200240, China.
| |
Collapse
|
2
|
Martínez-Huertas JÁ, Jorge-Botana G, Martínez-Mingo A, Iglesias D, Olmos R. Are valence and arousal related to the development of amodal representations of words? A computational study. Cogn Emot 2023:1-9. [PMID: 37987756 DOI: 10.1080/02699931.2023.2283882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/07/2023] [Indexed: 11/22/2023]
Abstract
In this study, we analyzed the relationship between the amodal (semantic) development of words and two popular emotional norms (emotional valence and arousal) in English and Spanish languages. To do so, we combined the strengths of semantics from vector space models (vector length, semantic diversity, and word maturity measures), and feature-based models of emotions. First, we generated a common vector space representing the meaning of words at different developmental stages (five and four developmental stages for English and Spanish, respectively) using the Word Maturity methodology to align different vector spaces. Second, we analyzed the amodal development of words through mixed-effects models with crossed random effects for words and variables using a continuous time metric. Third, the emotional norms were included as covariates in the statistical models. We evaluated more than 23,000 words, whose emotional norms were available for more than 10,000 words, in each language separately. Results showed a curve of amodal development with an increasing linear effect and a small quadratic deceleration. A relevant influence on the amodal development of words was found only for emotional valence (not for arousal), suggesting that positive words have an earlier amodal development and a less pronounced semantic change across early lifespan.
Collapse
Affiliation(s)
- José Ángel Martínez-Huertas
- Department of Methodology of Behavioral Sciences, Universidad Nacional de Educación a Distancia, Madrid, Spain
| | - Guillermo Jorge-Botana
- Department of Psychobiology and Methodology of Behavioral Sciences, Universidad Complutense de Madrid, Madrid, Spain
| | | | - Diego Iglesias
- Department of Social Psychology and Methodology, Universidad Autónoma de Madrid, Madrid, Spain
| | - Ricardo Olmos
- Department of Social Psychology and Methodology, Universidad Autónoma de Madrid, Madrid, Spain
| |
Collapse
|
3
|
A Failed Cross-Validation Study on the Relationship between LIWC Linguistic Indicators and Personality: Exemplifying the Lack of Generalizability of Exploratory Studies. PSYCH 2022. [DOI: 10.3390/psych4040059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
(1) Background: Previous meta-analytic research found small to moderate relationships between the Big Five personality traits and different linguistic computational indicators. However, previous studies included multiple linguistic indicators to predict personality from an exploratory framework. The aim of this study was to conduct a cross-validation study analyzing the relationships between language indicators and personality traits to test the generalizability of previous results; (2) Methods: 643 Spanish undergraduate students were tasked to write a self-description in 500 words (which was evaluated with the LIWC) and to answer a standardized Big Five questionnaire. Two different analytical approaches using multiple linear regression were followed: first, using the complete data and, second, by conducting different cross-validation studies; (3) Results: The results showed medium effect sizes in the first analytical approach. On the contrary, it was found that language and personality relationships were not generalizable in the cross-validation studies; (4) Conclusions: We concluded that moderate effect sizes could be obtained when the language and personality relationships were analyzed in single samples, but it was not possible to generalize the model estimates to other samples. Thus, previous exploratory results found on this line of research appear to be incompatible with a nomothetic approach.
Collapse
|
4
|
Günther F, Marelli M. Patterns in CAOSS: Distributed representations predict variation in relational interpretations for familiar and novel compound words. Cogn Psychol 2022; 134:101471. [PMID: 35339747 DOI: 10.1016/j.cogpsych.2022.101471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 01/25/2022] [Accepted: 02/28/2022] [Indexed: 12/01/2022]
Abstract
While distributional semantic models that represent word meanings as high-dimensional vectors induced from large text corpora have been shown to successfully predict human behavior across a wide range of tasks, they have also received criticism from different directions. These include concerns over their interpretability (how can numbers specifying abstract, latent dimensions represent meaning?) and their ability to capture variation in meaning (how can a single vector representation capture multiple different interpretations for the same expression?). Here, we demonstrate that semantic vectors can indeed rise up to these challenges, by training a mapping system (a simple linear regression) that predicts inter-individual variation in relational interpretations for compounds such as wood brush (for example brush FOR wood, or brush MADE OF wood) from (compositional) semantic vectors representing the meanings of these compounds. These predictions consistently beat different random baselines, both for familiar compounds (moon light, Experiment 1) as well as novel compounds (wood brush, Experiment 2), demonstrating that distributional semantic vectors encode variations in qualitative interpretations that can be decoded using techniques as simple as linear regression.
Collapse
Affiliation(s)
| | - Marco Marelli
- University of Milano-Bicocca, Milan, Italy; NeuroMI, Milan Center for Neuroscience, Milan, Italy
| |
Collapse
|
5
|
Martínez-Huertas JÁ, Jorge-Botana G, Olmos R. Emotional Valence Precedes Semantic Maturation of Words: A Longitudinal Computational Study of Early Verbal Emotional Anchoring. Cogn Sci 2021; 45:e13026. [PMID: 34288038 DOI: 10.1111/cogs.13026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 06/12/2021] [Accepted: 06/25/2021] [Indexed: 11/26/2022]
Abstract
We present a longitudinal computational study on the connection between emotional and amodal word representations from a developmental perspective. In this study, children's and adult word representations were generated using the latent semantic analysis (LSA) vector space model and Word Maturity methodology. Some children's word representations were used to set a mapping function between amodal and emotional word representations with a neural network model using ratings from 9-year-old children. The neural network was trained and validated in the child semantic space. Then, the resulting neural network was tested with adult word representations using ratings from an adult data set. Samples of 1210 and 5315 words were used in the child and the adult semantic spaces, respectively. Results suggested that the emotional valence of words can be predicted from amodal vector representations even at the child stage, and accurate emotional propagation was found in the adult word vector representations. In this way, different propagative processes were observed in the adult semantic space. These findings highlight a potential mechanism for early verbal emotional anchoring. Moreover, different multiple linear regression and mixed-effect models revealed moderation effects for the performance of the longitudinal computational model. First, words with early maturation and subsequent semantic definition promoted emotional propagation. Second, an interaction effect between age of acquisition and abstractness was found to explain model performance. The theoretical and methodological implications are discussed.
Collapse
Affiliation(s)
| | | | - Ricardo Olmos
- Faculty of Psychology, Universidad Autónoma de Madrid
| |
Collapse
|
6
|
Moreno JD, Martínez-Huertas JÁ, Olmos R, Jorge-Botana G, Botella J. Can personality traits be measured analyzing written language? A meta-analytic study on computational methods. PERSONALITY AND INDIVIDUAL DIFFERENCES 2021. [DOI: 10.1016/j.paid.2021.110818] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
7
|
Günther F, Petilli MA, Vergallito A, Marelli M. Images of the unseen: extrapolating visual representations for abstract and concrete words in a data-driven computational model. PSYCHOLOGICAL RESEARCH 2020; 86:2512-2532. [PMID: 33180152 DOI: 10.1007/s00426-020-01429-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Theories of grounded cognition assume that conceptual representations are grounded in sensorimotor experience. However, abstract concepts such as jealousy or childhood have no directly associated referents with which such sensorimotor experience can be made; therefore, the grounding of abstract concepts has long been a topic of debate. Here, we propose (a) that systematic relations exist between semantic representations learned from language on the one hand and perceptual experience on the other hand, (b) that these relations can be learned in a bottom-up fashion, and (c) that it is possible to extrapolate from this learning experience to predict expected perceptual representations for words even where direct experience is missing. To test this, we implement a data-driven computational model that is trained to map language-based representations (obtained from text corpora, representing language experience) onto vision-based representations (obtained from an image database, representing perceptual experience), and apply its mapping function onto language-based representations for abstract and concrete words outside the training set. In three experiments, we present participants with these words, accompanied by two images: the image predicted by the model and a random control image. Results show that participants' judgements were in line with model predictions even for the most abstract words. This preference was stronger for more concrete items and decreased for the more abstract ones. Taken together, our findings have substantial implications in support of the grounding of abstract words, suggesting that we can tap into our previous experience to create possible visual representation we don't have.
Collapse
Affiliation(s)
| | | | - Alessandra Vergallito
- University of Milano-Bicocca, Milan, Italy.,NeuroMI, Milan Center for Neuroscience, Milan, Italy
| | - Marco Marelli
- University of Milano-Bicocca, Milan, Italy.,NeuroMI, Milan Center for Neuroscience, Milan, Italy
| |
Collapse
|