1
|
Plisiecki H, Sobieszek A. Emotion topology: extracting fundamental components of emotions from text using word embeddings. Front Psychol 2024; 15:1401084. [PMID: 39439759 PMCID: PMC11494860 DOI: 10.3389/fpsyg.2024.1401084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 09/03/2024] [Indexed: 10/25/2024] Open
Abstract
This exploratory study examined the potential of word embeddings, an automated numerical representation of written text, as a novel method for emotion decomposition analysis. Drawing from a substantial dataset scraped from a Social Media site, we constructed emotion vectors to extract the dimensions of emotions, as annotated by the readers of the texts, directly from human language. Our findings demonstrated that word embeddings yield emotional components akin to those found in previous literature, offering an alternative perspective not bounded by theoretical presuppositions, as well as showing that the dimensional structure of emotions is reflected in the semantic structure of their text-based expressions. Our study highlights word embeddings as a promising tool for uncovering the nuances of human emotions and comments on the potential of this approach for other psychological domains, providing a basis for future studies. The exploratory nature of this research paves the way for further development and refinement of this method, promising to enrich our understanding of emotional constructs and psychological phenomena in a more ecologically valid and data-driven manner.
Collapse
Affiliation(s)
- Hubert Plisiecki
- Research Lab for the Digital Social Sciences, IFIS PAN, Warsaw, Poland
| | - Adam Sobieszek
- Department of Psychology, University of Warsaw, Warsaw, Poland
| |
Collapse
|
2
|
Izydorczyk D, Bröder A. What is the airspeed velocity of an unladen swallow? modeling numerical judgments of realistic stimuli. Psychon Bull Rev 2024; 31:1-15. [PMID: 37803234 PMCID: PMC11192830 DOI: 10.3758/s13423-023-02331-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/28/2023] [Indexed: 10/08/2023]
Abstract
Research on processes of multiple-cue judgments usually uses artificial stimuli with predefined cue structures, such as artificial bugs with four binary features like back color, belly color, gland size, and spot shape. One reason for using artifical stimuli is that the cognitive models used in this area need known cues and cue values. This limitation makes it difficult to apply the models to research questions with complex naturalistic stimuli with unknown cue structure. In two studies, building on early categorization research, we demonstrate how cues and cue values of complex naturalistic stimuli can be extracted from pairwise similarity ratings with a multidimensional scaling analysis. These extracted cues can then be used in a state-of-the-art hierarchical Bayesian model of numerical judgments. In the first study, we show that predefined cue structures of artificial stimuli are well recovered by an MDS analysis of similarity judgments and that using these MDS-based attributes as cues in a cognitive model of judgment data from an existing experiment leads to the same inferences as when the original cue values were used. In the second study, we use the same procedure to replicate previous findings from multiple-cue judgment literature using complex naturalistic stimuli.
Collapse
Affiliation(s)
- David Izydorczyk
- Department of Psychology, School of Social Sciences, University of Mannheim, Mannheim, Germany.
| | - Arndt Bröder
- Department of Psychology, School of Social Sciences, University of Mannheim, Mannheim, Germany
| |
Collapse
|
3
|
Cao X, Kosinski M. Large language models know how the personality of public figures is perceived by the general public. Sci Rep 2024; 14:6735. [PMID: 38509191 PMCID: PMC10954708 DOI: 10.1038/s41598-024-57271-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 03/15/2024] [Indexed: 03/22/2024] Open
Abstract
We show that people's perceptions of public figures' personalities can be accurately predicted from their names' location in GPT-3's semantic space. We collected Big Five personality perceptions of 226 public figures from 600 human raters. Cross-validated linear regression was used to predict human perceptions from public figures' name embeddings extracted from GPT-3. The models' accuracy ranged from r = .78 to .88 without controls and from r = .53 to .70 when controlling for public figures' likability and demographics, after correcting for attenuation. Prediction models showed high face validity as revealed by the personality-descriptive adjectives occupying their extremes. Our findings reveal that GPT-3 word embeddings capture signals pertaining to individual differences and intimate traits.
Collapse
Affiliation(s)
- Xubo Cao
- Stanford University, Stanford, USA.
| | | |
Collapse
|
4
|
Leach S, Kitchin AP, Sutton RM. Word embeddings reveal growing moral concern for people, animals and the environment. BRITISH JOURNAL OF SOCIAL PSYCHOLOGY 2023; 62:1925-1938. [PMID: 37403899 DOI: 10.1111/bjso.12663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 06/01/2023] [Indexed: 07/06/2023]
Abstract
The Enlightenment idea of historical moral progress asserts that civil societies become more moral over time. This is often understood as an expanding moral circle and is argued to be tightly linked with language use, with some suggesting that shifts in how we express concern for others can be considered an important indicator of moral progress. Our research explores these notions by examining historical trends in natural language use during the 19th and 20th centuries. We found that the associations between words denoting moral concern and words referring to people, animals, and the environment grew stronger over time. The findings support widely-held views about the nature of moral progress by showing that language has changed in a way that reflects greater concern for others.
Collapse
Affiliation(s)
- Stefan Leach
- School of Psychology, University of Kent, Canterbury, UK
| | | | | |
Collapse
|
5
|
Ansteeg L, Leoné F, Dijkstra T. Characterizing the semantic and form-based similarity spaces of the mental lexicon by means of the multi-arrangement method. Front Psychol 2022; 13:945094. [PMID: 36033027 PMCID: PMC9407019 DOI: 10.3389/fpsyg.2022.945094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 07/25/2022] [Indexed: 12/04/2022] Open
Abstract
Collecting human similarity judgments is instrumental to measuring and modeling neurocognitive representations (e.g., through representational similarity analysis) and has been made more efficient by the multi-arrangement task. While this task has been tested for collecting semantic similarity judgments, it is unclear whether it also lends itself to phonological and orthographic similarity judgments of words. We have extended the task to include these lexical modalities and compared the results between modalities and against computational models. We find that similarity judgments can be collected for all three modalities, although word forms were considered more difficult to sort and resulted in less consistent inter- and intra-rater agreement than semantics. For all three modalities we can construct stable group-level representational similarity matrices. However, these do not capture significant idiosyncratic similarity information unique to each participant. We discuss the potential underlying causes for differences between modalities and their effect on the application of the multi-arrangement task.
Collapse
|
6
|
How much is a cow like a meow? A novel database of human judgements of audiovisual semantic relatedness. Atten Percept Psychophys 2022; 84:1317-1327. [PMID: 35449432 DOI: 10.3758/s13414-022-02488-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/27/2022] [Indexed: 11/08/2022]
Abstract
Semantic information about objects, events, and scenes influences how humans perceive, interact with, and navigate the world. The semantic information about any object or event can be highly complex and frequently draws on multiple sensory modalities, which makes it difficult to quantify. Past studies have primarily relied on either a simplified binary classification of semantic relatedness based on category or on algorithmic values based on text corpora rather than human perceptual experience and judgement. With the aim to further accelerate research into multisensory semantics, we created a constrained audiovisual stimulus set and derived similarity ratings between items within three categories (animals, instruments, household items). A set of 140 participants provided similarity judgments between sounds and images. Participants either heard a sound (e.g., a meow) and judged which of two pictures of objects (e.g., a picture of a dog and a duck) it was more similar to, or saw a picture (e.g., a picture of a duck) and selected which of two sounds it was more similar to (e.g., a bark or a meow). Judgements were then used to calculate similarity values of any given cross-modal pair. An additional 140 participants provided word judgement to calculate similarity of word-word pairs. The derived and reported similarity judgements reflect a range of semantic similarities across three categories and items, and highlight similarities and differences among similarity judgments between modalities. We make the derived similarity values available in a database format to the research community to be used as a measure of semantic relatedness in cognitive psychology experiments, enabling more robust studies of semantics in audiovisual environments.
Collapse
|
7
|
Bhatia S, Aka A. Cognitive Modeling With Representations From Large-Scale Digital Data. CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 2022. [DOI: 10.1177/09637214211068113] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Deep-learning methods can extract high-dimensional feature vectors for objects, concepts, images, and texts from large-scale digital data sets. These vectors are proxies for the mental representations that people use in everyday cognition and behavior. For this reason, they can serve as inputs into computational models of cognition, giving these models the ability to process and respond to naturalistic prompts. Over the past few years, researchers have applied this approach to topics such as similarity judgment, memory search, categorization, decision making, and conceptual knowledge. In this article, we summarize these applications, identify underlying trends, and outline directions for future research on the computational modeling of naturalistic cognition and behavior.
Collapse
Affiliation(s)
- Sudeep Bhatia
- Department of Psychology and Department of Marketing, University of Pennsylvania
| | - Ada Aka
- Department of Psychology and Department of Marketing, University of Pennsylvania
| |
Collapse
|
8
|
Günther F, Marelli M. Patterns in CAOSS: Distributed representations predict variation in relational interpretations for familiar and novel compound words. Cogn Psychol 2022; 134:101471. [PMID: 35339747 DOI: 10.1016/j.cogpsych.2022.101471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 01/25/2022] [Accepted: 02/28/2022] [Indexed: 12/01/2022]
Abstract
While distributional semantic models that represent word meanings as high-dimensional vectors induced from large text corpora have been shown to successfully predict human behavior across a wide range of tasks, they have also received criticism from different directions. These include concerns over their interpretability (how can numbers specifying abstract, latent dimensions represent meaning?) and their ability to capture variation in meaning (how can a single vector representation capture multiple different interpretations for the same expression?). Here, we demonstrate that semantic vectors can indeed rise up to these challenges, by training a mapping system (a simple linear regression) that predicts inter-individual variation in relational interpretations for compounds such as wood brush (for example brush FOR wood, or brush MADE OF wood) from (compositional) semantic vectors representing the meanings of these compounds. These predictions consistently beat different random baselines, both for familiar compounds (moon light, Experiment 1) as well as novel compounds (wood brush, Experiment 2), demonstrating that distributional semantic vectors encode variations in qualitative interpretations that can be decoded using techniques as simple as linear regression.
Collapse
Affiliation(s)
| | - Marco Marelli
- University of Milano-Bicocca, Milan, Italy; NeuroMI, Milan Center for Neuroscience, Milan, Italy
| |
Collapse
|
9
|
Gandhi N, Zou W, Meyer C, Bhatia S, Walasek L. Computational Methods for Predicting and Understanding Food Judgment. Psychol Sci 2022; 33:579-594. [PMID: 35298316 DOI: 10.1177/09567976211043426] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
People make subjective judgments about the healthiness of different foods every day, and these judgments in turn influence their food choices and health outcomes. Despite the importance of such judgments, there are few quantitative theories about their psychological underpinnings. This article introduces a novel computational approach that can approximate people's knowledge representations for thousands of common foods. We used these representations to predict how both lay decision-makers (the general population) and experts judge the healthiness of individual foods. We also applied our method to predict the impact of behavioral interventions, such as the provision of front-of-pack nutrient and calorie information. Across multiple studies with data from 846 adults, our models achieved very high accuracy rates (r2 = .65-.77) and significantly outperformed competing models based on factual nutritional content. These results illustrate how new computational methods applied to established psychological theory can be used to better predict, understand, and influence health behavior.
Collapse
Affiliation(s)
- Natasha Gandhi
- Behaviour and Wellbeing Science Group, Warwick Manufacturing Group (WMG), University of Warwick
| | - Wanling Zou
- Department of Psychology, University of Pennsylvania
| | - Caroline Meyer
- Behaviour and Wellbeing Science Group, Warwick Manufacturing Group (WMG), University of Warwick
| | - Sudeep Bhatia
- Department of Psychology, University of Pennsylvania
| | | |
Collapse
|
10
|
Verheyen S, Storms G. Whether the Pairwise Rating Method and the Spatial Arrangement Method yield comparable dimensionalities depends on the dimensionality choice procedure. METHODS IN PSYCHOLOGY 2021. [DOI: 10.1016/j.metip.2021.100060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
11
|
Richie R, Bhatia S. Similarity Judgment Within and Across Categories: A Comprehensive Model Comparison. Cogn Sci 2021; 45:e13030. [PMID: 34379325 DOI: 10.1111/cogs.13030] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 06/17/2021] [Accepted: 06/25/2021] [Indexed: 10/20/2022]
Abstract
Similarity is one of the most important relations humans perceive, arguably subserving category learning and categorization, generalization and discrimination, judgment and decision making, and other cognitive functions. Researchers have proposed a wide range of representations and metrics that could be at play in similarity judgment, yet have not comprehensively compared the power of these representations and metrics for predicting similarity within and across different semantic categories. We performed such a comparison by pairing nine prominent vector semantic representations with seven established similarity metrics that could operate on these representations, as well as supervised methods for dimensional weighting in the similarity function. This approach yields a factorial model structure with 126 distinct representation-metric pairs, which we tested on a novel dataset of similarity judgments between pairs of cohyponymic words in eight categories. We found that cosine similarity and Pearson correlation were the overall best performing unweighted similarity functions, and that word vectors derived from free association norms often outperformed word vectors derived from text (including those specialized for similarity). Importantly, models that used human similarity judgments to learn category-specific weights on dimensions yielded substantially better predictions than all unweighted approaches across all types of similarity functions and representations, although dimension weights did not generalize well across semantic categories, suggesting strong category context effects in similarity judgment. We discuss implications of these results for cognitive modeling and natural language processing, as well as for theories of the representations and metrics involved in similarity.
Collapse
Affiliation(s)
- Russell Richie
- Department of Psychology, University of Pennsylvania.,Children's Hospital of Philadelphia
| | - Sudeep Bhatia
- Department of Psychology, University of Pennsylvania
| |
Collapse
|
12
|
Abstract
Adult semantic memory has been traditionally conceptualized as a relatively static memory system that consists of knowledge about the world, concepts, and symbols. Considerable work in the past few decades has challenged this static view of semantic memory, and instead proposed a more fluid and flexible system that is sensitive to context, task demands, and perceptual and sensorimotor information from the environment. This paper (1) reviews traditional and modern computational models of semantic memory, within the umbrella of network (free association-based), feature (property generation norms-based), and distributional semantic (natural language corpora-based) models, (2) discusses the contribution of these models to important debates in the literature regarding knowledge representation (localist vs. distributed representations) and learning (error-free/Hebbian learning vs. error-driven/predictive learning), and (3) evaluates how modern computational models (neural network, retrieval-based, and topic models) are revisiting the traditional "static" conceptualization of semantic memory and tackling important challenges in semantic modeling such as addressing temporal, contextual, and attentional influences, as well as incorporating grounding and compositionality into semantic representations. The review also identifies new challenges regarding the abundance and availability of data, the generalization of semantic models to other languages, and the role of social interaction and collaboration in language learning and development. The concluding section advocates the need for integrating representational accounts of semantic memory with process-based accounts of cognitive behavior, as well as the need for explicit comparisons of computational models to human baselines in semantic tasks to adequately assess their psychological plausibility as models of human semantic memory.
Collapse
|
13
|
Zou W, Bhatia S. Judgment errors in naturalistic numerical estimation. Cognition 2021; 211:104647. [PMID: 33706155 DOI: 10.1016/j.cognition.2021.104647] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 02/19/2021] [Accepted: 02/23/2021] [Indexed: 11/29/2022]
Abstract
People estimate numerical quantities (such as the calories of foods) on a day-to-day basis. Although these estimates influence behavior and determine wellbeing, they are prone to two important types of errors. Scaling errors occur when people make mistakes reporting their beliefs about a particular numerical quantity (e.g. by inflating small numbers). Belief errors occur when people make mistakes using their knowledge of the judgment target to form their beliefs about the numerical quantity (e.g. by overweighting certain cues). In this paper, we quantitatively model numerical estimates, and in turn, scaling and belief errors, in everyday judgment tasks. Our approach is unique in using insights from semantic memory research to specify knowledge for naturalistic judgment targets, allowing our models to formally describe nuanced errors in belief not considered in prior research. In Studies 1 and 2, we find that belief error models predict participant estimates and errors with very high out-of-sample accuracy rates, significantly outperforming the predictions of scaling error models. In fact, the best-fitting belief error models can closely mimic the inverse-S shaped patterns captured by scaling error models, suggesting that the types of responses previously attributed to scaling errors can be seen as errors of belief. In Studies 3 to 8, we find that belief error models are also able to predict people's responses in semantic judgment, free association, and verbal protocol tasks related to numerical judgment, and thus provide a good account of the cognitive underpinnings of judgment.
Collapse
|