1
|
Rosen ZP, Dale R. BERTs of a feather: Studying inter- and intra-group communication via information theory and language models. Behav Res Methods 2024; 56:3140-3160. [PMID: 38030924 DOI: 10.3758/s13428-023-02267-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/05/2023] [Indexed: 12/01/2023]
Abstract
When communicating, individuals alter their language to fulfill a myriad of social functions. In particular, linguistic convergence and divergence are fundamental in establishing and maintaining group identity. Quantitatively characterizing linguistic convergence is important when testing hypotheses surrounding language, including interpersonal and group communication. We provide a quantitative interpretation of linguistic convergence grounded in information theory. We then construct a computational model, built on top of a neural network model of language, that can be deployed to measure and test hypotheses about linguistic convergence in "big data." We demonstrate the utility of our convergence measurement in two case studies: (1) showing that our measurement is indeed sensitive to linguistic convergence across turns in dyadic conversation, and (2) showing that our convergence measurement is sensitive to social factors that mediate convergence in Internet-based communities (specifically, r/MensRights and r/MensLib). Our measurement also captures differences in which social factors influence web-based communities. We conclude by discussing methodological and theoretical implications of this semantic convergence analysis.
Collapse
Affiliation(s)
- Zachary P Rosen
- Communication Studies Saddleback Community College, Mission Viejo, CA, USA.
| | - Rick Dale
- Department of Communication UCLA, Los Angeles, CA, USA
| |
Collapse
|
2
|
Johns BT. Determining the Relativity of Word Meanings Through the Construction of Individualized Models of Semantic Memory. Cogn Sci 2024; 48:e13413. [PMID: 38402448 DOI: 10.1111/cogs.13413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 11/11/2023] [Accepted: 01/27/2024] [Indexed: 02/26/2024]
Abstract
Distributional models of lexical semantics are capable of acquiring sophisticated representations of word meanings. The main theoretical insight provided by these models is that they demonstrate the systematic connection between the knowledge that people acquire and the experience that they have with the natural language environment. However, linguistic experience is inherently variable and differs radically across people due to demographic and cultural variables. Recently, distributional models have been used to examine how word meanings vary across languages and it was found that there is considerable variability in the meanings of words across languages for most semantic categories. The goal of this article is to examine how variable word meanings are across individual language users within a single language. This was accomplished by assembling 500 individual user corpora attained from the online forum Reddit. Each user corpus ranged between 3.8 and 32.3 million words each, and a count-based distributional framework was used to extract word meanings for each user. These representations were then used to estimate the semantic alignment of word meanings across individual language users. It was found that there are significant levels of relativity in word meanings across individuals, and these differences are partially explained by other psycholinguistic factors, such as concreteness, semantic diversity, and social aspects of language usage. These results point to word meanings being fundamentally relative and contextually fluid, with this relativeness being related to the individualized nature of linguistic experience.
Collapse
|
3
|
Johns BT, Taler V, Jones MN. Contextual dynamics in lexical encoding across the ageing spectrum: A simulation study. Q J Exp Psychol (Hove) 2023; 76:2164-2182. [PMID: 36458499 PMCID: PMC10466941 DOI: 10.1177/17470218221145685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 10/18/2022] [Accepted: 10/19/2022] [Indexed: 08/23/2023]
Abstract
The field of psycholinguistics has recently questioned the primacy of word frequency (WF) in influencing word recognition and production, instead focusing on the importance of a word's contextual diversity (CD). WF is operationalised by counting the number of occurrences of a word in a corpus, while a word's CD is a count of the number of contexts that a word occurs in, with repetitions within a context being ignored. Numerous studies have converged on the conclusion that CD is a better predictor of word recognition latency and accuracy than frequency. These findings support a cognitive mechanism based on the principle of likely need over the principle of repetition in lexical organisation. In the current study, we trained the semantic distinctiveness model on communication patterns in social media platforms consisting of over 55-billion-word tokens and examined the ability of theoretically distinct models to explain word recognition latency and accuracy data from over 1 million participants from the Mandera et al. English Crowdsourding Project norms, consisting of approximately 59,000 words across six age bands ranging from ages 10 to 60 years. There was a clear quantitative trend across the age bands, where there is a shift from a social environment-based attention mechanism in the "younger" models, to a clear dominance for a discourse-based attention mechanism as models "aged." This pattern suggests that there is a dynamical interaction between the cognitive mechanisms of lexical organisation and environmental information that emerges across ageing.
Collapse
Affiliation(s)
- Brendan T Johns
- Department of Psychology, McGill University, Montreal, Quebec, Canada
| | | | | |
Collapse
|
4
|
Diachek E, Brown-Schmidt S, Polyn SM. Items Outperform Adjectives in a Computational Model of Binary Semantic Classification. Cogn Sci 2023; 47:e13336. [PMID: 37695844 DOI: 10.1111/cogs.13336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 08/03/2023] [Accepted: 08/23/2023] [Indexed: 09/13/2023]
Abstract
Semantic memory encompasses one's knowledge about the world. Distributional semantic models, which construct vector spaces with embedded words, are a proposed framework for understanding the representational structure of human semantic knowledge. Unlike some classic semantic models, distributional semantic models lack a mechanism for specifying the properties of concepts, which raises questions regarding their utility for a general theory of semantic knowledge. Here, we develop a computational model of a binary semantic classification task, in which participants judged target words for the referent's size or animacy. We created a family of models, evaluating multiple distributional semantic models, and mechanisms for performing the classification. The most successful model constructed two composite representations for each extreme of the decision axis (e.g., one averaging together representations of characteristically big things and another of characteristically small things). Next, the target item was compared to each composite representation, allowing the model to classify more than 1,500 words with human-range performance and to predict response times. We propose that when making a decision on a binary semantic classification task, humans use task prompts to retrieve instances representative of the extremes on that semantic dimension and compare the probe to those instances. This proposal is consistent with the principles of the instance theory of semantic memory.
Collapse
Affiliation(s)
- Evgeniia Diachek
- Department of Psychology and Human Development, Peabody College, Vanderbilt University
| | - Sarah Brown-Schmidt
- Department of Psychology and Human Development, Peabody College, Vanderbilt University
| | - Sean M Polyn
- Department of Psychology, College of Arts and Sciences, Vanderbilt University
| |
Collapse
|
5
|
Johns BT. Computing the Relativity of Word Meanings through the Construction of Individualized Models of Semantic Memory. COGN SYST RES 2023. [DOI: 10.1016/j.cogsys.2023.02.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
|
6
|
Taler V, Johns B. Using big data to understand bilingual performance in semantic fluency: Findings from the Canadian Longitudinal Study on Aging. PLoS One 2022; 17:e0277660. [PMID: 36441767 PMCID: PMC9704680 DOI: 10.1371/journal.pone.0277660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 11/01/2022] [Indexed: 11/29/2022] Open
Abstract
OBJECTIVES This study aimed to characterize verbal fluency performance in monolinguals and bilinguals using data from the Canadian Longitudinal Study on Aging (CLSA). METHODS A large sample of adults aged 45-85 (n = 12,875) completed a one-minute animal fluency task in English. Participants were English-speaking monolinguals (n = 9,759), bilinguals who spoke English as their first language (L1 bilinguals, n = 1,836), and bilinguals who spoke English as their second language (L2 bilinguals, n = 1,280). Using a distributional modeling approach to quantify the semantic similarity of words, we examined the impact of word frequency and pairwise semantic similarity on performance on this task. RESULTS Overall, L1 bilinguals outperformed monolinguals on the verbal fluency task: they produced more items, and these items were of lower average frequency and semantic similarity. Monolinguals in turn outperformed L2 bilinguals on these measures. The results held across different age groups, educational, and income levels. DISCUSSION These results demonstrate an advantage for bilinguals compared to monolinguals on a category fluency task, when performed in the first language, indicating that, at least in the CLSA sample, bilinguals have superior semantic search capabilities in their first language compared to monolingual speakers of that language.
Collapse
Affiliation(s)
- Vanessa Taler
- School of Psychology, University of Ottawa, Ottawa, Canada,Bruyère Research Institute, Ottawa, Canada,* E-mail:
| | - Brendan Johns
- Department of Psychology, McGill University, Montreal, Canada
| |
Collapse
|
7
|
Stevenson S, Merlo P. Beyond the Benchmarks: Toward Human-Like Lexical Representations. Front Artif Intell 2022; 5:796741. [PMID: 35685444 PMCID: PMC9170951 DOI: 10.3389/frai.2022.796741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 04/19/2022] [Indexed: 11/13/2022] Open
Abstract
To process language in a way that is compatible with human expectations in a communicative interaction, we need computational representations of lexical properties that form the basis of human knowledge of words. In this article, we concentrate on word-level semantics. We discuss key concepts and issues that underlie the scientific understanding of the human lexicon: its richly structured semantic representations, their ready and continual adaptability, and their grounding in crosslinguistically valid conceptualization. We assess the state of the art in natural language processing (NLP) in achieving these identified properties, and suggest ways in which the language sciences can inspire new approaches to their computational instantiation.
Collapse
Affiliation(s)
- Suzanne Stevenson
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Paola Merlo
- Linguistics Department, University of Geneva, Geneva, Switzerland
| |
Collapse
|
8
|
Günther F, Marelli M. Patterns in CAOSS: Distributed representations predict variation in relational interpretations for familiar and novel compound words. Cogn Psychol 2022; 134:101471. [PMID: 35339747 DOI: 10.1016/j.cogpsych.2022.101471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 01/25/2022] [Accepted: 02/28/2022] [Indexed: 12/01/2022]
Abstract
While distributional semantic models that represent word meanings as high-dimensional vectors induced from large text corpora have been shown to successfully predict human behavior across a wide range of tasks, they have also received criticism from different directions. These include concerns over their interpretability (how can numbers specifying abstract, latent dimensions represent meaning?) and their ability to capture variation in meaning (how can a single vector representation capture multiple different interpretations for the same expression?). Here, we demonstrate that semantic vectors can indeed rise up to these challenges, by training a mapping system (a simple linear regression) that predicts inter-individual variation in relational interpretations for compounds such as wood brush (for example brush FOR wood, or brush MADE OF wood) from (compositional) semantic vectors representing the meanings of these compounds. These predictions consistently beat different random baselines, both for familiar compounds (moon light, Experiment 1) as well as novel compounds (wood brush, Experiment 2), demonstrating that distributional semantic vectors encode variations in qualitative interpretations that can be decoded using techniques as simple as linear regression.
Collapse
Affiliation(s)
| | - Marco Marelli
- University of Milano-Bicocca, Milan, Italy; NeuroMI, Milan Center for Neuroscience, Milan, Italy
| |
Collapse
|
9
|
Jacobs AM, Kinder A. Computational Models of Readers' Apperceptive Mass. Front Artif Intell 2022; 5:718690. [PMID: 35280232 PMCID: PMC8905622 DOI: 10.3389/frai.2022.718690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 01/18/2022] [Indexed: 11/15/2022] Open
Abstract
Recent progress in machine-learning-based distributed semantic models (DSMs) offers new ways to simulate the apperceptive mass (AM; Kintsch, 1980) of reader groups or individual readers and to predict their performance in reading-related tasks. The AM integrates the mental lexicon with world knowledge, as for example, acquired via reading books. Following pioneering work by Denhière and Lemaire (2004), here, we computed DSMs based on a representative corpus of German children and youth literature (Jacobs et al., 2020) as null models of the part of the AM that represents distributional semantic input, for readers of different reading ages (grades 1–2, 3–4, and 5–6). After a series of DSM quality tests, we evaluated the performance of these models quantitatively in various tasks to simulate the different reader groups' hypothetical semantic and syntactic skills. In a final study, we compared the models' performance with that of human adult and children readers in two rating tasks. Overall, the results show that with increasing reading age performance in practically all tasks becomes better. The approach taken in these studies reveals the limits of DSMs for simulating human AM and their potential for applications in scientific studies of literature, research in education, or developmental science.
Collapse
Affiliation(s)
- Arthur M. Jacobs
- Experimental and Neurocognitive Psychology Group, Department of Educational Science and Psychology, Freie Universität Berlin, Berlin, Germany
- Center for Cognitive Neuroscience Berlin (CCNB), Freie Universität Berlin, Berlin, Germany
- *Correspondence: Arthur M. Jacobs
| | - Annette Kinder
- Learning Psychology Group, Department of Educational Science and Psychology, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
10
|
Iordan MC, Giallanza T, Ellis CT, Beckage NM, Cohen JD. Context Matters: Recovering Human Semantic Structure from Machine Learning Analysis of Large-Scale Text Corpora. Cogn Sci 2022; 46:e13085. [PMID: 35146779 PMCID: PMC9285590 DOI: 10.1111/cogs.13085] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 11/08/2021] [Accepted: 12/06/2021] [Indexed: 11/30/2022]
Abstract
Applying machine learning algorithms to automatically infer relationships between concepts from large‐scale collections of documents presents a unique opportunity to investigate at scale how human semantic knowledge is organized, how people use it to make fundamental judgments (“How similar are cats and bears?”), and how these judgments depend on the features that describe concepts (e.g., size, furriness). However, efforts to date have exhibited a substantial discrepancy between algorithm predictions and human empirical judgments. Here, we introduce a novel approach to generating embeddings for this purpose motivated by the idea that semantic context plays a critical role in human judgment. We leverage this idea by constraining the topic or domain from which documents used for generating embeddings are drawn (e.g., referring to the natural world vs. transportation apparatus). Specifically, we trained state‐of‐the‐art machine learning algorithms using contextually‐constrained text corpora (domain‐specific subsets of Wikipedia articles, 50+ million words each) and showed that this procedure greatly improved predictions of empirical similarity judgments and feature ratings of contextually relevant concepts. Furthermore, we describe a novel, computationally tractable method for improving predictions of contextually‐unconstrained embedding models based on dimensionality reduction of their internal representation to a small number of contextually relevant semantic features. By improving the correspondence between predictions derived automatically by machine learning methods using vast amounts of data and more limited, but direct empirical measurements of human judgments, our approach may help leverage the availability of online corpora to better understand the structure of human semantic representations and how people make judgments based on those.
Collapse
Affiliation(s)
| | - Tyler Giallanza
- Princeton Neuroscience Institute & Department of Psychology, Princeton University
| | | | | | - Jonathan D Cohen
- Princeton Neuroscience Institute & Department of Psychology, Princeton University
| |
Collapse
|
11
|
|
12
|
Distributional social semantics: Inferring word meanings from communication patterns. Cogn Psychol 2021; 131:101441. [PMID: 34666227 DOI: 10.1016/j.cogpsych.2021.101441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 10/05/2021] [Accepted: 10/07/2021] [Indexed: 11/20/2022]
Abstract
Distributional models of lexical semantics have proven to be powerful accounts of how word meanings are acquired from the natural language environment (Günther, Rinaldi, & Marelli, 2019; Kumar, 2020). Standard models of this type acquire the meaning of words through the learning of word co-occurrence statistics across large corpora. However, these models ignore social and communicative aspects of language processing, which is considered central to usage-based and adaptive theories of language (Tomasello, 2003; Beckner et al., 2009). Johns (2021) recently demonstrated that integrating social and communicative information into a lexical strength measure allowed for benchmark fits to be attained for lexical organization data, indicating that the social world contains important statistical information for language learning and processing. Through the analysis of the communication patterns of over 330,000 individuals on the online forum Reddit, totaling approximately 55 billion words of text, the findings of the current article demonstrates that social information about word usage allows for unique aspects of a word's meaning to be acquired, providing a new pathway for distributional model development.
Collapse
|
13
|
Hills TT, Kenett YN. Is the Mind a Network? Maps, Vehicles, and Skyhooks in Cognitive Network Science. Top Cogn Sci 2021; 14:189-208. [PMID: 34435461 DOI: 10.1111/tops.12570] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 07/28/2021] [Accepted: 07/28/2021] [Indexed: 11/28/2022]
Abstract
Cognitive researchers often carve cognition up into structures and processes. Cognitive processes operate on structures, like vehicles driving over a map. Language alongside semantic and episodic memory are proposed to have structure, as are perceptual systems. Over these structures, processes operate to construct memory and solve problems by retrieving and manipulating information. Network science offers an approach to representing cognitive structures and has made tremendous inroads into understanding the nature of cognitive structure and process. But is the mind a network? If so, what kind? In this article, we briefly review the main metaphors, assumptions, and pitfalls prevalent in cognitive network science (maps and vehicles; one network/process to rule them all), highlight the need for new metaphors that elaborate on the map-and-vehicle framework (wormholes, skyhooks, and generators), and present open questions in studying the mind as a network (the challenge of capturing network change, what should the edges of cognitive networks be made of, and aggregated vs. individual-based networks). One critical lesson of this exercise is that the richness of the mind as network approach makes it a powerful tool in its own right; it has helped to make our assumptions more visible, generating new and fascinating questions, and enriching the prospects for future research. A second lesson is that the mind as a network-though useful-is incomplete. The mind is not a network, but it may contain them.
Collapse
Affiliation(s)
| | - Yoed N Kenett
- Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology
| |
Collapse
|
14
|
Zhang Y, Ridchenko M, Hayashi A, Hamrick P. Episodic memory contributions to second language lexical development persist at higher proficiencies. APPLIED COGNITIVE PSYCHOLOGY 2021. [DOI: 10.1002/acp.3865] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Yin Zhang
- Language and Cognition Research Laboratory Kent State University Kent Ohio USA
| | - Maryna Ridchenko
- Language and Cognition Research Laboratory Kent State University Kent Ohio USA
| | - Aimi Hayashi
- Language and Cognition Research Laboratory Kent State University Kent Ohio USA
| | - Phillip Hamrick
- Language and Cognition Research Laboratory Kent State University Kent Ohio USA
| |
Collapse
|
15
|
Abstract
Adult semantic memory has been traditionally conceptualized as a relatively static memory system that consists of knowledge about the world, concepts, and symbols. Considerable work in the past few decades has challenged this static view of semantic memory, and instead proposed a more fluid and flexible system that is sensitive to context, task demands, and perceptual and sensorimotor information from the environment. This paper (1) reviews traditional and modern computational models of semantic memory, within the umbrella of network (free association-based), feature (property generation norms-based), and distributional semantic (natural language corpora-based) models, (2) discusses the contribution of these models to important debates in the literature regarding knowledge representation (localist vs. distributed representations) and learning (error-free/Hebbian learning vs. error-driven/predictive learning), and (3) evaluates how modern computational models (neural network, retrieval-based, and topic models) are revisiting the traditional "static" conceptualization of semantic memory and tackling important challenges in semantic modeling such as addressing temporal, contextual, and attentional influences, as well as incorporating grounding and compositionality into semantic representations. The review also identifies new challenges regarding the abundance and availability of data, the generalization of semantic models to other languages, and the role of social interaction and collaboration in language learning and development. The concluding section advocates the need for integrating representational accounts of semantic memory with process-based accounts of cognitive behavior, as well as the need for explicit comparisons of computational models to human baselines in semantic tasks to adequately assess their psychological plausibility as models of human semantic memory.
Collapse
|
16
|
Beekhuizen B, Armstrong BC, Stevenson S. Probing Lexical Ambiguity: Word Vectors Encode Number and Relatedness of Senses. Cogn Sci 2021; 45:e12943. [PMID: 34018227 DOI: 10.1111/cogs.12943] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2019] [Revised: 10/22/2020] [Accepted: 01/04/2021] [Indexed: 11/29/2022]
Abstract
Lexical ambiguity-the phenomenon of a single word having multiple, distinguishable senses-is pervasive in language. Both the degree of ambiguity of a word (roughly, its number of senses) and the relatedness of those senses have been found to have widespread effects on language acquisition and processing. Recently, distributional approaches to semantics, in which a word's meaning is determined by its contexts, have led to successful research quantifying the degree of ambiguity, but these measures have not distinguished between the ambiguity of words with multiple related senses versus multiple unrelated meanings. In this work, we present the first assessment of whether distributional meaning representations can capture the ambiguity structure of a word, including both the number and relatedness of senses. On a very large sample of English words, we find that some, but not all, distributional semantic representations that we test exhibit detectable differences between sets of monosemes (unambiguous words; N = 964), polysemes (with multiple related senses; N = 4,096), and homonyms (with multiple unrelated senses; N = 355). Our findings begin to answer open questions from earlier work regarding whether distributional semantic representations of words, which successfully capture various semantic relationships, also reflect fine-grained aspects of meaning structure that influence human behavior. Our findings emphasize the importance of measuring whether proposed lexical representations capture such distinctions: In addition to standard benchmarks that test the similarity structure of distributional semantic models, we need to also consider whether they have cognitively plausible ambiguity structure.
Collapse
Affiliation(s)
| | - Blair C Armstrong
- Department of Psychology and Department of Language Studies, Basque Center on Cognition, Brain, & Language, University of Toronto, Scarborough
| | | |
Collapse
|
17
|
Li J, Joanisse MF. Word Senses as Clusters of Meaning Modulations: A Computational Model of Polysemy. Cogn Sci 2021; 45:e12955. [DOI: 10.1111/cogs.12955] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Revised: 01/25/2021] [Accepted: 01/28/2021] [Indexed: 12/01/2022]
Affiliation(s)
- Jiangtian Li
- Department of Psychology University of Toronto Scarborough
| | - Marc F. Joanisse
- Department of Psychology The University of Western Ontario
- Brain and Mind Institute The University of Western Ontario
| |
Collapse
|
18
|
Abstract
Previous research has speculated that semantic diversity and lexical ambiguity may be closely related constructs. Our research sought to test this claim in respect of the semantic diversity measure proposed by Hoffman et al. (2013). To this end, we replicated the procedure described by Hoffman et al., Behavior Research Methods, 45(3), 718-730 (2013) for computing multidimensional representations of contextual information using Latent Semantic Analysis, and from these we derived semantic diversity values for 28,555 words. We then replicated the facilitatory effect of semantic diversity on word recognition using existing data resources and observed this effect to be greater for low-frequency words. Yet, we found no relationship between this measure and lexical ambiguity effects in word recognition. Further analysis of the LSA-based contextual representations used to compute Hoffman et al. (2013) measure of semantic diversity revealed that they do not capture the distinct meanings of ambiguous words. Instead, these contextual representations appear to capture general information about the topics and types of written material in which words occur. These analyses suggest that the semantic diversity metric previously proposed by Hoffman et al. (2013) facilitates word recognition because high-diversity words are likely to have been encountered no matter what one has read, whereas many participants may not have encountered lower-diversity words simply because the topics and types of written material in which they occur are more restricted.
Collapse
Affiliation(s)
- Benedetta Cevoli
- Department of Psychology, Royal Holloway, University of London, Egham, TW20 0EX, UK.
| | - Chris Watkins
- Computer Science Department, Royal Holloway, University of London, Egham, TW20 0EX, UK
| | - Kathleen Rastle
- Department of Psychology, Royal Holloway, University of London, Egham, TW20 0EX, UK
| |
Collapse
|
19
|
Kelly MA, Arora N, West RL, Reitter D. Holographic Declarative Memory: Distributional Semantics as the Architecture of Memory. Cogn Sci 2020; 44:e12904. [PMID: 33140517 DOI: 10.1111/cogs.12904] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Revised: 03/30/2020] [Accepted: 08/31/2020] [Indexed: 11/29/2022]
Abstract
We demonstrate that the key components of cognitive architectures (declarative and procedural memory) and their key capabilities (learning, memory retrieval, probability judgment, and utility estimation) can be implemented as algebraic operations on vectors and tensors in a high-dimensional space using a distributional semantics model. High-dimensional vector spaces underlie the success of modern machine learning techniques based on deep learning. However, while neural networks have an impressive ability to process data to find patterns, they do not typically model high-level cognition, and it is often unclear how they work. Symbolic cognitive architectures can capture the complexities of high-level cognition and provide human-readable, explainable models, but scale poorly to naturalistic, non-symbolic, or big data. Vector-symbolic architectures, where symbols are represented as vectors, bridge the gap between the two approaches. We posit that cognitive architectures, if implemented in a vector-space model, represent a useful, explanatory model of the internal representations of otherwise opaque neural architectures. Our proposed model, Holographic Declarative Memory (HDM), is a vector-space model based on distributional semantics. HDM accounts for primacy and recency effects in free recall, the fan effect in recognition, probability judgments, and human performance on an iterated decision task. HDM provides a flexible, scalable alternative to symbolic cognitive architectures at a level of description that bridges symbolic, quantum, and neural models of cognition.
Collapse
Affiliation(s)
- Mary Alexandria Kelly
- Department of Computer Science, Bucknell University
- College of Information Sciences and Computing, The Pennsylvania State University
| | - Nipun Arora
- Department of Cognitive Science, Carleton University
| | - Robert L West
- Department of Cognitive Science, Carleton University
| | - David Reitter
- College of Information Sciences and Computing, The Pennsylvania State University
- Google Research
| |
Collapse
|
20
|
Nosofsky RM, Sanders CA, Meagher BJ, Douglas BJ. Search for the Missing Dimensions: Building a Feature-Space Representation for a Natural-Science Category Domain. ACTA ACUST UNITED AC 2019. [DOI: 10.1007/s42113-019-00033-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
21
|
Johns BT, Mewhort DJK, Jones MN. The Role of Negative Information in Distributional Semantic Learning. Cogn Sci 2019; 43:e12730. [PMID: 31087587 DOI: 10.1111/cogs.12730] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Revised: 01/18/2019] [Accepted: 03/25/2019] [Indexed: 11/29/2022]
Abstract
Distributional models of semantics learn word meanings from contextual co-occurrence patterns across a large sample of natural language. Early models, such as LSA and HAL (Landauer & Dumais, 1997; Lund & Burgess, 1996), counted co-occurrence events; later models, such as BEAGLE (Jones & Mewhort, 2007), replaced counting co-occurrences with vector accumulation. All of these models learned from positive information only: Words that occur together within a context become related to each other. A recent class of distributional models, referred to as neural embedding models, are based on a prediction process embedded in the functioning of a neural network: Such models predict words that should surround a target word in a given context (e.g., word2vec; Mikolov, Sutskever, Chen, Corrado, & Dean, 2013). An error signal derived from the prediction is used to update each word's representation via backpropagation. However, another key difference in predictive models is their use of negative information in addition to positive information to develop a semantic representation. The models use negative examples to predict words that should not surround a word in a given context. As before, an error signal derived from the prediction prompts an update of the word's representation, a procedure referred to as negative sampling. Standard uses of word2vec recommend a greater or equal ratio of negative to positive sampling. The use of negative information in developing a representation of semantic information is often thought to be intimately associated with word2vec's prediction process. We assess the role of negative information in developing a semantic representation and show that its power does not reflect the use of a prediction mechanism. Finally, we show how negative information can be efficiently integrated into classic count-based semantic models using parameter-free analytical transformations.
Collapse
Affiliation(s)
- Brendan T Johns
- Department of Communicative Disorders and Sciences, University at Buffalo
| | | | - Michael N Jones
- Department of Psychological and Brain Sciences, Indiana University
| |
Collapse
|
22
|
|
23
|
Johns BT. Mining a Crowdsourced Dictionary to Understand Consistency and Preference in Word Meanings. Front Psychol 2019; 10:268. [PMID: 30833917 PMCID: PMC6387934 DOI: 10.3389/fpsyg.2019.00268] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 01/28/2019] [Indexed: 12/02/2022] Open
Abstract
Big data approaches to psychology have become increasing popular (Jones, 2017). Two of the main developments of this line of research is the advent of distributional models of semantics (e.g., Landauer and Dumais, 1997), which learn the meaning of words from large text corpora, and the collection of mega datasets of human behavior (e.g., The English lexicon project; Balota et al., 2007). The current article combines these two approaches, with the goal being to understand the consistency and preference that people have for word meanings. This was accomplished by mining a large amount of data from an online, crowdsourced dictionary and analyzing this data with a distributional model. Overall, it was found that even for words that are not an active part of the language environment, there is a large amount of consistency in the word meanings that different people have. Additionally, it was demonstrated that users of a language have strong preferences for word meanings, such that definitions to words that do not conform to people’s conceptions are rejected by a community of language users. The results of this article provides insights into the cultural evolution of word meanings, and sheds light on alternative methodologies that can be used to understand lexical behavior.
Collapse
Affiliation(s)
- Brendan T Johns
- Department of Communicative Disorders and Sciences, University at Buffalo, Buffalo, NY, United States
| |
Collapse
|