1
|
Marjieh R, Sucholutsky I, van Rijn P, Jacoby N, Griffiths TL. Large language models predict human sensory judgments across six modalities. Sci Rep 2024; 14:21445. [PMID: 39271909 PMCID: PMC11399123 DOI: 10.1038/s41598-024-72071-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 09/03/2024] [Indexed: 09/15/2024] Open
Abstract
Determining the extent to which the perceptual world can be recovered from language is a longstanding problem in philosophy and cognitive science. We show that state-of-the-art large language models can unlock new insights into this problem by providing a lower bound on the amount of perceptual information that can be extracted from language. Specifically, we elicit pairwise similarity judgments from GPT models across six psychophysical datasets. We show that the judgments are significantly correlated with human data across all domains, recovering well-known representations like the color wheel and pitch spiral. Surprisingly, we find that a model (GPT-4) co-trained on vision and language does not necessarily lead to improvements specific to the visual modality, and provides highly correlated predictions with human data irrespective of whether direct visual input is provided or purely textual descriptors. To study the impact of specific languages, we also apply the models to a multilingual color-naming task. We find that GPT-4 replicates cross-linguistic variation in English and Russian illuminating the interaction of language and perception.
Collapse
Affiliation(s)
- Raja Marjieh
- Department of Psychology, Princeton University, Princeton, USA.
| | - Ilia Sucholutsky
- Department of Computer Science, Princeton University, Princeton, USA
| | - Pol van Rijn
- Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany
| | - Nori Jacoby
- Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany
- Department of Psychology, Cornell University, Ithaca, USA
| | - Thomas L Griffiths
- Department of Psychology, Princeton University, Princeton, USA
- Department of Computer Science, Princeton University, Princeton, USA
| |
Collapse
|
2
|
Reilly J, Shain C, Borghesani V, Kuhnke P, Vigliocco G, Peelle JE, Mahon BZ, Buxbaum LJ, Majid A, Brysbaert M, Borghi AM, De Deyne S, Dove G, Papeo L, Pexman PM, Poeppel D, Lupyan G, Boggio P, Hickok G, Gwilliams L, Fernandino L, Mirman D, Chrysikou EG, Sandberg CW, Crutch SJ, Pylkkänen L, Yee E, Jackson RL, Rodd JM, Bedny M, Connell L, Kiefer M, Kemmerer D, de Zubicaray G, Jefferies E, Lynott D, Siew CSQ, Desai RH, McRae K, Diaz MT, Bolognesi M, Fedorenko E, Kiran S, Montefinese M, Binder JR, Yap MJ, Hartwigsen G, Cantlon J, Bi Y, Hoffman P, Garcea FE, Vinson D. What we mean when we say semantic: Toward a multidisciplinary semantic glossary. Psychon Bull Rev 2024:10.3758/s13423-024-02556-7. [PMID: 39231896 DOI: 10.3758/s13423-024-02556-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/19/2024] [Indexed: 09/06/2024]
Abstract
Tulving characterized semantic memory as a vast repository of meaning that underlies language and many other cognitive processes. This perspective on lexical and conceptual knowledge galvanized a new era of research undertaken by numerous fields, each with their own idiosyncratic methods and terminology. For example, "concept" has different meanings in philosophy, linguistics, and psychology. As such, many fundamental constructs used to delineate semantic theories remain underspecified and/or opaque. Weak construct specificity is among the leading causes of the replication crisis now facing psychology and related fields. Term ambiguity hinders cross-disciplinary communication, falsifiability, and incremental theory-building. Numerous cognitive subdisciplines (e.g., vision, affective neuroscience) have recently addressed these limitations via the development of consensus-based guidelines and definitions. The project to follow represents our effort to produce a multidisciplinary semantic glossary consisting of succinct definitions, background, principled dissenting views, ratings of agreement, and subjective confidence for 17 target constructs (e.g., abstractness, abstraction, concreteness, concept, embodied cognition, event semantics, lexical-semantic, modality, representation, semantic control, semantic feature, simulation, semantic distance, semantic dimension). We discuss potential benefits and pitfalls (e.g., implicit bias, prescriptiveness) of these efforts to specify a common nomenclature that other researchers might index in specifying their own theoretical perspectives (e.g., They said X, but I mean Y).
Collapse
Affiliation(s)
| | - Cory Shain
- Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | - Philipp Kuhnke
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Leipzig University, Leipzig, Germany
| | | | | | | | - Laurel J Buxbaum
- Thomas Jefferson University, Moss Rehabilitation Research Institute, Elkins Park, PA, USA
| | | | | | | | | | - Guy Dove
- University of Louisville, Louisville, KY, USA
| | - Liuba Papeo
- Centre National de La Recherche Scientifique (CNRS), University Claude-Bernard Lyon, Lyon, France
| | | | | | | | - Paulo Boggio
- Universidade Presbiteriana Mackenzie, São Paulo, Brazil
| | | | | | | | | | | | | | | | | | - Eiling Yee
- University of Connecticut, Storrs, CT, USA
| | | | | | | | | | | | | | | | | | | | | | | | - Ken McRae
- Western University, London, ON, Canada
| | | | | | | | | | | | | | - Melvin J Yap
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- National University of Singapore, Singapore, Singapore
| | - Gesa Hartwigsen
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Leipzig University, Leipzig, Germany
| | | | - Yanchao Bi
- University of Edinburgh, Edinburgh, UK
- Beijing Normal University, Beijing, China
| | | | | | | |
Collapse
|
3
|
Michaelov JA, Bardolph MD, Van Petten CK, Bergen BK, Coulson S. Strong Prediction: Language Model Surprisal Explains Multiple N400 Effects. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2024; 5:107-135. [PMID: 38645623 PMCID: PMC11025652 DOI: 10.1162/nol_a_00105] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 03/24/2023] [Indexed: 04/23/2024]
Abstract
Theoretical accounts of the N400 are divided as to whether the amplitude of the N400 response to a stimulus reflects the extent to which the stimulus was predicted, the extent to which the stimulus is semantically similar to its preceding context, or both. We use state-of-the-art machine learning tools to investigate which of these three accounts is best supported by the evidence. GPT-3, a neural language model trained to compute the conditional probability of any word based on the words that precede it, was used to operationalize contextual predictability. In particular, we used an information-theoretic construct known as surprisal (the negative logarithm of the conditional probability). Contextual semantic similarity was operationalized by using two high-quality co-occurrence-derived vector-based meaning representations for words: GloVe and fastText. The cosine between the vector representation of the sentence frame and final word was used to derive contextual cosine similarity estimates. A series of regression models were constructed, where these variables, along with cloze probability and plausibility ratings, were used to predict single trial N400 amplitudes recorded from healthy adults as they read sentences whose final word varied in its predictability, plausibility, and semantic relationship to the likeliest sentence completion. Statistical model comparison indicated GPT-3 surprisal provided the best account of N400 amplitude and suggested that apparently disparate N400 effects of expectancy, plausibility, and contextual semantic similarity can be reduced to variation in the predictability of words. The results are argued to support predictive coding in the human language network.
Collapse
Affiliation(s)
- James A. Michaelov
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| | - Megan D. Bardolph
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| | - Cyma K. Van Petten
- Department of Psychology, Binghamton University, State University of New York, Binghamton, NY, USA
| | - Benjamin K. Bergen
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| | - Seana Coulson
- Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA
| |
Collapse
|
4
|
Seidl AH, Indarjit M, Borovsky A. Touch to learn: Multisensory input supports word learning and processing. Dev Sci 2024; 27:e13419. [PMID: 37291692 PMCID: PMC10704002 DOI: 10.1111/desc.13419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 04/14/2023] [Accepted: 05/22/2023] [Indexed: 06/10/2023]
Abstract
Infants experience language in rich multisensory environments. For example, they may first be exposed to the word applesauce while touching, tasting, smelling, and seeing applesauce. In three experiments using different methods we asked whether the number of distinct senses linked with the semantic features of objects would impact word recognition and learning. Specifically, in Experiment 1 we asked whether words linked with more multisensory experiences were learned earlier than words linked fewer multisensory experiences. In Experiment 2, we asked whether 2-year-olds' known words linked with more multisensory experiences were better recognized than those linked with fewer. Finally, in Experiment 3, we taught 2-year-olds labels for novel objects that were linked with either just visual or visual and tactile experiences and asked whether this impacted their ability to learn the new label-to-object mappings. Results converge to support an account in which richer multisensory experiences better support word learning. We discuss two pathways through which rich multisensory experiences might support word learning.
Collapse
Affiliation(s)
- Amanda H Seidl
- Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Michelle Indarjit
- Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Arielle Borovsky
- Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana, USA
| |
Collapse
|
5
|
Maimon A, Wald IY, Ben Oz M, Codron S, Netzer O, Heimler B, Amedi A. The Topo-Speech sensory substitution system as a method of conveying spatial information to the blind and vision impaired. Front Hum Neurosci 2023; 16:1058093. [PMID: 36776219 PMCID: PMC9909096 DOI: 10.3389/fnhum.2022.1058093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 12/13/2022] [Indexed: 01/27/2023] Open
Abstract
Humans, like most animals, integrate sensory input in the brain from different sensory modalities. Yet humans are distinct in their ability to grasp symbolic input, which is interpreted into a cognitive mental representation of the world. This representation merges with external sensory input, providing modality integration of a different sort. This study evaluates the Topo-Speech algorithm in the blind and visually impaired. The system provides spatial information about the external world by applying sensory substitution alongside symbolic representations in a manner that corresponds with the unique way our brains acquire and process information. This is done by conveying spatial information, customarily acquired through vision, through the auditory channel, in a combination of sensory (auditory) features and symbolic language (named/spoken) features. The Topo-Speech sweeps the visual scene or image and represents objects' identity by employing naming in a spoken word and simultaneously conveying the objects' location by mapping the x-axis of the visual scene or image to the time it is announced and the y-axis by mapping the location to the pitch of the voice. This proof of concept study primarily explores the practical applicability of this approach in 22 visually impaired and blind individuals. The findings showed that individuals from both populations could effectively interpret and use the algorithm after a single training session. The blind showed an accuracy of 74.45%, while the visually impaired had an average accuracy of 72.74%. These results are comparable to those of the sighted, as shown in previous research, with all participants above chance level. As such, we demonstrate practically how aspects of spatial information can be transmitted through non-visual channels. To complement the findings, we weigh in on debates concerning models of spatial knowledge (the persistent, cumulative, or convergent models) and the capacity for spatial representation in the blind. We suggest the present study's findings support the convergence model and the scenario that posits the blind are capable of some aspects of spatial representation as depicted by the algorithm comparable to those of the sighted. Finally, we present possible future developments, implementations, and use cases for the system as an aid for the blind and visually impaired.
Collapse
Affiliation(s)
- Amber Maimon
- Baruch Ivcher School of Psychology, The Baruch Ivcher Institute for Brain, Cognition, and Technology, Reichman University, Herzliya, Israel
- The Ruth and Meir Rosenthal Brain Imaging Center, Reichman University, Herzliya, Israel
| | - Iddo Yehoshua Wald
- Baruch Ivcher School of Psychology, The Baruch Ivcher Institute for Brain, Cognition, and Technology, Reichman University, Herzliya, Israel
- The Ruth and Meir Rosenthal Brain Imaging Center, Reichman University, Herzliya, Israel
| | - Meshi Ben Oz
- Baruch Ivcher School of Psychology, The Baruch Ivcher Institute for Brain, Cognition, and Technology, Reichman University, Herzliya, Israel
- The Ruth and Meir Rosenthal Brain Imaging Center, Reichman University, Herzliya, Israel
| | - Sophie Codron
- Baruch Ivcher School of Psychology, The Baruch Ivcher Institute for Brain, Cognition, and Technology, Reichman University, Herzliya, Israel
- The Ruth and Meir Rosenthal Brain Imaging Center, Reichman University, Herzliya, Israel
| | - Ophir Netzer
- Gonda Brain Research Center, Bar Ilan University, Ramat Gan, Israel
| | - Benedetta Heimler
- Center of Advanced Technologies in Rehabilitation (CATR), Sheba Medical Center, Ramat Gan, Israel
| | - Amir Amedi
- Baruch Ivcher School of Psychology, The Baruch Ivcher Institute for Brain, Cognition, and Technology, Reichman University, Herzliya, Israel
- The Ruth and Meir Rosenthal Brain Imaging Center, Reichman University, Herzliya, Israel
| |
Collapse
|
6
|
Mamus E, Speed LJ, Rissman L, Majid A, Özyürek A. Lack of Visual Experience Affects Multimodal Language Production: Evidence From Congenitally Blind and Sighted People. Cogn Sci 2023; 47:e13228. [PMID: 36607157 PMCID: PMC10078191 DOI: 10.1111/cogs.13228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 10/08/2022] [Accepted: 11/25/2022] [Indexed: 01/07/2023]
Abstract
The human experience is shaped by information from different perceptual channels, but it is still debated whether and how differential experience influences language use. To address this, we compared congenitally blind, blindfolded, and sighted people's descriptions of the same motion events experienced auditorily by all participants (i.e., via sound alone) and conveyed in speech and gesture. Comparison of blind and sighted participants to blindfolded participants helped us disentangle the effects of a lifetime experience of being blind versus the task-specific effects of experiencing a motion event by sound alone. Compared to sighted people, blind people's speech focused more on path and less on manner of motion, and encoded paths in a more segmented fashion using more landmarks and path verbs. Gestures followed the speech, such that blind people pointed to landmarks more and depicted manner less than sighted people. This suggests that visual experience affects how people express spatial events in the multimodal language and that blindness may enhance sensitivity to paths of motion due to changes in event construal. These findings have implications for the claims that language processes are deeply rooted in our sensory experiences.
Collapse
Affiliation(s)
- Ezgi Mamus
- Centre for Language Studies, Radboud University.,Max Planck Institute for Psycholinguistics
| | | | - Lilia Rissman
- Department of Psychology, University of Wisconsin - Madison
| | - Asifa Majid
- Department of Experimental Psychology, University of Oxford
| | - Aslı Özyürek
- Centre for Language Studies, Radboud University.,Max Planck Institute for Psycholinguistics.,Donders Center for Cognition, Radboud University
| |
Collapse
|
7
|
Maimon A, Yizhar O, Buchs G, Heimler B, Amedi A. A case study in phenomenology of visual experience with retinal prosthesis versus visual-to-auditory sensory substitution. Neuropsychologia 2022; 173:108305. [PMID: 35752268 PMCID: PMC9297294 DOI: 10.1016/j.neuropsychologia.2022.108305] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 04/30/2022] [Accepted: 06/13/2022] [Indexed: 11/26/2022]
Abstract
The phenomenology of the blind has provided an age-old, unparalleled means of exploring the enigmatic link between the brain and mind. This paper delves into the unique phenomenological experience of a man who became blind in adulthood. He subsequently underwent both an Argus II retinal prosthesis implant and training, and extensive training on the EyeMusic visual to auditory sensory substitution device (SSD), thereby becoming the first reported case to date of dual proficiency with both devices. He offers a firsthand account into what he considers the great potential of combining sensory substitution devices with visual prostheses as part of a complete visual restoration protocol. While the Argus II retinal prosthesis alone provided him with immediate visual percepts by way of electrically stimulated phosphenes elicited by the device, the EyeMusic SSD requires extensive training from the onset. Yet following the extensive training program with the EyeMusic sensory substitution device, our subject reports that the sensory substitution device allowed him to experience a richer, more complex perceptual experience, that felt more "second nature" to him, while the Argus II prosthesis (which also requires training) did not allow him to achieve the same levels of automaticity and transparency. Following long-term use of the EyeMusic SSD, our subject reported that visual percepts representing mainly, but not limited to, colors portrayed by the EyeMusic SSD are elicited in association with auditory stimuli, indicating the acquisition of a high level of automaticity. Finally, the case study indicates an additive benefit to the combination of both devices on the user's subjective phenomenological visual experience.
Collapse
Affiliation(s)
- Amber Maimon
- The Baruch Ivcher Institute for Brain, Cognition, and Technology, The Baruch Ivcher School of Psychology, Reichman University, Herzliya, Israel; The Ruth & Meir Rosenthal Brain Imaging Center, Reichman University, Herzliya, Israel.
| | - Or Yizhar
- The Baruch Ivcher Institute for Brain, Cognition, and Technology, The Baruch Ivcher School of Psychology, Reichman University, Herzliya, Israel; Department of Cognitive and Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel; Max Planck Institute for Human Development, Research Group Adaptive Memory and Decision Making, Berlin, Germany; Max Planck Institute for Human Development, Max Planck Dahlem Campus of Cognition (MPDCC), Berlin, Germany
| | - Galit Buchs
- The Baruch Ivcher Institute for Brain, Cognition, and Technology, The Baruch Ivcher School of Psychology, Reichman University, Herzliya, Israel; Department of Cognitive and Brain Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Benedetta Heimler
- Center of Advanced Technologies in Rehabilitation (CATR), Sheba Medical Center, Ramat Gan, Israel
| | - Amir Amedi
- The Baruch Ivcher Institute for Brain, Cognition, and Technology, The Baruch Ivcher School of Psychology, Reichman University, Herzliya, Israel; The Ruth & Meir Rosenthal Brain Imaging Center, Reichman University, Herzliya, Israel.
| |
Collapse
|
8
|
Vitali H, Campus C, De Giorgis V, Signorini S, Gori M. The vision of dreams: from ontogeny to dream engineering in blindness. J Clin Sleep Med 2022; 18:2051-2062. [PMID: 35499135 PMCID: PMC9340600 DOI: 10.5664/jcsm.10026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The mechanisms involved in the origin of dreams remain one of the great unknowns in science. In the 21st century, studies in the field have focused on 3 main topics: functional networks that underlie dreaming, neural correlates of dream contents, and signal propagation. We review neuroscientific studies about dreaming processes, focusing on their cortical correlations. The involvement of frontoparietal regions in the dream-retrieval process allows us to discuss it in light of the Global Workspace theory of consciousness. However, dreaming in distinct sleep stages maintains relevant differences, suggesting that multiple generators are implicated. Then, given the strong influence of light perception on sleep regulation and the mostly visual content of dreams, we investigate the effect of blindness on the organization of dreams. Blind individuals represent a worthwhile population to clarify the role of perceptual systems in dream generation, and to make inferences about their top-down and/or bottom-up origin. Indeed, congenitally blind people maintain the ability to produce visual dreams, suggesting that bottom-up mechanisms could be associated with innate body schemes or multisensory integration processes. Finally, we propose the new dream-engineering technique as a tool to clarify the mechanisms of multisensory integration during sleep and related mental activity, presenting possible implications for rehabilitation in sensory-impaired individuals. The Theory of Proto-consciousness suggests that the interaction of brain states underlying waking and dreaming ensures the optimal functioning of both. Therefore, understanding the origin of dreams and capabilities of our brain during a dreamlike state, we could introduce it as a rehabilitative tool. CITATION Vitali H, Campus C, De Giorgis V, Signorini S, Gori M. The vision of dreams: from ontogeny to dream engineering in blindness. J Clin Sleep Med. 2022;18(8):2051-2062.
Collapse
Affiliation(s)
- Helene Vitali
- U-VIP: Unit for Visually Impaired People, Istituto Italiano di Tecnologia, Genova, Italy
| | - Claudio Campus
- U-VIP: Unit for Visually Impaired People, Istituto Italiano di Tecnologia, Genova, Italy
| | | | | | - Monica Gori
- U-VIP: Unit for Visually Impaired People, Istituto Italiano di Tecnologia, Genova, Italy
| |
Collapse
|