1
|
Peters B, DiCarlo JJ, Gureckis T, Haefner R, Isik L, Tenenbaum J, Konkle T, Naselaris T, Stachenfeld K, Tavares Z, Tsao D, Yildirim I, Kriegeskorte N. How does the primate brain combine generative and discriminative computations in vision? ARXIV 2024:arXiv:2401.06005v1. [PMID: 38259351 PMCID: PMC10802669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remove irrelevant variation and represent behaviorally relevant information in a format suitable for downstream functions of cognition and behavioral control. In this conception, vision is driven by the sensory data, and perception is direct because the processing proceeds from the data to the latent variables of interest. The notion of "inference" in this conception is that of the engineering literature on neural networks, where feedforward convolutional neural networks processing images are said to perform inference. The alternative conception is that of vision as an inference process in Helmholtz's sense, where the sensory evidence is evaluated in the context of a generative model of the causal processes that give rise to it. In this conception, vision inverts a generative model through an interrogation of the sensory evidence in a process often thought to involve top-down predictions of sensory data to evaluate the likelihood of alternative hypotheses. The authors include scientists rooted in roughly equal numbers in each of the conceptions and motivated to overcome what might be a false dichotomy between them and engage the other perspective in the realm of theory and experiment. The primate brain employs an unknown algorithm that may combine the advantages of both conceptions. We explain and clarify the terminology, review the key empirical evidence, and propose an empirical research program that transcends the dichotomy and sets the stage for revealing the mysterious hybrid algorithm of primate vision.
Collapse
Affiliation(s)
- Benjamin Peters
- Zuckerman Mind Brain Behavior Institute, Columbia University
- School of Psychology & Neuroscience, University of Glasgow
| | - James J DiCarlo
- Department of Brain and Cognitive Sciences, MIT
- McGovern Institute for Brain Research, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Quest for Intelligence, Schwarzman College of Computing, MIT
| | | | - Ralf Haefner
- Brain and Cognitive Sciences, University of Rochester
- Center for Visual Science, University of Rochester
| | - Leyla Isik
- Department of Cognitive Science, Johns Hopkins University
| | - Joshua Tenenbaum
- Department of Brain and Cognitive Sciences, MIT
- NSF Center for Brains, Minds and Machines, MIT
- Computer Science and Artificial Intelligence Laboratory, MIT
| | - Talia Konkle
- Department of Psychology, Harvard University
- Center for Brain Science, Harvard University
- Kempner Institute for Natural and Artificial Intelligence, Harvard University
| | | | | | - Zenna Tavares
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Data Science Institute, Columbia University
| | - Doris Tsao
- Dept of Molecular & Cell Biology, University of California Berkeley
- Howard Hughes Medical Institute
| | - Ilker Yildirim
- Department of Psychology, Yale University
- Department of Statistics and Data Science, Yale University
| | - Nikolaus Kriegeskorte
- Zuckerman Mind Brain Behavior Institute, Columbia University
- Department of Psychology, Columbia University
- Department of Neuroscience, Columbia University
- Department of Electrical Engineering, Columbia University
| |
Collapse
|