1
|
Deakin J, Schofield A, Heinke D. Support for the Time-Varying Drift Rate Model of Perceptual Discrimination in Dynamic and Static Noise Using Bayesian Model-Fitting Methodology. ENTROPY (BASEL, SWITZERLAND) 2024; 26:642. [PMID: 39202112 PMCID: PMC11354202 DOI: 10.3390/e26080642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 07/22/2024] [Accepted: 07/25/2024] [Indexed: 09/03/2024]
Abstract
The drift-diffusion model (DDM) is a common approach to understanding human decision making. It considers decision making as accumulation of evidence about visual stimuli until sufficient evidence is reached to make a decision (decision boundary). Recently, Smith and colleagues proposed an extension of DDM, the time-varying DDM (TV-DDM). Here, the standard simplification that evidence accumulation operates on a fully formed representation of perceptual information is replaced with a perceptual integration stage modulating evidence accumulation. They suggested that this model particularly captures decision making regarding stimuli with dynamic noise. We tested this new model in two studies by using Bayesian parameter estimation and model comparison with marginal likelihoods. The first study replicated Smith and colleagues' findings by utilizing the classical random-dot kinomatogram (RDK) task, which requires judging the motion direction of randomly moving dots (motion discrimination task). In the second study, we used a novel type of stimulus designed to be like RDKs but with randomized hue of stationary dots (color discrimination task). This study also found TV-DDM to be superior, suggesting that perceptual integration is also relevant for static noise possibly where integration over space is required. We also found support for within-trial changes in decision boundaries ("collapsing boundaries"). Interestingly, and in contrast to most studies, the boundaries increased with increasing task difficulty (amount of noise). Future studies will need to test this finding in a formal model.
Collapse
Affiliation(s)
- Jordan Deakin
- School of Psychology, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK;
- Faculty of Psychology and Human Movement Science, General Psychology, Universität Hamburg, Von-Melle-Park 11, 20146 Hamburg, Germany
| | | | - Dietmar Heinke
- School of Psychology, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK;
| |
Collapse
|
2
|
Xia Z, Wu T, Wang Z, Zhou M, Wu B, Chan CY, Kong LB. Dense monocular depth estimation for stereoscopic vision based on pyramid transformer and multi-scale feature fusion. Sci Rep 2024; 14:7037. [PMID: 38528098 DOI: 10.1038/s41598-024-57908-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 03/22/2024] [Indexed: 03/27/2024] Open
Abstract
Stereoscopic display technology plays a significant role in industries, such as film, television and autonomous driving. The accuracy of depth estimation is crucial for achieving high-quality and realistic stereoscopic display effects. In addressing the inherent challenges of applying Transformers to depth estimation, the Stereoscopic Pyramid Transformer-Depth (SPT-Depth) is introduced. This method utilizes stepwise downsampling to acquire both shallow and deep semantic information, which are subsequently fused. The training process is divided into fine and coarse convergence stages, employing distinct training strategies and hyperparameters, resulting in a substantial reduction in both training and validation losses. In the training strategy, a shift and scale-invariant mean square error function is employed to compensate for the lack of translational invariance in the Transformers. Additionally, an edge-smoothing function is applied to reduce noise in the depth map, enhancing the model's robustness. The SPT-Depth achieves a global receptive field while effectively reducing time complexity. In comparison with the baseline method, with the New York University Depth V2 (NYU Depth V2) dataset, there is a 10% reduction in Absolute Relative Error (Abs Rel) and a 36% decrease in Root Mean Square Error (RMSE). When compared with the state-of-the-art methods, there is a 17% reduction in RMSE.
Collapse
Affiliation(s)
- Zhongyi Xia
- College of New Materials and New Energies, Shenzhen Technology University, Shenzhen, 518118, Guangdong, China
- College of Applied Technology, Shenzhen University, Shenzhen, 518000, Guangdong, China
| | - Tianzhao Wu
- College of New Materials and New Energies, Shenzhen Technology University, Shenzhen, 518118, Guangdong, China
- College of Applied Technology, Shenzhen University, Shenzhen, 518000, Guangdong, China
| | - Zhuoyan Wang
- College of New Materials and New Energies, Shenzhen Technology University, Shenzhen, 518118, Guangdong, China
- College of Applied Technology, Shenzhen University, Shenzhen, 518000, Guangdong, China
| | - Man Zhou
- College of New Materials and New Energies, Shenzhen Technology University, Shenzhen, 518118, Guangdong, China
- College of Applied Technology, Shenzhen University, Shenzhen, 518000, Guangdong, China
| | - Boqi Wu
- Jilin Jianzhu University, Changchun, 130118, Jilin, China
| | - C Y Chan
- College of New Materials and New Energies, Shenzhen Technology University, Shenzhen, 518118, Guangdong, China.
| | - Ling Bing Kong
- College of New Materials and New Energies, Shenzhen Technology University, Shenzhen, 518118, Guangdong, China.
| |
Collapse
|
3
|
Dyballa L, Rudzite AM, Hoseini MS, Thapa M, Stryker MP, Field GD, Zucker SW. Population encoding of stimulus features along the visual hierarchy. Proc Natl Acad Sci U S A 2024; 121:e2317773121. [PMID: 38227668 PMCID: PMC10823231 DOI: 10.1073/pnas.2317773121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 12/13/2023] [Indexed: 01/18/2024] Open
Abstract
The retina and primary visual cortex (V1) both exhibit diverse neural populations sensitive to diverse visual features. Yet it remains unclear how neural populations in each area partition stimulus space to span these features. One possibility is that neural populations are organized into discrete groups of neurons, with each group signaling a particular constellation of features. Alternatively, neurons could be continuously distributed across feature-encoding space. To distinguish these possibilities, we presented a battery of visual stimuli to the mouse retina and V1 while measuring neural responses with multi-electrode arrays. Using machine learning approaches, we developed a manifold embedding technique that captures how neural populations partition feature space and how visual responses correlate with physiological and anatomical properties of individual neurons. We show that retinal populations discretely encode features, while V1 populations provide a more continuous representation. Applying the same analysis approach to convolutional neural networks that model visual processing, we demonstrate that they partition features much more similarly to the retina, indicating they are more like big retinas than little brains.
Collapse
Affiliation(s)
- Luciano Dyballa
- Department of Computer Science, Yale University, New Haven, CT06511
| | | | - Mahmood S. Hoseini
- Department of Physiology, University of California, San Francisco, CA94143
| | - Mishek Thapa
- Department of Neurobiology, Duke University, Durham, NC27708
- Department of Ophthalmology, David Geffen School of Medicine, Stein Eye Institute, University of California, Los Angeles, CA90095
| | - Michael P. Stryker
- Department of Physiology, University of California, San Francisco, CA94143
- Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, CA94143
| | - Greg D. Field
- Department of Neurobiology, Duke University, Durham, NC27708
- Department of Ophthalmology, David Geffen School of Medicine, Stein Eye Institute, University of California, Los Angeles, CA90095
| | - Steven W. Zucker
- Department of Computer Science, Yale University, New Haven, CT06511
- Department of Biomedical Engineering, Yale University, New Haven, CT06511
| |
Collapse
|
4
|
van Dyck LE, Gruber WR. Modeling Biological Face Recognition with Deep Convolutional Neural Networks. J Cogn Neurosci 2023; 35:1521-1537. [PMID: 37584587 DOI: 10.1162/jocn_a_02040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Deep convolutional neural networks (DCNNs) have become the state-of-the-art computational models of biological object recognition. Their remarkable success has helped vision science break new ground, and recent efforts have started to transfer this achievement to research on biological face recognition. In this regard, face detection can be investigated by comparing face-selective biological neurons and brain areas to artificial neurons and model layers. Similarly, face identification can be examined by comparing in vivo and in silico multidimensional "face spaces." In this review, we summarize the first studies that use DCNNs to model biological face recognition. On the basis of a broad spectrum of behavioral and computational evidence, we conclude that DCNNs are useful models that closely resemble the general hierarchical organization of face recognition in the ventral visual pathway and the core face network. In two exemplary spotlights, we emphasize the unique scientific contributions of these models. First, studies on face detection in DCNNs indicate that elementary face selectivity emerges automatically through feedforward processing even in the absence of visual experience. Second, studies on face identification in DCNNs suggest that identity-specific experience and generative mechanisms facilitate this particular challenge. Taken together, as this novel modeling approach enables close control of predisposition (i.e., architecture) and experience (i.e., training data), it may be suited to inform long-standing debates on the substrates of biological face recognition.
Collapse
|
5
|
Doerig A, Sommers RP, Seeliger K, Richards B, Ismael J, Lindsay GW, Kording KP, Konkle T, van Gerven MAJ, Kriegeskorte N, Kietzmann TC. The neuroconnectionist research programme. Nat Rev Neurosci 2023:10.1038/s41583-023-00705-w. [PMID: 37253949 DOI: 10.1038/s41583-023-00705-w] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 06/01/2023]
Abstract
Artificial neural networks (ANNs) inspired by biology are beginning to be widely used to model behavioural and neural data, an approach we call 'neuroconnectionism'. ANNs have been not only lauded as the current best models of information processing in the brain but also criticized for failing to account for basic cognitive functions. In this Perspective article, we propose that arguing about the successes and failures of a restricted set of current ANNs is the wrong approach to assess the promise of neuroconnectionism for brain science. Instead, we take inspiration from the philosophy of science, and in particular from Lakatos, who showed that the core of a scientific research programme is often not directly falsifiable but should be assessed by its capacity to generate novel insights. Following this view, we present neuroconnectionism as a general research programme centred around ANNs as a computational language for expressing falsifiable theories about brain computation. We describe the core of the programme, the underlying computational framework and its tools for testing specific neuroscientific hypotheses and deriving novel understanding. Taking a longitudinal view, we review past and present neuroconnectionist projects and their responses to challenges and argue that the research programme is highly progressive, generating new and otherwise unreachable insights into the workings of the brain.
Collapse
Affiliation(s)
- Adrien Doerig
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany.
- Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.
| | - Rowan P Sommers
- Department of Neurobiology of Language, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Katja Seeliger
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Blake Richards
- Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada
- School of Computer Science, McGill University, Montréal, QC, Canada
- Mila, Montréal, QC, Canada
- Montréal Neurological Institute, Montréal, QC, Canada
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
| | | | | | - Konrad P Kording
- Learning in Machines and Brains Program, CIFAR, Toronto, ON, Canada
- Bioengineering, Neuroscience, University of Pennsylvania, Pennsylvania, PA, USA
| | | | | | | | - Tim C Kietzmann
- Institute of Cognitive Science, University of Osnabrück, Osnabrück, Germany
| |
Collapse
|
6
|
Deeb A, Ibrahim A, Salem M, Pichler J, Tkachov S, Karaj A, Al Machot F, Kyandoghere K. A Robust Automated Analog Circuits Classification Involving a Graph Neural Network and a Novel Data Augmentation Strategy. SENSORS (BASEL, SWITZERLAND) 2023; 23:2989. [PMID: 36991700 PMCID: PMC10054122 DOI: 10.3390/s23062989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 03/03/2023] [Accepted: 03/05/2023] [Indexed: 06/19/2023]
Abstract
Analog mixed-signal (AMS) verification is one of the essential tasks in the development process of modern systems-on-chip (SoC). Most parts of the AMS verification flow are already automated, except for stimuli generation, which has been performed manually. It is thus challenging and time-consuming. Hence, automation is a necessity. To generate stimuli, subcircuits or subblocks of a given analog circuit module should be identified/classified. However, there currently needs to be a reliable industrial tool that can automatically identify/classify analog sub-circuits (eventually in the frame of a circuit design process) or automatically classify a given analog circuit at hand. Besides verification, several other processes would profit enormously from the availability of a robust and reliable automated classification model for analog circuit modules (which may belong to different levels). This paper presents how to use a Graph Convolutional Network (GCN) model and proposes a novel data augmentation strategy to automatically classify analog circuits of a given level. Eventually, it can be upscaled or integrated within a more complex functional module (for a structure recognition of complex analog circuits), targeting the identification of subcircuits within a more complex analog circuit module. An integrated novel data augmentation technique is particularly crucial due to the harsh reality of the availability of generally only a relatively limited dataset of analog circuits' schematics (i.e., sample architectures) in practical settings. Through a comprehensive ontology, we first introduce a graph representation framework of the circuits' schematics, which consists of converting the circuit's related netlists into graphs. Then, we use a robust classifier consisting of a GCN processor to determine the label corresponding to the given input analog circuit's schematics. Furthermore, the classification performance is improved and robust by involving a novel data augmentation technique. The classification accuracy was enhanced from 48.2% to 76.6% using feature matrix augmentation, and from 72% to 92% using Dataset Augmentation by Flipping. A 100% accuracy was achieved after applying either multi-Stage augmentation or Hyperphysical Augmentation. Overall, extensive tests of the concept were developed to demonstrate high accuracy for the analog circuit's classification endeavor. This is solid support for a future up-scaling towards an automated analog circuits' structure detection, which is one of the prerequisites not only for the stimuli generation in the frame of analog mixed-signal verification but also for other critical endeavors related to the engineering of AMS circuits.
Collapse
Affiliation(s)
- Ali Deeb
- Institute for Smart Systems Technologies, Universitaet Klagenfurt, 9020 Klagenfurt, Austria
| | | | - Mohamed Salem
- Infineon Technologies Austria, 9500 Villach, Austria
| | | | | | - Anjeza Karaj
- Infineon Technologies Austria, 9500 Villach, Austria
| | - Fadi Al Machot
- Faculty of Science and Technology, Norwegian University of Life Sciences (NMBU), 1430 Ås, Norway
| | - Kyamakya Kyandoghere
- Institute for Smart Systems Technologies, Universitaet Klagenfurt, 9020 Klagenfurt, Austria
| |
Collapse
|
7
|
Kirubeswaran OR, Storrs KR. Inconsistent illusory motion in predictive coding deep neural networks. Vision Res 2023; 206:108195. [PMID: 36801664 DOI: 10.1016/j.visres.2023.108195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 01/31/2023] [Accepted: 01/31/2023] [Indexed: 02/19/2023]
Abstract
Why do we perceive illusory motion in some static images? Several accounts point to eye movements, response latencies to different image elements, or interactions between image patterns and motion energy detectors. Recently PredNet, a recurrent deep neural network (DNN) based on predictive coding principles, was reported to reproduce the "Rotating Snakes" illusion, suggesting a role for predictive coding. We begin by replicating this finding, then use a series of "in silico" psychophysics and electrophysiology experiments to examine whether PredNet behaves consistently with human observers and non-human primate neural data. A pretrained PredNet predicted illusory motion for all subcomponents of the Rotating Snakes pattern, consistent with human observers. However, we found no simple response delays in internal units, unlike evidence from electrophysiological data. PredNet's detection of motion in gradients seemed dependent on contrast, but depends predominantly on luminance in humans. Finally, we examined the robustness of the illusion across ten PredNets of identical architecture, retrained on the same video data. There was large variation across network instances in whether they reproduced the Rotating Snakes illusion, and what motion, if any, they predicted for simplified variants. Unlike human observers, no network predicted motion for greyscale variants of the Rotating Snakes pattern. Our results sound a cautionary note: even when a DNN successfully reproduces some idiosyncrasy of human vision, more detailed investigation can reveal inconsistencies between humans and the network, and between different instances of the same network. These inconsistencies suggest that predictive coding does not reliably give rise to human-like illusory motion.
Collapse
Affiliation(s)
| | - Katherine R Storrs
- Department of Experimental Psychology, Justus Liebig University Giessen, Germany; Centre for Mind, Brain and Behaviour (CMBB), University of Marburg and Justus Liebig University Giessen, Germany; School of Psychology, University of Auckland, New Zealand
| |
Collapse
|
8
|
Wang J, Chen Y, Dong Z, Gao M, Lin H, Miao Q. SABV-Depth: A biologically inspired deep learning network for monocular depth estimation. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|