1
|
Forest S, Quinton JC, Lefort M. A Dynamic Neural Field Model of Multimodal Merging: Application to the Ventriloquist Effect. Neural Comput 2022; 34:1701-1726. [PMID: 35798331 DOI: 10.1162/neco_a_01509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 03/01/2022] [Indexed: 11/04/2022]
Abstract
Multimodal merging encompasses the ability to localize stimuli based on imprecise information sampled through individual senses such as sight and hearing. Merging decisions are standardly described using Bayesian models that fit behaviors over many trials, encapsulated in a probability distribution. We introduce a novel computational model based on dynamic neural fields able to simulate decision dynamics and generate localization decisions, trial by trial, adapting to varying degrees of discrepancy between audio and visual stimulations. Neural fields are commonly used to model neural processes at a mesoscopic scale-for instance, neurophysiological activity in the superior colliculus. Our model is fit to human psychophysical data of the ventriloquist effect, additionally testing the influence of retinotopic projection onto the superior colliculus and providing a quantitative performance comparison to the Bayesian reference model. While models perform equally on average, a qualitative analysis of free parameters in our model allows insights into the dynamics of the decision and the individual variations in perception caused by noise. We finally show that the increase in the number of free parameters does not result in overfitting and that the parameter space may be either reduced to fit specific criteria or exploited to perform well on more demanding tasks in the future. Indeed, beyond decision or localization tasks, our model opens the door to the simulation of behavioral dynamics, as well as saccade generation driven by multimodal stimulation.
Collapse
Affiliation(s)
- Simon Forest
- Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, UMR 5224, F-38000 Grenoble, France.,Université de Lyon, Université Claude Bernard Lyon 1, CNRS, INSA Lyon, LIRIS, UMR 5205, F-69621 Villeurbanne, France
| | - Jean-Charles Quinton
- Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, UMR 5224, F-38000, Grenoble, France
| | - Mathieu Lefort
- Université de Lyon, Université Claude Bernard Lyon 1, CNRS, INSA Lyon, LIRIS, UMR 5205, F-69621, Villeurbanne, France
| |
Collapse
|
2
|
Laeng B, Flaaten CB, Walle KM, Hochkeppler A, Specht K. "Mickey Mousing" in the Brain: Motion-Sound Synesthesia and the Subcortical Substrate of Audio-Visual Integration. Front Hum Neurosci 2021; 15:605166. [PMID: 33658913 PMCID: PMC7917298 DOI: 10.3389/fnhum.2021.605166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 01/18/2021] [Indexed: 11/13/2022] Open
Abstract
Motion-sound synesthesia is characterized by illusory auditory sensations linked to the pattern and rhythms of motion (dubbed "Mickey Mousing" as in cinema) of visually experienced but soundless object, like an optical flow array, a ball bouncing or a horse galloping. In an MRI study with a group of three synesthetes and a group of eighteen control participants, we found structural changes in the brains of synesthetes in the subcortical multisensory areas of the superior and inferior colliculi. In addition, functional magnetic resonance imaging data showed activity in motion-sensitive regions, as well as temporal and occipital areas, and the cerebellum. However, the synesthetes had a higher activation within the left and right cuneus, with stronger activations when viewing optical flow stimuli. There was also a general difference in connectivity of the colliculi with the above mentioned regions between the two groups. These findings implicate low-level mechanisms within the human neuroaxis as a substrate for local connectivity and cross activity between perceptual processes that are "distant" in terms of cortical topography. The present findings underline the importance of considering the role of subcortical systems and their connectivity to multimodal regions of the cortex and they strengthen a parsimonious account of synesthesia, at the least of the visual-auditory type.
Collapse
Affiliation(s)
- Bruno Laeng
- Department of Psychology, University of Oslo, Oslo, Norway.,RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
| | - Camilla Barthel Flaaten
- Department of Psychology, University of Oslo, Oslo, Norway.,NORMENT Centre for Research on Mental Disorders, Division of Mental Health and Addiction, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Kjersti Maehlum Walle
- Department of Psychology, University of Oslo, Oslo, Norway.,Norwegian Institute of Public Health, Oslo, Norway
| | - Anne Hochkeppler
- German Centre for Neurodegenerative Diseases (DZNE), Magdeburg, Germany.,Department of Biological and Medical Psychology, University of Bergen, Bergen, Norway
| | - Karsten Specht
- Department of Biological and Medical Psychology, University of Bergen, Bergen, Norway.,Department of Education, UiT/The Arctic University of Norway, Tromsø, Norway.,Mohn Medical Imaging and Visualization Centre, Haukeland University Hospital, Bergen, Norway
| |
Collapse
|
3
|
Abstract
The natural environment and our interaction with it are essentially multisensory, where we may deploy visual, tactile and/or auditory senses to perceive, learn and interact with our environment. Our objective in this study is to develop a scene analysis algorithm using multisensory information, specifically vision and audio. We develop a proto-object-based audiovisual saliency map (AVSM) for the analysis of dynamic natural scenes. A specialized audiovisual camera with 360∘ field of view, capable of locating sound direction, is used to collect spatiotemporally aligned audiovisual data. We demonstrate that the performance of a proto-object-based audiovisual saliency map in detecting and localizing salient objects/events is in agreement with human judgment. In addition, the proto-object-based AVSM that we compute as a linear combination of visual and auditory feature conspicuity maps captures a higher number of valid salient events compared to unisensory saliency maps. Such an algorithm can be useful in surveillance, robotic navigation, video compression and related applications.
Collapse
|
4
|
Oess T, Löhr MPR, Schmid D, Ernst MO, Neumann H. From Near-Optimal Bayesian Integration to Neuromorphic Hardware: A Neural Network Model of Multisensory Integration. Front Neurorobot 2020; 14:29. [PMID: 32499692 PMCID: PMC7243343 DOI: 10.3389/fnbot.2020.00029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Accepted: 04/22/2020] [Indexed: 11/18/2022] Open
Abstract
While interacting with the world our senses and nervous system are constantly challenged to identify the origin and coherence of sensory input signals of various intensities. This problem becomes apparent when stimuli from different modalities need to be combined, e.g., to find out whether an auditory stimulus and a visual stimulus belong to the same object. To cope with this problem, humans and most other animal species are equipped with complex neural circuits to enable fast and reliable combination of signals from various sensory organs. This multisensory integration starts in the brain stem to facilitate unconscious reflexes and continues on ascending pathways to cortical areas for further processing. To investigate the underlying mechanisms in detail, we developed a canonical neural network model for multisensory integration that resembles neurophysiological findings. For example, the model comprises multisensory integration neurons that receive excitatory and inhibitory inputs from unimodal auditory and visual neurons, respectively, as well as feedback from cortex. Such feedback projections facilitate multisensory response enhancement and lead to the commonly observed inverse effectiveness of neural activity in multisensory neurons. Two versions of the model are implemented, a rate-based neural network model for qualitative analysis and a variant that employs spiking neurons for deployment on a neuromorphic processing. This dual approach allows to create an evaluation environment with the ability to test model performances with real world inputs. As a platform for deployment we chose IBM's neurosynaptic chip TrueNorth. Behavioral studies in humans indicate that temporal and spatial offsets as well as reliability of stimuli are critical parameters for integrating signals from different modalities. The model reproduces such behavior in experiments with different sets of stimuli. In particular, model performance for stimuli with varying spatial offset is tested. In addition, we demonstrate that due to the emergent properties of network dynamics model performance is close to optimal Bayesian inference for integration of multimodal sensory signals. Furthermore, the implementation of the model on a neuromorphic processing chip enables a complete neuromorphic processing cascade from sensory perception to multisensory integration and the evaluation of model performance for real world inputs.
Collapse
Affiliation(s)
- Timo Oess
- Applied Cognitive Psychology, Institute of Psychology and Education, Ulm University, Ulm, Germany
| | - Maximilian P R Löhr
- Vision and Perception Science Lab, Institute of Neural Information Processing, Ulm University, Ulm, Germany
| | - Daniel Schmid
- Vision and Perception Science Lab, Institute of Neural Information Processing, Ulm University, Ulm, Germany
| | - Marc O Ernst
- Applied Cognitive Psychology, Institute of Psychology and Education, Ulm University, Ulm, Germany
| | - Heiko Neumann
- Vision and Perception Science Lab, Institute of Neural Information Processing, Ulm University, Ulm, Germany
| |
Collapse
|
5
|
Wang D, Zhang Y, Xin J. An emergent deep developmental model for auditory learning. J EXP THEOR ARTIF IN 2019. [DOI: 10.1080/0952813x.2019.1672795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Dongshu Wang
- School of Electrical Engineering, Zhengzhou University, Zhengzhou, PR China
| | - Yadong Zhang
- School of Electrical Engineering, Zhengzhou University, Zhengzhou, PR China
| | - Jianbin Xin
- School of Electrical Engineering, Zhengzhou University, Zhengzhou, PR China
| |
Collapse
|
6
|
Ursino M, Cuppini C, Magosso E. Neurocomputational approaches to modelling multisensory integration in the brain: A review. Neural Netw 2014; 60:141-65. [DOI: 10.1016/j.neunet.2014.08.003] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2014] [Revised: 08/05/2014] [Accepted: 08/07/2014] [Indexed: 10/24/2022]
|