1
|
Huang J, Wang A, Zhang M. The audiovisual competition effect induced by temporal asynchronous encoding weakened the visual dominance in working memory retrieval. Memory 2024; 32:1069-1082. [PMID: 39067050 DOI: 10.1080/09658211.2024.2381782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Accepted: 07/11/2024] [Indexed: 07/30/2024]
Abstract
Converging evidence suggests a facilitation effect of multisensory interactions on memory performance, reflected in higher accuracy or faster response time under a bimodal encoding condition than a unimodal condition. However, relatively little attention has been given to the effect of multisensory competition on memory. The present study adopted an adaptive staircase test to measure the point of subjective simultaneity (PSS), combined with a delayed matched-to-sample (DMS) task to probe the effect of audiovisual competition during the encoding stage on subsequent unisensory retrieval. The results showed that there was a robust visual dominance effect and multisensory interference effect in WM retrieval, regardless of the subjective synchronous or subjective asynchronous audiovisual presentation. However, a weakened visual dominance effect was observed when the auditory stimulus was presented before the visual stimulus in the encoding period, particularly in the semantically incongruent case. These findings revealed that the prior-entry of sensory information in the early perceptual stage could affect the processing in the late cognitive stage to some extent, and supported the evidence that there is a persistent advantage for visuospatial sketchpad in multisensory WM.
Collapse
Affiliation(s)
- Jie Huang
- Department of Psychology, Research Center for Psychology and Behavioral Sciences, Soochow University, Suzhou, People's Republic of China
| | - Aijun Wang
- Department of Psychology, Research Center for Psychology and Behavioral Sciences, Soochow University, Suzhou, People's Republic of China
| | - Ming Zhang
- School of Psychology, Northeast Normal University, Changchun, People's Republic of China
- Department of Psychology, Suzhou University of Science and Technology, Suzhou, People's Republic of China
- Cognitive Neuroscience Laboratory, Graduate School of Interdisciplinary Science and Engineering in Health Systems, Okayama University, Okayama, Japan
| |
Collapse
|
2
|
García-Lázaro HG, Teng S. Sensory and Perceptual Decisional Processes Underlying the Perception of Reverberant Auditory Environments. eNeuro 2024; 11:ENEURO.0122-24.2024. [PMID: 39122554 PMCID: PMC11335967 DOI: 10.1523/eneuro.0122-24.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 06/29/2024] [Accepted: 07/25/2024] [Indexed: 08/12/2024] Open
Abstract
Reverberation, a ubiquitous feature of real-world acoustic environments, exhibits statistical regularities that human listeners leverage to self-orient, facilitate auditory perception, and understand their environment. Despite the extensive research on sound source representation in the auditory system, it remains unclear how the brain represents real-world reverberant environments. Here, we characterized the neural response to reverberation of varying realism by applying multivariate pattern analysis to electroencephalographic (EEG) brain signals. Human listeners (12 males and 8 females) heard speech samples convolved with real-world and synthetic reverberant impulse responses and judged whether the speech samples were in a "real" or "fake" environment, focusing on the reverberant background rather than the properties of speech itself. Participants distinguished real from synthetic reverberation with ∼75% accuracy; EEG decoding reveals a multistage decoding time course, with dissociable components early in the stimulus presentation and later in the perioffset stage. The early component predominantly occurred in temporal electrode clusters, while the later component was prominent in centroparietal clusters. These findings suggest distinct neural stages in perceiving natural acoustic environments, likely reflecting sensory encoding and higher-level perceptual decision-making processes. Overall, our findings provide evidence that reverberation, rather than being largely suppressed as a noise-like signal, carries relevant environmental information and gains representation along the auditory system. This understanding also offers various applications; it provides insights for including reverberation as a cue to aid navigation for blind and visually impaired people. It also helps to enhance realism perception in immersive virtual reality settings, gaming, music, and film production.
Collapse
Affiliation(s)
| | - Santani Teng
- Smith-Kettlewell Eye Research Institute, San Francisco, California 94115
| |
Collapse
|
3
|
Lee J, Park S. Multi-modal Representation of the Size of Space in the Human Brain. J Cogn Neurosci 2024; 36:340-361. [PMID: 38010320 DOI: 10.1162/jocn_a_02092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
To estimate the size of an indoor space, we must analyze the visual boundaries that limit the spatial extent and acoustic cues from reflected interior surfaces. We used fMRI to examine how the brain processes the geometric size of indoor scenes when various types of sensory cues are presented individually or together. Specifically, we asked whether the size of space is represented in a modality-specific way or in an integrative way that combines multimodal cues. In a block-design study, images or sounds that depict small- and large-sized indoor spaces were presented. Visual stimuli were real-world pictures of empty spaces that were small or large. Auditory stimuli were sounds convolved with different reverberations. By using a multivoxel pattern classifier, we asked whether the two sizes of space can be classified in visual, auditory, and visual-auditory combined conditions. We identified both sensory-specific and multimodal representations of the size of space. To further investigate the nature of the multimodal region, we specifically examined whether it contained multimodal information in a coexistent or integrated form. We found that angular gyrus and the right medial frontal gyrus had modality-integrated representation, displaying sensitivity to the match in the spatial size information conveyed through image and sound. Background functional connectivity analysis further demonstrated that the connection between sensory-specific regions and modality-integrated regions increases in the multimodal condition compared with single modality conditions. Our results suggest that spatial size perception relies on both sensory-specific and multimodal representations, as well as their interplay during multimodal perception.
Collapse
|
4
|
Lee J, Park S. Multi-modal representation of the size of space in the human brain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.24.550343. [PMID: 37546991 PMCID: PMC10402083 DOI: 10.1101/2023.07.24.550343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
To estimate the size of an indoor space, we must analyze the visual boundaries that limit the spatial extent and acoustic cues from reflected interior surfaces. We used fMRI to examine how the brain processes geometric size of indoor scenes when various types of sensory cues are presented individually or together. Specifically, we asked whether the size of space is represented in a modality-specific way or in an integrative way that combines multimodal cues. In a block-design study, images or sounds that depict small and large sized indoor spaces were presented. Visual stimuli were real-world pictures of empty spaces that were small or large. Auditory stimuli were sounds convolved with different reverberation. By using a multi-voxel pattern classifier, we asked whether the two sizes of space can be classified in visual, auditory, and visual-auditory combined conditions. We identified both sensory specific and multimodal representations of the size of space. To further investigate the nature of the multimodal region, we specifically examined whether it contained multimodal information in a coexistent or integrated form. We found that AG and the right IFG pars opercularis had modality-integrated representation, displaying sensitivity to the match in the spatial size information conveyed through image and sound. Background functional connectivity analysis further demonstrated that the connection between sensory specific regions and modality-integrated regions increase in the multimodal condition compared to single modality conditions. Our results suggest that the spatial size perception relies on both sensory specific and multimodal representations, as well as their interplay during multimodal perception.
Collapse
Affiliation(s)
- Jaeeun Lee
- Department of Psychology, University of Minnesota, Minneapolis, MN
| | - Soojin Park
- Department of Psychology, Yonsei University, Seoul, South Korea
| |
Collapse
|
5
|
Palenciano AF, Senoussi M, Formica S, González-García C. Canonical template tracking: Measuring the activation state of specific neural representations. FRONTIERS IN NEUROIMAGING 2023; 1:974927. [PMID: 37555182 PMCID: PMC10406196 DOI: 10.3389/fnimg.2022.974927] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 12/13/2022] [Indexed: 08/10/2023]
Abstract
Multivariate analyses of neural data have become increasingly influential in cognitive neuroscience since they allow to address questions about the representational signatures of neurocognitive phenomena. Here, we describe Canonical Template Tracking: a multivariate approach that employs independent localizer tasks to assess the activation state of specific representations during the execution of cognitive paradigms. We illustrate the benefits of this methodology in characterizing the particular content and format of task-induced representations, comparing it with standard (cross-)decoding and representational similarity analyses. Then, we discuss relevant design decisions for experiments using this analysis approach, focusing on the nature of the localizer tasks from which the canonical templates are derived. We further provide a step-by-step tutorial of this method, stressing the relevant analysis choices for functional magnetic resonance imaging and magneto/electroencephalography data. Importantly, we point out the potential pitfalls linked to canonical template tracking implementation and interpretation of the results, together with recommendations to mitigate them. To conclude, we provide some examples from previous literature that highlight the potential of this analysis to address relevant theoretical questions in cognitive neuroscience.
Collapse
Affiliation(s)
- Ana F. Palenciano
- Mind, Brain, and Behavior Research Center, University of Granada, Granada, Spain
| | - Mehdi Senoussi
- CLLE Lab, CNRS UMR 5263, University of Toulouse, Toulouse, France
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Silvia Formica
- Department of Psychology, Berlin School of Mind and Brain, Humboldt Universität zu Berlin, Berlin, Germany
| | | |
Collapse
|
6
|
Bo K, Cui L, Yin S, Hu Z, Hong X, Kim S, Keil A, Ding M. Decoding the temporal dynamics of affective scene processing. Neuroimage 2022; 261:119532. [PMID: 35931307 DOI: 10.1016/j.neuroimage.2022.119532] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 07/01/2022] [Accepted: 08/01/2022] [Indexed: 10/31/2022] Open
Abstract
Natural images containing affective scenes are used extensively to investigate the neural mechanisms of visual emotion processing. Functional fMRI studies have shown that these images activate a large-scale distributed brain network that encompasses areas in visual, temporal, and frontal cortices. The underlying spatial and temporal dynamics, however, remain to be better characterized. We recorded simultaneous EEG-fMRI data while participants passively viewed affective images from the International Affective Picture System (IAPS). Applying multivariate pattern analysis to decode EEG data, and representational similarity analysis to fuse EEG data with simultaneously recorded fMRI data, we found that: (1) ∼80 ms after picture onset, perceptual processing of complex visual scenes began in early visual cortex, proceeding to ventral visual cortex at ∼100 ms, (2) between ∼200 and ∼300 ms (pleasant pictures: ∼200 ms; unpleasant pictures: ∼260 ms), affect-specific neural representations began to form, supported mainly by areas in occipital and temporal cortices, and (3) affect-specific neural representations were stable, lasting up to ∼2 s, and exhibited temporally generalizable activity patterns. These results suggest that affective scene representations in the brain are formed temporally in a valence-dependent manner and may be sustained by recurrent neural interactions among distributed brain areas.
Collapse
Affiliation(s)
- Ke Bo
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL 32611, USA; Department of Psychological and Brain Sciences, Dartmouth college, Hanover, NH 03755, USA
| | - Lihan Cui
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL 32611, USA
| | - Siyang Yin
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL 32611, USA
| | - Zhenhong Hu
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL 32611, USA
| | - Xiangfei Hong
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL 32611, USA; Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, 200030, China
| | - Sungkean Kim
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL 32611, USA; Department of Human-Computer Interaction, Hanyang University, Ansan, Republic of Korea
| | - Andreas Keil
- Department of Psychology, University of Florida, Gainesville, FL 32611, USA.
| | - Mingzhou Ding
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL 32611, USA.
| |
Collapse
|
7
|
The spatiotemporal neural dynamics of object location representations in the human brain. Nat Hum Behav 2022; 6:796-811. [PMID: 35210593 PMCID: PMC9225954 DOI: 10.1038/s41562-022-01302-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 01/14/2022] [Indexed: 12/30/2022]
Abstract
To interact with objects in complex environments, we must know what they are and where they are in spite of challenging viewing conditions. Here, we investigated where, how and when representations of object location and category emerge in the human brain when objects appear on cluttered natural scene images using a combination of functional magnetic resonance imaging, electroencephalography and computational models. We found location representations to emerge along the ventral visual stream towards lateral occipital complex, mirrored by gradual emergence in deep neural networks. Time-resolved analysis suggested that computing object location representations involves recurrent processing in high-level visual cortex. Object category representations also emerged gradually along the ventral visual stream, with evidence for recurrent computations. These results resolve the spatiotemporal dynamics of the ventral visual stream that give rise to representations of where and what objects are present in a scene under challenging viewing conditions.
Collapse
|
8
|
Multivariate Analysis of Evoked Responses during the Rubber Hand Illusion Suggests a Temporal Parcellation into Manipulation and Illusion-Specific Correlates. eNeuro 2022; 9:ENEURO.0355-21.2021. [PMID: 34980661 PMCID: PMC8805188 DOI: 10.1523/eneuro.0355-21.2021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 11/16/2021] [Accepted: 12/13/2021] [Indexed: 11/23/2022] Open
Abstract
The neurophysiological processes reflecting body illusions such as the rubber hand remain debated. Previous studies investigating the neural responses evoked by the illusion-inducing stimulation have provided diverging reports as to when these responses reflect the illusory state of the artificial limb becoming embodied. One reason for these diverging reports may be that different studies contrasted different experimental conditions to isolate potential correlates of the illusion, but individual contrasts may reflect multiple facets of the adopted experimental paradigm and not just the illusory state. To resolve these controversies, we recorded EEG responses in human participants and combined multivariate (cross-)classification with multiple Illusion and non-Illusion conditions. These conditions were designed to probe for markers of the illusory state that generalize across the spatial arrangements of limbs or the specific nature of the control object (a rubber hand or participant’s real hand), hence which are independent of the precise experimental conditions used as contrast for the illusion. Our results reveal a parcellation of evoked responses into a temporal sequence of events. Around 125 and 275 ms following stimulus onset, the neurophysiological signals reliably differentiate the illusory state from non-Illusion epochs. These results consolidate previous work by demonstrating multiple neurophysiological correlates of the rubber hand illusion and illustrate how multivariate approaches can help pinpointing those that are independent of the precise experimental configuration used to induce the illusion.
Collapse
|
9
|
Lowe MX, Mohsenzadeh Y, Lahner B, Charest I, Oliva A, Teng S. Cochlea to categories: The spatiotemporal dynamics of semantic auditory representations. Cogn Neuropsychol 2021; 38:468-489. [PMID: 35729704 PMCID: PMC10589059 DOI: 10.1080/02643294.2022.2085085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 03/31/2022] [Accepted: 05/25/2022] [Indexed: 10/17/2022]
Abstract
How does the auditory system categorize natural sounds? Here we apply multimodal neuroimaging to illustrate the progression from acoustic to semantically dominated representations. Combining magnetoencephalographic (MEG) and functional magnetic resonance imaging (fMRI) scans of observers listening to naturalistic sounds, we found superior temporal responses beginning ∼55 ms post-stimulus onset, spreading to extratemporal cortices by ∼100 ms. Early regions were distinguished less by onset/peak latency than by functional properties and overall temporal response profiles. Early acoustically-dominated representations trended systematically toward category dominance over time (after ∼200 ms) and space (beyond primary cortex). Semantic category representation was spatially specific: Vocalizations were preferentially distinguished in frontotemporal voice-selective regions and the fusiform; scenes and objects were distinguished in parahippocampal and medial place areas. Our results are consistent with real-world events coded via an extended auditory processing hierarchy, in which acoustic representations rapidly enter multiple streams specialized by category, including areas typically considered visual cortex.
Collapse
Affiliation(s)
- Matthew X. Lowe
- Computer Science and Artificial Intelligence Lab (CSAIL), MIT, Cambridge, MA
- Unlimited Sciences, Colorado Springs, CO
| | - Yalda Mohsenzadeh
- Computer Science and Artificial Intelligence Lab (CSAIL), MIT, Cambridge, MA
- The Brain and Mind Institute, The University of Western Ontario, London, ON, Canada
- Department of Computer Science, The University of Western Ontario, London, ON, Canada
| | - Benjamin Lahner
- Computer Science and Artificial Intelligence Lab (CSAIL), MIT, Cambridge, MA
| | - Ian Charest
- Département de Psychologie, Université de Montréal, Montréal, Québec, Canada
- Center for Human Brain Health, University of Birmingham, UK
| | - Aude Oliva
- Computer Science and Artificial Intelligence Lab (CSAIL), MIT, Cambridge, MA
| | - Santani Teng
- Computer Science and Artificial Intelligence Lab (CSAIL), MIT, Cambridge, MA
- Smith-Kettlewell Eye Research Institute (SKERI), San Francisco, CA
| |
Collapse
|
10
|
Cichy RM, Oliva A. A M/EEG-fMRI Fusion Primer: Resolving Human Brain Responses in Space and Time. Neuron 2020; 107:772-781. [DOI: 10.1016/j.neuron.2020.07.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 06/25/2020] [Accepted: 06/30/2020] [Indexed: 10/23/2022]
|
11
|
Functional Imaging of Visuospatial Attention in Complex and Naturalistic Conditions. Curr Top Behav Neurosci 2020. [PMID: 30547430 DOI: 10.1007/7854_2018_73] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2023]
Abstract
One of the ultimate goals of cognitive neuroscience is to understand how the brain works in the real world. Functional imaging with naturalistic stimuli provides us with the opportunity to study the brain in situations similar to the everyday life. This includes the processing of complex stimuli that can trigger many types of signals related both to the physical characteristics of the external input and to the internal knowledge that we have about natural objects and environments. In this chapter, I will first outline different types of stimuli that have been used in naturalistic imaging studies. These include static pictures, short video clips, full-length movies, and virtual reality, each comprising specific advantages and disadvantages. Next, I will turn to the main issue of visual-spatial orienting in naturalistic conditions and its neural substrates. I will discuss different classes of internal signals, related to objects, scene structure, and long-term memory. All of these, together with external signals about stimulus salience, have been found to modulate the activity and the connectivity of the frontoparietal attention networks. I will conclude by pointing out some promising future directions for functional imaging with naturalistic stimuli. Despite this field of research is still in its early days, I consider that it will play a major role in bridging the gap between standard laboratory paradigms and mechanisms of brain functioning in the real world.
Collapse
|
12
|
Gordon G. Social behaviour as an emergent property of embodied curiosity: a robotics perspective. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180029. [PMID: 30853006 PMCID: PMC6452242 DOI: 10.1098/rstb.2018.0029] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/14/2018] [Indexed: 12/23/2022] Open
Abstract
Social interaction is an extremely complex yet vital component in daily life. We present a bottom-up approach for the emergence of social behaviours from the interaction of the curiosity drive, i.e. the intrinsic motivation to learn as much as possible, and the embedding environment of an agent. Implementing artificial curiosity algorithms in robots that explore human-like environments results in the emergence of a hierarchical structure of learning and behaviour. This structure resembles the sequential emergence of behavioural patterns in human babies, culminating in social behaviours, such as face detection, tracking and attention-grabbing facial expressions. These results suggest that an embodied curiosity drive may be the progenitor of many social behaviours if satiated by a social environment. This article is part of the theme issue 'From social brains to social robots: applying neurocognitive insights to human-robot interaction'.
Collapse
Affiliation(s)
- Goren Gordon
- Curiosity Lab, Department of Industrial Engineering, Tel-Aviv University, Tel-Aviv, Israel
| |
Collapse
|
13
|
Aller M, Noppeney U. To integrate or not to integrate: Temporal dynamics of hierarchical Bayesian causal inference. PLoS Biol 2019; 17:e3000210. [PMID: 30939128 PMCID: PMC6461295 DOI: 10.1371/journal.pbio.3000210] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Revised: 04/12/2019] [Accepted: 03/19/2019] [Indexed: 11/19/2022] Open
Abstract
To form a percept of the environment, the brain needs to solve the binding problem-inferring whether signals come from a common cause and are integrated or come from independent causes and are segregated. Behaviourally, humans solve this problem near-optimally as predicted by Bayesian causal inference; but the neural mechanisms remain unclear. Combining Bayesian modelling, electroencephalography (EEG), and multivariate decoding in an audiovisual spatial localisation task, we show that the brain accomplishes Bayesian causal inference by dynamically encoding multiple spatial estimates. Initially, auditory and visual signal locations are estimated independently; next, an estimate is formed that combines information from vision and audition. Yet, it is only from 200 ms onwards that the brain integrates audiovisual signals weighted by their bottom-up sensory reliabilities and top-down task relevance into spatial priority maps that guide behavioural responses. As predicted by Bayesian causal inference, these spatial priority maps take into account the brain's uncertainty about the world's causal structure and flexibly arbitrate between sensory integration and segregation. The dynamic evolution of perceptual estimates thus reflects the hierarchical nature of Bayesian causal inference, a statistical computation, which is crucial for effective interactions with the environment.
Collapse
Affiliation(s)
- Máté Aller
- Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham, Birmingham, United Kingdom
| | - Uta Noppeney
- Computational Neuroscience and Cognitive Robotics Centre, University of Birmingham, Birmingham, United Kingdom
| |
Collapse
|
14
|
Bart E, Hegdé J. Deep Synthesis of Realistic Medical Images: A Novel Tool in Clinical Research and Training. Front Neuroinform 2018; 12:82. [PMID: 30515089 PMCID: PMC6255819 DOI: 10.3389/fninf.2018.00082] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Accepted: 10/22/2018] [Indexed: 11/15/2022] Open
Abstract
Making clinical decisions based on medical images is fundamentally an exercise in statistical decision-making. This is because in this case, the decision-maker must distinguish between image features that are clinically diagnostic (i.e., signal) from a large amount of non-diagnostic features. (i.e., noise). To perform this task, the decision-maker must have learned the underlying statistical distributions of the signal and noise to begin with. The same is true for machine learning algorithms that perform a given diagnostic task. In order to train and test human experts or expert machine systems in any diagnostic or analytical task, it is advisable to use large sets of images, so as to capture the underlying statistical distributions adequately. Large numbers of images are also useful in clinical and scientific research about the underlying diagnostic process, which remains poorly understood. Unfortunately, it is often difficult to obtain medical images of given specific descriptions in sufficiently large numbers. This represents a significant barrier to progress in the arenas of clinical care, education, and research. Here we describe a novel methodology that helps overcome this barrier. This method leverages the burgeoning technologies of deep learning (DL) and deep synthesis (DS) to synthesize medical images de novo. We provide a proof-of-principle of this approach using mammograms as an illustrative case. During the initial, prerequisite DL phase of the study, we trained a publicly available deep learning neural network (DNN), using open-sourced, radiologically vetted mammograms as labeled examples. During the subsequent DS phase of the study, the fully trained DNN was made to synthesize, de novo, images that capture the image statistics of a given input image. The resulting images indicated that our DNN was able to faithfully capture the image statistics of visually diverse sets of mammograms. We also briefly outline rigorous psychophysical testing methods to measure the extent to which synthesized mammography were sufficiently alike their original counterparts to human experts. These tests reveal that mammography experts fail to distinguish synthesized mammograms from their original counterparts at a statistically significant level, suggesting that the synthesized images were sufficiently realistic. Taken together, these results demonstrate that deep synthesis has the potential to be impactful in all fields in which medical images play a key role, most notably in radiology and pathology.
Collapse
Affiliation(s)
- Evgeniy Bart
- Palo Alto Research Center, Palo Alto, CA, United States
| | - Jay Hegdé
- Department of Neuroscience and Regenerative Medicine, James and Jean Culver Vision Discovery Institute, The Graduate School, Augusta University, Augusta, GA, United States.,Department of Ophthalmology, Medical College of Georgia, Augusta University, Augusta, GA, United States
| |
Collapse
|
15
|
Cavallo A, Romeo L, Ansuini C, Podda J, Battaglia F, Veneselli E, Pontil M, Becchio C. Prospective motor control obeys to idiosyncratic strategies in autism. Sci Rep 2018; 8:13717. [PMID: 30209274 PMCID: PMC6135837 DOI: 10.1038/s41598-018-31479-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Accepted: 08/13/2018] [Indexed: 12/15/2022] Open
Abstract
Disturbance of primary prospective motor control has been proposed to contribute to faults in higher mind functions of individuals with autism spectrum disorder, but little research has been conducted to characterize prospective control strategies in autism. In the current study, we applied pattern-classification analyses to kinematic features to verify whether children with autism spectrum disorder (ASD) and typically developing (TD) children altered their initial grasp in anticipation of self- and other-actions. Results indicate that children with autism adjusted their behavior to accommodate onward actions. The way they did so, however, varied idiosyncratically from one individual to another, which suggests that previous characterizations of general lack of prospective control strategies may be overly simplistic. These findings link abnormalities in anticipatory control with increased variability and offer insights into the difficulties that individuals with ASD may experience in social interaction.
Collapse
Affiliation(s)
- Andrea Cavallo
- Department of Psychology, University of Torino, Torino, Italy.,Cognition, Motion & Neuroscience Unit, Fondazione Istituto Italiano di Tecnologia, Genova, Italy
| | - Luca Romeo
- Computational Statistics and Machine Learning, Fondazione Istituto Italiano di Tecnologia, Genova, Italy.,Dipartimento di Ingegneria dell'Informazione, Università Politecnica delle Marche, Ancona, Italy
| | - Caterina Ansuini
- Cognition, Motion & Neuroscience Unit, Fondazione Istituto Italiano di Tecnologia, Genova, Italy
| | - Jessica Podda
- Cognition, Motion & Neuroscience Unit, Fondazione Istituto Italiano di Tecnologia, Genova, Italy
| | | | | | - Massimiliano Pontil
- Computational Statistics and Machine Learning, Fondazione Istituto Italiano di Tecnologia, Genova, Italy.,Department of Computer Science, University College London, London, UK
| | - Cristina Becchio
- Department of Psychology, University of Torino, Torino, Italy. .,Cognition, Motion & Neuroscience Unit, Fondazione Istituto Italiano di Tecnologia, Genova, Italy.
| |
Collapse
|
16
|
Pinheiro-Chagas P, Piazza M, Dehaene S. Decoding the processing stages of mental arithmetic with magnetoencephalography. Cortex 2018; 114:124-139. [PMID: 30177399 DOI: 10.1016/j.cortex.2018.07.018] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 05/25/2018] [Accepted: 07/16/2018] [Indexed: 01/24/2023]
Abstract
Elementary arithmetic is highly prevalent in our daily lives. However, despite decades of research, we are only beginning to understand how the brain solves simple calculations. Here, we applied machine learning techniques to magnetoencephalography (MEG) signals in an effort to decompose the successive processing stages and mental transformations underlying elementary arithmetic. Adults subjects verified single-digit addition and subtraction problems such as 3 + 2 = 9 in which each successive symbol was presented sequentially. MEG signals revealed a cascade of partially overlapping brain states. While the first operand could be transiently decoded above chance level, primarily based on its visual properties, the decoding of the second operand was more accurate and lasted longer. Representational similarity analyses suggested that this decoding rested on both visual and magnitude codes. We were also able to decode the operation type (additions vs. subtraction) during practically the entire trial after the presentation of the operation sign. At the decision stage, MEG indicated a fast and highly overlapping temporal dynamics for (1) identifying the proposed result, (2) judging whether it was correct or incorrect, and (3) pressing the response button. Surprisingly, however, the internally computed result could not be decoded. Our results provide a first comprehensive picture of the unfolding processing stages underlying arithmetic calculations at a single-trial level, and suggest that externally and internally generated neural codes may have different neural substrates.
Collapse
Affiliation(s)
- Pedro Pinheiro-Chagas
- Cognitive Neuroimaging Unit, CEA DRF/I2BM, INSERM, Université Paris-Sud, Université Paris-Saclay, NeuroSpin Center, Gif/Yvette, France.
| | - Manuela Piazza
- Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy
| | - Stanislas Dehaene
- Cognitive Neuroimaging Unit, CEA DRF/I2BM, INSERM, Université Paris-Sud, Université Paris-Saclay, NeuroSpin Center, Gif/Yvette, France; Collège de France, 11 Place Marcelin Berthelot, Paris, France
| |
Collapse
|
17
|
Modality-Independent Coding of Scene Categories in Prefrontal Cortex. J Neurosci 2018; 38:5969-5981. [PMID: 29858483 DOI: 10.1523/jneurosci.0272-18.2018] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Revised: 05/03/2018] [Accepted: 05/26/2018] [Indexed: 11/21/2022] Open
Abstract
Natural environments convey information through multiple sensory modalities, all of which contribute to people's percepts. Although it has been shown that visual or auditory content of scene categories can be decoded from brain activity, it remains unclear how humans represent scene information beyond a specific sensory modality domain. To address this question, we investigated how categories of scene images and sounds are represented in several brain regions. A group of healthy human subjects (both sexes) participated in the present study, where their brain activity was measured with fMRI while viewing images or listening to sounds of different real-world environments. We found that both visual and auditory scene categories can be decoded not only from modality-specific areas, but also from several brain regions in the temporal, parietal, and prefrontal cortex (PFC). Intriguingly, only in the PFC, but not in any other regions, categories of scene images and sounds appear to be represented in similar activation patterns, suggesting that scene representations in PFC are modality-independent. Furthermore, the error patterns of neural decoders indicate that category-specific neural activity patterns in the middle and superior frontal gyri are tightly linked to categorization behavior. Our findings demonstrate that complex scene information is represented at an abstract level in the PFC, regardless of the sensory modality of the stimulus.SIGNIFICANCE STATEMENT Our experience in daily life includes multiple sensory inputs, such as images, sounds, or scents from the surroundings, which all contribute to our understanding of the environment. Here, for the first time, we investigated where and how in the brain information about the natural environment from multiple senses is merged to form modality-independent representations of scene categories. We show direct decoding of scene categories across sensory modalities from patterns of neural activity in the prefrontal cortex (PFC). We also conclusively tie these neural representations to human categorization behavior by comparing patterns of errors between a neural decoder and behavior. Our findings suggest that PFC is a central hub for integrating sensory information and computing modality-independent representations of scene categories.
Collapse
|
18
|
Abstract
Sounds in everyday life seldom appear in isolation. Both humans and machines are constantly flooded with a cacophony of sounds that need to be sorted through and scoured for relevant information-a phenomenon referred to as the 'cocktail party problem'. A key component in parsing acoustic scenes is the role of attention, which mediates perception and behaviour by focusing both sensory and cognitive resources on pertinent information in the stimulus space. The current article provides a review of modelling studies of auditory attention. The review highlights how the term attention refers to a multitude of behavioural and cognitive processes that can shape sensory processing. Attention can be modulated by 'bottom-up' sensory-driven factors, as well as 'top-down' task-specific goals, expectations and learned schemas. Essentially, it acts as a selection process or processes that focus both sensory and cognitive resources on the most relevant events in the soundscape; with relevance being dictated by the stimulus itself (e.g. a loud explosion) or by a task at hand (e.g. listen to announcements in a busy airport). Recent computational models of auditory attention provide key insights into its role in facilitating perception in cluttered auditory scenes.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Emine Merve Kaya
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering, The Johns Hopkins University, 3400 N Charles Street, Barton Hall, Baltimore, MD 21218, USA
| | - Mounya Elhilali
- Laboratory for Computational Audio Perception, Department of Electrical and Computer Engineering, The Johns Hopkins University, 3400 N Charles Street, Barton Hall, Baltimore, MD 21218, USA
| |
Collapse
|
19
|
Dykstra AR, Cariani PA, Gutschalk A. A roadmap for the study of conscious audition and its neural basis. Philos Trans R Soc Lond B Biol Sci 2017; 372:20160103. [PMID: 28044014 PMCID: PMC5206271 DOI: 10.1098/rstb.2016.0103] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/03/2016] [Indexed: 12/16/2022] Open
Abstract
How and which aspects of neural activity give rise to subjective perceptual experience-i.e. conscious perception-is a fundamental question of neuroscience. To date, the vast majority of work concerning this question has come from vision, raising the issue of generalizability of prominent resulting theories. However, recent work has begun to shed light on the neural processes subserving conscious perception in other modalities, particularly audition. Here, we outline a roadmap for the future study of conscious auditory perception and its neural basis, paying particular attention to how conscious perception emerges (and of which elements or groups of elements) in complex auditory scenes. We begin by discussing the functional role of the auditory system, particularly as it pertains to conscious perception. Next, we ask: what are the phenomena that need to be explained by a theory of conscious auditory perception? After surveying the available literature for candidate neural correlates, we end by considering the implications that such results have for a general theory of conscious perception as well as prominent outstanding questions and what approaches/techniques can best be used to address them.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Andrew R Dykstra
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany
| | | | - Alexander Gutschalk
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany
| |
Collapse
|
20
|
Kondo HM, van Loon AM, Kawahara JI, Moore BCJ. Auditory and visual scene analysis: an overview. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0099. [PMID: 28044011 DOI: 10.1098/rstb.2016.0099] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/03/2016] [Indexed: 01/23/2023] Open
Abstract
We perceive the world as stable and composed of discrete objects even though auditory and visual inputs are often ambiguous owing to spatial and temporal occluders and changes in the conditions of observation. This raises important questions regarding where and how 'scene analysis' is performed in the brain. Recent advances from both auditory and visual research suggest that the brain does not simply process the incoming scene properties. Rather, top-down processes such as attention, expectations and prior knowledge facilitate scene perception. Thus, scene analysis is linked not only with the extraction of stimulus features and formation and selection of perceptual objects, but also with selective attention, perceptual binding and awareness. This special issue covers novel advances in scene-analysis research obtained using a combination of psychophysics, computational modelling, neuroimaging and neurophysiology, and presents new empirical and theoretical approaches. For integrative understanding of scene analysis beyond and across sensory modalities, we provide a collection of 15 articles that enable comparison and integration of recent findings in auditory and visual scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Hirohito M Kondo
- Human Information Science Laboratory, NTT Communication Science Laboratories, NTT Corporation, Atsugi, Kanagawa 243-0198, Japan
| | - Anouk M van Loon
- Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, Amsterdam 1081 BT, The Netherlands .,Institute of Brain and Behavior Amsterdam, Vrije Universiteit Amsterdam, Amsterdam 1081 BT, The Netherlands
| | - Jun-Ichiro Kawahara
- Department of Psychology, Graduate School of Letters, Hokkaido University, Sapporo 060-0810, Japan
| | - Brian C J Moore
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, UK
| |
Collapse
|
21
|
Groen IIA, Silson EH, Baker CI. Contributions of low- and high-level properties to neural processing of visual scenes in the human brain. Philos Trans R Soc Lond B Biol Sci 2017; 372:rstb.2016.0102. [PMID: 28044013 DOI: 10.1098/rstb.2016.0102] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2016] [Indexed: 11/12/2022] Open
Abstract
Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'.
Collapse
Affiliation(s)
- Iris I A Groen
- Laboratory of Brain and Cognition, National Institutes of Health, 10 Center Drive 10-3N228, Bethesda, MD, USA
| | - Edward H Silson
- Laboratory of Brain and Cognition, National Institutes of Health, 10 Center Drive 10-3N228, Bethesda, MD, USA
| | - Chris I Baker
- Laboratory of Brain and Cognition, National Institutes of Health, 10 Center Drive 10-3N228, Bethesda, MD, USA
| |
Collapse
|