1
|
Audio-Visual Integration in a Redundant Target Paradigm: A Comparison between Rhesus Macaque and Man. Front Syst Neurosci 2017; 11:89. [PMID: 29238295 PMCID: PMC5712580 DOI: 10.3389/fnsys.2017.00089] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 11/16/2017] [Indexed: 11/13/2022] Open
Abstract
The mechanisms underlying multi-sensory interactions are still poorly understood despite considerable progress made since the first neurophysiological recordings of multi-sensory neurons. While the majority of single-cell neurophysiology has been performed in anesthetized or passive-awake laboratory animals, the vast majority of behavioral data stems from studies with human subjects. Interpretation of neurophysiological data implicitly assumes that laboratory animals exhibit perceptual phenomena comparable or identical to those observed in human subjects. To explicitly test this underlying assumption, we here characterized how two rhesus macaques and four humans detect changes in intensity of auditory, visual, and audio-visual stimuli. These intensity changes consisted of a gradual envelope modulation for the sound, and a luminance step for the LED. Subjects had to detect any perceived intensity change as fast as possible. By comparing the monkeys' results with those obtained from the human subjects we found that (1) unimodal reaction times differed across modality, acoustic modulation frequency, and species, (2) the largest facilitation of reaction times with the audio-visual stimuli was observed when stimulus onset asynchronies were such that the unimodal reactions would occur at the same time (response, rather than physical synchrony), and (3) the largest audio-visual reaction-time facilitation was observed when unimodal auditory stimuli were difficult to detect, i.e., at slow unimodal reaction times. We conclude that despite marked unimodal heterogeneity, similar multisensory rules applied to both species. Single-cell neurophysiology in the rhesus macaque may therefore yield valuable insights into the mechanisms governing audio-visual integration that may be informative of the processes taking place in the human brain.
Collapse
|
2
|
Level-weighted averaging in elevation to synchronous amplitude-modulated sounds. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:3094. [PMID: 29195479 PMCID: PMC6147220 DOI: 10.1121/1.5011182] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
To program a goal-directed response in the presence of multiple sounds, the audiomotor system should separate the sound sources. The authors examined whether the brain can segregate synchronous broadband sounds in the midsagittal plane, using amplitude modulations as an acoustic discrimination cue. To succeed in this task, the brain has to use pinna-induced spectral-shape cues and temporal envelope information. The authors tested spatial segregation performance in the midsagittal plane in two paradigms in which human listeners were required to localize, or distinguish, a target amplitude-modulated broadband sound when a non-modulated broadband distractor was played simultaneously at another location. The level difference between the amplitude-modulated and distractor stimuli was systematically varied, as well as the modulation frequency of the target sound. The authors found that participants were unable to segregate, or localize, the synchronous sounds. Instead, they invariably responded toward a level-weighted average of both sound locations, irrespective of the modulation frequency. An increased variance in the response distributions for double sounds of equal level was also observed, which cannot be accounted for by a segregation model, or by a probabilistic averaging model.
Collapse
|
3
|
Single-sided deafness and directional hearing: contribution of spectral cues and high-frequency hearing loss in the hearing ear. Front Neurosci 2014; 8:188. [PMID: 25071433 PMCID: PMC4082092 DOI: 10.3389/fnins.2014.00188] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2014] [Accepted: 06/13/2014] [Indexed: 11/23/2022] Open
Abstract
Direction-specific interactions of sound waves with the head, torso, and pinna provide unique spectral-shape cues that are used for the localization of sounds in the vertical plane, whereas horizontal sound localization is based primarily on the processing of binaural acoustic differences in arrival time (interaural time differences, or ITDs) and sound level (interaural level differences, or ILDs). Because the binaural sound-localization cues are absent in listeners with total single-sided deafness (SSD), their ability to localize sound is heavily impaired. However, some studies have reported that SSD listeners are able, to some extent, to localize sound sources in azimuth, although the underlying mechanisms used for localization are unclear. To investigate whether SSD listeners rely on monaural pinna-induced spectral-shape cues of their hearing ear for directional hearing, we investigated localization performance for low-pass filtered (LP, <1.5 kHz), high-pass filtered (HP, >3kHz), and broadband (BB, 0.5–20 kHz) noises in the two-dimensional frontal hemifield. We tested whether localization performance of SSD listeners further deteriorated when the pinna cavities of their hearing ear were filled with a mold that disrupted their spectral-shape cues. To remove the potential use of perceived sound level as an invalid azimuth cue, we randomly varied stimulus presentation levels over a broad range (45–65 dB SPL). Several listeners with SSD could localize HP and BB sound sources in the horizontal plane, but inter-subject variability was considerable. Localization performance of these listeners strongly reduced after diminishing of their spectral pinna-cues. We further show that inter-subject variability of SSD can be explained to a large extent by the severity of high-frequency hearing loss in their hearing ear.
Collapse
|
4
|
Task-related preparatory modulations multiply with acoustic processing in monkey auditory cortex. Eur J Neurosci 2014; 39:1538-50. [PMID: 24649904 DOI: 10.1111/ejn.12532] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2014] [Accepted: 01/28/2014] [Indexed: 11/30/2022]
Abstract
We characterised task-related top-down signals in monkey auditory cortex cells by comparing single-unit activity during passive sound exposure with neuronal activity during a predictable and unpredictable reaction-time task for a variety of spectral-temporally modulated broadband sounds. Although animals were not trained to attend to particular spectral or temporal sound modulations, their reaction times demonstrated clear acoustic spectral-temporal sensitivity for unpredictable modulation onsets. Interestingly, this sensitivity was absent for predictable trials with fast manual responses, but re-emerged for the slower reactions in these trials. Our analysis of neural activity patterns revealed a task-related dynamic modulation of auditory cortex neurons that was locked to the animal's reaction time, but invariant to the spectral and temporal acoustic modulations. This finding suggests dissociation between acoustic and behavioral signals at the single-unit level. We further demonstrated that single-unit activity during task execution can be described by a multiplicative gain modulation of acoustic-evoked activity and a task-related top-down signal, rather than by linear summation of these signals.
Collapse
|
5
|
Stable bottom-up processing during dynamic top-down modulations in monkey auditory cortex. Eur J Neurosci 2013; 37:1830-42. [PMID: 23510187 DOI: 10.1111/ejn.12180] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Accepted: 02/11/2013] [Indexed: 11/26/2022]
Abstract
It is unclear whether top-down processing in the auditory cortex (AC) interferes with its bottom-up analysis of sound. Recent studies indicated non-acoustic modulations of AC responses, and that attention changes a neuron's spectrotemporal tuning. As a result, the AC would seem ill-suited to represent a stable acoustic environment, which is deemed crucial for auditory perception. To assess whether top-down signals influence acoustic tuning in tasks without directed attention, we compared monkey single-unit AC responses to dynamic spectrotemporal sounds under different behavioral conditions. Recordings were mostly made from neurons located in primary fields (primary AC and area R of the AC) that were well tuned to pure tones, with short onset latencies. We demonstrated that responses in the AC were substantially modulated during an auditory detection task and that these modulations were systematically related to top-down processes. Importantly, despite these significant modulations, the spectrotemporal receptive fields of all neurons remained remarkably stable. Our results suggest multiplexed encoding of bottom-up acoustic and top-down task-related signals at single AC neurons. This mechanism preserves a stable representation of the acoustic environment despite strong non-acoustic modulations.
Collapse
|
6
|
The influence of static eye and head position on the ventriloquist effect. Eur J Neurosci 2013; 37:1501-10. [PMID: 23463919 DOI: 10.1111/ejn.12176] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2012] [Revised: 12/20/2012] [Accepted: 01/30/2013] [Indexed: 11/28/2022]
Abstract
Orienting responses to audiovisual events have shorter reaction times and better accuracy and precision when images and sounds in the environment are aligned in space and time. How the brain constructs an integrated audiovisual percept is a computational puzzle because the auditory and visual senses are represented in different reference frames: the retina encodes visual locations with respect to the eyes; whereas the sound localisation cues are referenced to the head. In the well-known ventriloquist effect, the auditory spatial percept of the ventriloquist's voice is attracted toward the synchronous visual image of the dummy, but does this visual bias on sound localisation operate in a common reference frame by correctly taking into account eye and head position? Here we studied this question by independently varying initial eye and head orientations, and the amount of audiovisual spatial mismatch. Human subjects pointed head and/or gaze to auditory targets in elevation, and were instructed to ignore co-occurring visual distracters. Results demonstrate that different initial head and eye orientations are accurately and appropriately incorporated into an audiovisual response. Effectively, sounds and images are perceptually fused according to their physical locations in space independent of an observer's point of view. Implications for neurophysiological findings and modelling efforts that aim to reconcile sensory and motor signals for goal-directed behaviour are discussed.
Collapse
|
7
|
Age-related hearing loss and ear morphology affect vertical but not horizontal sound-localization performance. J Assoc Res Otolaryngol 2013; 14:261-73. [PMID: 23319012 DOI: 10.1007/s10162-012-0367-7] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Accepted: 12/14/2012] [Indexed: 10/27/2022] Open
Abstract
Several studies have attributed deterioration of sound localization in the horizontal (azimuth) and vertical (elevation) planes to an age-related decline in binaural processing and high-frequency hearing loss (HFHL). The latter might underlie decreased elevation performance of older adults. However, as the pinnae keep growing throughout life, we hypothesized that larger ears might enable older adults to localize sounds in elevation on the basis of lower frequencies, thus (partially) compensating their HFHL. In addition, it is not clear whether sound localization has already matured at a very young age, when the body is still growing, and the binaural and monaural sound-localization cues change accordingly. The present study investigated sound-localization performance of children (7-11 years), young adults (20-34 years), and older adults (63-80 years) under open-loop conditions in the two-dimensional frontal hemifield. We studied the effect of age-related hearing loss and ear size on localization responses to brief broadband sound bursts with different bandwidths. We found similar localization abilities in azimuth for all listeners, including the older adults with HFHL. Sound localization in elevation for the children and young adult listeners with smaller ears improved when stimuli contained frequencies above 7 kHz. Subjects with larger ears could also judge the elevation of sound sources restricted to lower frequency content. Despite increasing ear size, sound localization in elevation deteriorated in older adults with HFHL. We conclude that the binaural localization cues are successfully used well into later stages of life, but that pinna growth cannot compensate the more profound HFHL with age.
Collapse
|
8
|
Contribution of monaural and binaural cues to sound localization in listeners with acquired unilateral conductive hearing loss: Improved directional hearing with a bone-conduction device. Hear Res 2012; 286:9-18. [PMID: 22616091 DOI: 10.1016/j.heares.2012.02.012] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
9
|
Applying double-magnetic induction to measure head-unrestrained gaze shifts: calibration and validation in monkey. BIOLOGICAL CYBERNETICS 2010; 103:415-432. [PMID: 21082199 DOI: 10.1007/s00422-010-0408-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2010] [Accepted: 10/28/2010] [Indexed: 05/30/2023]
Abstract
The double magnetic induction (DMI) method has successfully been used to record head-unrestrained gaze shifts in human subjects (Bremen et al., J Neurosci Methods 160:75-84, 2007a, J Neurophysiol, 98:3759-3769, 2007b). This method employs a small golden ring placed on the eye that, when positioned within oscillating magnetic fields, induces orientation-dependent voltages in a pickup coil in front of the eye. Here we develop and test a streamlined calibration routine for use with experimental animals, in particular, with monkeys. The calibration routine requires the animal solely to accurately follow visual targets presented at random locations in the visual field. Animals can readily learn this task. In addition, we use the fact that the pickup coil can be fixed rigidly and reproducibly on implants on the animal's skull. Therefore, accumulation of calibration data leads to increasing accuracy. As a first step, we simulated gaze shifts and the resulting DMI signals. Our simulations showed that the complex DMI signals can be effectively calibrated with the use of random target sequences, which elicit substantial decoupling of eye- and head orientations in a natural way. Subsequently, we tested our paradigm on three macaque monkeys. Our results show that the data for a successful calibration can be collected in a single recording session, in which the monkey makes about 1,500-2,000 goal-directed saccades. We obtained a resolution of 30 arc minutes (measurement range [-60,+60]°). This resolution compares to the fixation resolution of the monkey's oculomotor system, and to the standard scleral search-coil method.
Collapse
|
10
|
Abstract
Orienting responses to audiovisual events in the environment can benefit markedly by the integration of visual and auditory spatial information. However, logically, audiovisual integration would only be considered successful for stimuli that are spatially and temporally aligned, as these would be emitted by a single object in space-time. As humans do not have prior knowledge about whether novel auditory and visual events do indeed emanate from the same object, such information needs to be extracted from a variety of sources. For example, expectation about alignment or misalignment could modulate the strength of multisensory integration. If evidence from previous trials would repeatedly favour aligned audiovisual inputs, the internal state might also assume alignment for the next trial, and hence react to a new audiovisual event as if it were aligned. To test for such a strategy, subjects oriented a head-fixed pointer as fast as possible to a visual flash that was consistently paired, though not always spatially aligned, with a co-occurring broadband sound. We varied the probability of audiovisual alignment between experiments. Reaction times were consistently lower in blocks containing only aligned audiovisual stimuli than in blocks also containing pseudorandomly presented spatially disparate stimuli. Results demonstrate dynamic updating of the subject's prior expectation of audiovisual congruency. We discuss a model of prior probability estimation to explain the results.
Collapse
|
11
|
Improved horizontal directional hearing in bone conduction device users with acquired unilateral conductive hearing loss. J Assoc Res Otolaryngol 2010; 12:1-11. [PMID: 20838845 PMCID: PMC3015026 DOI: 10.1007/s10162-010-0235-2] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2010] [Accepted: 08/25/2010] [Indexed: 11/12/2022] Open
Abstract
We examined horizontal directional hearing in patients with acquired severe unilateral conductive hearing loss (UCHL). All patients (n = 12) had been fitted with a bone conduction device (BCD) to restore bilateral hearing. The patients were tested in the unaided (monaural) and aided (binaural) hearing condition. Five listeners without hearing loss were tested as a control group while listening with a monaural plug and earmuff, or with both ears (binaural). We randomly varied stimulus presentation levels to assess whether listeners relied on the acoustic head-shadow effect (HSE) for horizontal (azimuth) localization. Moreover, to prevent sound localization on the basis of monaural spectral shape cues from head and pinna, subjects were exposed to narrow band (1/3 octave) noises. We demonstrate that the BCD significantly improved sound localization in 8/12 of the UCHL patients. Interestingly, under monaural hearing (BCD off), we observed fairly good unaided azimuth localization performance in 4/12 of the patients. Our multiple regression analysis shows that all patients relied on the ambiguous HSE for localization. In contrast, acutely plugged control listeners did not employ the HSE. Our data confirm and further extend results of recent studies on the use of sound localization cues in chronic and acute monaural listening.
Collapse
|
12
|
The effect of spatial-temporal audiovisual disparities on saccades in a complex scene. Exp Brain Res 2009; 198:425-37. [PMID: 19415249 PMCID: PMC2733184 DOI: 10.1007/s00221-009-1815-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2008] [Accepted: 04/11/2009] [Indexed: 11/30/2022]
Abstract
In a previous study we quantified the effect of multisensory integration on the latency and accuracy of saccadic eye movements toward spatially aligned audiovisual (AV) stimuli within a rich AV-background (Corneil et al. in J Neurophysiol 88:438–454, 2002). In those experiments both stimulus modalities belonged to the same object, and subjects were instructed to foveate that source, irrespective of modality. Under natural conditions, however, subjects have no prior knowledge as to whether visual and auditory events originated from the same, or from different objects in space and time. In the present experiments we included these possibilities by introducing various spatial and temporal disparities between the visual and auditory events within the AV-background. Subjects had to orient fast and accurately to the visual target, thereby ignoring the auditory distractor. We show that this task belies a dichotomy, as it was quite difficult to produce fast responses (<250 ms) that were not aurally driven. Subjects therefore made many erroneous saccades. Interestingly, for the spatially aligned events the inability to ignore auditory stimuli produced shorter reaction times, but also more accurate responses than for the unisensory target conditions. These findings, which demonstrate effective multisensory integration, are similar to the previous study, and the same multisensory integration rules are applied (Corneil et al. in J Neurophysiol 88:438–454, 2002). In contrast, with increasing spatial disparity, integration gradually broke down, as the subjects’ responses became bistable: saccades were directed either to the auditory (fast responses), or to the visual stimulus (late responses). Interestingly, also in this case responses were faster and more accurate than to the respective unisensory stimuli.
Collapse
|
13
|
Abstract
This paper reports on the acute effects of a monaural plug on directional hearing in the horizontal (azimuth) and vertical (elevation) planes of human listeners. Sound localization behavior was tested with rapid head-orienting responses toward brief high-pass filtered (>3 kHz; HP) and broadband (0.5-20 kHz; BB) noises, with sound levels between 30 and 60 dB, A-weighted (dBA). To deny listeners any consistent azimuth-related head-shadow cues, stimuli were randomly interleaved. A plug immediately degraded azimuth performance, as evidenced by a sound level-dependent shift ("bias") of responses contralateral to the plug, and a level-dependent change in the slope of the stimulus-response relation ("gain"). Although the azimuth bias and gain were highly correlated, they could not be predicted from the plug's acoustic attenuation. Interestingly, listeners performed best for low-intensity stimuli at their normal-hearing side. These data demonstrate that listeners rely on monaural spectral cues for sound-source azimuth localization as soon as the binaural difference cues break down. Also the elevation response components were affected by the plug: elevation gain depended on both stimulus azimuth and on sound level and, as for azimuth, localization was best for low-intensity stimuli at the hearing side. Our results show that the neural computation of elevation incorporates a binaural weighting process that relies on the perceived, rather than the actual, sound-source azimuth. It is our conjecture that sound localization ensues from a weighting of all acoustic cues for both azimuth and elevation, in which the weights may be partially determined, and rapidly updated, by the reliability of the particular cue.
Collapse
|
14
|
Abstract
Human sound localization results primarily from the processing of binaural differences in sound level and arrival time for locations in the horizontal plane (azimuth) and of spectral shape cues generated by the head and pinnae for positions in the vertical plane (elevation). The latter mechanism incorporates two processing stages: a spectral-to-spatial mapping stage and a binaural weighting stage that determines the contribution of each ear to perceived elevation as function of sound azimuth. We demonstrated recently that binaural pinna molds virtually abolish the ability to localize sound-source elevation, but, after several weeks, subjects regained normal localization performance. It is not clear which processing stage underlies this remarkable plasticity, because the auditory system could have learned the new spectral cues separately for each ear (spatial-mapping adaptation) or for one ear only, while extending its contribution into the contralateral hemifield (binaural-weighting adaptation). To dissociate these possibilities, we applied a long-term monaural spectral perturbation in 13 subjects. Our results show that, in eight experiments, listeners learned to localize accurately with new spectral cues that differed substantially from those provided by their own ears. Interestingly, five subjects, whose spectral cues were not sufficiently perturbed, never yielded stable localization performance. Our findings indicate that the analysis of spectral cues may involve a correlation process between the sensory input and a stored spectral representation of the subject's ears and that learning acts predominantly at a spectral-to-spatial mapping level rather than at the level of binaural weighting.
Collapse
|
15
|
Abstract
Monaurally deaf people lack the binaural acoustic difference cues in sound level and timing that are needed to encode sound location in the horizontal plane (azimuth). It has been proposed that these people therefore rely on spectral pinna cues of their normal ear to localize sounds. However, the acoustic head-shadow effect (HSE) might also serve as an azimuth cue, despite its ambiguity when absolute sound levels are unknown. Here, we assess the contribution of either cue in the monaural deaf to two-dimensional (2D) sound localization. In a localization test with randomly interleaved sound levels, we show that all monaurally deaf listeners relied heavily on the HSE, whereas binaural control listeners ignore this cue. However, some monaural listeners responded partly to actual sound-source azimuth, regardless of sound level. We show that these listeners extracted azimuth information from their pinna cues. The better monaural listeners were able to localize azimuth on the basis of spectral cues, the better their ability to also localize sound-source elevation. In a subsequent localization experiment with one fixed sound level, monaural listeners rapidly adopted a strategy on the basis of the HSE. We conclude that monaural spectral cues are not sufficient for adequate 2D sound localization under unfamiliar acoustic conditions. Thus, monaural listeners strongly rely on the ambiguous HSE, which may help them to cope with familiar acoustic environments.
Collapse
|