1
|
Laback B, Tabuchi H, Kohlrausch A. Evidence for proactive and retroactive temporal pattern analysis in simultaneous maskinga). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:3742-3759. [PMID: 38856312 DOI: 10.1121/10.0026240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 05/17/2024] [Indexed: 06/11/2024]
Abstract
Amplitude modulation (AM) of a masker reduces its masking on a simultaneously presented unmodulated pure-tone target, which likely involves dip listening. This study tested the idea that dip-listening efficiency may depend on stimulus context, i.e., the match in AM peakedness (AMP) between the masker and a precursor or postcursor stimulus, assuming a form of temporal pattern analysis process. Masked thresholds were measured in normal-hearing listeners using Schroeder-phase harmonic complexes as maskers and precursors or postcursors. Experiment 1 showed threshold elevation (i.e., interference) when a flat cursor preceded or followed a peaked masker, suggesting proactive and retroactive temporal pattern analysis. Threshold decline (facilitation) was observed when the masker AMP was matched to the precursor, irrespective of stimulus AMP, suggesting only proactive processing. Subsequent experiments showed that both interference and facilitation (1) remained robust when a temporal gap was inserted between masker and cursor, (2) disappeared when an F0-difference was introduced between masker and precursor, and (3) decreased when the presentation level was reduced. These results suggest an important role of envelope regularity in dip listening, especially when masker and cursor are F0-matched and, therefore, form one perceptual stream. The reported effects seem to represent a time-domain variant of comodulation masking release.
Collapse
Affiliation(s)
- Bernhard Laback
- Austrian Academy of Sciences, Acoustics Research Institute, Wohllebengasse 12-14, 1040 Vienna, Austria
| | - Hisaaki Tabuchi
- Department of Psychology, University of Innsbruck, Universitätsstraße 15, 6020 Innsbruck, Austria
| | - Armin Kohlrausch
- Industrial Engineering & Innovation Sciences, Technische Universiteit Eindhoven, P.O. Box 513, 5600 MB Eindhoven, Netherlands
| |
Collapse
|
2
|
Pearson DV, Shen Y, McAuley JD, Kidd GR. The effect of rhythm on selective listening in multiple-source environments for young and older adults. Hear Res 2023; 435:108789. [PMID: 37276686 PMCID: PMC10460128 DOI: 10.1016/j.heares.2023.108789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 05/03/2023] [Accepted: 05/10/2023] [Indexed: 06/07/2023]
Abstract
Understanding continuous speech with competing background sounds is challenging, particularly for older adults. One stimulus property that may aid listeners understanding of to-be-attended (target) material is temporal regularity (rhythm). In the context of speech-in-noise understanding, McAuley and colleagues recently showed a target rhythm effect whereby recognition of target speech was better when natural speech rhythm of a target talker was intact than when it was temporally altered. The current study replicates the target rhythm effect using a synthetic vowel sequence paradigm in young adults (Experiment 1) and then uses this paradigm to investigate potential age-related changes in the effect of rhythm on recognition (Experiment 2). Listeners identified the last three vowels of temporally regular (isochronous) and irregular (anisochronous) synthetic vowel sequences in quiet and with a competing background sequence of vowel-like harmonic tone complexes presented at various tempos. The results replicated the target rhythm effect whereby temporal regularity in the vowel sequences improved identification accuracy of young listeners compared to irregular vowel sequences. The magnitude of the effect was not found to be influenced by background tempo, but faster background tempos led to greater vowel identification accuracy independent of regularity. Older listeners also demonstrated a target rhythm effect but received less benefit from the temporal regularity of the target sequences than did young listeners. This study highlights the importance of rhythm for understanding age-related differences in selective listening in complex environments and provides a novel paradigm for investigating effects of rhythm on perception.
Collapse
Affiliation(s)
- Dylan V Pearson
- Department of Speech, Language, and Hearing Sciences, Indiana University, United States.
| | - Yi Shen
- Department of Speech and Hearing Sciences, University of Washington, United States
| | - J Devin McAuley
- Department of Psychology, Michigan State University, United States
| | - Gary R Kidd
- Department of Speech, Language, and Hearing Sciences, Indiana University, United States
| |
Collapse
|
3
|
Roberts B, Haywood NR. Asymmetric effects of sudden changes in timbre on auditory stream segregation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:363-378. [PMID: 37462404 DOI: 10.1121/10.0020172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 06/28/2023] [Indexed: 07/21/2023]
Abstract
Two experiments explored the effects of abrupt transitions in timbral properties [amplitude modulation (AM), pure tones vs narrow-band noises, and attack/decay envelope] on streaming. Listeners reported continuously the number of streams heard during 18-s-long alternating low- and high-frequency (LHL-) sequences (frequency separation: 2-6 semitones) that underwent a coherent transition at 6 s or remained unchanged. In experiment 1, triplets comprised unmodulated pure tones or 100%-depth AM was created using narrowly spaced tone pairs (dyads: 30- or 50-Hz modulation). In experiment 2, triplets comprised narrow-band noises, dyads, or pure tones with quasi-trapezoidal envelopes (10/80/10 ms), fast attacks and slow decays (10/90 ms), or vice versa (90/10 ms). Abrupt transitions led to direction-dependent changes in stream segregation. Transitions from modulated to unmodulated (or slower-modulated) tones, from noise bands to pure tones, or from slow- to fast-attack tones typically caused substantial loss of segregation (resetting), whereas transitions in the opposite direction mostly caused less or no resetting. Furthermore, for the smallest frequency separation, transitions in the latter direction usually led to increased segregation (overshoot). Overall, the results are reminiscent of the perceptual asymmetries found in auditory search for targets with or without a salient additional feature (or greater activation of that feature).
Collapse
Affiliation(s)
- Brian Roberts
- School of Psychology, Aston University, Birmingham B4 7ET, United Kingdom
| | - Nicholas R Haywood
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0SZ, United Kingdom
| |
Collapse
|
4
|
Regev J, Zaar J, Relaño-Iborra H, Dau T. Age-related reduction of amplitude modulation frequency selectivity. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:2298. [PMID: 37092934 DOI: 10.1121/10.0017835] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 03/27/2023] [Indexed: 05/03/2023]
Abstract
The perception of amplitude modulations (AMs) has been characterized by a frequency-selective process in the temporal envelope domain and simulated in computational auditory processing and perception models using a modulation filterbank. Such AM frequency-selective processing has been argued to be critical for the perception of complex sounds, including speech. This study aimed at investigating the effects of age on behavioral AM frequency selectivity in young (n = 11, 22-29 years) versus older (n = 10, 57-77 years) listeners with normal hearing, using a simultaneous AM masking paradigm with a sinusoidal carrier (2.8 kHz), target modulation frequencies of 4, 16, 64, and 128 Hz, and narrowband-noise modulation maskers. A reduction of AM frequency selectivity by a factor of up to 2 was found in the older listeners. While the observed AM selectivity co-varied with the unmasked AM detection sensitivity, the age-related broadening of the masked threshold patterns remained stable even when AM sensitivity was similar across groups for an extended stimulus duration. The results from the present study might provide a valuable basis for further investigations exploring the effects of age and reduced AM frequency selectivity on complex sound perception as well as the interaction of age and hearing impairment on AM processing and perception.
Collapse
Affiliation(s)
- Jonathan Regev
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - Johannes Zaar
- Eriksholm Research Centre, Snekkersten, 3070, Denmark
| | - Helia Relaño-Iborra
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - Torsten Dau
- Hearing Systems Section, Department of Health Technology, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| |
Collapse
|
5
|
Cappelloni MS, Mateo VS, Maddox RK. Performance in an Audiovisual Selective Attention Task Using Speech-Like Stimuli Depends on the Talker Identities, But Not Temporal Coherence. Trends Hear 2023; 27:23312165231207235. [PMID: 37847849 PMCID: PMC10586009 DOI: 10.1177/23312165231207235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 09/25/2023] [Accepted: 09/26/2023] [Indexed: 10/19/2023] Open
Abstract
Audiovisual integration of speech can benefit the listener by not only improving comprehension of what a talker is saying but also helping a listener select a particular talker's voice from a mixture of sounds. Binding, an early integration of auditory and visual streams that helps an observer allocate attention to a combined audiovisual object, is likely involved in processing audiovisual speech. Although temporal coherence of stimulus features across sensory modalities has been implicated as an important cue for non-speech stimuli (Maddox et al., 2015), the specific cues that drive binding in speech are not fully understood due to the challenges of studying binding in natural stimuli. Here we used speech-like artificial stimuli that allowed us to isolate three potential contributors to binding: temporal coherence (are the face and the voice changing synchronously?), articulatory correspondence (do visual faces represent the correct phones?), and talker congruence (do the face and voice come from the same person?). In a trio of experiments, we examined the relative contributions of each of these cues. Normal hearing listeners performed a dual task in which they were instructed to respond to events in a target auditory stream while ignoring events in a distractor auditory stream (auditory discrimination) and detecting flashes in a visual stream (visual detection). We found that viewing the face of a talker who matched the attended voice (i.e., talker congruence) offered a performance benefit. We found no effect of temporal coherence on performance in this task, prompting an important recontextualization of previous findings.
Collapse
Affiliation(s)
- Madeline S. Cappelloni
- Biomedical Engineering, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
- Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA
| | - Vincent S. Mateo
- Audio and Music Engineering, University of Rochester, Rochester, NY, USA
| | - Ross K. Maddox
- Biomedical Engineering, University of Rochester, Rochester, NY, USA
- Center for Visual Science, University of Rochester, Rochester, NY, USA
- Del Monte Institute for Neuroscience, University of Rochester, Rochester, NY, USA
- Neuroscience, University of Rochester, Rochester, NY, USA
| |
Collapse
|
6
|
Matz AF, Nie Y, Wheeler HJ. Auditory stream segregation of amplitude-modulated narrowband noise in cochlear implant users and individuals with normal hearing. Front Psychol 2022; 13:927854. [PMID: 36118488 PMCID: PMC9479457 DOI: 10.3389/fpsyg.2022.927854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Accepted: 08/11/2022] [Indexed: 11/13/2022] Open
Abstract
Voluntary stream segregation was investigated in cochlear implant (CI) users and normal-hearing (NH) listeners using a segregation-promoting objective approach which evaluated the role of spectral and amplitude-modulation (AM) rate separations on stream segregation and its build-up. Sequences of 9 or 3 pairs of A and B narrowband noise (NBN) bursts were presented which differed in either center frequency of the noise band, the AM-rate, or both. In some sequences (delayed sequences), the last B burst was delayed by 35 ms from their otherwise-steady temporal position. In the other sequences (no-delay sequences), the last B bursts were temporally advanced from 0 to 10 ms. A single interval yes/no procedure was utilized to measure participants’ sensitivity (d′) in identifying delayed vs. no-delay sequences. A higher d′ value showed the higher ability to segregate the A and B subsequences. For NH listeners, performance improved with each spectral separation. However, for CI users, performance was only significantly better for the condition with the largest spectral separation. Additionally, performance was significantly poorer for the largest AM-rate separation than for the condition with no AM-rate separation for both groups. The significant effect of sequence duration in both groups indicated that listeners made more improvement with lengthening the duration of stimulus sequences, supporting the build-up effect. The results of this study suggest that CI users are less able than NH listeners to segregate NBN bursts into different auditory streams when they are moderately separated in the spectral domain. Contrary to our hypothesis, our results indicate that AM-rate separation may interfere with the segregation of streams of NBN. Additionally, our results add evidence to the literature that CI users build up stream segregation at a rate comparable to NH listeners, when the inter-stream spectral separations are adequately large.
Collapse
Affiliation(s)
- Alexandria F. Matz
- Department of Otolaryngology, Eastern Virginia Medical School, Norfolk, VA, United States
| | - Yingjiu Nie
- Department of Communication Sciences and Disorders, James Madison University, Harrisonburg, VA, United States
- *Correspondence: Yingjiu Nie,
| | - Harley J. Wheeler
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Twin Cities, Minneapolis, MN, United States
| |
Collapse
|
7
|
Cortical Processing of Binaural Cues as Shown by EEG Responses to Random-Chord Stereograms. J Assoc Res Otolaryngol 2021; 23:75-94. [PMID: 34904205 PMCID: PMC8783002 DOI: 10.1007/s10162-021-00820-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 10/06/2021] [Indexed: 10/26/2022] Open
Abstract
Spatial hearing facilitates the perceptual organization of complex soundscapes into accurate mental representations of sound sources in the environment. Yet, the role of binaural cues in auditory scene analysis (ASA) has received relatively little attention in recent neuroscientific studies employing novel, spectro-temporally complex stimuli. This may be because a stimulation paradigm that provides binaurally derived grouping cues of sufficient spectro-temporal complexity has not yet been established for neuroscientific ASA experiments. Random-chord stereograms (RCS) are a class of auditory stimuli that exploit spectro-temporal variations in the interaural envelope correlation of noise-like sounds with interaurally coherent fine structure; they evoke salient auditory percepts that emerge only under binaural listening. Here, our aim was to assess the usability of the RCS paradigm for indexing binaural processing in the human brain. To this end, we recorded EEG responses to RCS stimuli from 12 normal-hearing subjects. The stimuli consisted of an initial 3-s noise segment with interaurally uncorrelated envelopes, followed by another 3-s segment, where envelope correlation was modulated periodically according to the RCS paradigm. Modulations were applied either across the entire stimulus bandwidth (wideband stimuli) or in temporally shifting frequency bands (ripple stimulus). Event-related potentials and inter-trial phase coherence analyses of the EEG responses showed that the introduction of the 3- or 5-Hz wideband modulations produced a prominent change-onset complex and ongoing synchronized responses to the RCS modulations. In contrast, the ripple stimulus elicited a change-onset response but no response to ongoing RCS modulation. Frequency-domain analyses revealed increased spectral power at the fundamental frequency and the first harmonic of wideband RCS modulations. RCS stimulation yields robust EEG measures of binaurally driven auditory reorganization and has potential to provide a flexible stimulation paradigm suitable for isolating binaural effects in ASA experiments.
Collapse
|
8
|
Rajasingam SL, Summers RJ, Roberts B. The dynamics of auditory stream segregation: Effects of sudden changes in frequency, level, or modulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3769. [PMID: 34241493 DOI: 10.1121/10.0005049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Accepted: 05/03/2021] [Indexed: 06/13/2023]
Abstract
Three experiments explored the effects of abrupt changes in stimulus properties on streaming dynamics. Listeners monitored 20-s-long low- and high-frequency (LHL-) tone sequences and reported the number of streams heard throughout. Experiments 1 and 2 used pure tones and examined the effects of changing triplet base frequency and level, respectively. Abrupt changes in base frequency (±3-12 semitones) caused significant magnitude-related falls in segregation (resetting), regardless of transition direction, but an asymmetry occurred for changes in level (±12 dB). Rising-level transitions usually decreased segregation significantly, whereas falling-level transitions had little or no effect. Experiment 3 used pure tones (unmodulated) and narrowly spaced (±25 Hz) tone pairs (dyads); the two evoke similar excitation patterns, but dyads are strongly modulated with a distinctive timbre. Dyad-only sequences induced a strongly segregated percept, limiting scope for further build-up. Alternation between groups of pure tones and dyads produced large, asymmetric changes in streaming. Dyad-to-pure transitions caused substantial resetting, but pure-to-dyad transitions sometimes elicited even greater segregation than for the corresponding interval in dyad-only sequences (overshoot). The results indicate that abrupt changes in timbre can strongly affect the likelihood of stream segregation without introducing significant peripheral-channeling cues. These asymmetric effects of transition direction are reminiscent of subtractive adaptation in vision.
Collapse
Affiliation(s)
- Saima L Rajasingam
- Department of Vision and Hearing Sciences, Anglia Ruskin University, Cambridge CB1 1PT, United Kingdom
| | - Robert J Summers
- School of Psychology, Aston University, Birmingham B4 7ET, United Kingdom
| | - Brian Roberts
- School of Psychology, Aston University, Birmingham B4 7ET, United Kingdom
| |
Collapse
|
9
|
Rapid Assessment of Non-Verbal Auditory Perception in Normal-Hearing Participants and Cochlear Implant Users. J Clin Med 2021; 10:jcm10102093. [PMID: 34068067 PMCID: PMC8152499 DOI: 10.3390/jcm10102093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 04/26/2021] [Accepted: 05/06/2021] [Indexed: 01/17/2023] Open
Abstract
In the case of hearing loss, cochlear implants (CI) allow for the restoration of hearing. Despite the advantages of CIs for speech perception, CI users still complain about their poor perception of their auditory environment. Aiming to assess non-verbal auditory perception in CI users, we developed five listening tests. These tests measure pitch change detection, pitch direction identification, pitch short-term memory, auditory stream segregation, and emotional prosody recognition, along with perceived intensity ratings. In order to test the potential benefit of visual cues for pitch processing, the three pitch tests included half of the trials with visual indications to perform the task. We tested 10 normal-hearing (NH) participants with material being presented as original and vocoded sounds, and 10 post-lingually deaf CI users. With the vocoded sounds, the NH participants had reduced scores for the detection of small pitch differences, and reduced emotion recognition and streaming abilities compared to the original sounds. Similarly, the CI users had deficits for small differences in the pitch change detection task and emotion recognition, as well as a decreased streaming capacity. Overall, this assessment allows for the rapid detection of specific patterns of non-verbal auditory perception deficits. The current findings also open new perspectives about how to enhance pitch perception capacities using visual cues.
Collapse
|
10
|
Mohn JL, Downer JD, O'Connor KN, Johnson JS, Sutter ML. Choice-related activity and neural encoding in primary auditory cortex and lateral belt during feature-selective attention. J Neurophysiol 2021; 125:1920-1937. [PMID: 33788616 DOI: 10.1152/jn.00406.2020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Selective attention is necessary to sift through, form a coherent percept of, and make behavioral decisions on the vast amount of information present in most sensory environments. How and where selective attention is employed in cortex and how this perceptual information then informs the relevant behavioral decisions is still not well understood. Studies probing selective attention and decision-making in visual cortex have been enlightening as to how sensory attention might work in that modality; whether or not similar mechanisms are employed in auditory attention is not yet clear. Therefore, we trained rhesus macaques on a feature-selective attention task, where they switched between reporting changes in temporal (amplitude modulation, AM) and spectral (carrier bandwidth) features of a broadband noise stimulus. We investigated how the encoding of these features by single neurons in primary (A1) and secondary (middle lateral belt, ML) auditory cortex was affected by the different attention conditions. We found that neurons in A1 and ML showed mixed selectivity to the sound and task features. We found no difference in AM encoding between the attention conditions. We found that choice-related activity in both A1 and ML neurons shifts between attentional conditions. This finding suggests that choice-related activity in auditory cortex does not simply reflect motor preparation or action and supports the relationship between reported choice-related activity and the decision and perceptual process.NEW & NOTEWORTHY We recorded from primary and secondary auditory cortex while monkeys performed a nonspatial feature attention task. Both areas exhibited rate-based choice-related activity. The manifestation of choice-related activity was attention dependent, suggesting that choice-related activity in auditory cortex does not simply reflect arousal or motor influences but relates to the specific perceptual choice.
Collapse
Affiliation(s)
- Jennifer L Mohn
- Center for Neuroscience, University of California, Davis, California.,Department of Neurobiology, Physiology and Behavior, University of California, Davis, California
| | - Joshua D Downer
- Center for Neuroscience, University of California, Davis, California.,Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, California
| | - Kevin N O'Connor
- Center for Neuroscience, University of California, Davis, California.,Department of Neurobiology, Physiology and Behavior, University of California, Davis, California
| | - Jeffrey S Johnson
- Center for Neuroscience, University of California, Davis, California.,Department of Neurobiology, Physiology and Behavior, University of California, Davis, California
| | - Mitchell L Sutter
- Center for Neuroscience, University of California, Davis, California.,Department of Neurobiology, Physiology and Behavior, University of California, Davis, California
| |
Collapse
|
11
|
Johnson JS, Niwa M, O'Connor KN, Sutter ML. Amplitude modulation encoding in the auditory cortex: comparisons between the primary and middle lateral belt regions. J Neurophysiol 2020; 124:1706-1726. [PMID: 33026929 DOI: 10.1152/jn.00171.2020] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
In macaques, the middle lateral auditory cortex (ML) is a belt region adjacent to the primary auditory cortex (A1) and believed to be at a hierarchically higher level. Although ML single-unit responses have been studied for several auditory stimuli, the ability of ML cells to encode amplitude modulation (AM)-an ability that has been widely studied in A1-has not yet been characterized. Here, we compared the responses of A1 and ML neurons to amplitude-modulated (AM) noise in awake macaques. Although several of the basic properties of A1 and ML responses to AM noise were similar, we found several key differences. ML neurons were less likely to phase lock, did not phase lock as strongly, and were more likely to respond in a nonsynchronized fashion than A1 cells, consistent with a temporal-to-rate transformation as information ascends the auditory hierarchy. ML neurons tended to have lower temporally (phase-locking) based best modulation frequencies than A1 neurons. Neurons that decreased their firing rate in response to AM noise relative to their firing rate in response to unmodulated noise became more common at the level of ML than they were in A1. In both A1 and ML, we found a prevalent class of neurons that usually have enhanced rate responses relative to responses to the unmodulated noise at lower modulation frequencies and suppressed rate responses relative to responses to the unmodulated noise at middle modulation frequencies.NEW & NOTEWORTHY ML neurons synchronized less than A1 neurons, consistent with a hierarchical temporal-to-rate transformation. Both A1 and ML had a class of modulation transfer functions previously unreported in the cortex with a low-modulation-frequency (MF) peak, a middle-MF trough, and responses similar to unmodulated noise responses at high MFs. The results support a hierarchical shift toward a two-pool opponent code, where subtraction of neural activity between two populations of oppositely tuned neurons encodes AM.
Collapse
Affiliation(s)
- Jeffrey S Johnson
- Center for Neuroscience, University of California, Davis, California
| | - Mamiko Niwa
- Center for Neuroscience, University of California, Davis, California
| | - Kevin N O'Connor
- Center for Neuroscience, University of California, Davis, California.,Department of Neurobiology, Physiology and Behavior, University of California, Davis, California
| | - Mitchell L Sutter
- Center for Neuroscience, University of California, Davis, California.,Department of Neurobiology, Physiology and Behavior, University of California, Davis, California
| |
Collapse
|
12
|
Gurariy G, Randall R, Greenberg AS. Manipulation of low-level features modulates grouping strength of auditory objects. PSYCHOLOGICAL RESEARCH 2020; 85:2256-2270. [PMID: 32691138 DOI: 10.1007/s00426-020-01391-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 07/10/2020] [Indexed: 11/29/2022]
Abstract
A central challenge of auditory processing involves the segregation, analysis, and integration of acoustic information into auditory perceptual objects for processing by higher order cognitive operations. This study explores the influence of low-level features on auditory object perception. Participants provided perceived musicality ratings in response to randomly generated pure tone sequences. Previous work has shown that music perception relies on the integration of discrete sounds into a holistic structure. Hence, high (versus low) ratings were viewed as indicative of strong (versus weak) object formation. Additionally, participants rated sequences in which random subsets of tones were manipulated along one of three low-level dimensions (timbre, amplitude, or fade-in) at one of three strengths (low, medium, or high). Our primary findings demonstrate how low-level acoustic features modulate the perception of auditory objects, as measured by changes in musicality ratings for manipulated sequences. Secondarily, we used principal component analysis to categorize participants into subgroups based on differential sensitivities to low-level auditory dimensions, thereby highlighting the importance of individual differences in auditory perception. Finally, we report asymmetries regarding the effects of low-level dimensions; specifically, the perceptual significance of timbre. Together, these data contribute to our understanding of how low-level auditory features modulate auditory object perception.
Collapse
Affiliation(s)
- Gennadiy Gurariy
- Department of Biomedical Engineering, Medical College of Wisconsin & Marquette University, Milwaukee, USA
| | - Richard Randall
- School of Music and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, USA.
| | - Adam S Greenberg
- Department of Biomedical Engineering, Medical College of Wisconsin & Marquette University, Milwaukee, USA
| |
Collapse
|
13
|
Oster MM, Werner LA. Infants' use of isolated and combined temporal cues in speech sound segregation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:401. [PMID: 32752747 PMCID: PMC7386947 DOI: 10.1121/10.0001582] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 06/14/2020] [Accepted: 06/28/2020] [Indexed: 06/11/2023]
Abstract
This paper investigates infants' and adults' use of envelope cues and combined onset asynchrony and envelope cues in the segregation of concurrent vowels. Listeners heard superimposed vowel pairs consisting of two different vowels spoken by a male and a female talker and were trained to respond to one specific target vowel, either the male /u:/ or male /i:/. Vowel detection was measured in three conditions. In the baseline condition the two superimposed vowels had similar amplitude envelopes and synchronous onset. In the envelope cue condition, the amplitude envelopes of the two vowels differed. In the combined cue condition, both the onset time and amplitude envelopes of the two vowels differed. Seven-month-old infants' concurrent vowel segregation improved both with envelope and with combined onset asynchrony and envelope cues to the same extent as adults'. A preliminary investigation with 3-month-old infants suggested that neither envelope cues nor combined asynchrony and envelope cues improved their ability to detect the target vowel. Taken together, these results suggest that envelope and combined onset-asynchrony cues are available to infants as they attempt to process competing speech sounds, at least after 7 months of age.
Collapse
Affiliation(s)
- Monika-Maria Oster
- Listen and Talk, 8610 8th Avenue Northeast, Seattle, Washington 98115, USA
| | - Lynne A Werner
- Department of Speech and Hearing Sciences, University of Washington, 1417 Northeast 42nd Street, Seattle, Washington 98105, USA
| |
Collapse
|
14
|
Ecological origins of perceptual grouping principles in the auditory system. Proc Natl Acad Sci U S A 2019; 116:25355-25364. [PMID: 31754035 DOI: 10.1073/pnas.1903887116] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Events and objects in the world must be inferred from sensory signals to support behavior. Because sensory measurements are temporally and spatially local, the estimation of an object or event can be viewed as the grouping of these measurements into representations of their common causes. Perceptual grouping is believed to reflect internalized regularities of the natural environment, yet grouping cues have traditionally been identified using informal observation and investigated using artificial stimuli. The relationship of grouping to natural signal statistics has thus remained unclear, and additional or alternative cues remain possible. Here, we develop a general methodology for relating grouping to natural sensory signals and apply it to derive auditory grouping cues from natural sounds. We first learned local spectrotemporal features from natural sounds and measured their co-occurrence statistics. We then learned a small set of stimulus properties that could predict the measured feature co-occurrences. The resulting cues included established grouping cues, such as harmonic frequency relationships and temporal coincidence, but also revealed previously unappreciated grouping principles. Human perceptual grouping was predicted by natural feature co-occurrence, with humans relying on the derived grouping cues in proportion to their informativity about co-occurrence in natural sounds. The results suggest that auditory grouping is adapted to natural stimulus statistics, show how these statistics can reveal previously unappreciated grouping phenomena, and provide a framework for studying grouping in natural signals.
Collapse
|
15
|
Wijetillake AA, van Hoesel RJM, Cowan R. Sequential stream segregation with bilateral cochlear implants. Hear Res 2019; 383:107812. [PMID: 31630083 DOI: 10.1016/j.heares.2019.107812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 10/05/2019] [Accepted: 10/07/2019] [Indexed: 11/18/2022]
Abstract
Sequential stream segregation on the basis of binaural 'ear-of-entry', modulation rate and electrode place-of-stimulation cues was investigated in bilateral cochlear implant (CI) listeners using a rhythm anisochrony detection task. Sequences of alternating 'A' and 'B' bursts were presented via direct electrical stimulation and comprised either an isochronous timing structure or an anisochronous structure that was generated by delaying just the 'B' bursts. 'B' delay thresholds that enabled rhythm anisochrony detection were determined. Higher thresholds were assumed to indicate a greater likelihood of stream segregation, resulting specifically from stream integration breakdown. Results averaged across subjects showed that thresholds were significantly higher when monaural 'A' and 'B' bursts were presented contralaterally rather than ipsilaterally, and that diotic presentation of 'A', with a monaural 'B', yielded intermediate thresholds. When presented monaurally and ipsilaterally, higher thresholds were also found when successive bursts had mismatched rather than matched modulation rates. In agreement with previous studies, average delay thresholds also increased as electrode separation between bursts increased when presented ipsilaterally. No interactions were found between ear-of-entry, modulation rate and place-of-stimulation. However, combining moderate electrode difference cues with either diotic-'A' ear-of-entry cues or modulation-rate mismatch cues did yield greater threshold increases than observed with any of those cues alone. The results from the present study indicate that sequential stream segregation can be elicited in bilateral CI users by differences in the signal across ears (binaural cues), in modulation rate (monaural cues) and in place-of-stimulation (monaural cues), and that those differences can be combined to further increase segregation.
Collapse
Affiliation(s)
| | | | - Robert Cowan
- The Hearing CRC, 550 Swanston St, Carlton, 3053, Victoria, Australia.
| |
Collapse
|
16
|
Anderson SR, Kan A, Litovsky RY. Asymmetric temporal envelope encoding: Implications for within- and across-ear envelope comparison. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:1189. [PMID: 31472559 PMCID: PMC7051005 DOI: 10.1121/1.5121423] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2018] [Revised: 07/24/2019] [Accepted: 07/24/2019] [Indexed: 05/17/2023]
Abstract
Separating sound sources in acoustic environments relies on making ongoing, highly accurate spectro-temporal comparisons. However, listeners with hearing impairment may have varying quality of temporal encoding within or across ears, which may limit the listeners' ability to make spectro-temporal comparisons between places-of-stimulation. In this study in normal hearing listeners, depth of amplitude modulation (AM) for sinusoidally amplitude modulated (SAM) tones was manipulated in an effort to reduce the coding of periodicity in the auditory nerve. The ability to judge differences in AM rates was studied for stimuli presented to different cochlear places-of-stimulation, within- or across-ears. It was hypothesized that if temporal encoding was poorer for one tone in a pair, then sensitivity to differences in AM rate of the pair would decrease. Results indicated that when the depth of AM was reduced from 50% to 20% for one SAM tone in a pair, sensitivity to differences in AM rate decreased. Sensitivity was greatest for AM rates near 90 Hz and depended upon the places-of-stimulation being compared. These results suggest that degraded temporal representations in the auditory nerve for one place-of-stimulation could lead to deficits comparing that temporal information with other places-of-stimulation.
Collapse
Affiliation(s)
- Sean R Anderson
- Waisman Center, University of Wisconsin-Madison, Madison, Wisconsin 53705, USA
| | - Alan Kan
- Waisman Center, University of Wisconsin-Madison, Madison, Wisconsin 53705, USA
| | - Ruth Y Litovsky
- Waisman Center, University of Wisconsin-Madison, Madison, Wisconsin 53705, USA
| |
Collapse
|
17
|
Buss E, Lorenzi C, Cabrera L, Leibold LJ, Grose JH. Amplitude modulation detection and modulation masking in school-age children and adults. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:2565. [PMID: 31046373 PMCID: PMC6909994 DOI: 10.1121/1.5098950] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Revised: 04/02/2019] [Accepted: 04/03/2019] [Indexed: 05/30/2023]
Abstract
Two experiments were performed to better understand on- and off-frequency modulation masking in normal-hearing school-age children and adults. Experiment 1 estimated thresholds for detecting 16-, 64- or 256-Hz sinusoidal amplitude modulation (AM) imposed on a 4300-Hz pure tone. Thresholds tended to improve with age, with larger developmental effects for 64- and 256-Hz AM than 16-Hz AM. Detection of 16-Hz AM was also measured with a 1000-Hz off-frequency masker tone carrying 16-Hz AM. Off-frequency modulation masking was larger for younger than older children and adults when the masker was gated with the target, but not when the masker was continuous. Experiment 2 measured detection of 16- or 64-Hz sinusoidal AM carried on a bandpass noise with and without additional on-frequency masker AM. Children and adults demonstrated modulation masking with similar tuning to modulation rate. Rate-dependent age effects for AM detection on a pure-tone carrier are consistent with maturation of temporal resolution, an effect that may be obscured by modulation masking for noise carriers. Children were more susceptible than adults to off-frequency modulation masking for gated stimuli, consistent with maturation in the ability to listen selectively in frequency, but the children were not more susceptible to on-frequency modulation masking than adults.
Collapse
Affiliation(s)
- Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, School of Medicine, University of North Carolina, Chapel Hill, North Carolina 27599-7070, USA
| | - Christian Lorenzi
- Laboratoire des Systèmes Perceptifs, Département d'Études Cognitives, Ecole Normale Supérieure, Universite Paris Sciences et Lettres, Centre National de la Recherche Scientifique, Paris, France
| | - Laurianne Cabrera
- Laboratoire de Psychologie de la Perception, Université Paris Descartes, Centre National de la Recherche Scientifique, Paris, France
| | - Lori J Leibold
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68131, USA
| | - John H Grose
- Department of Otolaryngology/Head and Neck Surgery, School of Medicine, University of North Carolina, Chapel Hill, North Carolina 27599-7070, USA
| |
Collapse
|
18
|
Chakrabarty D, Elhilali M. A Gestalt inference model for auditory scene segregation. PLoS Comput Biol 2019; 15:e1006711. [PMID: 30668568 PMCID: PMC6358108 DOI: 10.1371/journal.pcbi.1006711] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Revised: 02/01/2019] [Accepted: 12/12/2018] [Indexed: 11/18/2022] Open
Abstract
Our current understanding of how the brain segregates auditory scenes into meaningful objects is in line with a Gestaltism framework. These Gestalt principles suggest a theory of how different attributes of the soundscape are extracted then bound together into separate groups that reflect different objects or streams present in the scene. These cues are thought to reflect the underlying statistical structure of natural sounds in a similar way that statistics of natural images are closely linked to the principles that guide figure-ground segregation and object segmentation in vision. In the present study, we leverage inference in stochastic neural networks to learn emergent grouping cues directly from natural soundscapes including speech, music and sounds in nature. The model learns a hierarchy of local and global spectro-temporal attributes reminiscent of simultaneous and sequential Gestalt cues that underlie the organization of auditory scenes. These mappings operate at multiple time scales to analyze an incoming complex scene and are then fused using a Hebbian network that binds together coherent features into perceptually-segregated auditory objects. The proposed architecture successfully emulates a wide range of well established auditory scene segregation phenomena and quantifies the complimentary role of segregation and binding cues in driving auditory scene segregation.
Collapse
Affiliation(s)
- Debmalya Chakrabarty
- Laboratory for Computational Audio Processing, Center for Speech and Language Processing, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Mounya Elhilali
- Laboratory for Computational Audio Processing, Center for Speech and Language Processing, Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
- * E-mail:
| |
Collapse
|
19
|
Souza P, Hoover E. The Physiologic and Psychophysical Consequences of Severe-to-Profound Hearing Loss. Semin Hear 2018; 39:349-363. [PMID: 30443103 DOI: 10.1055/s-0038-1670698] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022] Open
Abstract
Substantial loss of cochlear function is required to elevate pure-tone thresholds to the severe hearing loss range; yet, individuals with severe or profound hearing loss continue to rely on hearing for communication. Despite the impairment, sufficient information is encoded at the periphery to make acoustic hearing a viable option. However, the probability of significant cochlear and/or neural damage associated with the loss has consequences for sound perception and speech recognition. These consequences include degraded frequency selectivity, which can be assessed with tests including psychoacoustic tuning curves and broadband rippled stimuli. Because speech recognition depends on the ability to resolve frequency detail, a listener with severe hearing loss is likely to have impaired communication in both quiet and noisy environments. However, the extent of the impairment varies widely among individuals. A better understanding of the fundamental abilities of listeners with severe and profound hearing loss and the consequences of those abilities for communication can support directed treatment options in this population.
Collapse
Affiliation(s)
- Pamela Souza
- Department of Communication Sciences and Disorders, Northwestern University, Evanston, Illinois
| | - Eric Hoover
- Department of Hearing and Speech Sciences, University of Maryland, Baltimore, Maryland
| |
Collapse
|
20
|
Madsen SMK, Dau T, Moore BCJ. Effect of harmonic rank on sequential sound segregation. Hear Res 2018; 367:161-168. [PMID: 30006111 DOI: 10.1016/j.heares.2018.06.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2018] [Revised: 06/01/2018] [Accepted: 06/08/2018] [Indexed: 11/26/2022]
Abstract
The ability to segregate sounds from different sound sources is thought to depend on the perceptual salience of differences between the sounds, such as differences in frequency or fundamental frequency (F0). F0 discrimination of complex tones is better for tones with low harmonics than for tones that only contain high harmonics, suggesting greater pitch salience for the former. This leads to the expectation that the sequential stream segregation (streaming) of complex tones should be better for tones with low harmonics than for tones with only high harmonics. However, the results of previous studies are conflicting about whether this is the case. The goals of this study were to determine the effect of harmonic rank on streaming and to establish whether streaming is related to F0 discrimination. Thirteen young normal-hearing participants were tested. Streaming was assessed for pure tones and complex tones containing harmonics with various ranks using sequences of ABA triplets, where A and B differed in frequency or in F0. The participants were asked to try to hear two streams and to indicate when they heard one and when they heard two streams. F0 discrimination was measured for the same tones that were used as A tones in the streaming experiment. Both streaming and F0 discrimination worsened significantly with increasing harmonic rank. There was a significant relationship between streaming and F0 discrimination, indicating that good F0 discrimination is associated with good streaming. This supports the idea that the extent of stream segregation depends on the salience of the perceptual difference between successive sounds.
Collapse
Affiliation(s)
- Sara M K Madsen
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, DK-2800, Lyngby, Denmark.
| | - Torsten Dau
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, DK-2800, Lyngby, Denmark
| | - Brian C J Moore
- Department of Psychology, University of Cambridge, Cambridge, UK
| |
Collapse
|
21
|
Knyazeva S, Selezneva E, Gorkin A, Aggelopoulos NC, Brosch M. Neuronal Correlates of Auditory Streaming in Monkey Auditory Cortex for Tone Sequences without Spectral Differences. Front Integr Neurosci 2018; 12:4. [PMID: 29440999 PMCID: PMC5797536 DOI: 10.3389/fnint.2018.00004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 01/16/2018] [Indexed: 11/13/2022] Open
Abstract
This study finds a neuronal correlate of auditory perceptual streaming in the primary auditory cortex for sequences of tone complexes that have the same amplitude spectrum but a different phase spectrum. Our finding is based on microelectrode recordings of multiunit activity from 270 cortical sites in three awake macaque monkeys. The monkeys were presented with repeated sequences of a tone triplet that consisted of an A tone, a B tone, another A tone and then a pause. The A and B tones were composed of unresolved harmonics formed by adding the harmonics in cosine phase, in alternating phase, or in random phase. A previous psychophysical study on humans revealed that when the A and B tones are similar, humans integrate them into a single auditory stream; when the A and B tones are dissimilar, humans segregate them into separate auditory streams. We found that the similarity of neuronal rate responses to the triplets was highest when all A and B tones had cosine phase. Similarity was intermediate when the A tones had cosine phase and the B tones had alternating phase. Similarity was lowest when the A tones had cosine phase and the B tones had random phase. The present study corroborates and extends previous reports, showing similar correspondences between neuronal activity in the primary auditory cortex and auditory streaming of sound sequences. It also is consistent with Fishman’s population separation model of auditory streaming.
Collapse
Affiliation(s)
- Stanislava Knyazeva
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany
| | - Elena Selezneva
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany
| | - Alexander Gorkin
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany.,Laboratory of Psychophysiology, Institute of Psychology, Moscow, Russia
| | | | - Michael Brosch
- Speziallabor Primatenneurobiologie, Leibniz-Institute für Neurobiologie, Magdeburg, Germany.,Center for Behavioral Brain Sciences, Otto-von-Guericke-University, Magdeburg, Germany
| |
Collapse
|
22
|
Paredes-Gallardo A, Madsen SMK, Dau T, Marozeau J. The Role of Temporal Cues in Voluntary Stream Segregation for Cochlear Implant Users. Trends Hear 2018; 22:2331216518773226. [PMID: 29766759 PMCID: PMC5974563 DOI: 10.1177/2331216518773226] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Revised: 03/27/2018] [Accepted: 03/30/2018] [Indexed: 11/30/2022] Open
Abstract
The role of temporal cues in sequential stream segregation was investigated in cochlear implant (CI) listeners using a delay detection task composed of a sequence of bursts of pulses (B) on a single electrode interleaved with a second sequence (A) presented on the same electrode with a different pulse rate. In half of the trials, a delay was added to the last burst of the otherwise regular B sequence and the listeners were asked to detect this delay. As a jitter was added to the period between consecutive A bursts, time judgments between the A and B sequences provided an unreliable cue to perform the task. Thus, the segregation of the A and B sequences should improve performance. The pulse rate difference and the duration of the sequences were varied between trials. The performance in the detection task improved by increasing both the pulse rate differences and the sequence duration. This suggests that CI listeners can use pulse rate differences to segregate sequential sounds and that a segregated percept builds up over time. In addition, the contribution of place versus temporal cues for voluntary stream segregation was assessed by combining the results from this study with those from our previous study, where the same paradigm was used to determine the role of place cues on stream segregation. Pitch height differences between the A and the B sounds accounted for the results from both studies, suggesting that stream segregation is related to the salience of the perceptual difference between the sounds.
Collapse
Affiliation(s)
- Andreu Paredes-Gallardo
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Sara M. K. Madsen
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Torsten Dau
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Jeremy Marozeau
- Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby, Denmark
| |
Collapse
|
23
|
Shinn-Cunningham B. Cortical and Sensory Causes of Individual Differences in Selective Attention Ability Among Listeners With Normal Hearing Thresholds. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:2976-2988. [PMID: 29049598 PMCID: PMC5945067 DOI: 10.1044/2017_jslhr-h-17-0080] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Revised: 06/23/2017] [Accepted: 07/05/2017] [Indexed: 05/28/2023]
Abstract
PURPOSE This review provides clinicians with an overview of recent findings relevant to understanding why listeners with normal hearing thresholds (NHTs) sometimes suffer from communication difficulties in noisy settings. METHOD The results from neuroscience and psychoacoustics are reviewed. RESULTS In noisy settings, listeners focus their attention by engaging cortical brain networks to suppress unimportant sounds; they then can analyze and understand an important sound, such as speech, amidst competing sounds. Differences in the efficacy of top-down control of attention can affect communication abilities. In addition, subclinical deficits in sensory fidelity can disrupt the ability to perceptually segregate sound sources, interfering with selective attention, even in listeners with NHTs. Studies of variability in control of attention and in sensory coding fidelity may help to isolate and identify some of the causes of communication disorders in individuals presenting at the clinic with "normal hearing." CONCLUSIONS How well an individual with NHTs can understand speech amidst competing sounds depends not only on the sound being audible but also on the integrity of cortical control networks and the fidelity of the representation of suprathreshold sound. Understanding the root cause of difficulties experienced by listeners with NHTs ultimately can lead to new, targeted interventions that address specific deficits affecting communication in noise. PRESENTATION VIDEO http://cred.pubs.asha.org/article.aspx?articleid=2601617.
Collapse
Affiliation(s)
- Barbara Shinn-Cunningham
- Center for Research in Sensory Communication and Emerging Neural Technology, Boston University, MA
| |
Collapse
|
24
|
Auditory sequential accumulation of spectral information. Hear Res 2017; 356:118-126. [PMID: 29042121 DOI: 10.1016/j.heares.2017.10.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Revised: 10/03/2017] [Accepted: 10/04/2017] [Indexed: 11/22/2022]
Abstract
In many listening situations, information about the spectral content of a target sound may be distributed over time, and estimating the target spectrum requires efficient sequential processing. Listeners' ability to estimate the spectrum of a random-frequency, six-tone complex was investigated and the spectral content of the complex was revealed using a sequence of bursts. Whether each of the six tones was presented within each burst was determined at random according to a presentation probability. In separate conditions, the presentation probabilities (p) ranged from 0.2 to 1, the total number of bursts varied from 1 to 16, and the inter-burst interval was either 0 or 200 ms. To evaluate the information acquired by the listener, the burst sequence was followed, after a 500-ms silent interval, by the six-tone complex acting as an informational masker and the listener was required to detect a pure-tone target presented simultaneously with the masker. Greater performance in this task indicates more accurate estimation of the spectrum of the complex by the listener. Evidence for integration of information across bursts was observed, and the integration process did not significantly depend on inter-burst interval.
Collapse
|
25
|
A Crucial Test of the Population Separation Model of Auditory Stream Segregation in Macaque Primary Auditory Cortex. J Neurosci 2017; 37:10645-10655. [PMID: 28954867 DOI: 10.1523/jneurosci.0792-17.2017] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 08/29/2017] [Accepted: 09/05/2017] [Indexed: 11/21/2022] Open
Abstract
An important aspect of auditory scene analysis is auditory stream segregation-the organization of sound sequences into perceptual streams reflecting different sound sources in the environment. Several models have been proposed to account for stream segregation. According to the "population separation" (PS) model, alternating ABAB tone sequences are perceived as a single stream or as two separate streams when "A" and "B" tones activate the same or distinct frequency-tuned neuronal populations in primary auditory cortex (A1), respectively. A crucial test of the PS model is whether it can account for the observation that A and B tones are generally perceived as a single stream when presented synchronously, rather than in an alternating pattern, even if they are widely separated in frequency. Here, we tested the PS model by recording neural responses to alternating (ALT) and synchronous (SYNC) tone sequences in A1 of male macaques. Consistent with predictions of the PS model, a greater effective tonotopic separation of A and B tone responses was observed under ALT than under SYNC conditions, thus paralleling the perceptual organization of the sequences. While other models of stream segregation, such as temporal coherence, are not excluded by the present findings, we conclude that PS is sufficient to account for the perceptual organization of ALT and SYNC sequences and thus remains a viable model of auditory stream segregation.SIGNIFICANCE STATEMENT According to the population separation (PS) model of auditory stream segregation, sounds that activate the same or separate neural populations in primary auditory cortex (A1) are perceived as one or two streams, respectively. It is unclear, however, whether the PS model can account for the perception of sounds as a single stream when they are presented synchronously. Here, we tested the PS model by recording neural responses to alternating (ALT) and synchronous (SYNC) tone sequences in macaque A1. A greater effective separation of tonotopic activity patterns was observed under ALT than under SYNC conditions, thus paralleling the perceptual organization of the sequences. Based on these findings, we conclude that PS remains a plausible neurophysiological model of auditory stream segregation.
Collapse
|
26
|
Comparison of perceptual properties of auditory streaming between spectral and amplitude modulation domains. Hear Res 2017; 350:244-250. [PMID: 28323019 DOI: 10.1016/j.heares.2017.03.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 02/20/2017] [Accepted: 03/15/2017] [Indexed: 11/21/2022]
Abstract
The two-tone sequence (ABA_), which comprises two different sounds (A and B) and a silent gap, has been used to investigate how the auditory system organizes sequential sounds depending on various stimulus conditions or brain states. Auditory streaming can be evoked by differences not only in the tone frequency ("spectral cue": ΔFTONE, TONE condition) but also in the amplitude modulation rate ("AM cue": ΔFAM, AM condition). The aim of the present study was to explore the relationship between the perceptual properties of auditory streaming for the TONE and AM conditions. A sequence with a long duration (400 repetitions of ABA_) was used to examine the property of the bistability of streaming. The ratio of feature differences that evoked an equivalent probability of the segregated percept was close to the ratio of the Q-values of the auditory and modulation filters, consistent with a "channeling theory" of auditory streaming. On the other hand, for values of ΔFAM and ΔFTONE evoking equal probabilities of the segregated percept, the number of perceptual switches was larger for the TONE condition than for the AM condition, indicating that the mechanism(s) that determine the bistability of auditory streaming are different between or sensitive to the two domains. Nevertheless, the number of switches for individual listeners was positively correlated between the spectral and AM domains. The results suggest a possibility that the neural substrates for spectral and AM processes share a common switching mechanism but differ in location and/or in the properties of neural activity or the strength of internal noise at each level.
Collapse
|
27
|
Cognitive representation of "musical fractals": Processing hierarchy and recursion in the auditory domain. Cognition 2017; 161:31-45. [PMID: 28103526 PMCID: PMC5348576 DOI: 10.1016/j.cognition.2017.01.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Revised: 10/01/2016] [Accepted: 01/03/2017] [Indexed: 11/21/2022]
Abstract
The human ability to process hierarchical structures has been a longstanding research topic. However, the nature of the cognitive machinery underlying this faculty remains controversial. Recursion, the ability to embed structures within structures of the same kind, has been proposed as a key component of our ability to parse and generate complex hierarchies. Here, we investigated the cognitive representation of both recursive and iterative processes in the auditory domain. The experiment used a two-alternative forced-choice paradigm: participants were exposed to three-step processes in which pure-tone sequences were built either through recursive or iterative processes, and had to choose the correct completion. Foils were constructed according to generative processes that did not match the previous steps. Both musicians and non-musicians were able to represent recursion in the auditory domain, although musicians performed better. We also observed that general ‘musical’ aptitudes played a role in both recursion and iteration, although the influence of musical training was somehow independent from melodic memory. Moreover, unlike iteration, recursion in audition was well correlated with its non-auditory (recursive) analogues in the visual and action sequencing domains. These results suggest that the cognitive machinery involved in establishing recursive representations is domain-general, even though this machinery requires access to information resulting from domain-specific processes.
Collapse
|
28
|
Shinn-Cunningham B, Best V, Lee AKC. Auditory Object Formation and Selection. SPRINGER HANDBOOK OF AUDITORY RESEARCH 2017. [DOI: 10.1007/978-3-319-51662-2_2] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
29
|
Tóth B, Kocsis Z, Háden GP, Szerafin Á, Shinn-Cunningham BG, Winkler I. EEG signatures accompanying auditory figure-ground segregation. Neuroimage 2016; 141:108-119. [PMID: 27421185 PMCID: PMC5656226 DOI: 10.1016/j.neuroimage.2016.07.028] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Revised: 07/06/2016] [Accepted: 07/11/2016] [Indexed: 11/16/2022] Open
Abstract
In everyday acoustic scenes, figure-ground segregation typically requires one to group together sound elements over both time and frequency. Electroencephalogram was recorded while listeners detected repeating tonal complexes composed of a random set of pure tones within stimuli consisting of randomly varying tonal elements. The repeating pattern was perceived as a figure over the randomly changing background. It was found that detection performance improved both as the number of pure tones making up each repeated complex (figure coherence) increased, and as the number of repeated complexes (duration) increased - i.e., detection was easier when either the spectral or temporal structure of the figure was enhanced. Figure detection was accompanied by the elicitation of the object related negativity (ORN) and the P400 event-related potentials (ERPs), which have been previously shown to be evoked by the presence of two concurrent sounds. Both ERP components had generators within and outside of auditory cortex. The amplitudes of the ORN and the P400 increased with both figure coherence and figure duration. However, only the P400 amplitude correlated with detection performance. These results suggest that 1) the ORN and P400 reflect processes involved in detecting the emergence of a new auditory object in the presence of other concurrent auditory objects; 2) the ORN corresponds to the likelihood of the presence of two or more concurrent sound objects, whereas the P400 reflects the perceptual recognition of the presence of multiple auditory objects and/or preparation for reporting the detection of a target object.
Collapse
Affiliation(s)
- Brigitta Tóth
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Center for Computational Neuroscience and Neural Technology, Boston University, Boston, USA.
| | - Zsuzsanna Kocsis
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Budapest, Hungary
| | - Gábor P Háden
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| | - Ágnes Szerafin
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Budapest, Hungary
| | | | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary; Department of Cognitive and Neuropsychology, Institute of Psychology, University of Szeged, Szeged, Hungary
| |
Collapse
|
30
|
Fogerty D, Xu J, Gibbs BE. Modulation masking and glimpsing of natural and vocoded speech during single-talker modulated noise: Effect of the modulation spectrum. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:1800. [PMID: 27914381 PMCID: PMC5848862 DOI: 10.1121/1.4962494] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Compared to notionally steady-state noise, modulated maskers provide a perceptual benefit for speech recognition, in part due to preserved speech information during the amplitude dips of the masker. However, overlap in the modulation spectrum between the target speech and the competing modulated masker may potentially result in modulation masking, and thereby offset the release from energetic masking. The current study investigated masking release provided by single-talker modulated noise. The overlap in the modulation spectra of the target speech and the modulated noise masker was varied through time compression or expansion of the competing masker. Younger normal hearing adults listened to sentences that were unprocessed or noise vocoded to primarily limit speech recognition to the preserved temporal envelope cues. For unprocessed speech, results demonstrated improved performance with masker modulation spectrum shifted up or down compared to the target modulation spectrum, except for the most extreme time expansion. For vocoded speech, significant masking release was observed with the slowest masker rate. Perceptual results combined with acoustic analyses of the preserved glimpses of the target speech suggest contributions of modulation masking and cognitive-linguistic processing as factors contributing to performance.
Collapse
Affiliation(s)
- Daniel Fogerty
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, South Carolina 29208, USA
| | - Jiaqian Xu
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, South Carolina 29208, USA
| | - Bobby E Gibbs
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, South Carolina 29208, USA
| |
Collapse
|
31
|
Fogerty D, Xu J, Gibbs BE. Modulation masking and glimpsing of natural and vocoded speech during single-talker modulated noise: Effect of the modulation spectrum. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:1800. [PMID: 27914381 DOI: 10.5041466/1.4962494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Compared to notionally steady-state noise, modulated maskers provide a perceptual benefit for speech recognition, in part due to preserved speech information during the amplitude dips of the masker. However, overlap in the modulation spectrum between the target speech and the competing modulated masker may potentially result in modulation masking, and thereby offset the release from energetic masking. The current study investigated masking release provided by single-talker modulated noise. The overlap in the modulation spectra of the target speech and the modulated noise masker was varied through time compression or expansion of the competing masker. Younger normal hearing adults listened to sentences that were unprocessed or noise vocoded to primarily limit speech recognition to the preserved temporal envelope cues. For unprocessed speech, results demonstrated improved performance with masker modulation spectrum shifted up or down compared to the target modulation spectrum, except for the most extreme time expansion. For vocoded speech, significant masking release was observed with the slowest masker rate. Perceptual results combined with acoustic analyses of the preserved glimpses of the target speech suggest contributions of modulation masking and cognitive-linguistic processing as factors contributing to performance.
Collapse
Affiliation(s)
- Daniel Fogerty
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, South Carolina 29208, USA
| | - Jiaqian Xu
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, South Carolina 29208, USA
| | - Bobby E Gibbs
- Department of Communication Sciences and Disorders, University of South Carolina, Columbia, South Carolina 29208, USA
| |
Collapse
|
32
|
Functional magnetic resonance imaging confirms forward suppression for rapidly alternating sounds in human auditory cortex but not in the inferior colliculus. Hear Res 2016; 335:25-32. [PMID: 26899342 DOI: 10.1016/j.heares.2016.02.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 02/08/2016] [Accepted: 02/15/2016] [Indexed: 11/21/2022]
Abstract
Forward suppression at the level of the auditory cortex has been suggested to subserve auditory stream segregation. Recent results in non-streaming stimulation contexts have indicated that forward suppression can also be observed in the inferior colliculus; whether this holds for streaming-related contexts remains unclear. Here, we used cardiac-gated fMRI to examine forward suppression in the inferior colliculus (and the rest of the human auditory pathway) in response to canonical streaming stimuli (rapid tone sequences comprised of either one repetitive tone or two alternating tones). The first stimulus is typically perceived as a single stream, the second as two interleaved streams. In different experiments using either pure tones differing in frequency or bandpass-filtered noise differing in inter-aural time differences, we observed stronger auditory cortex activation in response to alternating vs. repetitive stimulation, consistent with the presence of forward suppression. In contrast, activity in the inferior colliculus and other subcortical nuclei did not significantly differ between alternating and monotonic stimuli. This finding could be explained by active amplification of forward suppression in auditory cortex, by a low rate (or absence) of cells showing forward suppression in inferior colliculus, or both.
Collapse
|
33
|
Winkler I, Schröger E. Auditory perceptual objects as generative models: Setting the stage for communication by sound. BRAIN AND LANGUAGE 2015; 148:1-22. [PMID: 26184883 DOI: 10.1016/j.bandl.2015.05.003] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Revised: 03/03/2015] [Accepted: 05/03/2015] [Indexed: 06/04/2023]
Abstract
Communication by sounds requires that the communication channels (i.e. speech/speakers and other sound sources) had been established. This allows to separate concurrently active sound sources, to track their identity, to assess the type of message arriving from them, and to decide whether and when to react (e.g., reply to the message). We propose that these functions rely on a common generative model of the auditory environment. This model predicts upcoming sounds on the basis of representations describing temporal/sequential regularities. Predictions help to identify the continuation of the previously discovered sound sources to detect the emergence of new sources as well as changes in the behavior of the known ones. It produces auditory event representations which provide a full sensory description of the sounds, including their relation to the auditory context and the current goals of the organism. Event representations can be consciously perceived and serve as objects in various cognitive operations.
Collapse
Affiliation(s)
- István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Hungary; Institute of Psychology, University of Szeged, Hungary.
| | - Erich Schröger
- Institute for Psychology, University of Leipzig, Germany.
| |
Collapse
|
34
|
Arehart K, Souza P, Kates J, Lunner T, Pedersen MS. Relationship Among Signal Fidelity, Hearing Loss, and Working Memory for Digital Noise Suppression. Ear Hear 2015; 36:505-16. [PMID: 25985016 PMCID: PMC4549215 DOI: 10.1097/aud.0000000000000173] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES This study considered speech modified by additive babble combined with noise-suppression processing. The purpose was to determine the relative importance of the signal modifications, individual peripheral hearing loss, and individual cognitive capacity on speech intelligibility and speech quality. DESIGN The participant group consisted of 31 individuals with moderate high-frequency hearing loss ranging in age from 51 to 89 years (mean = 69.6 years). Speech intelligibility and speech quality were measured using low-context sentences presented in babble at several signal-to-noise ratios. Speech stimuli were processed with a binary mask noise-suppression strategy with systematic manipulations of two parameters (error rate and attenuation values). The cumulative effects of signal modification produced by babble and signal processing were quantified using an envelope-distortion metric. Working memory capacity was assessed with a reading span test. Analysis of variance was used to determine the effects of signal processing parameters on perceptual scores. Hierarchical linear modeling was used to determine the role of degree of hearing loss and working memory capacity in individual listener response to the processed noisy speech. The model also considered improvements in envelope fidelity caused by the binary mask and the degradations to envelope caused by error and noise. RESULTS The participants showed significant benefits in terms of intelligibility scores and quality ratings for noisy speech processed by the ideal binary mask noise-suppression strategy. This benefit was observed across a range of signal-to-noise ratios and persisted when up to a 30% error rate was introduced into the processing. Average intelligibility scores and average quality ratings were well predicted by an objective metric of envelope fidelity. Degree of hearing loss and working memory capacity were significant factors in explaining individual listener's intelligibility scores for binary mask processing applied to speech in babble. Degree of hearing loss and working memory capacity did not predict listeners' quality ratings. CONCLUSIONS The results indicate that envelope fidelity is a primary factor in determining the combined effects of noise and binary mask processing for intelligibility and quality of speech presented in babble noise. Degree of hearing loss and working memory capacity are significant factors in explaining variability in listeners' speech intelligibility scores but not in quality ratings.
Collapse
Affiliation(s)
- Kathryn Arehart
- 1Speech Language and Hearing Sciences, University of Colorado Boulder, Boulder, CO, USA; 2Communication Sciences and Disorders and Knowles Hearing Center, Northwestern University, Evanston, IL, USA; 3Eriksholm Research Centre, Oticon A/S, Snekkersten, Denmark; 4Linnaeus Centre HEAD, Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden; and 5Oticon A/S, Smørum, Denmark
| | | | | | | | | |
Collapse
|
35
|
Farley BJ, Noreña AJ. Membrane potential dynamics of populations of cortical neurons during auditory streaming. J Neurophysiol 2015; 114:2418-30. [PMID: 26269558 DOI: 10.1152/jn.00545.2015] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Accepted: 08/12/2015] [Indexed: 11/22/2022] Open
Abstract
How a mixture of acoustic sources is perceptually organized into discrete auditory objects remains unclear. One current hypothesis postulates that perceptual segregation of different sources is related to the spatiotemporal separation of cortical responses induced by each acoustic source or stream. In the present study, the dynamics of subthreshold membrane potential activity were measured across the entire tonotopic axis of the rodent primary auditory cortex during the auditory streaming paradigm using voltage-sensitive dye imaging. Consistent with the proposed hypothesis, we observed enhanced spatiotemporal segregation of cortical responses to alternating tone sequences as their frequency separation or presentation rate was increased, both manipulations known to promote stream segregation. However, across most streaming paradigm conditions tested, a substantial cortical region maintaining a response to both tones coexisted with more peripheral cortical regions responding more selectively to one of them. We propose that these coexisting subthreshold representation types could provide neural substrates to support the flexible switching between the integrated and segregated streaming percepts.
Collapse
Affiliation(s)
| | - Arnaud J Noreña
- Aix Marseille Université, Centre National de la Recherche Scientifique, Marseille, France; and
| |
Collapse
|
36
|
Nie Y, Nelson PB. Auditory stream segregation using amplitude modulated bandpass noise. Front Psychol 2015; 6:1151. [PMID: 26300831 PMCID: PMC4528102 DOI: 10.3389/fpsyg.2015.01151] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2015] [Accepted: 07/23/2015] [Indexed: 12/23/2022] Open
Abstract
The purpose of this study was to investigate the roles of spectral overlap and amplitude modulation (AM) rate for stream segregation for noise signals, as well as to test the build-up effect based on these two cues. Segregation ability was evaluated using an objective paradigm with listeners' attention focused on stream segregation. Stimulus sequences consisted of two interleaved sets of bandpass noise bursts (A and B bursts). The A and B bursts differed in spectrum, AM-rate, or both. The amount of the difference between the two sets of noise bursts was varied. Long and short sequences were studied to investigate the build-up effect for segregation based on spectral and AM-rate differences. Results showed the following: (1). Stream segregation ability increased with greater spectral separation. (2). Larger AM-rate separations were associated with stronger segregation abilities. (3). Spectral separation was found to elicit the build-up effect for the range of spectral differences assessed in the current study. (4). AM-rate separation interacted with spectral separation suggesting an additive effect of spectral separation and AM-rate separation on segregation build-up. The findings suggest that, when normal-hearing listeners direct their attention towards segregation, they are able to segregate auditory streams based on reduced spectral contrast cues that vary by the amount of spectral overlap. Further, regardless of the spectral separation they are able to use AM-rate difference as a secondary/weaker cue. Based on the spectral differences, listeners can segregate auditory streams better as the listening duration is prolonged—i.e., sparse spectral cues elicit build-up segregation; however, AM-rate differences only appear to elicit build-up when in combination with spectral difference cues.
Collapse
Affiliation(s)
- Yingjiu Nie
- Department of Communication Sciences and Disorders, James Madison University Harrisonburg, VA, USA
| | - Peggy B Nelson
- Department of Speech-Language-Hearing Sciences, University of Minnesota Minneapolis, MN, USA
| |
Collapse
|
37
|
Abstract
Amplitude modulations are fundamental features of natural signals, including human speech and nonhuman primate vocalizations. Because natural signals frequently occur in the context of other competing signals, we used a forward-masking paradigm to investigate how the modulation context of a prior signal affects cortical responses to subsequent modulated sounds. Psychophysical "modulation masking," in which the presentation of a modulated "masker" signal elevates the threshold for detecting the modulation of a subsequent stimulus, has been interpreted as evidence of a central modulation filterbank and modeled accordingly. Whether cortical modulation tuning is compatible with such models remains unknown. By recording responses to pairs of sinusoidally amplitude modulated (SAM) tones in the auditory cortex of awake squirrel monkeys, we show that the prior presentation of the SAM masker elicited persistent and tuned suppression of the firing rate to subsequent SAM signals. Population averages of these effects are compatible with adaptation in broadly tuned modulation channels. In contrast, modulation context had little effect on the synchrony of the cortical representation of the second SAM stimuli and the tuning of such effects did not match that observed for firing rate. Our results suggest that, although the temporal representation of modulated signals is more robust to changes in stimulus context than representations based on average firing rate, this representation is not fully exploited and psychophysical modulation masking more closely mirrors physiological rate suppression and that rate tuning for a given stimulus feature in a given neuron's signal pathway appears sufficient to engender context-sensitive cortical adaptation.
Collapse
|
38
|
Christison-Lagay KL, Gifford AM, Cohen YE. Neural correlates of auditory scene analysis and perception. Int J Psychophysiol 2015; 95:238-245. [PMID: 24681354 PMCID: PMC4176604 DOI: 10.1016/j.ijpsycho.2014.03.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2013] [Revised: 01/13/2014] [Accepted: 03/14/2014] [Indexed: 11/16/2022]
Abstract
The auditory system is designed to transform acoustic information from low-level sensory representations into perceptual representations. These perceptual representations are the computational result of the auditory system's ability to group and segregate spectral, spatial and temporal regularities in the acoustic environment into stable perceptual units (i.e., sounds or auditory objects). Current evidence suggests that the cortex-specifically, the ventral auditory pathway-is responsible for the computations most closely related to perceptual representations. Here, we discuss how the transformations along the ventral auditory pathway relate to auditory percepts, with special attention paid to the processing of vocalizations and categorization, and explore recent models of how these areas may carry out these computations.
Collapse
Affiliation(s)
- Kate L. Christison-Lagay
- Neuroscience Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104
| | - Adam M. Gifford
- Neuroscience Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104
| | - Yale E. Cohen
- Department of Otorhinolaryngology, University of Pennsylvania, Philadelphia, 19104
- Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104
- Department of Bioengineering University of Pennsylvania, Philadelphia, 19104
| |
Collapse
|
39
|
Niwa M, O'Connor KN, Engall E, Johnson JS, Sutter ML. Hierarchical effects of task engagement on amplitude modulation encoding in auditory cortex. J Neurophysiol 2014; 113:307-27. [PMID: 25298387 DOI: 10.1152/jn.00458.2013] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
We recorded from middle lateral belt (ML) and primary (A1) auditory cortical neurons while animals discriminated amplitude-modulated (AM) sounds and also while they sat passively. Engagement in AM discrimination improved ML and A1 neurons' ability to discriminate AM with both firing rate and phase-locking; however, task engagement affected neural AM discrimination differently in the two fields. The results suggest that these two areas utilize different AM coding schemes: a "single mode" in A1 that relies on increased activity for AM relative to unmodulated sounds and a "dual-polar mode" in ML that uses both increases and decreases in neural activity to encode modulation. In the dual-polar ML code, nonsynchronized responses might play a special role. The results are consistent with findings in the primary and secondary somatosensory cortices during discrimination of vibrotactile modulation frequency, implicating a common scheme in the hierarchical processing of temporal information among different modalities. The time course of activity differences between behaving and passive conditions was also distinct in A1 and ML and may have implications for auditory attention. At modulation depths ≥ 16% (approximately behavioral threshold), A1 neurons' improvement in distinguishing AM from unmodulated noise is relatively constant or improves slightly with increasing modulation depth. In ML, improvement during engagement is most pronounced near threshold and disappears at highly suprathreshold depths. This ML effect is evident later in the stimulus, and mainly in nonsynchronized responses. This suggests that attention-related increases in activity are stronger or longer-lasting for more difficult stimuli in ML.
Collapse
Affiliation(s)
- Mamiko Niwa
- Center for Neuroscience and Department of Neurobiology, Physiology, and Behavior, University of California, Davis, California
| | - Kevin N O'Connor
- Center for Neuroscience and Department of Neurobiology, Physiology, and Behavior, University of California, Davis, California
| | - Elizabeth Engall
- Center for Neuroscience and Department of Neurobiology, Physiology, and Behavior, University of California, Davis, California
| | - Jeffrey S Johnson
- Center for Neuroscience and Department of Neurobiology, Physiology, and Behavior, University of California, Davis, California
| | - M L Sutter
- Center for Neuroscience and Department of Neurobiology, Physiology, and Behavior, University of California, Davis, California
| |
Collapse
|
40
|
Riecke L, Scharke W, Valente G, Gutschalk A. Sustained selective attention to competing amplitude-modulations in human auditory cortex. PLoS One 2014; 9:e108045. [PMID: 25259525 PMCID: PMC4178064 DOI: 10.1371/journal.pone.0108045] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Accepted: 08/23/2014] [Indexed: 11/18/2022] Open
Abstract
Auditory selective attention plays an essential role for identifying sounds of interest in a scene, but the neural underpinnings are still incompletely understood. Recent findings demonstrate that neural activity that is time-locked to a particular amplitude-modulation (AM) is enhanced in the auditory cortex when the modulated stream of sounds is selectively attended to under sensory competition with other streams. However, the target sounds used in the previous studies differed not only in their AM, but also in other sound features, such as carrier frequency or location. Thus, it remains uncertain whether the observed enhancements reflect AM-selective attention. The present study aims at dissociating the effect of AM frequency on response enhancement in auditory cortex by using an ongoing auditory stimulus that contains two competing targets differing exclusively in their AM frequency. Electroencephalography results showed a sustained response enhancement for auditory attention compared to visual attention, but not for AM-selective attention (attended AM frequency vs. ignored AM frequency). In contrast, the response to the ignored AM frequency was enhanced, although a brief trend toward response enhancement occurred during the initial 15 s. Together with the previous findings, these observations indicate that selective enhancement of attended AMs in auditory cortex is adaptive under sustained AM-selective attention. This finding has implications for our understanding of cortical mechanisms for feature-based attentional gain control.
Collapse
Affiliation(s)
- Lars Riecke
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
- * E-mail:
| | - Wolfgang Scharke
- Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, University Hospital, RWTH Aachen University, Aachen, Germany
| | - Giancarlo Valente
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
| | - Alexander Gutschalk
- Department of Neurology, Ruprecht-Karls-Universität Heidelberg, Heidelberg, Germany
| |
Collapse
|
41
|
Nie Y, Zhang Y, Nelson PB. Auditory stream segregation using bandpass noises: evidence from event-related potentials. Front Neurosci 2014; 8:277. [PMID: 25309306 PMCID: PMC4162371 DOI: 10.3389/fnins.2014.00277] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2014] [Accepted: 08/18/2014] [Indexed: 11/13/2022] Open
Abstract
The current study measured neural responses to investigate auditory stream segregation of noise stimuli with or without clear spectral contrast. Sequences of alternating A and B noise bursts were presented to elicit stream segregation in normal-hearing listeners. The successive B bursts in each sequence maintained an equal amount of temporal separation with manipulations introduced on the last stimulus. The last B burst was either delayed for 50% of the sequences or not delayed for the other 50%. The A bursts were jittered in between every two adjacent B bursts. To study the effects of spectral separation on streaming, the A and B bursts were further manipulated by using either bandpass-filtered noises widely spaced in center frequency or broadband noises. Event-related potentials (ERPs) to the last B bursts were analyzed to compare the neural responses to the delay vs. no-delay trials in both passive and attentive listening conditions. In the passive listening condition, a trend for a possible late mismatch negativity (MMN) or late discriminative negativity (LDN) response was observed only when the A and B bursts were spectrally separate, suggesting that spectral separation in the A and B burst sequences could be conducive to stream segregation at the pre-attentive level. In the attentive condition, a P300 response was consistently elicited regardless of whether there was spectral separation between the A and B bursts, indicating the facilitative role of voluntary attention in stream segregation. The results suggest that reliable ERP measures can be used as indirect indicators for auditory stream segregation in conditions of weak spectral contrast. These findings have important implications for cochlear implant (CI) studies-as spectral information available through a CI device or simulation is substantially degraded, it may require more attention to achieve stream segregation.
Collapse
Affiliation(s)
- Yingjiu Nie
- Department of Communication Sciences and Disorders, James Madison UniversityHarrisonburg, VA, USA
| | - Yang Zhang
- Department of Speech-Language-Hearing Sciences, University of MinnesotaTwin-Cities, MN, USA
- Center for Neurobehavioral Development, University of MinnesotaTwin-Cities, MN, USA
| | - Peggy B. Nelson
- Department of Speech-Language-Hearing Sciences, University of MinnesotaTwin-Cities, MN, USA
| |
Collapse
|
42
|
Dolležal LV, Brechmann A, Klump GM, Deike S. Evaluating auditory stream segregation of SAM tone sequences by subjective and objective psychoacoustical tasks, and brain activity. Front Neurosci 2014; 8:119. [PMID: 24936170 PMCID: PMC4047832 DOI: 10.3389/fnins.2014.00119] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 05/03/2014] [Indexed: 11/13/2022] Open
Abstract
Auditory stream segregation refers to a segregated percept of signal streams with different acoustic features. Different approaches have been pursued in studies of stream segregation. In psychoacoustics, stream segregation has mostly been investigated with a subjective task asking the subjects to report their percept. Few studies have applied an objective task in which stream segregation is evaluated indirectly by determining thresholds for a percept that depends on whether auditory streams are segregated or not. Furthermore, both perceptual measures and physiological measures of brain activity have been employed but only little is known about their relation. How the results from different tasks and measures are related is evaluated in the present study using examples relying on the ABA- stimulation paradigm that apply the same stimuli. We presented A and B signals that were sinusoidally amplitude modulated (SAM) tones providing purely temporal, spectral or both types of cues to evaluate perceptual stream segregation and its physiological correlate. Which types of cues are most prominent was determined by the choice of carrier and modulation frequencies (f mod) of the signals. In the subjective task subjects reported their percept and in the objective task we measured their sensitivity for detecting time-shifts of B signals in an ABA- sequence. As a further measure of processes underlying stream segregation we employed functional magnetic resonance imaging (fMRI). SAM tone parameters were chosen to evoke an integrated (1-stream), a segregated (2-stream), or an ambiguous percept by adjusting the f mod difference between A and B tones (Δf mod). The results of both psychoacoustical tasks are significantly correlated. BOLD responses in fMRI depend on Δf mod between A and B SAM tones. The effect of Δf mod, however, differs between auditory cortex and frontal regions suggesting differences in representation related to the degree of perceptual ambiguity of the sequences.
Collapse
Affiliation(s)
- Lena-Vanessa Dolležal
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Center of Excellence "Hearing4all," Carl von Ossietzky University Oldenburg Oldenburg, Germany
| | - André Brechmann
- Special Lab Non-invasive Brain Imaging, Leibniz Institute for Neurobiology Magdeburg, Germany
| | - Georg M Klump
- Animal Physiology and Behavior Group, Department for Neuroscience, School for Medicine and Health Sciences, Center of Excellence "Hearing4all," Carl von Ossietzky University Oldenburg Oldenburg, Germany
| | - Susann Deike
- Special Lab Non-invasive Brain Imaging, Leibniz Institute for Neurobiology Magdeburg, Germany
| |
Collapse
|
43
|
Shestopalova L, Bőhm TM, Bendixen A, Andreou AG, Georgiou J, Garreau G, Hajdu B, Denham SL, Winkler I. Do audio-visual motion cues promote segregation of auditory streams? Front Neurosci 2014; 8:64. [PMID: 24778604 PMCID: PMC3985028 DOI: 10.3389/fnins.2014.00064] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Accepted: 03/19/2014] [Indexed: 11/19/2022] Open
Abstract
An audio-visual experiment using moving sound sources was designed to investigate whether the analysis of auditory scenes is modulated by synchronous presentation of visual information. Listeners were presented with an alternating sequence of two pure tones delivered by two separate sound sources. In different conditions, the two sound sources were either stationary or moving on random trajectories around the listener. Both the sounds and the movement trajectories were derived from recordings in which two humans were moving with loudspeakers attached to their heads. Visualized movement trajectories modeled by a computer animation were presented together with the sounds. In the main experiment, behavioral reports on sound organization were collected from young healthy volunteers. The proportion and stability of the different sound organizations were compared between the conditions in which the visualized trajectories matched the movement of the sound sources and when the two were independent of each other. The results corroborate earlier findings that separation of sound sources in space promotes segregation. However, no additional effect of auditory movement per se on the perceptual organization of sounds was obtained. Surprisingly, the presentation of movement-congruent visual cues did not strengthen the effects of spatial separation on segregating auditory streams. Our findings are consistent with the view that bistability in the auditory modality can occur independently from other modalities.
Collapse
Affiliation(s)
- Lidia Shestopalova
- Pavlov Institute of Physiology, Russian Academy of Sciences St.-Petersburg, Russia
| | - Tamás M Bőhm
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary ; Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics Budapest, Hungary
| | - Alexandra Bendixen
- Auditory Psychophysiology Lab, Department of Psychology, Cluster of Excellence "Hearing4all", European Medical School, Carl von Ossietzky University of Oldenburg Oldenburg, Germany
| | - Andreas G Andreou
- Department of Electrical and Computer Engineering, Johns Hopkins University Baltimore, MD, USA ; Department of Electrical and Computer Engineering, University of Cyprus Nicosia, Cyprus
| | - Julius Georgiou
- Department of Electrical and Computer Engineering, University of Cyprus Nicosia, Cyprus
| | - Guillaume Garreau
- Department of Electrical and Computer Engineering, University of Cyprus Nicosia, Cyprus
| | - Botond Hajdu
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary
| | - Susan L Denham
- School of Psychology, Cognition Institute, University of Plymouth Plymouth, UK
| | - István Winkler
- Research Centre for Natural Sciences, Institute of Cognitive Neuroscience and Psychology, Hungarian Academy of Sciences Budapest, Hungary ; Department of Cognitive and Neuropsychology, Institute of Psychology, University of Szeged Szeged, Hungary
| |
Collapse
|
44
|
Association of auditory steady state responses with perception of temporal modulations and speech in noise. ISRN OTOLARYNGOLOGY 2014; 2014:374035. [PMID: 25006511 PMCID: PMC4009337 DOI: 10.1155/2014/374035] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/19/2014] [Accepted: 03/04/2014] [Indexed: 11/17/2022]
Abstract
Amplitude modulations in the speech convey important acoustic information for speech perception. Auditory steady state response (ASSR) is thought to be physiological correlate of amplitude modulation perception. Limited research is available exploring association between ASSR and modulation detection ability as well as speech perception. Correlation of modulation detection thresholds (MDT) and speech perception in noise with ASSR was investigated in twofold experiments. 30 normal hearing individuals and 11 normal hearing individuals within age range of 18–24 years participated in experiments 1 and 2, respectively. MDTs were measured using ASSR and behavioral method at 60 Hz, 80 Hz, and 120 Hz modulation frequencies in the first experiment. ASSR threshold was obtained by estimating the minimum modulation depth required to elicit ASSR (ASSR-MDT). There was a positive correlation between behavioral MDT and ASSR-MDT at all modulation frequencies. In the second experiment, ASSR for amplitude modulation (AM) sweeps at four different frequency ranges (30–40 Hz, 40–50 Hz, 50–60 Hz, and 60–70 Hz) was recorded. Speech recognition threshold in noise (SRTn) was estimated using staircase procedure. There was a positive correlation between amplitude of ASSR for AM sweep with frequency range of 30–40 Hz and SRTn. Results of the current study suggest that ASSR provides substantial information about temporal modulation and speech perception.
Collapse
|
45
|
Attention effects on auditory scene analysis: insights from event-related brain potentials. PSYCHOLOGICAL RESEARCH 2014; 78:361-78. [PMID: 24553776 DOI: 10.1007/s00426-014-0547-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Accepted: 02/06/2014] [Indexed: 10/25/2022]
Abstract
Sounds emitted by different sources arrive at our ears as a mixture that must be disentangled before meaningful information can be retrieved. It is still a matter of debate whether this decomposition happens automatically or requires the listener's attention. These opposite positions partly stem from different methodological approaches to the problem. We propose an integrative approach that combines the logic of previous measurements targeting either auditory stream segregation (interpreting a mixture as coming from two separate sources) or integration (interpreting a mixture as originating from only one source). By means of combined behavioral and event-related potential (ERP) measures, our paradigm has the potential to measure stream segregation and integration at the same time, providing the opportunity to obtain positive evidence of either one. This reduces the reliance on zero findings (i.e., the occurrence of stream integration in a given condition can be demonstrated directly, rather than indirectly based on the absence of empirical evidence for stream segregation, and vice versa). With this two-way approach, we systematically manipulate attention devoted to the auditory stimuli (by varying their task relevance) and to their underlying structure (by delivering perceptual tasks that require segregated or integrated percepts). ERP results based on the mismatch negativity (MMN) show no evidence for a modulation of stream integration by attention, while stream segregation results were less clear due to overlapping attention-related components in the MMN latency range. We suggest future studies combining the proposed two-way approach with some improvements in the ERP measurement of sequential stream segregation.
Collapse
|
46
|
Wang Q, Bao M, Chen L. The role of spatiotemporal and spectral cues in segregating short sound events: evidence from auditory Ternus display. Exp Brain Res 2013; 232:273-82. [PMID: 24141518 DOI: 10.1007/s00221-013-3738-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2013] [Accepted: 10/03/2013] [Indexed: 11/30/2022]
Abstract
Previous studies using auditory sequences with rapid repetition of tones revealed that spatiotemporal cues and spectral cues are important cues used to fuse or segregate sound streams. However, the perceptual grouping was partially driven by the cognitive processing of the periodicity cues of the long sequence. Here, we investigate whether perceptual groupings (spatiotemporal grouping vs. frequency grouping) could also be applicable to short auditory sequences, where auditory perceptual organization is mainly subserved by lower levels of perceptual processing. To find the answer to that question, we conducted two experiments using an auditory Ternus display. The display was composed of three speakers (A, B and C), with each speaker consecutively emitting one sound consisting of two frames (AB and BC). Experiment 1 manipulated both spatial and temporal factors. We implemented three 'within-frame intervals' (WFIs, or intervals between A and B, and between B and C), seven 'inter-frame intervals' (IFIs, or intervals between AB and BC) and two different speaker layouts (inter-distance of speakers: near or far). Experiment 2 manipulated the differentiations of frequencies between two auditory frames, in addition to the spatiotemporal cues as in Experiment 1. Listeners were required to make two alternative forced choices (2AFC) to report the perception of a given Ternus display: element motion (auditory apparent motion from sound A to B to C) or group motion (auditory apparent motion from sound 'AB' to 'BC'). The results indicate that the perceptual grouping of short auditory sequences (materialized by the perceptual decisions of the auditory Ternus display) was modulated by temporal and spectral cues, with the latter contributing more to segregating auditory events. Spatial layout plays a less role in perceptual organization. These results could be accounted for by the 'peripheral channeling' theory.
Collapse
Affiliation(s)
- Qingcui Wang
- Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences, Beijing, 100190, China,
| | | | | |
Collapse
|
47
|
Szalárdy O, Winkler I, Schröger E, Widmann A, Bendixen A. Foreground-background discrimination indicated by event-related brain potentials in a new auditory multistability paradigm. Psychophysiology 2013; 50:1239-50. [DOI: 10.1111/psyp.12139] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2012] [Accepted: 07/15/2013] [Indexed: 11/26/2022]
Affiliation(s)
- Orsolya Szalárdy
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences; Hungarian Academy of Sciences; Budapest Hungary
- Department of Cognitive Science, Faculty of Natural Sciences; Budapest University of Technology and Economics; Budapest Hungary
| | - István Winkler
- Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences; Hungarian Academy of Sciences; Budapest Hungary
- Institute of Psychology; University of Szeged; Szeged Hungary
| | - Erich Schröger
- Institute of Psychology; University of Leipzig; Leipzig Germany
| | - Andreas Widmann
- Institute of Psychology; University of Leipzig; Leipzig Germany
| | - Alexandra Bendixen
- Institute of Psychology; University of Leipzig; Leipzig Germany
- Department of Psychology; Cluster of Excellence “Hearing4all,” European Medical School; Carl von Ossietzky University of Oldenburg; Oldenburg Germany
| |
Collapse
|
48
|
Szalárdy O, Bendixen A, Tóth D, Denham SL, Winkler I. Modulation-frequency acts as a primary cue for auditory stream segregation. ACTA ACUST UNITED AC 2013. [DOI: 10.1556/lp.5.2013.suppl2.9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
49
|
Bendixen A, Bőhm TM, Szalárdy O, Mill R, Denham SL, Winkler I. Different roles of similarity and predictability in auditory stream segregation. ACTA ACUST UNITED AC 2013. [DOI: 10.1556/lp.5.2013.suppl2.4] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
50
|
Sayles M, Füllgrabe C, Winter IM. Neurometric amplitude-modulation detection threshold in the guinea-pig ventral cochlear nucleus. J Physiol 2013; 591:3401-19. [PMID: 23629508 DOI: 10.1113/jphysiol.2013.253062] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Amplitude modulation (AM) is a pervasive feature of natural sounds. Neural detection and processing of modulation cues is behaviourally important across species. Although most ecologically relevant sounds are not fully modulated, physiological studies have usually concentrated on fully modulated (100% modulation depth) signals. Psychoacoustic experiments mainly operate at low modulation depths, around detection threshold (∼5% AM). We presented sinusoidal amplitude-modulated tones, systematically varying modulation depth between zero and 100%, at a range of modulation frequencies, to anaesthetised guinea-pigs while recording spikes from neurons in the ventral cochlear nucleus (VCN). The cochlear nucleus is the site of the first synapse in the central auditory system. At this locus significant signal processing occurs with respect to representation of AM signals. Spike trains were analysed in terms of the vector strength of spike synchrony to the amplitude envelope. Neurons showed either low-pass or band-pass temporal modulation transfer functions, with the proportion of band-pass responses increasing with increasing sound level. The proportion of units showing a band-pass response varies with unit type: sustained chopper (CS) > transient chopper (CT) > primary-like (PL). Spike synchrony increased with increasing modulation depth. At the lowest modulation depth (6%), significant spike synchrony was only observed near to the unit's best modulation frequency for all unit types tested. Modulation tuning therefore became sharper with decreasing modulation depth. AM detection threshold was calculated for each individual unit as a function of modulation frequency. Chopper units have significantly better AM detection thresholds than do primary-like units. AM detection threshold is significantly worse at 40 dB vs. 10 dB above pure-tone spike rate threshold. Mean modulation detection thresholds for sounds 10 dB above pure-tone spike rate threshold at best modulation frequency are (95% CI) 11.6% (10.0-13.1) for PL units, 9.8% (8.2-11.5) for CT units, and 10.8% (8.4-13.2) for CS units. The most sensitive guinea-pig VCN single unit AM detection thresholds are similar to human psychophysical performance (∼3% AM), while the mean neurometric thresholds approach whole animal behavioural performance (∼10% AM).
Collapse
Affiliation(s)
- Mark Sayles
- Department of Otolaryngology - Head and Neck Surgery, Queen's Medical Centre, Nottingham, NG7 2UH, UK.
| | | | | |
Collapse
|