1
|
Cocktail party training induces increased speech intelligibility and decreased cortical activity in bilateral inferior frontal gyri. A functional near-infrared study. PLoS One 2022; 17:e0277801. [PMID: 36454948 PMCID: PMC9714910 DOI: 10.1371/journal.pone.0277801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 11/03/2022] [Indexed: 12/03/2022] Open
Abstract
The human brain networks responsible for selectively listening to a voice amid other talkers remain to be clarified. The present study aimed to investigate relationships between cortical activity and performance in a speech-in-speech task, before (Experiment I) and after training-induced improvements (Experiment II). In Experiment I, 74 participants performed a speech-in-speech task while their cortical activity was measured using a functional near infrared spectroscopy (fNIRS) device. One target talker and one masker talker were simultaneously presented at three different target-to-masker ratios (TMRs): adverse, intermediate and favorable. Behavioral results show that performance may increase monotonically with TMR in some participants and failed to decrease, or even improved, in the adverse-TMR condition for others. On the neural level, an extensive brain network including the frontal (left prefrontal cortex, right dorsolateral prefrontal cortex and bilateral inferior frontal gyri) and temporal (bilateral auditory cortex) regions was more solicited by the intermediate condition than the two others. Additionally, bilateral frontal gyri and left auditory cortex activities were found to be positively correlated with behavioral performance in the adverse-TMR condition. In Experiment II, 27 participants, whose performance was the poorest in the adverse-TMR condition of Experiment I, were trained to improve performance in that condition. Results show significant performance improvements along with decreased activity in bilateral inferior frontal gyri, the right dorsolateral prefrontal cortex, the left inferior parietal cortex and the right auditory cortex in the adverse-TMR condition after training. Arguably, lower neural activity reflects higher efficiency in processing masker inhibition after speech-in-speech training. As speech-in-noise tasks also imply frontal and temporal regions, we suggest that regardless of the type of masking (speech or noise) the complexity of the task will prompt the implication of a similar brain network. Furthermore, the initial significant cognitive recruitment will be reduced following a training leading to an economy of cognitive resources.
Collapse
|
2
|
Vocal and semantic cues for the segregation of long concurrent speech stimuli in diotic and dichotic listening-The Long-SWoRD test. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:1557. [PMID: 35364949 DOI: 10.1121/10.0007225] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 10/25/2021] [Indexed: 06/14/2023]
Abstract
It is not always easy to follow a conversation in a noisy environment. To distinguish between two speakers, a listener must mobilize many perceptual and cognitive processes to maintain attention on a target voice and avoid shifting attention to the background noise. The development of an intelligibility task with long stimuli-the Long-SWoRD test-is introduced. This protocol allows participants to fully benefit from the cognitive resources, such as semantic knowledge, to separate two talkers in a realistic listening environment. Moreover, this task also provides the experimenters with a means to infer fluctuations in auditory selective attention. Two experiments document the performance of normal-hearing listeners in situations where the perceptual separability of the competing voices ranges from easy to hard using a combination of voice and binaural cues. The results show a strong effect of voice differences when the voices are presented diotically. In addition, analyzing the influence of the semantic context on the pattern of responses indicates that the semantic information induces a response bias in situations where the competing voices are distinguishable and indistinguishable from one another.
Collapse
|
3
|
Behavioral Account of Attended Stream Enhances Neural Tracking. Front Neurosci 2021; 15:674112. [PMID: 34966252 PMCID: PMC8710602 DOI: 10.3389/fnins.2021.674112] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open
Abstract
During the past decade, several studies have identified electroencephalographic (EEG) correlates of selective auditory attention to speech. In these studies, typically, listeners are instructed to focus on one of two concurrent speech streams (the "target"), while ignoring the other (the "masker"). EEG signals are recorded while participants are performing this task, and subsequently analyzed to recover the attended stream. An assumption often made in these studies is that the participant's attention can remain focused on the target throughout the test. To check this assumption, and assess when a participant's attention in a concurrent speech listening task was directed toward the target, the masker, or neither, we designed a behavioral listen-then-recall task (the Long-SWoRD test). After listening to two simultaneous short stories, participants had to identify keywords from the target story, randomly interspersed among words from the masker story and words from neither story, on a computer screen. To modulate task difficulty, and hence, the likelihood of attentional switches, masker stories were originally uttered by the same talker as the target stories. The masker voice parameters were then manipulated to parametrically control the similarity of the two streams, from clearly dissimilar to almost identical. While participants listened to the stories, EEG signals were measured and subsequently, analyzed using a temporal response function (TRF) model to reconstruct the speech stimuli. Responses in the behavioral recall task were used to infer, retrospectively, when attention was directed toward the target, the masker, or neither. During the model-training phase, the results of these behavioral-data-driven inferences were used as inputs to the model in addition to the EEG signals, to determine if this additional information would improve stimulus reconstruction accuracy, relative to performance of models trained under the assumption that the listener's attention was unwaveringly focused on the target. Results from 21 participants show that information regarding the actual - as opposed to, assumed - attentional focus can be used advantageously during model training, to enhance subsequent (test phase) accuracy of auditory stimulus-reconstruction based on EEG signals. This is the case, especially, in challenging listening situations, where the participants' attention is less likely to remain focused entirely on the target talker. In situations where the two competing voices are clearly distinct and easily separated perceptually, the assumption that listeners are able to stay focused on the target is reasonable. The behavioral recall protocol introduced here provides experimenters with a means to behaviorally track fluctuations in auditory selective attention, including, in combined behavioral/neurophysiological studies.
Collapse
|
4
|
Changes in Heart Rate Variability Following Acoustic Therapy in Individuals With Tinnitus. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:1413-1419. [PMID: 33820426 DOI: 10.1044/2021_jslhr-20-00596] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Purpose The aim of the study was to investigate changes in autonomic function, as measured by heart rate variability, in individuals with tinnitus following acoustic therapy implemented using tinnitus maskers presented via hearing aids. Method Twenty-six individuals with tinnitus and hearing impairment completed an 8-week field trial wearing hearing aids providing acoustic therapy via three tinnitus masker options set just below minimum masking level. Tinnitus handicap was measured using the Tinnitus Handicap Inventory at baseline (before starting acoustic therapy) and posttreatment (at end of 8-week trial). Resting heart rate and heart rate variability were measured using electrocardiography at baseline and posttreatment. Results There was a significant decrease in tinnitus handicap posttreatment compared to baseline. There was no change in heart rate, but there was a significant increase in heart rate variability posttreatment compared to baseline. Conclusions Acoustic therapy using tinnitus maskers delivered via hearing aids provided tinnitus relief and produced a concurrent increase in heart rate variability, suggesting a decrease in stress. Heart rate variability is a potential biomarker for tracking efficacy of acoustic therapy; however, further research is required.
Collapse
|
5
|
Gradual decay and sudden death of short-term memory for pitch. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:259. [PMID: 33514136 PMCID: PMC7803383 DOI: 10.1121/10.0002992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 11/17/2020] [Accepted: 12/08/2020] [Indexed: 05/17/2023]
Abstract
The ability to discriminate frequency differences between pure tones declines as the duration of the interstimulus interval (ISI) increases. The conventional explanation for this finding is that pitch representations gradually decay from auditory short-term memory. Gradual decay means that internal noise increases with increasing ISI duration. Another possibility is that pitch representations experience "sudden death," disappearing without a trace from memory. Sudden death means that listeners guess (respond at random) more often when the ISIs are longer. Since internal noise and guessing probabilities influence the shape of psychometric functions in different ways, they can be estimated simultaneously. Eleven amateur musicians performed a two-interval, two-alternative forced-choice frequency-discrimination task. The frequencies of the first tones were roved, and frequency differences and ISI durations were manipulated across trials. Data were analyzed using Bayesian models that simultaneously estimated internal noise and guessing probabilities. On average across listeners, internal noise increased monotonically as a function of increasing ISI duration, suggesting that gradual decay occurred. The guessing rate decreased with an increasing ISI duration between 0.5 and 2 s but then increased with further increases in ISI duration, suggesting that sudden death occurred but perhaps only at longer ISIs. Results are problematic for decay-only models of discrimination and contrast with those from a study on visual short-term memory, which found that over similar durations, visual representations experienced little gradual decay yet substantial sudden death.
Collapse
|
6
|
High-Frequency Sensorineural Hearing Loss Alters Cue-Weighting Strategies for Discriminating Stop Consonants in Noise. Trends Hear 2020; 23:2331216519886707. [PMID: 31722636 PMCID: PMC6856982 DOI: 10.1177/2331216519886707] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
There is increasing evidence that hearing-impaired (HI) individuals do not use the same listening strategies as normal-hearing (NH) individuals, even when wearing optimally fitted hearing aids. In this perspective, better characterization of individual perceptual strategies is an important step toward designing more effective speech-processing algorithms. Here, we describe two complementary approaches for (a) revealing the acoustic cues used by a participant in a /d/-/g/ categorization task in noise and (b) measuring the relative contributions of these cues to decision. These two approaches involve natural speech recordings altered by the addition of a “bump noise.” The bumps were narrowband bursts of noise localized on the spectrotemporal locations of the acoustic cues, allowing the experimenter to manipulate the consonant percept. The cue-weighting strategies were estimated for three groups of participants: 17 NH listeners, 18 HI listeners with high-frequency loss, and 15 HI listeners with flat loss. HI participants were provided with individual frequency-dependent amplification to compensate for their hearing loss. Although all listeners relied more heavily on the high-frequency cue than on the low-frequency cue, an important variability was observed in the individual weights, mostly explained by differences in internal noise. Individuals with high-frequency loss relied slightly less heavily on the high-frequency cue relative to the low-frequency cue, compared with NH individuals, suggesting a possible influence of supra-threshold deficits on cue-weighting strategies. Altogether, these results suggest a need for individually tailored speech-in-noise processing in hearing aids, if more effective speech discriminability in noise is to be achieved.
Collapse
|
7
|
On the utility of perceptual anchors during pure-tone frequency discrimination. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:371. [PMID: 32006971 PMCID: PMC7043863 DOI: 10.1121/10.0000584] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 11/26/2019] [Accepted: 12/18/2019] [Indexed: 05/17/2023]
Abstract
Perceptual anchors are representations of stimulus features stored in long-term memory rather than short-term memory. The present study investigated whether listeners use perceptual anchors to improve pure-tone frequency discrimination. Ten amateur musicians performed a two-interval, two-alternative forced-choice frequency-discrimination experiment. In one half of the experiment, the frequency of the first tone was fixed across trials, and in the other half, the frequency of the first tone was roved widely across trials. The durations of the interstimulus intervals (ISIs) and the frequency differences between the tones on each trial were also manipulated. The data were analyzed with a Bayesian model that assumed that performance was limited by sensory noise (related to the initial encoding of the stimuli), memory noise (which increased proportionally to the ISI), fluctuations in attention, and response bias. It was hypothesized that memory-noise variance increased more rapidly during roved-frequency discrimination than fixed-frequency discrimination because listeners used perceptual anchors in the latter condition. The results supported this hypothesis. The results also suggested that listeners experienced more lapses in attention during roved-frequency discrimination.
Collapse
|
8
|
Tracking the dynamic representation of consonants from auditory periphery to cortex. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:2462. [PMID: 30404465 DOI: 10.1121/1.5065492] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Accepted: 10/09/2018] [Indexed: 06/08/2023]
Abstract
In order to perceive meaningful speech, the auditory system must recognize different phonemes amidst a noisy and variable acoustic signal. To better understand the processing mechanisms underlying this ability, evoked cortical responses to different spoken consonants were measured with electroencephalography (EEG). Using multivariate pattern analysis (MVPA), binary classifiers attempted to discriminate between the EEG activity evoked by two given consonants at each peri-stimulus time sample, providing a dynamic measure of their cortical dissimilarity. To examine the relationship between representations at the auditory periphery and cortex, MVPA was also applied to modelled auditory-nerve (AN) responses of consonants, and time-evolving AN-based and EEG-based dissimilarities were compared with one another. Cortical dissimilarities between consonants were commensurate with their articulatory distinctions, particularly their manner of articulation, and to a lesser extent, their voicing. Furthermore, cortical distinctions between consonants in two periods of activity, centered at 130 and 400 ms after onset, aligned with their peripheral dissimilarities in distinct onset and post-onset periods, respectively. In relating speech representations across articulatory, peripheral, and cortical domains, the understanding of crucial transformations in the auditory pathway underlying the ability to perceive speech is advanced.
Collapse
|
9
|
Effect of stimulus type and pitch salience on pitch-sequence processing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:3665. [PMID: 29960504 DOI: 10.1121/1.5043405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Using a same-different discrimination task, it has been shown that discrimination performance for sequences of complex tones varying just detectably in pitch is less dependent on sequence length (1, 2, or 4 elements) when the tones contain resolved harmonics than when they do not [Cousineau, Demany, and Pessnitzer (2009). J. Acoust. Soc. Am. 126, 3179-3187]. This effect had been attributed to the activation of automatic frequency-shift detectors (FSDs) by the shifts in resolved harmonics. The present study provides evidence against this hypothesis by showing that the sequence-processing advantage found for complex tones with resolved harmonics is not found for pure tones or other sounds supposed to activate FSDs (narrow bands of noise and wide-band noises eliciting pitch sensations due to interaural phase shifts). The present results also indicate that for pitch sequences, processing performance is largely unrelated to pitch salience per se: for a fixed level of discriminability between sequence elements, sequences of elements with salient pitches are not necessarily better processed than sequences of elements with less salient pitches. An ideal-observer model for the same-different binary-sequence discrimination task is also developed in the present study. The model allows the computation of d' for this task using numerical methods.
Collapse
|
10
|
Continued search for better prediction of aided speech understanding in multi-talker environments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:2386. [PMID: 29092591 DOI: 10.1121/1.5008498] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
To better understand issues of hearing-aid benefit during natural listening, this study examined the added demand placed by the goal of understanding speech over the more typically studied goal of simply recognizing speech sounds. The study compared hearing-aid benefit in two conditions, and examined factors that might account for the observed benefits. In the phonetic condition, listeners needed only identify the correct sound to make a correct response. In the semantic condition, listeners had to understand what they had heard to respond correctly, because the answer did not include any keywords from the spoken speech. Hearing aids provided significant benefit for listeners in the phonetic condition. In the semantic condition on the other hand, there were large inter-individual differences, with many listeners not experiencing any benefit of aiding. Neither a set of cognitive and linguistic tests, nor age, could explain this variability. Furthermore, analysis of psychometric functions showed that enhancement of the target speech fidelity through improvement of signal-to-noise ratio had a larger impact on listeners' performance in the phonetic condition than in the semantic condition. These results demonstrate the importance of incorporating naturalistic elements in the simulation of multi-talker listening for assessing the benefits of intervention in communication success.
Collapse
|
11
|
Frogs Exploit Statistical Regularities in Noisy Acoustic Scenes to Solve Cocktail-Party-like Problems. Curr Biol 2017; 27:743-750. [PMID: 28238657 DOI: 10.1016/j.cub.2017.01.031] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Revised: 01/13/2017] [Accepted: 01/18/2017] [Indexed: 11/30/2022]
Abstract
Noise is a ubiquitous source of errors in all forms of communication [1]. Noise-induced errors in speech communication, for example, make it difficult for humans to converse in noisy social settings, a challenge aptly named the "cocktail party problem" [2]. Many nonhuman animals also communicate acoustically in noisy social groups and thus face biologically analogous problems [3]. However, we know little about how the perceptual systems of receivers are evolutionarily adapted to avoid the costs of noise-induced errors in communication. In this study of Cope's gray treefrog (Hyla chrysoscelis; Hylidae), we investigated whether receivers exploit a potential statistical regularity present in noisy acoustic scenes to reduce errors in signal recognition and discrimination. We developed an anatomical/physiological model of the peripheral auditory system to show that temporal correlation in amplitude fluctuations across the frequency spectrum ("comodulation") [4-6] is a feature of the noise generated by large breeding choruses of sexually advertising males. In four psychophysical experiments, we investigated whether females exploit comodulation in background noise to mitigate noise-induced errors in evolutionarily critical mate-choice decisions. Subjects experienced fewer errors in recognizing conspecific calls and in selecting the calls of high-quality mates in the presence of simulated chorus noise that was comodulated. These data show unequivocally, and for the first time, that exploiting statistical regularities present in noisy acoustic scenes is an important biological strategy for solving cocktail-party-like problems in nonhuman animal communication.
Collapse
|
12
|
Binaural Diplacusis and Its Relationship with Hearing-Threshold Asymmetry. PLoS One 2016; 11:e0159975. [PMID: 27536884 PMCID: PMC4990190 DOI: 10.1371/journal.pone.0159975] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Accepted: 07/11/2016] [Indexed: 12/02/2022] Open
Abstract
Binaural pitch diplacusis refers to a perceptual anomaly whereby the same sound is perceived as having a different pitch depending on whether it is presented in the left or the right ear. Results in the literature suggest that this phenomenon is more prevalent, and larger, in individuals with asymmetric hearing loss than in individuals with symmetric hearing. However, because studies devoted to this effect have thus far involved small samples, the prevalence of the effect, and its relationship with interaural asymmetries in hearing thresholds, remain unclear. In this study, psychometric functions for interaural pitch comparisons were measured in 55 subjects, including 12 normal-hearing and 43 hearing-impaired participants. Statistically significant pitch differences between the left and right ears were observed in normal-hearing participants, but the effect was usually small (less than 1.5/16 octave, or about 7%). For the hearing-impaired participants, statistically significant interaural pitch differences were found in about three-quarters of the cases. Moreover, for about half of these participants, the difference exceeded 1.5/16 octaves and, in some participants, was as large as or larger than 1/4 octave. This was the case even for the lowest frequency tested, 500 Hz. The pitch differences were weakly, but significantly, correlated with the difference in hearing thresholds between the two ears, such that larger threshold asymmetries were statistically associated with larger pitch differences. For the vast majority of the hearing-impaired participants, the direction of the pitch differences was such that pitch was perceived as higher on the side with the higher (i.e., ‘worse’) hearing thresholds than on the opposite side. These findings are difficult to reconcile with purely temporal models of pitch perception, but may be accounted for by place-based or spectrotemporal models.
Collapse
|
13
|
Intelligent hearing aids: the next revolution. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2016; 2016:72-76. [PMID: 28268284 DOI: 10.1109/embc.2016.7590643] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The first revolution in hearing aids came from nonlinear amplification, which allows better compensation for both soft and loud sounds. The second revolution stemmed from the introduction of digital signal processing, which allows better programmability and more sophisticated algorithms. The third revolution in hearing aids is wireless, which allows seamless connectivity between a pair of hearing aids and with more and more external devices. Each revolution has fundamentally transformed hearing aids and pushed the entire industry forward significantly. Machine learning has received significant attention in recent years and has been applied in many other industries, e.g., robotics, speech recognition, genetics, and crowdsourcing. We argue that the next revolution in hearing aids is machine intelligence. In fact, this revolution is already quietly happening. We will review the development in at least three major areas: applications of machine learning in speech enhancement; applications of machine learning in individualization and customization of signal processing algorithms; applications of machine learning in improving the efficiency and effectiveness of clinical tests. With the advent of the internet of things, the above developments will accelerate. This revolution will bring patient satisfactions to a new level that has never been seen before.
Collapse
|
14
|
Abstract
The question of what makes a good melody has interested composers, music theorists, and psychologists alike. Many of the observed principles of good "melodic continuation" involve melodic contour-the pattern of rising and falling pitch within a sequence. Previous work has shown that contour perception can extend beyond pitch to other auditory dimensions, such as brightness and loudness. Here, we show that the generalization of contour perception to nontraditional dimensions also extends to melodic expectations. In the first experiment, subjective ratings for 3-tone sequences that vary in brightness or loudness conformed to the same general contour-based expectations as pitch sequences. In the second experiment, we modified the sequence of melody presentation such that melodies with the same beginning were blocked together. This change produced substantively different results, but the patterns of ratings remained similar across the 3 auditory dimensions. Taken together, these results suggest that (a) certain well-known principles of melodic expectation (such as the expectation for a reversal following a skip) are dependent on long-term context, and (b) these expectations are not unique to the dimension of pitch and may instead reflect more general principles of perceptual organization.
Collapse
|
15
|
Neural representation of concurrent harmonic sounds in monkey primary auditory cortex: implications for models of auditory scene analysis. J Neurosci 2014; 34:12425-43. [PMID: 25209282 PMCID: PMC4160777 DOI: 10.1523/jneurosci.0025-14.2014] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2014] [Revised: 07/14/2014] [Accepted: 07/28/2014] [Indexed: 11/21/2022] Open
Abstract
The ability to attend to a particular sound in a noisy environment is an essential aspect of hearing. To accomplish this feat, the auditory system must segregate sounds that overlap in frequency and time. Many natural sounds, such as human voices, consist of harmonics of a common fundamental frequency (F0). Such harmonic complex tones (HCTs) evoke a pitch corresponding to their F0. A difference in pitch between simultaneous HCTs provides a powerful cue for their segregation. The neural mechanisms underlying concurrent sound segregation based on pitch differences are poorly understood. Here, we examined neural responses in monkey primary auditory cortex (A1) to two concurrent HCTs that differed in F0 such that they are heard as two separate "auditory objects" with distinct pitches. We found that A1 can resolve, via a rate-place code, the lower harmonics of both HCTs, a prerequisite for deriving their pitches and for their perceptual segregation. Onset asynchrony between the HCTs enhanced the neural representation of their harmonics, paralleling their improved perceptual segregation in humans. Pitches of the concurrent HCTs could also be temporally represented by neuronal phase-locking at their respective F0s. Furthermore, a model of A1 responses using harmonic templates could qualitatively reproduce psychophysical data on concurrent sound segregation in humans. Finally, we identified a possible intracortical homolog of the "object-related negativity" recorded noninvasively in humans, which correlates with the perceptual segregation of concurrent sounds. Findings indicate that A1 contains sufficient spectral and temporal information for segregating concurrent sounds based on differences in pitch.
Collapse
|
16
|
A demonstration of improved precision of word recognition scores. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2014; 57:543-555. [PMID: 24686502 DOI: 10.1044/2014_jslhr-h-13-0017] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
PURPOSE The purpose of this study was to demonstrate improved precision of word recognition scores (WRSs) by increasing list length and analyzing phonemic errors. METHOD Pure-tone thresholds (frequencies between 0.25 and 8.0 kHz) and WRSs were measured in 3 levels of speech-shaped noise (50, 52, and 54 dB HL) for 24 listeners with normal hearing. WRSs were obtained for half-lists and full lists of Northwestern University Test No. 6 (Tillman & Carhart, 1966) words presented at 48 dB HL. A resampling procedure was used to derive dimensionless effect sizes for identifying a change in hearing using the data. This allowed the direct comparison of the magnitude of shifts in WRS (%) and in the average pure-tone threshold (dB), which provided a context for interpreting the WRS. RESULTS WRSs based on a 50-word list analyzed by the percentage of correct phonemes were significantly more sensitive for identifying a change in hearing than the WRSs based on 25-word lists analyzed by percentage of correct words. CONCLUSION Increasing the number of items that contribute to a WRS significantly increased the test's ability to identify a change in hearing. Clinical and research applications could potentially benefit from a more precise word recognition test, the only basic audiologic measure that estimates directly the distortion component of hearing loss and its effect on communication.
Collapse
|
17
|
Auditory frequency and intensity discrimination explained using a cortical population rate code. PLoS Comput Biol 2013; 9:e1003336. [PMID: 24244142 PMCID: PMC3828126 DOI: 10.1371/journal.pcbi.1003336] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Accepted: 09/27/2013] [Indexed: 11/18/2022] Open
Abstract
The nature of the neural codes for pitch and loudness, two basic auditory attributes, has been a key question in neuroscience for over century. A currently widespread view is that sound intensity (subjectively, loudness) is encoded in spike rates, whereas sound frequency (subjectively, pitch) is encoded in precise spike timing. Here, using information-theoretic analyses, we show that the spike rates of a population of virtual neural units with frequency-tuning and spike-count correlation characteristics similar to those measured in the primary auditory cortex of primates, contain sufficient statistical information to account for the smallest frequency-discrimination thresholds measured in human listeners. The same population, and the same spike-rate code, can also account for the intensity-discrimination thresholds of humans. These results demonstrate the viability of a unified rate-based cortical population code for both sound frequency (pitch) and sound intensity (loudness), and thus suggest a resolution to a long-standing puzzle in auditory neuroscience.
Collapse
|
18
|
Effects of sensorineural hearing loss on temporal coding of harmonic and inharmonic tone complexes in the auditory nerve. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2013; 787:109-18. [PMID: 23716215 DOI: 10.1007/978-1-4614-1590-9_13] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
Abstract
Listeners with sensorineural hearing loss (SNHL) often show poorer thresholds for fundamental-frequency (F0) discrimination and poorer discrimination between harmonic and frequency-shifted (inharmonic) complex tones, than normal-hearing (NH) listeners-especially when these tones contain resolved or partially resolved components. It has been suggested that these perceptual deficits reflect reduced access to temporal-fine-structure (TFS) information and could be due to degraded phase locking in the auditory nerve (AN) with SNHL. In the present study, TFS and temporal-envelope (ENV) cues in single AN-fiber responses to band-pass-filtered harmonic and inharmonic complex tones were -measured in chinchillas with either normal-hearing or noise-induced SNHL. The stimuli were comparable to those used in recent psychophysical studies of F0 and harmonic/inharmonic discrimination. As in those studies, the rank of the center component was manipulated to produce -different resolvability conditions, different phase relationships (cosine and random phase) were tested, and background noise was present. Neural TFS and ENV cues were quantified using cross-correlation coefficients computed using shuffled cross correlograms between neural responses to REF (harmonic) and TEST (F0- or frequency-shifted) stimuli. In animals with SNHL, AN-fiber tuning curves showed elevated thresholds, broadened tuning, best-frequency shifts, and downward shifts in the dominant TFS response component; however, no significant degradation in the ability of AN fibers to encode TFS or ENV cues was found. Consistent with optimal-observer analyses, the results indicate that TFS and ENV cues depended only on the relevant frequency shift in Hz and thus were not degraded because phase locking remained intact. These results suggest that perceptual "TFS-processing" deficits do not simply reflect degraded phase locking at the level of the AN. To the extent that performance in F0- and harmonic/inharmonic discrimination tasks depend on TFS cues, it is likely through a more complicated (suboptimal) decoding mechanism, which may involve "spatiotemporal" (place-time) neural representations.
Collapse
|
19
|
Illusory auditory continuity despite neural evidence to the contrary. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2013; 787:483-9. [PMID: 23716255 DOI: 10.1007/978-1-4614-1590-9_53] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2023]
Abstract
Many previous studies have shown that a tone that is momentarily -interrupted can be perceived as continuous if the interruption is completely masked by noise. It has been suggested this "continuity illusion" occurs only when peripheral neural responses contain no evidence that the signal was interrupted. In this study, we used a combination of psychophysical measures and computational simulations of peripheral auditory responses to examine whether the continuity illusion can be experienced under conditions where peripheral neural responses contain evidence that the signal did not continue through the masker. Our results provide an example of a salient continuity illusion despite evidence of an interruption in the peripheral representation, indicating that the illusion may depend more on global features of the interrupting sound, such as its long-term specific loudness, than on its fine-grained temporal structure.
Collapse
|
20
|
Perception of across-frequency asynchrony by listeners with cochlear hearing loss. J Assoc Res Otolaryngol 2013; 14:573-89. [PMID: 23612740 DOI: 10.1007/s10162-013-0387-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2012] [Accepted: 03/20/2013] [Indexed: 11/24/2022] Open
Abstract
Cochlear hearing loss is often associated with broader tuning of the cochlear filters. Cochlear response latencies are dependent on the filter bandwidths, so hearing loss may affect the relationship between latencies across different characteristic frequencies. This prediction was tested by investigating the perception of synchrony between two tones exciting different regions of the cochlea in listeners with hearing loss. Subjective judgments of synchrony were compared with thresholds for asynchrony discrimination in a three-alternative forced-choice task. In contrast to earlier data from normal-hearing (NH) listeners, the synchronous-response functions obtained from the hearing-impaired (HI) listeners differed in patterns of symmetry and often had a very low peak (i.e., maximum proportion of "synchronous" responses). Also in contrast to data from NH listeners, the quantitative and qualitative correspondence between the data from the subjective and the forced-choice tasks was often poor. The results do not provide strong evidence for the influence of changes in cochlear mechanics on the perception of synchrony in HI listeners, and it remains possible that age, independent of hearing loss, plays an important role in temporal synchrony and asynchrony perception.
Collapse
|
21
|
Abstract
Sound sequences, such as music, are usually organized perceptually into concurrent "streams." The mechanisms underlying this "auditory streaming" phenomenon are not completely known. The present study sought to test the hypothesis that synchrony limits listeners' ability to separate sound streams. To test this hypothesis, both perceptual-organization judgments and performance measures were used. In Experiment 1, listeners indicated whether they perceived sequences of alternating or synchronous tones as a single stream or as two streams. In Experiments 2 and 3, listeners detected rare changes in the intensity of "target" tones at one frequency in the presence of synchronous or asynchronous random-intensity "distractor" tones at another frequency. The results of these experiments showed that, for large frequency separations between the tones, the probability of perceiving two streams was lower on average for synchronous than for alternating tones, and that sensitivity to intensity changes in the target sequence was greater for asynchronous than for synchronous distractors. Overall, these results are consistent with the hypothesis that synchrony limits listeners' ability to form separate streams and/or to attend selectively to certain sounds in the presence of other sounds, even when the target and distractor sounds are well separated from each other in frequency.
Collapse
|
22
|
Temporal coherence versus harmonicity in auditory stream formation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:EL188-EL194. [PMID: 23464127 PMCID: PMC3579859 DOI: 10.1121/1.4789866] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2012] [Revised: 01/02/2013] [Accepted: 01/16/2013] [Indexed: 06/01/2023]
Abstract
This study sought to investigate the influence of temporal incoherence and inharmonicity on concurrent stream segregation, using performance-based measures. Subjects discriminated frequency shifts in a temporally regular sequence of target pure tones, embedded in a constant or randomly varying multi-tone background. Depending on the condition tested, the target tones were either temporally coherent or incoherent with, and either harmonically or inharmonically related to, the background tones. The results provide further evidence that temporal incoherence facilitates stream segregation and they suggest that deviations from harmonicity can cause similar facilitation effects, even when the targets and the maskers are temporally coherent.
Collapse
|
23
|
Effects of temporal stimulus properties on the perception of across-frequency asynchrony. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 133:982-997. [PMID: 23363115 PMCID: PMC3574076 DOI: 10.1121/1.4773350] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2012] [Revised: 11/24/2012] [Accepted: 12/11/2012] [Indexed: 06/01/2023]
Abstract
The role of temporal stimulus parameters in the perception of across-frequency synchrony and asynchrony was investigated using pairs of 500-ms tones consisting of a 250-Hz tone and a tone with a higher frequency of 1, 2, 4, or 6 kHz. Subjective judgments suggested veridical perception of across-frequency synchrony but with greater sensitivity to changes in asynchrony for pairs in which the lower-frequency tone was leading than for pairs in which it was lagging. Consistent with the subjective judgments, thresholds for the detection of asynchrony measured in a three-alternative forced-choice task were lower when the signal interval contained a pair with the low-frequency tone leading than a pair with a high-frequency tone leading. A similar asymmetry was observed for asynchrony discrimination when the standard asynchrony was relatively small (≤20 ms) but not for larger standard asynchronies. Independent manipulation of onset and offset ramp durations indicated a dominant role of onsets in the perception of across-frequency asynchrony. A physiologically inspired model, involving broadly tuned monaural coincidence detectors that receive inputs from frequency-selective onset detectors, was able to accurately reproduce the asymmetric distributions of synchrony judgments. The model provides testable predictions for future physiological investigations of responses to broadband stimuli with across-frequency delays.
Collapse
|
24
|
Temporal coherence and the streaming of complex sounds. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2013; 787:535-43. [PMID: 23716261 DOI: 10.1007/978-1-4614-1590-9_59] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Humans and other animals can attend to one of multiple sounds, and -follow it selectively over time. The neural underpinnings of this perceptual feat remain mysterious. Some studies have concluded that sounds are heard as separate streams when they activate well-separated populations of central auditory neurons, and that this process is largely pre-attentive. Here, we propose instead that stream formation depends primarily on temporal coherence between responses that encode various features of a sound source. Furthermore, we postulate that only when attention is directed toward a particular feature (e.g., pitch or location) do all other temporally coherent features of that source (e.g., timbre and location) become bound together as a stream that is segregated from the incoherent features of other sources. Experimental -neurophysiological evidence in support of this hypothesis will be presented. The focus, however, will be on a computational realization of this idea and a discussion of the insights learned from simulations to disentangle complex sound sources such as speech and music. The model consists of a representational stage of early and cortical auditory processing that creates a multidimensional depiction of various sound attributes such as pitch, location, and spectral resolution. The following stage computes a coherence matrix that summarizes the pair-wise correlations between all channels making up the cortical representation. Finally, the perceived segregated streams are extracted by decomposing the coherence matrix into its uncorrelated components. Questions raised by the model are discussed, especially on the role of attention in streaming and the search for further neural correlates of streaming percepts.
Collapse
|
25
|
Measuring decision weights in recognition experiments with multiple response alternatives: comparing the correlation and multinomial-logistic-regression methods. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 132:3418-3427. [PMID: 23145622 PMCID: PMC3505214 DOI: 10.1121/1.4754523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2012] [Revised: 08/20/2012] [Accepted: 08/28/2012] [Indexed: 06/01/2023]
Abstract
Psychophysical "reverse-correlation" methods allow researchers to gain insight into the perceptual representations and decision weighting strategies of individual subjects in perceptual tasks. Although these methods have gained momentum, until recently their development was limited to experiments involving only two response categories. Recently, two approaches for estimating decision weights in m-alternative experiments have been put forward. One approach extends the two-category correlation method to m > 2 alternatives; the second uses multinomial logistic regression (MLR). In this article, the relative merits of the two methods are discussed, and the issues of convergence and statistical efficiency of the methods are evaluated quantitatively using Monte Carlo simulations. The results indicate that, for a range of values of the number of trials, the estimated weighting patterns are closer to their asymptotic values for the correlation method than for the MLR method. Moreover, for the MLR method, weight estimates for different stimulus components can exhibit strong correlations, making the analysis and interpretation of measured weighting patterns less straightforward than for the correlation method. These and other advantages of the correlation method, which include computational simplicity and a close relationship to other well-established psychophysical reverse-correlation methods, make it an attractive tool to uncover decision strategies in m-alternative experiments.
Collapse
|
26
|
Auditory discrimination of frequency ratios: the octave singularity. J Exp Psychol Hum Percept Perform 2012; 39:788-801. [PMID: 23088507 DOI: 10.1037/a0030095] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Sensitivity to frequency ratios is essential for the perceptual processing of complex sounds and the appreciation of music. This study assessed the effect of ratio simplicity on ratio discrimination for pure tones presented either simultaneously or sequentially. Each stimulus consisted of four 100-ms pure tones, equally spaced in terms of frequency ratio and presented at a low intensity to limit interactions in the auditory periphery. Listeners had to discriminate between a reference frequency ratio of 0.97 octave (about 1.96:1) and target frequency ratios, which were larger than the reference. In the simultaneous condition, the obtained psychometric functions were nonmonotonic: as the target frequency ratio increased from 0.98 octave to 1.04 octaves, discrimination performance initially increased, then decreased, and then increased again; performance was better when the target was exactly one octave (2:1) than when the target was slightly larger. In the sequential condition, by contrast, the psychometric functions were monotonic and there was no effect of frequency ratio simplicity. A control experiment verified that the non-monotonicity observed in the simultaneous condition did not originate from peripheral interactions between the tones. Our results indicate that simultaneous octaves are recognized as "special" frequency intervals by a mechanism that is insensitive to the sign (positive or negative) of deviations from the octave, whereas this is apparently not the case for sequential octaves.
Collapse
|
27
|
Separating the contributions of primary and unwanted cues in psychophysical studies. Psychol Rev 2012; 119:770-88. [PMID: 22844984 DOI: 10.1037/a0029343] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A fundamental issue in the design and the interpretation of experimental studies of perception relates to the question of whether the participants in these experiments could perform the perceptual task assigned to them using another feature, or cue, than that intended by the experimenter. An approach frequently used by auditory- and visual-perception researchers to guard against this possibility involves applying random variations to the stimuli across presentations or trials so as to make the "unwanted" cue unreliable for the participants. However, the theoretical basis of this widespread practice is not well developed. In this article, we describe a 2-channel model based on general principles of psychophysical signal detection theory, which can be used to assess the respective contributions of the unwanted cue and of the primary cue to performance or thresholds measured in perceptual discrimination experiments involving stimulus randomization. Example applications of the model to the analysis of results obtained in representative studies from the auditory- and visual-perception literature are provided. In several cases, the results of the model-based analyses indicate that the effectiveness of the randomization procedure was less than originally assumed by the authors of these studies. These findings underscore the importance of quantifying the potential influence of unwanted cues on the results of psychophysical experiments, even when stimulus randomization is used.
Collapse
|
28
|
Characterizing the dependence of pure-tone frequency difference limens on frequency, duration, and level. Hear Res 2012; 292:1-13. [PMID: 22841571 DOI: 10.1016/j.heares.2012.07.004] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/17/2012] [Revised: 07/05/2012] [Accepted: 07/14/2012] [Indexed: 10/28/2022]
Abstract
This study examined the relationship between the difference limen for frequency (DLF) of pure tones and three commonly explored stimulus parameters of frequency, duration, and sensation level. Data from 12 published studies of pure-tone frequency discrimination (a total of 583 DLF measurements across 77 normal-hearing listeners) were analyzed using hierarchical (or "mixed-effects") generalized linear models. Model parameters were estimated using two approaches (Bayesian and maximum likelihood). A model in which log-transformed DLFs were predicted using a sum of power-law functions plus a random subject- or group-specific term was found to explain a substantial proportion of the variability in the psychophysical data. The results confirmed earlier findings of an inverse-square-root relationship between log-transformed DLFs and duration, and of an inverse relationship between log(DLF) and sensation level. However, they did not confirm earlier suggestions that log(DLF) increases approximately linearly with the square-root of frequency; instead, the relationship between frequency and log(DLF) was best fitted using a power function of frequency with an exponent of about 0.8. These results, and the comprehensive quantitative analysis of pure-tone frequency discrimination on which they are based, provide a new reference for the quantitative evaluation of models of frequency (or pitch) discrimination.
Collapse
|
29
|
Comparing models of the combined-stimulation advantage for speech recognition. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:3970-3980. [PMID: 22559370 PMCID: PMC3356316 DOI: 10.1121/1.3699231] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2011] [Revised: 03/06/2012] [Accepted: 03/08/2012] [Indexed: 05/31/2023]
Abstract
The "combined-stimulation advantage" refers to an improvement in speech recognition when cochlear-implant or vocoded stimulation is supplemented by low-frequency acoustic information. Previous studies have been interpreted as evidence for "super-additive" or "synergistic" effects in the combination of low-frequency and electric or vocoded speech information by human listeners. However, this conclusion was based on predictions of performance obtained using a suboptimal high-threshold model of information combination. The present study shows that a different model, based on Gaussian signal detection theory, can predict surprisingly large combined-stimulation advantages, even when performance with either information source alone is close to chance, without involving any synergistic interaction. A reanalysis of published data using this model reveals that previous results, which have been interpreted as evidence for super-additive effects in perception of combined speech stimuli, are actually consistent with a more parsimonious explanation, according to which the combined-stimulation advantage reflects an optimal combination of two independent sources of information. The present results do not rule out the possible existence of synergistic effects in combined stimulation; however, they emphasize the possibility that the combined-stimulation advantages observed in some studies can be explained simply by non-interactive combination of two information sources.
Collapse
|
30
|
Further evidence that fundamental-frequency difference limens measure pitch discrimination. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:3989-4001. [PMID: 22559372 PMCID: PMC3356318 DOI: 10.1121/1.3699253] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2011] [Revised: 03/08/2012] [Accepted: 03/11/2012] [Indexed: 05/31/2023]
Abstract
Difference limens for complex tones (DLCs) that differ in F0 are widely regarded as a measure of periodicity-pitch discrimination. However, because F0 changes are inevitably accompanied by changes in the frequencies of the harmonics, DLCs may actually reflect the discriminability of individual components. To test this hypothesis, DLCs were measured for complex tones, the component frequencies of which were shifted coherently upward or downward by ΔF = 0%, 25%, 37.5%, or 50% of the F0, yielding fully harmonic (ΔF = 0%), strongly inharmonic (ΔF = 25%, 37.5%), or odd-harmonic (ΔF = 50%) tones. If DLCs truly reflect periodicity-pitch discriminability, they should be larger (worse) for inharmonic tones than for harmonic and odd harmonic tones because inharmonic tones have a weaker pitch. Consistent with this prediction, the results of two experiments showed a non-monotonic dependence of DLCs on ΔF, with larger DLCs for ΔF's of ± 25% or ± 37.5% than for ΔF's of 0 or ± 50% of F0. These findings are consistent with models of pitch perception that involve harmonic templates or with an autocorrelation-based model provided that more than just the highest peak in the summary autocorrelogram is taken into account.
Collapse
|
31
|
Neural mechanisms of rhythmic masking release in monkey primary auditory cortex: implications for models of auditory scene analysis. J Neurophysiol 2012; 107:2366-82. [PMID: 22323627 DOI: 10.1152/jn.01010.2011] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The ability to detect and track relevant acoustic signals embedded in a background of other sounds is crucial for hearing in complex acoustic environments. This ability is exemplified by a perceptual phenomenon known as "rhythmic masking release" (RMR). To demonstrate RMR, a sequence of tones forming a target rhythm is intermingled with physically identical "Distracter" sounds that perceptually mask the rhythm. The rhythm can be "released from masking" by adding "Flanker" tones in adjacent frequency channels that are synchronous with the Distracters. RMR represents a special case of auditory stream segregation, whereby the target rhythm is perceptually segregated from the background of Distracters when they are accompanied by the synchronous Flankers. The neural basis of RMR is unknown. Previous studies suggest the involvement of primary auditory cortex (A1) in the perceptual organization of sound patterns. Here, we recorded neural responses to RMR sequences in A1 of awake monkeys in order to identify neural correlates and potential mechanisms of RMR. We also tested whether two current models of stream segregation, when applied to these responses, could account for the perceptual organization of RMR sequences. Results suggest a key role for suppression of Distracter-evoked responses by the simultaneous Flankers in the perceptual restoration of the target rhythm in RMR. Furthermore, predictions of stream segregation models paralleled the psychoacoustics of RMR in humans. These findings reinforce the view that preattentive or "primitive" aspects of auditory scene analysis may be explained by relatively basic neural mechanisms at the cortical level.
Collapse
|
32
|
Perception of across-frequency asynchrony and the role of cochlear delays. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:363-377. [PMID: 22280598 PMCID: PMC3272712 DOI: 10.1121/1.3665995] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2011] [Revised: 11/10/2011] [Accepted: 11/10/2011] [Indexed: 05/29/2023]
Abstract
Cochlear filtering results in earlier responses to high than to low frequencies. This study examined potential perceptual correlates of cochlear delays by measuring the perception of relative timing between tones of different frequencies. A brief 250-Hz tone was combined with a brief 1-, 2-, 4-, or 6-kHz tone. Two experiments were performed, one involving subjective judgments of perceived synchrony, the other involving asynchrony detection and discrimination. The functions relating the proportion of "synchronous" responses to the delay between the tones were similar for all tone pairs. Perceived synchrony was maximal when the tones in a pair were gated synchronously. The perceived-synchrony function slopes were asymmetric, being steeper on the low-frequency-leading side. In the second experiment, asynchrony-detection thresholds were lower for low-frequency rather than for high-frequency leading pairs. In contrast with previous studies, but consistent with the first experiment, thresholds did not depend on frequency separation between the tones, perhaps because of the elimination of within-channel cues. The results of the two experiments were related quantitatively using a decision-theoretic model, and were found to be highly correlated. Overall the results suggest that frequency-dependent cochlear group delays are compensated for at higher processing stages, resulting in veridical perception of timing relationships across frequency.
Collapse
|
33
|
A model-based analysis of the "combined-stimulation advantage". Hear Res 2011; 282:252-64. [PMID: 21801823 DOI: 10.1016/j.heares.2011.06.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/20/2011] [Revised: 06/19/2011] [Accepted: 06/20/2011] [Indexed: 10/17/2022]
Abstract
Improvements in speech-recognition performance resulting from the addition of low-frequency information to electric (or vocoded) signals have attracted considerable interest in recent years. An important question is whether these improvements reflect a form of constructive perceptual interaction-whereby acoustic cues enhance the perception of electric or vocoded signals-or whether they can be explained without assuming any interaction. To address this question, speech-recognition performance was measured in 24 normal-hearing listeners using lowpass-filtered, vocoded, and "combined" (lowpass + vocoded) words presented either in quiet or in a realistic background (cafeteria noise), for different signal-to-noise ratios, different lowpass-filter cutoff frequencies, and different numbers of vocoder bands. The results of these measures were then compared to the predictions of three models of cue combination, including a "probability-summation" model and two Gaussian signal detection theory (SDT) models-one (the "independent-noises" model) involving pre-combination noises, and the other (the "late-noise" model) involving post-combination noise. Consistent with previous findings, speech-recognition performance with combined stimulation was significantly higher than performance with vocoded or lowpass stimuli alone, and it was also higher than predicted by the probability-summation model. The two Gaussian-SDT models could account quantitatively for the data. Moreover, a Bayesian model-comparison procedure demonstrated that, given the data, these two models were far more likely than the probability-summation model. Since these models do not involve any constructive-interaction mechanism, this demonstrates that constructive interactions are not needed to explain the combined-stimulation benefits measured in this study. It will be important for future studies to investigate whether this conclusion generalizes to other test conditions, including real EAS, and to further test the assumptions of these different models of the combined-stimulation advantage.
Collapse
|
34
|
Psychometric functions for pure-tone frequency discrimination. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:263-72. [PMID: 21786896 PMCID: PMC3155586 DOI: 10.1121/1.3598448] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
The form of the psychometric function (PF) for auditory frequency discrimination is of theoretical interest and practical importance. In this study, PFs for pure-tone frequency discrimination were measured for several standard frequencies (200-8000 Hz) and levels [35-85 dB sound pressure level (SPL)] in normal-hearing listeners. The proportion-correct data were fitted using a cumulative-Gaussian function of the sensitivity index, d', computed as a power transformation of the frequency difference, Δf. The exponent of the power function corresponded to the slope of the PF on log(d')-log(Δf) coordinates. The influence of attentional lapses on PF-slope estimates was investigated. When attentional lapses were not taken into account, the estimated PF slopes on log(d')-log(Δf) coordinates were found to be significantly lower than 1, suggesting a nonlinear relationship between d' and Δf. However, when lapse rate was included as a free parameter in the fits, PF slopes were found not to differ significantly from 1, consistent with a linear relationship between d' and Δf. This was the case across the wide ranges of frequencies and levels tested in this study. Therefore, spectral and temporal models of frequency discrimination must account for a linear relationship between d' and Δf across a wide range of frequencies and levels.
Collapse
|
35
|
Relationship between age of hearing-loss onset, hearing-loss duration, and speech recognition in individuals with severe-to-profound high-frequency hearing loss. J Assoc Res Otolaryngol 2011; 12:519-34. [PMID: 21350969 DOI: 10.1007/s10162-011-0261-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2010] [Accepted: 02/07/2011] [Indexed: 11/25/2022] Open
Abstract
The factors responsible for interindividual differences in speech-understanding ability among hearing-impaired listeners are not well understood. Although audibility has been found to account for some of this variability, other factors may play a role. This study sought to examine whether part of the large interindividual variability of speech-recognition performance in individuals with severe-to-profound high-frequency hearing loss could be accounted for by differences in hearing-loss onset type (early, progressive, or sudden), age at hearing-loss onset, or hearing-loss duration. Other potential factors including age, hearing thresholds, speech-presentation levels, and speech audibility were controlled. Percent-correct (PC) scores for syllables in dissyllabic words, which were either unprocessed or lowpass filtered at cutoff frequencies ranging from 250 to 2,000 Hz, were measured in 20 subjects (40 ears) with severe-to-profound hearing losses above 1 kHz. For comparison purposes, 20 normal-hearing subjects (20 ears) were also tested using the same filtering conditions and a range of speech levels (10-80 dB SPL). Significantly higher asymptotic PCs were observed in the early (<=4 years) hearing-loss onset group than in both the progressive- and sudden-onset groups, even though the three groups did not differ significantly with respect to age, hearing thresholds, or speech audibility. In addition, significant negative correlations between PC and hearing-loss onset age, and positive correlations between PC and hearing-loss duration were observed. These variables accounted for a greater proportion of the variance in speech-intelligibility scores than, and were not significantly correlated with, speech audibility, as quantified using a variant of the articulation index. Although the lack of statistical independence between hearing-loss onset type, hearing-loss onset age, hearing-loss duration, and age complicate and limit the interpretation of the results, these findings indicate that other variables than audibility can influence speech intelligibility in listeners with severe-to-profound high-frequency hearing loss.
Collapse
|
36
|
Abstract
Pitch, the perceptual correlate of fundamental frequency (F0), plays an important role in speech, music, and animal vocalizations. Changes in F0 over time help define musical melodies and speech prosody, while comparisons of simultaneous F0 are important for musical harmony, and for segregating competing sound sources. This study compared listeners' ability to detect differences in F0 between pairs of sequential or simultaneous tones that were filtered into separate, nonoverlapping spectral regions. The timbre differences induced by filtering led to poor F0 discrimination in the sequential, but not the simultaneous, conditions. Temporal overlap of the two tones was not sufficient to produce good performance; instead performance appeared to depend on the two tones being integrated into the same perceptual object. The results confirm the difficulty of comparing the pitches of sequential sounds with different timbres and suggest that, for simultaneous sounds, pitch differences may be detected through a decrease in perceptual fusion rather than an explicit coding and comparison of the underlying F0s.
Collapse
|
37
|
Recalibration of the auditory continuity illusion: sensory and decisional effects. Hear Res 2011; 277:152-62. [PMID: 21276844 DOI: 10.1016/j.heares.2011.01.013] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/06/2010] [Revised: 01/17/2011] [Accepted: 01/19/2011] [Indexed: 12/01/2022]
Abstract
An interrupted sound can be perceived as continuous when noise masks the interruption, creating an illusion of continuity. Recent findings have shown that adaptor sounds preceding an ambiguous target sound can influence listeners' rating of target continuity. However, it remains unclear whether these aftereffects on perceived continuity influence sensory processes, decisional processes (i.e., criterion shifts), or both. The present study addressed this question. Results show that the target sound was more likely to be rated as 'continuous' when preceded by adaptors that were perceived as clearly discontinuous than when it was preceded by adaptors that were heard (illusorily or veridically) as continuous. Detection-theory analyses indicated that these contrastive aftereffects reflect a combination of sensory and decisional processes. The contrastive sensory aftereffect persisted even when adaptors and targets were presented to opposite ears, suggesting a neural origin in structures that receive binaural inputs. Finally, physically identical but perceptually ambiguous adaptors that were rated as 'continuous' induced more reports of target continuity than adaptors that were rated as 'discontinuous'. This assimilative aftereffect was purely decisional. These findings confirm that judgments of auditory continuity can be influenced by preceding events, and reveal that these aftereffects have both sensory and decisional components.
Collapse
|
38
|
Behavioral measures of auditory streaming in ferrets (Mustela putorius). ACTA ACUST UNITED AC 2011; 124:317-30. [PMID: 20695663 DOI: 10.1037/a0018273] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
An important aspect of the analysis of auditory "scenes" relates to the perceptual organization of sound sequences into auditory "streams." In this study, we adapted two auditory perception tasks, used in recent human psychophysical studies, to obtain behavioral measures of auditory streaming in ferrets (Mustela putorius). One task involved the detection of shifts in the frequency of tones within an alternating tone sequence. The other task involved the detection of a stream of regularly repeating target tones embedded within a randomly varying multitone background. In both tasks, performance was measured as a function of various stimulus parameters, which previous psychophysical studies in humans have shown to influence auditory streaming. Ferret performance in the two tasks was found to vary as a function of these parameters in a way that is qualitatively consistent with the human data. These results suggest that auditory streaming occurs in ferrets, and that the two tasks described here may provide a valuable tool in future behavioral and neurophysiological studies of the phenomenon.
Collapse
|
39
|
Auditory stream segregation and the perception of across-frequency synchrony. J Exp Psychol Hum Percept Perform 2010; 36:1029-1039. [PMID: 20695716 DOI: 10.1037/a0017601] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This study explored the extent to which sequential auditory grouping affects the perception of temporal synchrony. In Experiment 1, listeners discriminated between 2 pairs of asynchronous "target" tones at different frequencies, A and B, in which the B tone either led or lagged. Thresholds were markedly higher when the target tones were temporally surrounded by "captor tones" at the A frequency than when the captor tones were absent or at a remote frequency. Experiment 2 extended these findings to asynchrony detection, revealing that the perception of synchrony, one of the most potent cues for simultaneous auditory grouping, is not immune to competing effects of sequential grouping. Experiment 3 examined the influence of ear separation on the interactions between sequential and simultaneous grouping cues. The results showed that, although ear separation could facilitate perceptual segregation and impair asynchrony detection, it did not prevent the perceptual integration of simultaneous sounds.
Collapse
|
40
|
Musical intervals and relative pitch: frequency resolution, not interval resolution, is special. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:1943-1951. [PMID: 20968366 PMCID: PMC2981111 DOI: 10.1121/1.3478785] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2009] [Revised: 07/16/2010] [Accepted: 07/19/2010] [Indexed: 05/26/2023]
Abstract
Pitch intervals are central to most musical systems, which utilize pitch at the expense of other acoustic dimensions. It seemed plausible that pitch might uniquely permit precise perception of the interval separating two sounds, as this could help explain its importance in music. To explore this notion, a simple discrimination task was used to measure the precision of interval perception for the auditory dimensions of pitch, brightness, and loudness. Interval thresholds were then expressed in units of just-noticeable differences for each dimension, to enable comparison across dimensions. Contrary to expectation, when expressed in these common units, interval acuity was actually worse for pitch than for loudness or brightness. This likely indicates that the perceptual dimension of pitch is unusual not for interval perception per se, but rather for the basic frequency resolution it supports. The ubiquity of pitch in music may be due in part to this fine-grained basic resolution.
Collapse
|
41
|
Does fundamental-frequency discrimination measure virtual pitch discrimination? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:1930-42. [PMID: 20968365 PMCID: PMC2981110 DOI: 10.1121/1.3478786] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2009] [Revised: 07/16/2010] [Accepted: 07/19/2010] [Indexed: 05/30/2023]
Abstract
Studies of pitch perception often involve measuring difference limens for complex tones (DLCs) that differ in fundamental frequency (F0). These measures are thought to reflect F0 discrimination and to provide an indirect measure of subjective pitch strength. However, in many situations discrimination may be based on cues other than the pitch or the F0, such as differences in the frequencies of individual components or timbre (brightness). Here, DLCs were measured for harmonic and inharmonic tones under various conditions, including a randomized or fixed lowest harmonic number, with and without feedback. The inharmonic tones were produced by shifting the frequencies of all harmonics upwards by 6.25%, 12.5%, or 25% of F0. It was hypothesized that, if DLCs reflect residue-pitch discrimination, these frequency-shifted tones, which produced a weaker and more ambiguous pitch than would yield larger DLCs than the harmonic tones. However, if DLCs reflect comparisons of component pitches, or timbre, they should not be systematically influenced by frequency shifting. The results showed larger DLCs and more scattered pitch matches for inharmonic than for harmonic complexes, confirming that the inharmonic tones produced a less consistent pitch than the harmonic tones, and consistent with the idea that DLCs reflect F0 pitch discrimination.
Collapse
|
42
|
Abstract
Psychophysical reverse-correlation methods such as the "classification image" technique provide a unique tool to uncover the internal representations and decision strategies of individual participants in perceptual tasks. Over the past 30 years, these techniques have gained increasing popularity among both visual and auditory psychophysicists. However, thus far, principled applications of the psychophysical reverse-correlation approach have been almost exclusively limited to two-alternative decision (detection or discrimination) tasks. Whether and how reverse-correlation methods can be applied to uncover perceptual templates and decision strategies in situations involving more than just two response alternatives remain largely unclear. Here, the authors consider the problem of estimating perceptual templates and decision strategies in stimulus identification tasks with multiple response alternatives. They describe a modified correlational approach, which can be used to solve this problem. The approach is evaluated under a variety of simulated conditions, including different ratios of internal-to-external noise, different degrees of correlations between the sensory observations, and various statistical distributions of stimulus perturbations. The results indicate that the proposed approach is reasonably robust, suggesting that it could be used in future empirical studies.
Collapse
|
43
|
Pitch perception for mixtures of spectrally overlapping harmonic complex tones. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:257-69. [PMID: 20649221 PMCID: PMC2921428 DOI: 10.1121/1.3372751] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2008] [Revised: 03/02/2010] [Accepted: 03/04/2010] [Indexed: 05/29/2023]
Abstract
This study measured difference limens for fundamental frequency (DLF0s) for a target harmonic complex in the presence of a simultaneous spectrally overlapping harmonic masker. The resolvability of the target harmonics was manipulated by bandpass filtering the stimuli into a low (800-2400 Hz) or high (1600-3200 Hz) spectral region, using different nominal F0s for the targets (100, 200, and 400 Hz), and different masker F0s (0, +9, or -9 semitones) relative to the target. Three different modes of masker presentation, relative to the target, were tested: ipsilateral, contralateral, and dichotic, with a higher masker level in the contralateral ear. Ipsilateral and dichotic maskers generally caused marked elevations in DLF0s compared to both the unmasked and contralateral masker conditions. Analyses based on excitation patterns revealed that ipsilaterally masked F0 difference limens were small (<2%) only when the excitation patterns evoked by the target-plus-masker mixture contained several salient (>1 dB) peaks at or close to target harmonic frequencies, even though these peaks were rarely produced by the target alone. The findings are discussed in terms of place- or place-time mechanisms of pitch perception.
Collapse
|
44
|
Neural adaptation to tone sequences in the songbird forebrain: patterns, determinants, and relation to the build-up of auditory streaming. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 2010; 196:543-57. [PMID: 20563587 DOI: 10.1007/s00359-010-0542-4] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2010] [Revised: 05/08/2010] [Accepted: 05/28/2010] [Indexed: 11/29/2022]
Abstract
Neural responses to tones in the mammalian primary auditory cortex (A1) exhibit adaptation over the course of several seconds. Important questions remain about the taxonomic distribution of multi-second adaptation and its possible roles in hearing. It has been hypothesized that neural adaptation could explain the gradual "build-up" of auditory stream segregation. We investigated the influence of several stimulus-related factors on neural adaptation in the avian homologue of mammalian A1 (field L2) in starlings (Sturnus vulgaris). We presented awake birds with sequences of repeated triplets of two interleaved tones (ABA-ABA-...) in which we varied the frequency separation between the A and B tones (DeltaF), the stimulus onset asynchrony (time from tone onset to onset within a triplet), and tone duration. We found that stimulus onset asynchrony generally had larger effects on adaptation compared with DeltaF and tone duration over the parameter range tested. Using a simple model, we show how time-dependent changes in neural responses can be transformed into neurometric functions that make testable predictions about the dependence of the build-up of stream segregation on various spectral and temporal stimulus properties.
Collapse
|
45
|
Stimulus uncertainty and insensitivity to pitch-change direction. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:3026-37. [PMID: 21117752 PMCID: PMC2882662 DOI: 10.1121/1.3365252] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Revised: 02/16/2010] [Accepted: 02/18/2010] [Indexed: 05/19/2023]
Abstract
In a series of experiments, Semal and Demany [(2006). J. Acoust. Soc. Am. 120, 3907-3915] demonstrated that some normally hearing listeners are unable to determine the direction of small but detectable differences in frequency between pure tones. Unlike studies demonstrating similar effects in patients with brain damage, the authors used stimuli in which the standard frequency of the tones was highly uncertain (roved) over trials. In Experiment 1, listeners were identified as insensitive to the direction of pitch changes using stimuli with frequency roving. When listeners were retested using stimuli without roving in Experiment 2, impairments in pitch-direction identification were generally much less profound. In Experiment 3, frequency-roving range had a systematic effect on listeners' thresholds, and impairments in pitch-direction identification tended to occur only when the roving range was widest. In Experiment 4, the influence of frequency roving was similar for continuous frequency changes as for discrete changes. Possible explanations for the influence of roving on listeners' insensitivity to pitch-change direction are discussed.
Collapse
|
46
|
Behind the scenes of auditory perception. Curr Opin Neurobiol 2010; 20:361-6. [PMID: 20456940 DOI: 10.1016/j.conb.2010.03.009] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2010] [Revised: 03/16/2010] [Accepted: 03/29/2010] [Indexed: 11/30/2022]
Abstract
'Auditory scenes' often contain contributions from multiple acoustic sources. These are usually heard as separate auditory 'streams', which can be selectively followed over time. How and where these auditory streams are formed in the auditory system is one of the most fascinating questions facing auditory scientists today. Findings published within the past two years indicate that both cortical and subcortical processes contribute to the formation of auditory streams, and they raise important questions concerning the roles of primary and secondary areas of auditory cortex in this phenomenon. In addition, these findings underline the importance of taking into account the relative timing of neural responses, and the influence of selective attention, in the search for neural correlates of the perception of auditory streams.
Collapse
|
47
|
Can temporal fine structure represent the fundamental frequency of unresolved harmonics? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 125:2189-2199. [PMID: 19354395 PMCID: PMC2736736 DOI: 10.1121/1.3089220] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2008] [Revised: 02/04/2009] [Accepted: 02/04/2009] [Indexed: 05/27/2023]
Abstract
At least two modes of pitch perception exist: in one, the fundamental frequency (F0) of harmonic complex tones is estimated using the temporal fine structure (TFS) of individual low-order resolved harmonics; in the other, F0 is derived from the temporal envelope of high-order unresolved harmonics that interact in the auditory periphery. Pitch is typically more accurate in the former than in the latter mode. Another possibility is that pitch can sometimes be coded via the TFS from unresolved harmonics. A recent study supporting this third possibility [Moore et al. (2006a). J. Acoust. Soc. Am. 119, 480-490] based its conclusion on a condition where phase interaction effects (implying unresolved harmonics) accompanied accurate F0 discrimination (implying TFS processing). The present study tests whether these results were influenced by audible distortion products. Experiment 1 replicated the original results, obtained using a low-level background noise. However, experiments 2-4 found no evidence for the use of TFS cues with unresolved harmonics when the background noise level was raised, or the stimulus level was lowered, to render distortion inaudible. Experiment 5 measured the presence and phase dependence of audible distortion products. The results provide no evidence that TFS cues are used to code the F0 of unresolved harmonics.
Collapse
|
48
|
63. Neural correlates of perceptual awareness versus informational masking in human auditory cortex: An MEG study. Clin Neurophysiol 2009. [DOI: 10.1016/j.clinph.2008.07.062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
49
|
Abstract
In many psychophysical experiments, the participant's task is to detect small changes along a given stimulus dimension or to identify the direction (e.g., upward vs. downward) of such changes. The results of these experiments are traditionally analyzed with a constant-variance Gaussian (CVG) model or a high-threshold (HT) model. Here, the authors demonstrate that for changes along three basic sound dimensions (frequency, intensity, and amplitude-modulation rate), such models cannot account for the observed relationship between detection thresholds and direction-identification thresholds. It is shown that two alternative models can account for this relationship. One of them is based on the idea of sensory quanta; the other assumes that small changes are detected on the basis of Poisson processes with low means. The predictions of these two models are then compared against receiver operating characteristics (ROCs) for the detection of changes in sound intensity. It is concluded that human listeners' perception of small and unidimensional acoustic changes is better described by a discrete-state Poisson model than by the more commonly used CVG model or by the less favored HT and quantum models.
Collapse
|
50
|
Neural correlates of auditory perceptual awareness under informational masking. PLoS Biol 2008; 6:e138. [PMID: 18547141 PMCID: PMC2422852 DOI: 10.1371/journal.pbio.0060138] [Citation(s) in RCA: 143] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2007] [Accepted: 04/23/2008] [Indexed: 11/18/2022] Open
Abstract
Our ability to detect target sounds in complex acoustic backgrounds is often limited not by the ear's resolution, but by the brain's information-processing capacity. The neural mechanisms and loci of this "informational masking" are unknown. We combined magnetoencephalography with simultaneous behavioral measures in humans to investigate neural correlates of informational masking and auditory perceptual awareness in the auditory cortex. Cortical responses were sorted according to whether or not target sounds were detected by the listener in a complex, randomly varying multi-tone background known to produce informational masking. Detected target sounds elicited a prominent, long-latency response (50-250 ms), whereas undetected targets did not. In contrast, both detected and undetected targets produced equally robust auditory middle-latency, steady-state responses, presumably from the primary auditory cortex. These findings indicate that neural correlates of auditory awareness in informational masking emerge between early and late stages of processing within the auditory cortex.
Collapse
|