1
|
Benefits of Text Supplementation on Sentence Recognition and Subjective Ratings With and Without Facial Cues for Listeners With Normal Hearing. Ear Hear 2022:00003446-990000000-00088. [PMID: 36534697 DOI: 10.1097/aud.0000000000001316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
OBJECTIVES Recognizing speech through telecommunication can be challenging in unfavorable listening conditions. Text supplementation or provision of facial cues can facilitate speech recognition under some circumstances. However, our understanding of the combined benefit of text and facial cues in telecommunication is limited. The purpose of this study was to investigate the potential benefit of text supplementation for sentence recognition scores and subjective ratings of spoken speech with and without facial cues available. DESIGN Twenty adult females (M = 24 years, range 21 to 29 years) with normal hearing performed a sentence recognition task and also completed a subjective rating questionnaire in 24 conditions. The conditions varied by integrity of the available facial cues (clear facial cues, slight distortion facial cues, great distortion facial cues, no facial cues), signal-to-noise ratio (quiet, +1 dB, -3 dB), and text availability (with text, without text). When present, the text was an 86 to 88% accurate transcription of the auditory signal presented at a 500 ms delay relative to the auditory signal. RESULTS The benefits of text supplementation were largest when facial cues were not available and when the signal-to-noise ratio was unfavorable. Although no recognition score benefit was present in quiet, recognition benefit was significant in all levels of background noise for all levels of facial cue integrity. Moreover, participant subjective ratings of text benefit were robust and present even in the absence of recognition benefit. Consistent with previous literature, facial cues were beneficial for sentence recognition scores in the most unfavorable signal-to-noise ratio, even when greatly distorted. It is interesting that, although all levels of facial cues were beneficial for recognition scores, participants rated a significant benefit only with clear facial cues. CONCLUSIONS The benefit of text for auditory-only and auditory-visual speech recognition is evident in recognition scores and subjective ratings; the benefit is larger and more robust for subjective ratings than for scores. Therefore, text supplementation might provide benefit that extends beyond speech recognition scores. Combined, these findings support the use of text supplementation in telecommunication, even when facial cues are concurrently present, such as during teleconferencing or watching television.
Collapse
|
2
|
Abstract
Most human auditory psychophysics research has historically been conducted in carefully controlled environments with calibrated audio equipment, and over potentially hours of repetitive testing with expert listeners. Here, we operationally define such conditions as having high ‘auditory hygiene’. From this perspective, conducting auditory psychophysical paradigms online presents a serious challenge, in that results may hinge on absolute sound presentation level, reliably estimated perceptual thresholds, low and controlled background noise levels, and sustained motivation and attention. We introduce a set of procedures that address these challenges and facilitate auditory hygiene for online auditory psychophysics. First, we establish a simple means of setting sound presentation levels. Across a set of four level-setting conditions conducted in person, we demonstrate the stability and robustness of this level setting procedure in open air and controlled settings. Second, we test participants’ tone-in-noise thresholds using widely adopted online experiment platforms and demonstrate that reliable threshold estimates can be derived online in approximately one minute of testing. Third, using these level and threshold setting procedures to establish participant-specific stimulus conditions, we show that an online implementation of the classic probe-signal paradigm can be used to demonstrate frequency-selective attention on an individual-participant basis, using a third of the trials used in recent in-lab experiments. Finally, we show how threshold and attentional measures relate to well-validated assays of online participants’ in-task motivation, fatigue, and confidence. This demonstrates the promise of online auditory psychophysics for addressing new auditory perception and neuroscience questions quickly, efficiently, and with more diverse samples. Code for the tests is publicly available through Pavlovia and Gorilla.
Collapse
Affiliation(s)
- Sijia Zhao
- Department of Experimental Psychology, 6396University of Oxford, Oxford, UK
| | - Christopher A Brown
- Department of Communication Science and Disorders, 6614University of Pittsburgh, Pittsburgh, PA, USA
| | - Lori L Holt
- Department of Psychology, 6612Carnegie Mellon University, Pittsburgh, PA, USA.,Neuroscience Institute, 6612Carnegie Mellon University, Pittsburgh, PA, USA
| | - Frederic Dick
- Department of Psychological Sciences, 4894Birkbeck College, University of London, London, UK.,Department of Experimental Psychology, PALS, 4919University College London, London, UK
| |
Collapse
|
3
|
Reeves A, Seluakumaran K, Scharf B. Contralateral proximal interference. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3352. [PMID: 34241123 DOI: 10.1121/10.0004786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Accepted: 04/06/2021] [Indexed: 06/13/2023]
Abstract
A contralateral "cue" tone presented in continuous broadband noise both lowers the threshold of a signal tone by guiding attention to it and raises its threshold by interference. Here, signal tones were fixed in duration (40 ms, 52 ms with ramps), frequency (1500 Hz), timing, and level, so attention did not need guidance. Interference by contralateral cues was studied in relation to cue-signal proximity, cue-signal temporal overlap, and cue-signal order (cue after: backward interference, BI; or cue first: forward interference, FI). Cues, also ramped, were 12 dB above the signal level. Long cues (300 or 600 ms) raised thresholds by 5.3 dB when the signal and cue overlapped and by 5.1 dB in FI and 3.2 dB in BI when cues and signals were separated by 40 ms. Short cues (40 ms) raised thresholds by 4.5 dB in FI and 4.0 dB in BI for separations of 7 to 40 ms, but by ∼13 dB when simultaneous and in phase. FI and BI are comparable in magnitude and hardly increase when the signal is close in time to abrupt cue transients. These results do not support the notion that masking of the signal is due to the contralateral cue onset/offset transient response. Instead, sluggish attention or temporal integration may explain contralateral proximal interference.
Collapse
Affiliation(s)
- Adam Reeves
- Department of Psychology, Northeastern University, Boston, Massachusetts 02115, USA
| | - Kumar Seluakumaran
- Faculty of Medicine, Department of Physiology, University of Malaya, 50603 Kuala Lumpur, Malaysia
| | - Bertram Scharf
- Department of Psychology, Northeastern University, Boston, Massachusetts 02115, USA
| |
Collapse
|
4
|
Holt LL, Tierney AT, Guerra G, Laffere A, Dick F. Dimension-selective attention as a possible driver of dynamic, context-dependent re-weighting in speech processing. Hear Res 2018; 366:50-64. [PMID: 30131109 PMCID: PMC6107307 DOI: 10.1016/j.heares.2018.06.014] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 06/10/2018] [Accepted: 06/19/2018] [Indexed: 12/24/2022]
Abstract
The contribution of acoustic dimensions to an auditory percept is dynamically adjusted and reweighted based on prior experience about how informative these dimensions are across the long-term and short-term environment. This is especially evident in speech perception, where listeners differentially weight information across multiple acoustic dimensions, and use this information selectively to update expectations about future sounds. The dynamic and selective adjustment of how acoustic input dimensions contribute to perception has made it tempting to conceive of this as a form of non-spatial auditory selective attention. Here, we review several human speech perception phenomena that might be consistent with auditory selective attention although, as of yet, the literature does not definitively support a mechanistic tie. We relate these human perceptual phenomena to illustrative nonhuman animal neurobiological findings that offer informative guideposts in how to test mechanistic connections. We next present a novel empirical approach that can serve as a methodological bridge from human research to animal neurobiological studies. Finally, we describe four preliminary results that demonstrate its utility in advancing understanding of human non-spatial dimension-based auditory selective attention.
Collapse
Affiliation(s)
- Lori L Holt
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, 15213, USA; Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, PA, 15213, USA.
| | - Adam T Tierney
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK; Centre for Brain and Cognitive Development, Birkbeck College, London, WC1E 7HX, UK
| | - Giada Guerra
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK; Centre for Brain and Cognitive Development, Birkbeck College, London, WC1E 7HX, UK
| | - Aeron Laffere
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK
| | - Frederic Dick
- Department of Psychological Sciences, Birkbeck College, University of London, London, WC1E 7HX, UK; Centre for Brain and Cognitive Development, Birkbeck College, London, WC1E 7HX, UK; Department of Experimental Psychology, University College London, London, WC1H 0AP, UK
| |
Collapse
|
5
|
Extensive Tonotopic Mapping across Auditory Cortex Is Recapitulated by Spectrally Directed Attention and Systematically Related to Cortical Myeloarchitecture. J Neurosci 2017; 37:12187-12201. [PMID: 29109238 PMCID: PMC5729191 DOI: 10.1523/jneurosci.1436-17.2017] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Revised: 10/04/2017] [Accepted: 10/06/2017] [Indexed: 11/21/2022] Open
Abstract
Auditory selective attention is vital in natural soundscapes. But it is unclear how attentional focus on the primary dimension of auditory representation—acoustic frequency—might modulate basic auditory functional topography during active listening. In contrast to visual selective attention, which is supported by motor-mediated optimization of input across saccades and pupil dilation, the primate auditory system has fewer means of differentially sampling the world. This makes spectrally-directed endogenous attention a particularly crucial aspect of auditory attention. Using a novel functional paradigm combined with quantitative MRI, we establish in male and female listeners that human frequency-band-selective attention drives activation in both myeloarchitectonically estimated auditory core, and across the majority of tonotopically mapped nonprimary auditory cortex. The attentionally driven best-frequency maps show strong concordance with sensory-driven maps in the same subjects across much of the temporal plane, with poor concordance in areas outside traditional auditory cortex. There is significantly greater activation across most of auditory cortex when best frequency is attended, versus ignored; the same regions do not show this enhancement when attending to the least-preferred frequency band. Finally, the results demonstrate that there is spatial correspondence between the degree of myelination and the strength of the tonotopic signal across a number of regions in auditory cortex. Strong frequency preferences across tonotopically mapped auditory cortex spatially correlate with R1-estimated myeloarchitecture, indicating shared functional and anatomical organization that may underlie intrinsic auditory regionalization. SIGNIFICANCE STATEMENT Perception is an active process, especially sensitive to attentional state. Listeners direct auditory attention to track a violin's melody within an ensemble performance, or to follow a voice in a crowded cafe. Although diverse pathologies reduce quality of life by impacting such spectrally directed auditory attention, its neurobiological bases are unclear. We demonstrate that human primary and nonprimary auditory cortical activation is modulated by spectrally directed attention in a manner that recapitulates its tonotopic sensory organization. Further, the graded activation profiles evoked by single-frequency bands are correlated with attentionally driven activation when these bands are presented in complex soundscapes. Finally, we observe a strong concordance in the degree of cortical myelination and the strength of tonotopic activation across several auditory cortical regions.
Collapse
|
6
|
Zimmermann JF, Moscovitch M, Alain C. Attending to auditory memory. Brain Res 2015; 1640:208-21. [PMID: 26638836 DOI: 10.1016/j.brainres.2015.11.032] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Revised: 11/18/2015] [Accepted: 11/19/2015] [Indexed: 10/22/2022]
Abstract
Attention to memory describes the process of attending to memory traces when the object is no longer present. It has been studied primarily for representations of visual stimuli with only few studies examining attention to sound object representations in short-term memory. Here, we review the interplay of attention and auditory memory with an emphasis on 1) attending to auditory memory in the absence of related external stimuli (i.e., reflective attention) and 2) effects of existing memory on guiding attention. Attention to auditory memory is discussed in the context of change deafness, and we argue that failures to detect changes in our auditory environments are most likely the result of a faulty comparison system of incoming and stored information. Also, objects are the primary building blocks of auditory attention, but attention can also be directed to individual features (e.g., pitch). We review short-term and long-term memory guided modulation of attention based on characteristic features, location, and/or semantic properties of auditory objects, and propose that auditory attention to memory pathways emerge after sensory memory. A neural model for auditory attention to memory is developed, which comprises two separate pathways in the parietal cortex, one involved in attention to higher-order features and the other involved in attention to sensory information. This article is part of a Special Issue entitled SI: Auditory working memory.
Collapse
Affiliation(s)
- Jacqueline F Zimmermann
- University of Toronto, Department of Psychology, Sidney Smith Hall, 100 St. George Street, Toronto, Ontario, Canada M5S 3G3; Rotman Research Institute, Baycrest Hospital, 3560 Bathurst Street, Toronto, Ontario, Canada M6A 2E1.
| | - Morris Moscovitch
- University of Toronto, Department of Psychology, Sidney Smith Hall, 100 St. George Street, Toronto, Ontario, Canada M5S 3G3; Rotman Research Institute, Baycrest Hospital, 3560 Bathurst Street, Toronto, Ontario, Canada M6A 2E1
| | - Claude Alain
- University of Toronto, Department of Psychology, Sidney Smith Hall, 100 St. George Street, Toronto, Ontario, Canada M5S 3G3; Rotman Research Institute, Baycrest Hospital, 3560 Bathurst Street, Toronto, Ontario, Canada M6A 2E1; Institute of Medical Sciences, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
7
|
Wittekindt A, Kaiser J, Abel C. Attentional modulation of the inner ear: a combined otoacoustic emission and EEG study. J Neurosci 2014; 34:9995-10002. [PMID: 25057201 PMCID: PMC6608308 DOI: 10.1523/jneurosci.4861-13.2014] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Revised: 05/18/2014] [Accepted: 06/09/2014] [Indexed: 11/21/2022] Open
Abstract
Attending to a single stimulus in a complex multisensory environment requires the ability to select relevant information while ignoring distracting input. The underlying mechanism and involved neuronal levels of this attentional gain control are still a matter of debate. Here, we investigated the influence of intermodal attention on different levels of auditory processing in humans. It is known that the activity of the cochlear amplifier can be modulated by efferent neurons of the medial olivocochlear complex. We used distortion product otoacoustic emission (DPOAE) measurements to monitor cochlear activity during an intermodal cueing paradigm. Simultaneously, central auditory processing was assessed by electroencephalography (EEG) with a steady-state paradigm targeting early cortical responses and analysis of alpha oscillations reflecting higher cognitive control of attentional modulation. We found effects of selective attention at all measured levels of the auditory processing: DPOAE levels differed significantly between periods of visual and auditory attention, showing a reduction during visual attention, but no change during auditory attention. Primary auditory cortex activity, as measured by the auditory steady-state response (ASSR), differed between conditions, with higher ASSRs during auditory than visual attention. Furthermore, the analysis of cortical oscillatory activity revealed increased alpha power over occipitoparietal and frontal regions during auditory compared with visual attention, putatively reflecting suppression of visual processing. In conclusion, this study showed both enhanced processing of attended acoustic stimuli in early sensory cortex and reduced processing of distracting input, both at higher cortical levels and at the most peripheral level of the hearing system, the cochlea.
Collapse
Affiliation(s)
- Anna Wittekindt
- Institute of Medical Psychology, Goethe University Frankfurt, 60528 Frankfurt am Main, Germany
| | - Jochen Kaiser
- Institute of Medical Psychology, Goethe University Frankfurt, 60528 Frankfurt am Main, Germany
| | - Cornelius Abel
- Institute of Medical Psychology, Goethe University Frankfurt, 60528 Frankfurt am Main, Germany
| |
Collapse
|
8
|
|
9
|
Althen H, Wittekindt A, Gaese B, Kössl M, Abel C. Effect of contralateral pure tone stimulation on distortion emissions suggests a frequency-specific functioning of the efferent cochlear control. J Neurophysiol 2012; 107:1962-9. [DOI: 10.1152/jn.00418.2011] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Contralateral acoustic stimulation (CAS) with white noise and pure tone stimuli was used to assess frequency specificity of efferent olivocochlear control of cochlear mechanics in the gerbil. Changes of the cochlear amplifier can be monitored by distortion product otoacoustic emissions (DPOAEs), which are a byproduct of the nonlinear amplification by the outer hair cells. We used the quadratic DPOAE f2-f1 as ipsilateral probe, as it is known to be sensitive to efferent olivocochlear activity. White noise CAS, used to evoke efferent activity, had maximal effects on the DPOAE level for f2-stimulus frequencies of 5–7 kHz. The dominant effect during CAS was a DPOAE level increase of up to 13.5 dB. The frequency specificity of the olivocochlear system was evaluated by presenting pure tones (0.5–38 kHz) as contralateral stimuli to evoke efferent activity. Maximal DPOAE level changes were triggered by CAS frequencies close to the frequency of the DPOAE elicitor tones (tested f2 range: 2.5–15 kHz). The effective CAS frequency range covered 1.4–2.4 octaves and was centered 0.42 octaves below the DPOAE elicitor tone f2. The frequency-specific effect of CAS with pure tones suggests a dedicated central control of mechanical adjustments for peripheral frequency processing.
Collapse
Affiliation(s)
- H. Althen
- Institute for Cell Biology and Neuroscience, Department of Biological Sciences, and
| | - A. Wittekindt
- Institute for Cell Biology and Neuroscience, Department of Biological Sciences, and
| | - B. Gaese
- Institute for Cell Biology and Neuroscience, Department of Biological Sciences, and
| | - M. Kössl
- Institute for Cell Biology and Neuroscience, Department of Biological Sciences, and
| | - C. Abel
- Institute of Medical Psychology, Goethe University, Frankfurt am Main, Germany
| |
Collapse
|
10
|
Updating and feature overwriting in short-term memory for timbre. Atten Percept Psychophys 2010; 72:2289-303. [PMID: 21097870 DOI: 10.3758/bf03196702] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
11
|
List A, Justus T. Relative priming of temporal local--global levels in auditory hierarchical stimuli. Atten Percept Psychophys 2010; 72:193-208. [PMID: 20045889 PMCID: PMC2802320 DOI: 10.3758/app.72.1.193] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Priming is a useful tool for ascertaining the circumstances under which previous experiences influence behavior. Previously, using hierarchical stimuli, we demonstrated (Justus & List, 2005) that selectively attending to one temporal scale of an auditory stimulus improved subsequent attention to a repeated (vs. changed) temporal scale; that is, we demonstrated intertrial auditory temporal level priming. Here, we have extended those results to address whether level priming relied on absolute or relative temporal information. Both relative and absolute temporal information are important in auditory perception: Speech and music can be recognized over various temporal scales but become uninterpretable to a listener when presented too quickly or slowly. We first confirmed that temporal level priming generalized over new temporal scales. Second, in the context of multiple temporal scales, we found that temporal level priming operates predominantly on the basis of relative, rather than absolute, temporal information. These findings are discussed in the context of expectancies and relational invariance in audition.
Collapse
|
12
|
Apoux F, Healy EW. On the number of auditory filter outputs needed to understand speech: further evidence for auditory channel independence. Hear Res 2009; 255:99-108. [PMID: 19539016 DOI: 10.1016/j.heares.2009.06.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2009] [Revised: 06/01/2009] [Accepted: 06/10/2009] [Indexed: 11/17/2022]
Abstract
The number of auditory filter outputs required to identify phonemes was estimated in two experiments. Stimuli were divided into 30 contiguous equivalent rectangular bandwidths (ERB(N)) spanning 80-7563Hz. Normal-hearing listeners were presented with limited numbers of bands having frequency locations determined randomly from trial to trial to provide a general view, i.e., irrespective of specific band location, of the number of 1-ERB(N)-wide speech bands needed to identify phonemes. The first experiment demonstrated that 20 such bands are required to accurately identify vowels, and 16 are required to identify consonants. In the second experiment, speech-shaped noise or time-reversed speech was introduced to the non-speech bands at various signal-to-noise ratios. Considerably elevated noise levels were necessary to substantially affect phoneme recognition, confirming a high degree of channel independence in the auditory system. The independence observed between auditory filter outputs supports current views of speech recognition in noise in which listeners extract and combine pieces of information randomly distributed both in time and frequency. These findings also suggest that the ability to partition incoming sounds into a large number of narrow bands, an ability often lost in cases of hearing impairment or cochlear implantation, is critical for speech recognition in noise.
Collapse
Affiliation(s)
- Frédéric Apoux
- Department of Speech and Hearing Science, The Ohio State University, Columbus, OH 43210, USA.
| | | |
Collapse
|
13
|
Abstract
Three cued signal detection experiments demonstrated a role for auditory memory traces in frequency selectivity. The extent to which the cue predicted the signal frequency affected the size of the advantage for signals at the cue frequency over those at distant frequencies when the cue-signal gap was 10 sec but not when it was 1 sec. Detection of occasional signals presented at uncued frequencies was enhanced when they matched the frequency of cues from recent trials. With "relative" cues, which were usually followed by signals at the musical fifth above the cue frequency, performance on occasional signals at the cue frequency was enhanced relative to other unexpected frequencies. These results suggest that, regardless of the listener's expectations and intentions, the detectability of a signal is enhanced if its frequency matches an existing memory trace. One form of voluntary attention to frequency may involve maintaining traces that would otherwise slowly decay.
Collapse
|
14
|
Scharf B, Reeves A, Suciu J. The time required to focus on a cued signal frequency. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:2149-57. [PMID: 17471729 DOI: 10.1121/1.2537461] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
How quickly can a listener focus on a single tonal cue that indicates the frequency of an upcoming signal? Initial measurements were made with frequency uncertainty (signal frequency varies randomly from trial to trial) and with certainty (same frequency on all trials). Measured by a yes-no procedure, thresholds for 40- and 20-ms signals presented in continuous broadband noise at 50 dB SPL were higher in uncertainty than in certainty; the difference decreased monotonically from 5 dB at frequencies below 500 Hz to under 3 dB above about 2500 Hz. This decrease in the detrimental effect from uncertainty, which comes about with increasing signal frequency, may result from preferential attention to higher frequencies. In a second experiment, frequency again varied randomly, but each trial now began with a cue at the signal frequency. The critical variable was the delay from cue onset to signal onset. A delay of 352 ms eliminated the detrimental effect of frequency uncertainty at all frequencies. At the shortest delays of 52 and 82 ms the detrimental effect was reduced primarily at lower frequencies. Our analysis suggests that shifting focus to a cued frequency region, under optimal stimulus conditions, requires less than 52 ms.
Collapse
Affiliation(s)
- Bertram Scharf
- Department of Psychology, Northeastern University, Boston, Massachusetts 02115, USA.
| | | | | |
Collapse
|
15
|
Kidd G, Arbogast TL, Mason CR, Gallun FJ. The advantage of knowing where to listen. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 118:3804-15. [PMID: 16419825 DOI: 10.1121/1.2109187] [Citation(s) in RCA: 173] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
This study examined the role of focused attention along the spatial (azimuthal) dimension in a highly uncertain multitalker listening situation. The task of the listener was to identify key words from a target talker in the presence of two other talkers simultaneously uttering similar sentences. When the listener had no a priori knowledge about target location, or which of the three sentences was the target sentence, performance was relatively poor-near the value expected simply from choosing to focus attention on only one of the three locations. When the target sentence was cued before the trial, but location was uncertain, performance improved significantly relative to the uncued case. When spatial location information was provided before the trial, performance improved significantly for both cued and uncued conditions. If the location of the target was certain, proportion correct identification performance was higher than 0.9 independent of whether the target was cued beforehand. In contrast to studies in which known versus unknown spatial locations were compared for relatively simple stimuli and tasks, the results of the current experiments suggest that the focus of attention along the spatial dimension can play a very significant role in solving the "cocktail party" problem.
Collapse
Affiliation(s)
- Gerald Kidd
- Department of Speech, Language and Hearing Sciences and Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, USA.
| | | | | | | |
Collapse
|
16
|
Justus T, List A. Auditory attention to frequency and time: an analogy to visual local-global stimuli. Cognition 2005; 98:31-51. [PMID: 16297675 PMCID: PMC1987383 DOI: 10.1016/j.cognition.2004.11.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2004] [Revised: 07/27/2004] [Accepted: 11/11/2004] [Indexed: 10/26/2022]
Abstract
Two priming experiments demonstrated exogenous attentional persistence to the fundamental auditory dimensions of frequency (Experiment 1) and time (Experiment 2). In a divided-attention task, participants responded to an independent dimension, the identification of three-tone sequence patterns, for both prime and probe stimuli. The stimuli were specifically designed to parallel the local-global hierarchical letter stimuli of [Navon D. (1977). Forest before trees: The precedence of global features in visual perception. Cognitive Psychology, 9, 353-383] and the task was designed to parallel subsequent work in visual attention using Navon stimuli [Robertson, L. C. (1996). Attentional persistence for features of hierarchical patterns. Journal of Experimental Psychology: General, 125, 227-249; Ward, L. M. (1982). Determinants of attention to local and global features of visual forms. Journal of Experimental Psychology: Human Perception and Performance, 8, 562-581]. The results are discussed in terms of previous work in auditory attention and previous approaches to auditory local-global processing.
Collapse
|
17
|
Ashkenazi A, Marks LE. Effect of endogenous attention on detection of weak gustatory and olfactory flavors. ACTA ACUST UNITED AC 2004; 66:596-608. [PMID: 15311659 DOI: 10.3758/bf03194904] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The effect of endogenous attention on the detectability of weak flavorants was examined in an absolute detection (two-alternative forced-choice) task. Attention to sucrose improved the detectability of sucrose, a gustation-based flavorant, both when the alternative was water and when it was vanillin. But attention to vanillin did not improve the detectability of vanillin, an olfaction-based flavorant, either when the alternative was water or when it was sucrose. Nor did attention improve the detectability of vanillin when the alternative was citric acid, a tastant that is qualitatively less similar to vanillin than is sucrose. Attention had no positive effect on the detection of either sucrose or vanillin when it was mixed with the other substance. These findings suggest that although it is possible to attend selectively to gustatory flavors, it may be more difficult to attend selectively to olfactory flavors--perhaps because attention to flavors, which are taken in the mouth, is directed spatially toward the tongue, where gustatory, but not olfactory, receptors are located.
Collapse
Affiliation(s)
- Amir Ashkenazi
- John B. Pierce Laboratory and Yale University, New Haven, Connecticut 06519, USA.
| | | |
Collapse
|
18
|
Richards VM, Neff DL. Cuing effects for informational masking. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2004; 115:289-300. [PMID: 14759022 DOI: 10.1121/1.1631942] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The detection of a tone added to a random-frequency, multitone masker can be very poor even when the maskers have little energy in the frequency region of the signal. This paper examines the effects of adding a pretrial cue to reduce uncertainty for the masker or the signal. The first two experiments examined the effect of cuing a fixed-frequency signal as the number of masker components and presentation methods were manipulated. Cue effectiveness varied across observers, but could reduce thresholds by as much as 20 dB. Procedural comparisons indicated observers benefited more from having two masker samples to compare, with or without a signal cue, than having a single interval with one masker sample and a signal cue. The third experiment used random-frequency signals and compared no-cue, signal-cue, and masker-cue conditions, and also systematically varied the time interval between cue offset and trial onset. Thresholds with a cued random-frequency signal remained higher than for a cued fixed-frequency signal. For time intervals between the cue and trial of 50 ms or longer, thresholds were approximately the same with a signal or a masker cue and lower than when there was no cue. Without a cue or with a masker cue, analyses of possible decision strategies suggested observers attended to the potential signal frequencies, particularly the highest signal frequency. With a signal cue, observers appeared to attend to the frequency of the subsequent signal.
Collapse
Affiliation(s)
- Virginia M Richards
- Department of Psychology, 3815 Walnut Street, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.
| | | |
Collapse
|
19
|
Kim J, Davis C. Hearing foreign voices: does knowing what is said affect visual-masked-speech detection? Perception 2003; 32:111-20. [PMID: 12613790 DOI: 10.1068/p3466] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We investigated audio-visual (AV) perceptual integration by examining the effect of seeing the speaker's synchronised moving face on masked-speech detection ability. Signal amplification and higher-level cognitive accounts of an AV advantage were contrasted, the latter by varying whether participants knew the language of the speaker. An AV advantage was shown for sentences whose mid-to-high-frequency acoustic envelope was highly correlated with articulator movement, regardless of knowledge of the language. For low-correlation sentences, knowledge of the language had a large impact; for participants with no knowledge of the language an AV inhibitory effect was found (providing support for reports of a compelling AV illusion). The results indicate a role for both sensory enhancement and higher-level cognitive factors in AV speech detection.
Collapse
Affiliation(s)
- Jeesun Kim
- Department of Psychology, University of Melbourne, Parkville, Victoria 3010, Australia.
| | | |
Collapse
|
20
|
Prime DJ, Ward LM. Auditory frequency-based inhibition differs from spatial IOR. PERCEPTION & PSYCHOPHYSICS 2002; 64:771-84. [PMID: 12201336 DOI: 10.3758/bf03194744] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Uninformative auditory frequency cues have a facilitatory effect on reaction time and accuracy of detection and intensity discrimination of target tones for cue-target intervals of up to 3 sec (Green & McKeown, 2001; Ward, 1997). Under some conditions, however, this facilitatory effect can reverse to an inhibitory effect at cue-target intervals longer than 450 msec (Mondor, Breau, & Milliken, 1998). Thepresent work demonstrates that such inhibitory effects are not found in target-target experiments (Experiment 1) or in cue-target experiments requiring a go-no-go discrimination of the target (Experiment 2), whereas they do appear in the paradigm used by Mondor et al. (1998, Experiment 3), albeit unaffected by the similarity of cue and target. Thus, the frequency-based inhibitory effects sometimes found in auditory cuing tasks can be distinguished empirically from those characterizing spatial inhibition of return (IOR), which are found in both target-target and go-no-go cue-target paradigms. The present work and functional and neurophysiological arguments all support the position that different mechanisms underlie spatial IOR and the inhibitory effects sometimes found in auditory frequency processing.
Collapse
Affiliation(s)
- David J Prime
- Department of Psychology, University of British Columbia, Vancouver, Canada.
| | | |
Collapse
|